An Ontology-Based Tourism Recommender System Based on Spreading Activation Model
NASA Astrophysics Data System (ADS)
Bahramian, Z.; Abbaspour, R. Ali
2015-12-01
A tourist has time and budget limitations; hence, he needs to select points of interest (POIs) optimally. Since the available information about POIs is overloading, it is difficult for a tourist to select the most appreciate ones considering preferences. In this paper, a new travel recommender system is proposed to overcome information overload problem. A recommender system (RS) evaluates the overwhelming number of POIs and provides personalized recommendations to users based on their preferences. A content-based recommendation system is proposed, which uses the information about the user's preferences and POIs and calculates a degree of similarity between them. It selects POIs, which have highest similarity with the user's preferences. The proposed content-based recommender system is enhanced using the ontological information about tourism domain to represent both the user profile and the recommendable POIs. The proposed ontology-based recommendation process is performed in three steps including: ontology-based content analyzer, ontology-based profile learner, and ontology-based filtering component. User's feedback adapts the user's preferences using Spreading Activation (SA) strategy. It shows the proposed recommender system is effective and improves the overall performance of the traditional content-based recommender systems.
Ontology-Based Retrieval of Spatially Related Objects for Location Based Services
NASA Astrophysics Data System (ADS)
Haav, Hele-Mai; Kaljuvee, Aivi; Luts, Martin; Vajakas, Toivo
Advanced Location Based Service (LBS) applications have to integrate information stored in GIS, information about users' preferences (profile) as well as contextual information and information about application itself. Ontology engineering provides methods to semantically integrate several data sources. We propose an ontology-driven LBS development framework: the paper describes the architecture of ontologies and their usage for retrieval of spatially related objects relevant to the user. Our main contribution is to enable personalised ontology driven LBS by providing a novel approach for defining personalised semantic spatial relationships by means of ontologies. The approach is illustrated by an industrial case study.
A UML profile for the OBO relation ontology.
Guardia, Gabriela D A; Vêncio, Ricardo Z N; de Farias, Cléver R G
2012-01-01
Ontologies have increasingly been used in the biomedical domain, which has prompted the emergence of different initiatives to facilitate their development and integration. The Open Biological and Biomedical Ontologies (OBO) Foundry consortium provides a repository of life-science ontologies, which are developed according to a set of shared principles. This consortium has developed an ontology called OBO Relation Ontology aiming at standardizing the different types of biological entity classes and associated relationships. Since ontologies are primarily intended to be used by humans, the use of graphical notations for ontology development facilitates the capture, comprehension and communication of knowledge between its users. However, OBO Foundry ontologies are captured and represented basically using text-based notations. The Unified Modeling Language (UML) provides a standard and widely-used graphical notation for modeling computer systems. UML provides a well-defined set of modeling elements, which can be extended using a built-in extension mechanism named Profile. Thus, this work aims at developing a UML profile for the OBO Relation Ontology to provide a domain-specific set of modeling elements that can be used to create standard UML-based ontologies in the biomedical domain.
Ontology Performance Profiling and Model Examination: First Steps
NASA Astrophysics Data System (ADS)
Wang, Taowei David; Parsia, Bijan
"[Reasoner] performance can be scary, so much so, that we cannot deploy the technology in our products." - Michael Shepard. What are typical OWL users to do when their favorite reasoner never seems to return? In this paper, we present our first steps considering this problem. We describe the challenges and our approach, and present a prototype tool to help users identify reasoner performance bottlenecks with respect to their ontologies. We then describe 4 case studies on synthetic and real-world ontologies. While the anecdotal evidence suggests that the service can be useful for both ontology developers and reasoner implementors, much more is desired.
A UML profile for the OBO relation ontology
2012-01-01
Background Ontologies have increasingly been used in the biomedical domain, which has prompted the emergence of different initiatives to facilitate their development and integration. The Open Biological and Biomedical Ontologies (OBO) Foundry consortium provides a repository of life-science ontologies, which are developed according to a set of shared principles. This consortium has developed an ontology called OBO Relation Ontology aiming at standardizing the different types of biological entity classes and associated relationships. Since ontologies are primarily intended to be used by humans, the use of graphical notations for ontology development facilitates the capture, comprehension and communication of knowledge between its users. However, OBO Foundry ontologies are captured and represented basically using text-based notations. The Unified Modeling Language (UML) provides a standard and widely-used graphical notation for modeling computer systems. UML provides a well-defined set of modeling elements, which can be extended using a built-in extension mechanism named Profile. Thus, this work aims at developing a UML profile for the OBO Relation Ontology to provide a domain-specific set of modeling elements that can be used to create standard UML-based ontologies in the biomedical domain. Results We have studied the OBO Relation Ontology, the UML metamodel and the UML profiling mechanism. Based on these studies, we have proposed an extension to the UML metamodel in conformance with the OBO Relation Ontology and we have defined a profile that implements the extended metamodel. Finally, we have applied the proposed UML profile in the development of a number of fragments from different ontologies. Particularly, we have considered the Gene Ontology (GO), the PRotein Ontology (PRO) and the Xenopus Anatomy and Development Ontology (XAO). Conclusions The use of an established and well-known graphical language in the development of biomedical ontologies provides a more intuitive form of capturing and representing knowledge than using only text-based notations. The use of the profile requires the domain expert to reason about the underlying semantics of the concepts and relationships being modeled, which helps preventing the introduction of inconsistencies in an ontology under development and facilitates the identification and correction of errors in an already defined ontology. PMID:23095840
Gibert, Karina; Valls, Aida; Riaño, David
2008-01-01
One of the tasks towards the definition of a knowledge model for home care is the definition of the different roles of the users involved in the system. The roles determine the actions and services that can or must be performed by each type of user. In this paper the experience of building an ontology to represent the home-care users and their associated information is presented, in a proposal for a standard model of a Home-Care support system to the European Community.
Supporting Multi-view User Ontology to Understand Company Value Chains
NASA Astrophysics Data System (ADS)
Zuo, Landong; Salvadores, Manuel; Imtiaz, Sm Hazzaz; Darlington, John; Gibbins, Nicholas; Shadbolt, Nigel R.; Dobree, James
The objective of the Market Blended Insight (MBI) project is to develop web based techniques to improve the performance of UK Business to Business (B2B) marketing activities. The analysis of company value chains is a fundamental task within MBI because it is an important model for understanding the market place and the company interactions within it. The project has aggregated rich data profiles of 3.7 million companies that form the active UK business community. The profiles are augmented by Web extractions from heterogeneous sources to provide unparalleled business insight. Advances by the Semantic Web in knowledge representation and logic reasoning allow flexible integration of data from heterogeneous sources, transformation between different representations and reasoning about their meaning. The MBI project has identified that the market insight and analysis interests of different types of users are difficult to maintain using a single domain ontology. Therefore, the project has developed a technique to undertake a plurality of analyses of value chains by deploying a distributed multi-view ontology to capture different user views over the classification of companies and their various relationships.
Semantically Enhanced Recommender Systems
NASA Astrophysics Data System (ADS)
Ruiz-Montiel, Manuela; Aldana-Montes, José F.
Recommender Systems have become a significant area in the context of web personalization, given the large amount of available data. Ontologies can be widely taken advantage of in recommender systems, since they provide a means of classifying and discovering of new information about the items to recommend, about user profiles and even about their context. We have developed a semantically enhanced recommender system based on this kind of ontologies. In this paper we present a description of the proposed system.
Adaptive Semantic and Social Web-based learning and assessment environment for the STEM
NASA Astrophysics Data System (ADS)
Babaie, Hassan; Atchison, Chris; Sunderraman, Rajshekhar
2014-05-01
We are building a cloud- and Semantic Web-based personalized, adaptive learning environment for the STEM fields that integrates and leverages Social Web technologies to allow instructors and authors of learning material to collaborate in semi-automatic development and update of their common domain and task ontologies and building their learning resources. The semi-automatic ontology learning and development minimize issues related to the design and maintenance of domain ontologies by knowledge engineers who do not have any knowledge of the domain. The social web component of the personal adaptive system will allow individual and group learners to interact with each other and discuss their own learning experience and understanding of course material, and resolve issues related to their class assignments. The adaptive system will be capable of representing key knowledge concepts in different ways and difficulty levels based on learners' differences, and lead to different understanding of the same STEM content by different learners. It will adapt specific pedagogical strategies to individual learners based on their characteristics, cognition, and preferences, allow authors to assemble remotely accessed learning material into courses, and provide facilities for instructors to assess (in real time) the perception of students of course material, monitor their progress in the learning process, and generate timely feedback based on their understanding or misconceptions. The system applies a set of ontologies that structure the learning process, with multiple user friendly Web interfaces. These include the learning ontology (models learning objects, educational resources, and learning goal); context ontology (supports adaptive strategy by detecting student situation), domain ontology (structures concepts and context), learner ontology (models student profile, preferences, and behavior), task ontologies, technological ontology (defines devices and places that surround the student), pedagogy ontology, and learner ontology (defines time constraint, comment, profile).
SPONGY (SPam ONtoloGY): Email Classification Using Two-Level Dynamic Ontology
2014-01-01
Email is one of common communication methods between people on the Internet. However, the increase of email misuse/abuse has resulted in an increasing volume of spam emails over recent years. An experimental system has been designed and implemented with the hypothesis that this method would outperform existing techniques, and the experimental results showed that indeed the proposed ontology-based approach improves spam filtering accuracy significantly. In this paper, two levels of ontology spam filters were implemented: a first level global ontology filter and a second level user-customized ontology filter. The use of the global ontology filter showed about 91% of spam filtered, which is comparable with other methods. The user-customized ontology filter was created based on the specific user's background as well as the filtering mechanism used in the global ontology filter creation. The main contributions of the paper are (1) to introduce an ontology-based multilevel filtering technique that uses both a global ontology and an individual filter for each user to increase spam filtering accuracy and (2) to create a spam filter in the form of ontology, which is user-customized, scalable, and modularized, so that it can be embedded to many other systems for better performance. PMID:25254240
SPONGY (SPam ONtoloGY): email classification using two-level dynamic ontology.
Youn, Seongwook
2014-01-01
Email is one of common communication methods between people on the Internet. However, the increase of email misuse/abuse has resulted in an increasing volume of spam emails over recent years. An experimental system has been designed and implemented with the hypothesis that this method would outperform existing techniques, and the experimental results showed that indeed the proposed ontology-based approach improves spam filtering accuracy significantly. In this paper, two levels of ontology spam filters were implemented: a first level global ontology filter and a second level user-customized ontology filter. The use of the global ontology filter showed about 91% of spam filtered, which is comparable with other methods. The user-customized ontology filter was created based on the specific user's background as well as the filtering mechanism used in the global ontology filter creation. The main contributions of the paper are (1) to introduce an ontology-based multilevel filtering technique that uses both a global ontology and an individual filter for each user to increase spam filtering accuracy and (2) to create a spam filter in the form of ontology, which is user-customized, scalable, and modularized, so that it can be embedded to many other systems for better performance.
RuleGO: a logical rules-based tool for description of gene groups by means of Gene Ontology
Gruca, Aleksandra; Sikora, Marek; Polanski, Andrzej
2011-01-01
Genome-wide expression profiles obtained with the use of DNA microarray technology provide abundance of experimental data on biological and molecular processes. Such amount of data need to be further analyzed and interpreted in order to obtain biological conclusions on the basis of experimental results. The analysis requires a lot of experience and is usually time-consuming process. Thus, frequently various annotation databases are used to improve the whole process of analysis. Here, we present RuleGO—the web-based application that allows the user to describe gene groups on the basis of logical rules that include Gene Ontology (GO) terms in their premises. Presented application allows obtaining rules that reflect coappearance of GO-terms describing genes supported by the rules. The ontology level and number of coappearing GO-terms is adjusted in automatic manner. The user limits the space of possible solutions only. The RuleGO application is freely available at http://rulego.polsl.pl/. PMID:21715384
Semantic technologies in a decision support system
NASA Astrophysics Data System (ADS)
Wasielewska, K.; Ganzha, M.; Paprzycki, M.; Bǎdicǎ, C.; Ivanovic, M.; Lirkov, I.
2015-10-01
The aim of our work is to design a decision support system based on ontological representation of domain(s) and semantic technologies. Specifically, we consider the case when Grid / Cloud user describes his/her requirements regarding a "resource" as a class expression from an ontology, while the instances of (the same) ontology represent available resources. The goal is to help the user to find the best option with respect to his/her requirements, while remembering that user's knowledge may be "limited." In this context, we discuss multiple approaches based on semantic data processing, which involve different "forms" of user interaction with the system. Specifically, we consider: (a) ontological matchmaking based on SPARQL queries and class expression, (b) graph-based semantic closeness of instances representing user requirements (constructed from the class expression) and available resources, and (c) multicriterial analysis based on the AHP method, which utilizes expert domain knowledge (also ontologically represented).
Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; ...
2014-09-26
A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less
Yan, Xianghe; Peng, Yun; Meng, Jianghong; Ruzante, Juliana; Fratamico, Pina M; Huang, Lihan; Juneja, Vijay; Needleman, David S
2011-01-01
Several factors have hindered effective use of information and resources related to food safety due to inconsistency among semantically heterogeneous data resources, lack of knowledge on profiling of food-borne pathogens, and knowledge gaps among research communities, government risk assessors/managers, and end-users of the information. This paper discusses technical aspects in the establishment of a comprehensive food safety information system consisting of the following steps: (a) computational collection and compiling publicly available information, including published pathogen genomic, proteomic, and metabolomic data; (b) development of ontology libraries on food-borne pathogens and design automatic algorithms with formal inference and fuzzy and probabilistic reasoning to address the consistency and accuracy of distributed information resources (e.g., PulseNet, FoodNet, OutbreakNet, PubMed, NCBI, EMBL, and other online genetic databases and information); (c) integration of collected pathogen profiling data, Foodrisk.org ( http://www.foodrisk.org ), PMP, Combase, and other relevant information into a user-friendly, searchable, "homogeneous" information system available to scientists in academia, the food industry, and government agencies; and (d) development of a computational model in semantic web for greater adaptability and robustness.
Noy, Natalya; Tudorache, Tania; Nyulas, Csongor; Musen, Mark
2010-01-01
Ontologies have become a critical component of many applications in biomedical informatics. However, the landscape of the ontology tools today is largely fragmented, with independent tools for ontology editing, publishing, and peer review: users develop an ontology in an ontology editor, such as Protégé; and publish it on a Web server or in an ontology library, such as BioPortal, in order to share it with the community; they use the tools provided by the library or mailing lists and bug trackers to collect feedback from users. In this paper, we present a set of tools that bring the ontology editing and publishing closer together, in an integrated platform for the entire ontology lifecycle. This integration streamlines the workflow for collaborative development and increases integration between the ontologies themselves through the reuse of terms. PMID:21347039
Hwang, Wonil; Salvendy, Gavriel
2005-06-10
Ontologies, as a possible element of organizational memory information systems, appear to support organizational learning. Ontology tools can be used to share knowledge among the members of an organization. However, current ontology-viewing user interfaces of ontology tools do not fully support organizational learning, because most of them lack proper history representation in their display. In this study, a conceptual model was developed that emphasized the role of ontology in the organizational learning cycle and explored the integration of history representation in the ontology display. Based on the experimental results from a split-plot design with 30 participants, two conclusions were derived: first, appropriately selected history representations in the ontology display help users to identify changes in the ontologies; and second, compatibility between types of ontology display and history representation is more important than ontology display and history representation in themselves.
Faceted Visualization of Three Dimensional Neuroanatomy By Combining Ontology with Faceted Search
Veeraraghavan, Harini; Miller, James V.
2013-01-01
In this work, we present a faceted-search based approach for visualization of anatomy by combining a three dimensional digital atlas with an anatomy ontology. Specifically, our approach provides a drill-down search interface that exposes the relevant pieces of information (obtained by searching the ontology) for a user query. Hence, the user can produce visualizations starting with minimally specified queries. Furthermore, by automatically translating the user queries into the controlled terminology our approach eliminates the need for the user to use controlled terminology. We demonstrate the scalability of our approach using an abdominal atlas and the same ontology. We implemented our visualization tool on the opensource 3D Slicer software. We present results of our visualization approach by combining a modified Foundational Model of Anatomy (FMA) ontology with the Surgical Planning Laboratory (SPL) Brain 3D digital atlas, and geometric models specific to patients computed using the SPL brain tumor dataset. PMID:24006207
Faceted visualization of three dimensional neuroanatomy by combining ontology with faceted search.
Veeraraghavan, Harini; Miller, James V
2014-04-01
In this work, we present a faceted-search based approach for visualization of anatomy by combining a three dimensional digital atlas with an anatomy ontology. Specifically, our approach provides a drill-down search interface that exposes the relevant pieces of information (obtained by searching the ontology) for a user query. Hence, the user can produce visualizations starting with minimally specified queries. Furthermore, by automatically translating the user queries into the controlled terminology our approach eliminates the need for the user to use controlled terminology. We demonstrate the scalability of our approach using an abdominal atlas and the same ontology. We implemented our visualization tool on the opensource 3D Slicer software. We present results of our visualization approach by combining a modified Foundational Model of Anatomy (FMA) ontology with the Surgical Planning Laboratory (SPL) Brain 3D digital atlas, and geometric models specific to patients computed using the SPL brain tumor dataset.
OLSVis: an animated, interactive visual browser for bio-ontologies
2012-01-01
Background More than one million terms from biomedical ontologies and controlled vocabularies are available through the Ontology Lookup Service (OLS). Although OLS provides ample possibility for querying and browsing terms, the visualization of parts of the ontology graphs is rather limited and inflexible. Results We created the OLSVis web application, a visualiser for browsing all ontologies available in the OLS database. OLSVis shows customisable subgraphs of the OLS ontologies. Subgraphs are animated via a real-time force-based layout algorithm which is fully interactive: each time the user makes a change, e.g. browsing to a new term, hiding, adding, or dragging terms, the algorithm performs smooth and only essential reorganisations of the graph. This assures an optimal viewing experience, because subsequent screen layouts are not grossly altered, and users can easily navigate through the graph. URL: http://ols.wordvis.com Conclusions The OLSVis web application provides a user-friendly tool to visualise ontologies from the OLS repository. It broadens the possibilities to investigate and select ontology subgraphs through a smooth visualisation method. PMID:22646023
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis; Nyulas, Csongor; Tudorache, Tania; Noy, Natalya F; Musen, Mark A
The need to examine the behavior of different user groups is a fundamental requirement when building information systems. In this paper, we present Ontology-based Decentralized Search (OBDS), a novel method to model the navigation behavior of users equipped with different types of background knowledge. Ontology-based Decentralized Search combines decentralized search, an established method for navigation in social networks, and ontologies to model navigation behavior in information networks. The method uses ontologies as an explicit representation of background knowledge to inform the navigation process and guide it towards navigation targets. By using different ontologies, users equipped with different types of background knowledge can be represented. We demonstrate our method using four biomedical ontologies and their associated Wikipedia articles. We compare our simulation results with base line approaches and with results obtained from a user study. We find that our method produces click paths that have properties similar to those originating from human navigators. The results suggest that our method can be used to model human navigation behavior in systems that are based on information networks, such as Wikipedia. This paper makes the following contributions: (i) To the best of our knowledge, this is the first work to demonstrate the utility of ontologies in modeling human navigation and (ii) it yields new insights and understanding about the mechanisms of human navigation in information networks.
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis; Nyulas, Csongor; Tudorache, Tania; Noy, Natalya F.; Musen, Mark A.
2015-01-01
The need to examine the behavior of different user groups is a fundamental requirement when building information systems. In this paper, we present Ontology-based Decentralized Search (OBDS), a novel method to model the navigation behavior of users equipped with different types of background knowledge. Ontology-based Decentralized Search combines decentralized search, an established method for navigation in social networks, and ontologies to model navigation behavior in information networks. The method uses ontologies as an explicit representation of background knowledge to inform the navigation process and guide it towards navigation targets. By using different ontologies, users equipped with different types of background knowledge can be represented. We demonstrate our method using four biomedical ontologies and their associated Wikipedia articles. We compare our simulation results with base line approaches and with results obtained from a user study. We find that our method produces click paths that have properties similar to those originating from human navigators. The results suggest that our method can be used to model human navigation behavior in systems that are based on information networks, such as Wikipedia. This paper makes the following contributions: (i) To the best of our knowledge, this is the first work to demonstrate the utility of ontologies in modeling human navigation and (ii) it yields new insights and understanding about the mechanisms of human navigation in information networks. PMID:26568745
Using ontologies for structuring organizational knowledge in Home Care assistance.
Valls, Aida; Gibert, Karina; Sánchez, David; Batet, Montserrat
2010-05-01
Information Technologies and Knowledge-based Systems can significantly improve the management of complex distributed health systems, where supporting multidisciplinarity is crucial and communication and synchronization between the different professionals and tasks becomes essential. This work proposes the use of the ontological paradigm to describe the organizational knowledge of such complex healthcare institutions as a basis to support their management. The ontology engineering process is detailed, as well as the way to maintain the ontology updated in front of changes. The paper also analyzes how such an ontology can be exploited in a real healthcare application and the role of the ontology in the customization of the system. The particular case of senior Home Care assistance is addressed, as this is a highly distributed field as well as a strategic goal in an ageing Europe. The proposed ontology design is based on a Home Care medical model defined by an European consortium of Home Care professionals, framed in the scope of the K4Care European project (FP6). Due to the complexity of the model and the knowledge gap existing between the - textual - medical model and the strict formalization of an ontology, an ontology engineering methodology (On-To-Knowledge) has been followed. After applying the On-To-Knowledge steps, the following results were obtained: the feasibility study concluded that the ontological paradigm and the expressiveness of modern ontology languages were enough to describe the required medical knowledge; after the kick-off and refinement stages, a complete and non-ambiguous definition of the Home Care model, including its main components and interrelations, was obtained; the formalization stage expressed HC medical entities in the form of ontological classes, which are interrelated by means of hierarchies, properties and semantically rich class restrictions; the evaluation, carried out by exploiting the ontology into a knowledge-driven e-health application running on a real scenario, showed that the ontology design and its exploitation brought several benefits with regards to flexibility, adaptability and work efficiency from the end-user point of view; for the maintenance stage, two software tools are presented, aimed to address the incorporation and modification of healthcare units and the personalization of ontological profiles. The paper shows that the ontological paradigm and the expressiveness of modern ontology languages can be exploited not only to represent terminology in a non-ambiguous way, but also to formalize the interrelations and organizational structures involved in a real and distributed healthcare environment. This kind of ontologies facilitates the adaptation in front of changes in the healthcare organization or Care Units, supports the creation of profile-based interaction models in a transparent and seamless way, and increases the reusability and generality of the developed software components. As a conclusion of the exploitation of the developed ontology in a real medical scenario, we can say that an ontology formalizing organizational interrelations is a key component for building effective distributed knowledge-driven e-health systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Agile development of ontologies through conversation
NASA Astrophysics Data System (ADS)
Braines, Dave; Bhattal, Amardeep; Preece, Alun D.; de Mel, Geeth
2016-05-01
Ontologies and semantic systems are necessarily complex but offer great potential in terms of their ability to fuse information from multiple sources in support of situation awareness. Current approaches do not place the ontologies directly into the hands of the end user in the field but instead hide them away behind traditional applications. We have been experimenting with human-friendly ontologies and conversational interactions to enable non-technical business users to interact with and extend these dynamically. In this paper we outline our approach via a worked example, covering: OWL ontologies, ITA Controlled English, Sensor/mission matching and conversational interactions between human and machine agents.
Building a semi-automatic ontology learning and construction system for geosciences
NASA Astrophysics Data System (ADS)
Babaie, H. A.; Sunderraman, R.; Zhu, Y.
2013-12-01
We are developing an ontology learning and construction framework that allows continuous, semi-automatic knowledge extraction, verification, validation, and maintenance by potentially a very large group of collaborating domain experts in any geosciences field. The system brings geoscientists from the side-lines to the center stage of ontology building, allowing them to collaboratively construct and enrich new ontologies, and merge, align, and integrate existing ontologies and tools. These constantly evolving ontologies can more effectively address community's interests, purposes, tools, and change. The goal is to minimize the cost and time of building ontologies, and maximize the quality, usability, and adoption of ontologies by the community. Our system will be a domain-independent ontology learning framework that applies natural language processing, allowing users to enter their ontology in a semi-structured form, and a combined Semantic Web and Social Web approach that lets direct participation of geoscientists who have no skill in the design and development of their domain ontologies. A controlled natural language (CNL) interface and an integrated authoring and editing tool automatically convert syntactically correct CNL text into formal OWL constructs. The WebProtege-based system will allow a potentially large group of geoscientists, from multiple domains, to crowd source and participate in the structuring of their knowledge model by sharing their knowledge through critiquing, testing, verifying, adopting, and updating of the concept models (ontologies). We will use cloud storage for all data and knowledge base components of the system, such as users, domain ontologies, discussion forums, and semantic wikis that can be accessed and queried by geoscientists in each domain. We will use NoSQL databases such as MongoDB as a service in the cloud environment. MongoDB uses the lightweight JSON format, which makes it convenient and easy to build Web applications using just HTML5 and Javascript, thereby avoiding cumbersome server side coding present in the traditional approaches. The JSON format used in MongoDB is also suitable for storing and querying RDF data. We will store the domain ontologies and associated linked data in JSON/RDF formats. Our Web interface will be built upon the open source and configurable WebProtege ontology editor. We will develop a simplified mobile version of our user interface which will automatically detect the hosting device and adjust the user interface layout to accommodate different screen sizes. We will also use the Semantic Media Wiki that allows the user to store and query the data within the wiki pages. By using HTML 5, JavaScript, and WebGL, we aim to create an interactive, dynamic, and multi-dimensional user interface that presents various geosciences data sets in a natural and intuitive way.
A knowledge-driven approach to biomedical document conceptualization.
Zheng, Hai-Tao; Borchert, Charles; Jiang, Yong
2010-06-01
Biomedical document conceptualization is the process of clustering biomedical documents based on ontology-represented domain knowledge. The result of this process is the representation of the biomedical documents by a set of key concepts and their relationships. Most of clustering methods cluster documents based on invariant domain knowledge. The objective of this work is to develop an effective method to cluster biomedical documents based on various user-specified ontologies, so that users can exploit the concept structures of documents more effectively. We develop a flexible framework to allow users to specify the knowledge bases, in the form of ontologies. Based on the user-specified ontologies, we develop a key concept induction algorithm, which uses latent semantic analysis to identify key concepts and cluster documents. A corpus-related ontology generation algorithm is developed to generate the concept structures of documents. Based on two biomedical datasets, we evaluate the proposed method and five other clustering algorithms. The clustering results of the proposed method outperform the five other algorithms, in terms of key concept identification. With respect to the first biomedical dataset, our method has the F-measure values 0.7294 and 0.5294 based on the MeSH ontology and gene ontology (GO), respectively. With respect to the second biomedical dataset, our method has the F-measure values 0.6751 and 0.6746 based on the MeSH ontology and GO, respectively. Both results outperforms the five other algorithms in terms of F-measure. Based on the MeSH ontology and GO, the generated corpus-related ontologies show informative conceptual structures. The proposed method enables users to specify the domain knowledge to exploit the conceptual structures of biomedical document collections. In addition, the proposed method is able to extract the key concepts and cluster the documents with a relatively high precision. Copyright 2010 Elsevier B.V. All rights reserved.
Su, Chuan-Jun; Chiang, Chang-Yu; Chih, Meng-Chun
2014-03-07
Good physical fitness generally makes the body less prone to common diseases. A personalized exercise plan that promotes a balanced approach to fitness helps promotes fitness, while inappropriate forms of exercise can have adverse consequences for health. This paper aims to develop an ontology-driven knowledge-based system for generating custom-designed exercise plans based on a user's profile and health status, incorporating international standard Health Level Seven International (HL7) data on physical fitness and health screening. The generated plan exposing Representational State Transfer (REST) style web services which can be accessed from any Internet-enabled device and deployed in cloud computing environments. To ensure the practicality of the generated exercise plans, encapsulated knowledge used as a basis for inference in the system is acquired from domain experts. The proposed Ubiquitous Exercise Plan Generation for Personalized Physical Fitness (UFIT) will not only improve health-related fitness through generating personalized exercise plans, but also aid users in avoiding inappropriate work outs.
Su, Chuan-Jun; Chiang, Chang-Yu; Chih, Meng-Chun
2014-01-01
Good physical fitness generally makes the body less prone to common diseases. A personalized exercise plan that promotes a balanced approach to fitness helps promotes fitness, while inappropriate forms of exercise can have adverse consequences for health. This paper aims to develop an ontology-driven knowledge-based system for generating custom-designed exercise plans based on a user's profile and health status, incorporating international standard Health Level Seven International (HL7) data on physical fitness and health screening. The generated plan exposing Representational State Transfer (REST) style web services which can be accessed from any Internet-enabled device and deployed in cloud computing environments. To ensure the practicality of the generated exercise plans, encapsulated knowledge used as a basis for inference in the system is acquired from domain experts. The proposed Ubiquitous Exercise Plan Generation for Personalized Physical Fitness (UFIT) will not only improve health-related fitness through generating personalized exercise plans, but also aid users in avoiding inappropriate work outs. PMID:24608002
Research of three level match method about semantic web service based on ontology
NASA Astrophysics Data System (ADS)
Xiao, Jie; Cai, Fang
2011-10-01
An important step of Web service Application is the discovery of useful services. Keywords are used in service discovery in traditional technology like UDDI and WSDL, with the disadvantage of user intervention, lack of semantic description and low accuracy. To cope with these problems, OWL-S is introduced and extended with QoS attributes to describe the attribute and functions of Web Services. A three-level service matching algorithm based on ontology and QOS in proposed in this paper. Our algorithm can match web service by utilizing the service profile, QoS parameters together with input and output of the service. Simulation results shows that it greatly enhanced the speed of service matching while high accuracy is also guaranteed.
Beger, Christoph; Uciteli, Alexandr; Herre, Heinrich
2017-01-01
The amount of ontologies, which are utilizable for widespread domains, is growing steadily. BioPortal alone, embraces over 500 published ontologies with nearly 8 million classes. In contrast, the vast informative content of these ontologies is only directly intelligible by experts. To overcome this deficiency it could be possible to represent ontologies as web portals, which does not require knowledge about ontologies and their semantics, but still carries as much information as possible to the end-user. Furthermore, the conception of a complex web portal is a sophisticated process. Many entities must be analyzed and linked to existing terminologies. Ontologies are a decent solution for gathering and storing this complex data and dependencies. Hence, automated imports of ontologies into web portals could support both mentioned scenarios. The Content Management System (CMS) Drupal 8 is one of many solutions to develop web presentations with less required knowledge about programming languages and it is suitable to represent ontological entities. We developed the Drupal Upper Ontology (DUO), which models concepts of Drupal's architecture, such as nodes, vocabularies and links. DUO can be imported into ontologies to map their entities to Drupal's concepts. Because of Drupal's lack of import capabilities, we implemented the Simple Ontology Loader in Drupal (SOLID), a Drupal 8 module, which allows Drupal administrators to import ontologies based on DUO. Our module generates content in Drupal from existing ontologies and makes it accessible by the general public. Moreover Drupal offers a tagging system which may be amplified with multiple standardized and established terminologies by importing them with SOLID. Our Drupal module shows that ontologies can be used to model content of a CMS and vice versa CMS are suitable to represent ontologies in a user-friendly way. Ontological entities are presented to the user as discrete pages with all appropriate properties, links and tags.
Region Evolution eXplorer - A tool for discovering evolution trends in ontology regions.
Christen, Victor; Hartung, Michael; Groß, Anika
2015-01-01
A large number of life science ontologies has been developed to support different application scenarios such as gene annotation or functional analysis. The continuous accumulation of new insights and knowledge affects specific portions in ontologies and thus leads to their adaptation. Therefore, it is valuable to study which ontology parts have been extensively modified or remained unchanged. Users can monitor the evolution of an ontology to improve its further development or apply the knowledge in their applications. Here we present REX (Region Evolution eXplorer) a web-based system for exploring the evolution of ontology parts (regions). REX provides an analysis platform for currently about 1,000 versions of 16 well-known life science ontologies. Interactive workflows allow an explorative analysis of changing ontology regions and can be used to study evolution trends for long-term periods. REX is a web application providing an interactive and user-friendly interface to identify (un)stable regions in large life science ontologies. It is available at http://www.izbi.de/rex.
Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun
2017-01-01
Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. PMID:27733503
Where to Publish and Find Ontologies? A Survey of Ontology Libraries
d'Aquin, Mathieu; Noy, Natalya F.
2011-01-01
One of the key promises of the Semantic Web is its potential to enable and facilitate data interoperability. The ability of data providers and application developers to share and reuse ontologies is a critical component of this data interoperability: if different applications and data sources use the same set of well defined terms for describing their domain and data, it will be much easier for them to “talk” to one another. Ontology libraries are the systems that collect ontologies from different sources and facilitate the tasks of finding, exploring, and using these ontologies. Thus ontology libraries can serve as a link in enabling diverse users and applications to discover, evaluate, use, and publish ontologies. In this paper, we provide a survey of the growing—and surprisingly diverse—landscape of ontology libraries. We highlight how the varying scope and intended use of the libraries a ects their features, content, and potential exploitation in applications. From reviewing eleven ontology libraries, we identify a core set of questions that ontology practitioners and users should consider in choosing an ontology library for finding ontologies or publishing their own. We also discuss the research challenges that emerge from this survey, for the developers of ontology libraries to address. PMID:22408576
A Semantic Approach for Knowledge Discovery to Help Mitigate Habitat Loss in the Gulf of Mexico
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Maskey, M.; Graves, S.; Hardin, D.
2008-12-01
Noesis is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities. Ontologies enable Noesis to help users refine their searches for information on the open web and in hidden web locations such as data catalogues with standardized, but discipline specific vocabularies. Through its ontologies Noesis provides a guided refinement of search queries which produces complete and accurate searches while reducing the user's burden to experiment with different search strings. All search results are organized by categories (e. g. all results from Google are grouped together) which may be selected or omitted according to the desire of the user. During the past two years ontologies were developed for sea grasses in the Gulf of Mexico and were used to support a habitat restoration demonstration project. Currently these ontologies are being augmented to address the special characteristics of mangroves. These new ontologies will extend the demonstration project to broader regions of the Gulf including protected mangrove locations in coastal Mexico. Noesis contributes to the decision making process by producing a comprehensive list of relevant resources based on the semantic information contained in the ontologies. Ontologies are organized in a tree like taxonomies, where the child nodes represent the Specializations and the parent nodes represent the Generalizations of a node or concept. Specializations can be used to provide more detailed search, while generalizations are used to make the search broader. Ontologies are also used to link two syntactically different terms to one semantic concept (synonyms). Appending a synonym to the query expands the search, thus providing better search coverage. Every concept has a set of properties that are neither in the same inheritance hierarchy (Specializations / Generalizations) nor equivalent (synonyms). These are called Related Concepts and they are captured in the ontology through property relationships. By using Related Concepts users can search for resources with respect to a particular property. Noesis automatically generates searches that include all of these capabilities, removing the burden from the user and producing broader and more accurate search results. This presentation will demonstrate the features of Noesis and describe its application to habitat studies in the Gulf of Mexico.
Usadel, Björn; Nagel, Axel; Steinhauser, Dirk; Gibon, Yves; Bläsing, Oliver E; Redestig, Henning; Sreenivasulu, Nese; Krall, Leonard; Hannah, Matthew A; Poree, Fabien; Fernie, Alisdair R; Stitt, Mark
2006-12-18
Microarray technology has become a widely accepted and standardized tool in biology. The first microarray data analysis programs were developed to support pair-wise comparison. However, as microarray experiments have become more routine, large scale experiments have become more common, which investigate multiple time points or sets of mutants or transgenics. To extract biological information from such high-throughput expression data, it is necessary to develop efficient analytical platforms, which combine manually curated gene ontologies with efficient visualization and navigation tools. Currently, most tools focus on a few limited biological aspects, rather than offering a holistic, integrated analysis. Here we introduce PageMan, a multiplatform, user-friendly, and stand-alone software tool that annotates, investigates, and condenses high-throughput microarray data in the context of functional ontologies. It includes a GUI tool to transform different ontologies into a suitable format, enabling the user to compare and choose between different ontologies. It is equipped with several statistical modules for data analysis, including over-representation analysis and Wilcoxon statistical testing. Results are exported in a graphical format for direct use, or for further editing in graphics programs.PageMan provides a fast overview of single treatments, allows genome-level responses to be compared across several microarray experiments covering, for example, stress responses at multiple time points. This aids in searching for trait-specific changes in pathways using mutants or transgenics, analyzing development time-courses, and comparison between species. In a case study, we analyze the results of publicly available microarrays of multiple cold stress experiments using PageMan, and compare the results to a previously published meta-analysis.PageMan offers a complete user's guide, a web-based over-representation analysis as well as a tutorial, and is freely available at http://mapman.mpimp-golm.mpg.de/pageman/. PageMan allows multiple microarray experiments to be efficiently condensed into a single page graphical display. The flexible interface allows data to be quickly and easily visualized, facilitating comparisons within experiments and to published experiments, thus enabling researchers to gain a rapid overview of the biological responses in the experiments.
OntoFox: web-based support for ontology reuse
2010-01-01
Background Ontology development is a rapidly growing area of research, especially in the life sciences domain. To promote collaboration and interoperability between different projects, the OBO Foundry principles require that these ontologies be open and non-redundant, avoiding duplication of terms through the re-use of existing resources. As current options to do so present various difficulties, a new approach, MIREOT, allows specifying import of single terms. Initial implementations allow for controlled import of selected annotations and certain classes of related terms. Findings OntoFox http://ontofox.hegroup.org/ is a web-based system that allows users to input terms, fetch selected properties, annotations, and certain classes of related terms from the source ontologies and save the results using the RDF/XML serialization of the Web Ontology Language (OWL). Compared to an initial implementation of MIREOT, OntoFox allows additional and more easily configurable options for selecting and rewriting annotation properties, and for inclusion of all or a computed subset of terms between low and top level terms. Additional methods for including related classes include a SPARQL-based ontology term retrieval algorithm that extracts terms related to a given set of signature terms and an option to extract the hierarchy rooted at a specified ontology term. OntoFox's output can be directly imported into a developer's ontology. OntoFox currently supports term retrieval from a selection of 15 ontologies accessible via SPARQL endpoints and allows users to extend this by specifying additional endpoints. An OntoFox application in the development of the Vaccine Ontology (VO) is demonstrated. Conclusions OntoFox provides a timely publicly available service, providing different options for users to collect terms from external ontologies, making them available for reuse by import into client OWL ontologies. PMID:20569493
What Four Million Mappings Can Tell You about Two Hundred Ontologies
NASA Astrophysics Data System (ADS)
Ghazvinian, Amir; Noy, Natalya F.; Jonquet, Clement; Shah, Nigam; Musen, Mark A.
The field of biomedicine has embraced the Semantic Web probably more than any other field. As a result, there is a large number of biomedical ontologies covering overlapping areas of the field. We have developed BioPortal—an open community-based repository of biomedical ontologies. We analyzed ontologies and terminologies in BioPortal and the Unified Medical Language System (UMLS), creating more than 4 million mappings between concepts in these ontologies and terminologies based on the lexical similarity of concept names and synonyms. We then analyzed the mappings and what they tell us about the ontologies themselves, the structure of the ontology repository, and the ways in which the mappings can help in the process of ontology design and evaluation. For example, we can use the mappings to guide users who are new to a field to the most pertinent ontologies in that field, to identify areas of the domain that are not covered sufficiently by the ontologies in the repository, and to identify which ontologies will serve well as background knowledge in domain-specific tools. While we used a specific (but large) ontology repository for the study, we believe that the lessons we learned about the value of a large-scale set of mappings to ontology users and developers are general and apply in many other domains.
Ontological Approach to Military Knowledge Modeling and Management
2004-03-01
federated search mechanism has to reformulate user queries (expressed using the ontology) in the query languages of the different sources (e.g. SQL...ontologies as a common terminology – Unified query to perform federated search • Query processing – Ontology mapping to sources reformulate queries
ERIC Educational Resources Information Center
Bachore, Zelalem
2012-01-01
Ontology not only is considered to be the backbone of the semantic web but also plays a significant role in distributed and heterogeneous information systems. However, ontology still faces limited application and adoption to date. One of the major problems is that prevailing engineering-oriented methodologies for building ontologies do not…
A sensor and video based ontology for activity recognition in smart environments.
Mitchell, D; Morrow, Philip J; Nugent, Chris D
2014-01-01
Activity recognition is used in a wide range of applications including healthcare and security. In a smart environment activity recognition can be used to monitor and support the activities of a user. There have been a range of methods used in activity recognition including sensor-based approaches, vision-based approaches and ontological approaches. This paper presents a novel approach to activity recognition in a smart home environment which combines sensor and video data through an ontological framework. The ontology describes the relationships and interactions between activities, the user, objects, sensors and video data.
The Relationship between User Expertise and Structural Ontology Characteristics
ERIC Educational Resources Information Center
Waldstein, Ilya Michael
2014-01-01
Ontologies are commonly used to support application tasks such as natural language processing, knowledge management, learning, browsing, and search. Literature recommends considering specific context during ontology design, and highlights that a different context is responsible for problems in ontology reuse. However, there is still no clear…
Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun
2017-01-04
Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Uniform Ontology for Software Interfaces
NASA Technical Reports Server (NTRS)
Feyock, Stefan
2002-01-01
It is universally the case that computer users who are not also computer specialists prefer to deal with computers' in terms of a familiar ontology, namely that of their application domains. For example, the well-known Windows ontology assumes that the user is an office worker, and therefore should be presented with a "desktop environment" featuring entities such as (virtual) file folders, documents, appointment calendars, and the like, rather than a world of machine registers and machine language instructions, or even the DOS command level. The central theme of this research has been the proposition that the user interacting with a software system should have at his disposal both the ontology underlying the system, as well as a model of the system. This information is necessary for the understanding of the system in use, as well as for the automatic generation of assistance for the user, both in solving the problem for which the application is designed, and for providing guidance in the capabilities and use of the system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sorokine, Alexandre
2011-10-01
Simple Ontology Format (SOFT) library and file format specification provides a set of simple tools for developing and maintaining ontologies. The library, implemented as a perl module, supports parsing and verification of the files in SOFt format, operations with ontologies (adding, removing, or filtering of entities), and converting of ontologies into other formats. SOFT allows users to quickly create ontologies using only a basic text editor, verify it, and portray it in a graph layout system using customized styles.
Modularizing Spatial Ontologies for Assisted Living Systems
NASA Astrophysics Data System (ADS)
Hois, Joana
Assisted living systems are intended to support daily-life activities in user homes by automatizing and monitoring behavior of the environment while interacting with the user in a non-intrusive way. The knowledge base of such systems therefore has to define thematically different aspects of the environment mostly related to space, such as basic spatial floor plan information, pieces of technical equipment in the environment and their functions and spatial ranges, activities users can perform, entities that occur in the environment, etc. In this paper, we present thematically different ontologies, each of which describing environmental aspects from a particular perspective. The resulting modular structure allows the selection of application-specific ontologies as necessary. This hides information and reduces complexity in terms of the represented spatial knowledge and reasoning practicability. We motivate and present the different spatial ontologies applied to an ambient assisted living application.
OntologyWidget - a reusable, embeddable widget for easily locating ontology terms.
Beauheim, Catherine C; Wymore, Farrell; Nitzberg, Michael; Zachariah, Zachariah K; Jin, Heng; Skene, J H Pate; Ball, Catherine A; Sherlock, Gavin
2007-09-13
Biomedical ontologies are being widely used to annotate biological data in a computer-accessible, consistent and well-defined manner. However, due to their size and complexity, annotating data with appropriate terms from an ontology is often challenging for experts and non-experts alike, because there exist few tools that allow one to quickly find relevant ontology terms to easily populate a web form. We have produced a tool, OntologyWidget, which allows users to rapidly search for and browse ontology terms. OntologyWidget can easily be embedded in other web-based applications. OntologyWidget is written using AJAX (Asynchronous JavaScript and XML) and has two related elements. The first is a dynamic auto-complete ontology search feature. As a user enters characters into the search box, the appropriate ontology is queried remotely for terms that match the typed-in text, and the query results populate a drop-down list with all potential matches. Upon selection of a term from the list, the user can locate this term within a generic and dynamic ontology browser, which comprises the second element of the tool. The ontology browser shows the paths from a selected term to the root as well as parent/child tree hierarchies. We have implemented web services at the Stanford Microarray Database (SMD), which provide the OntologyWidget with access to over 40 ontologies from the Open Biological Ontology (OBO) website 1. Each ontology is updated weekly. Adopters of the OntologyWidget can either use SMD's web services, or elect to rely on their own. Deploying the OntologyWidget can be accomplished in three simple steps: (1) install Apache Tomcat 2 on one's web server, (2) download and install the OntologyWidget servlet stub that provides access to the SMD ontology web services, and (3) create an html (HyperText Markup Language) file that refers to the OntologyWidget using a simple, well-defined format. We have developed OntologyWidget, an easy-to-use ontology search and display tool that can be used on any web page by creating a simple html description. OntologyWidget provides a rapid auto-complete search function paired with an interactive tree display. We have developed a web service layer that communicates between the web page interface and a database of ontology terms. We currently store 40 of the ontologies from the OBO website 1, as well as a several others. These ontologies are automatically updated on a weekly basis. OntologyWidget can be used in any web-based application to take advantage of the ontologies we provide via web services or any other ontology that is provided elsewhere in the correct format. The full source code for the JavaScript and description of the OntologyWidget is available from http://smd.stanford.edu/ontologyWidget/.
OntologyWidget – a reusable, embeddable widget for easily locating ontology terms
Beauheim, Catherine C; Wymore, Farrell; Nitzberg, Michael; Zachariah, Zachariah K; Jin, Heng; Skene, JH Pate; Ball, Catherine A; Sherlock, Gavin
2007-01-01
Background Biomedical ontologies are being widely used to annotate biological data in a computer-accessible, consistent and well-defined manner. However, due to their size and complexity, annotating data with appropriate terms from an ontology is often challenging for experts and non-experts alike, because there exist few tools that allow one to quickly find relevant ontology terms to easily populate a web form. Results We have produced a tool, OntologyWidget, which allows users to rapidly search for and browse ontology terms. OntologyWidget can easily be embedded in other web-based applications. OntologyWidget is written using AJAX (Asynchronous JavaScript and XML) and has two related elements. The first is a dynamic auto-complete ontology search feature. As a user enters characters into the search box, the appropriate ontology is queried remotely for terms that match the typed-in text, and the query results populate a drop-down list with all potential matches. Upon selection of a term from the list, the user can locate this term within a generic and dynamic ontology browser, which comprises the second element of the tool. The ontology browser shows the paths from a selected term to the root as well as parent/child tree hierarchies. We have implemented web services at the Stanford Microarray Database (SMD), which provide the OntologyWidget with access to over 40 ontologies from the Open Biological Ontology (OBO) website [1]. Each ontology is updated weekly. Adopters of the OntologyWidget can either use SMD's web services, or elect to rely on their own. Deploying the OntologyWidget can be accomplished in three simple steps: (1) install Apache Tomcat [2] on one's web server, (2) download and install the OntologyWidget servlet stub that provides access to the SMD ontology web services, and (3) create an html (HyperText Markup Language) file that refers to the OntologyWidget using a simple, well-defined format. Conclusion We have developed OntologyWidget, an easy-to-use ontology search and display tool that can be used on any web page by creating a simple html description. OntologyWidget provides a rapid auto-complete search function paired with an interactive tree display. We have developed a web service layer that communicates between the web page interface and a database of ontology terms. We currently store 40 of the ontologies from the OBO website [1], as well as a several others. These ontologies are automatically updated on a weekly basis. OntologyWidget can be used in any web-based application to take advantage of the ontologies we provide via web services or any other ontology that is provided elsewhere in the correct format. The full source code for the JavaScript and description of the OntologyWidget is available from . PMID:17854506
Ontologies for Effective Use of Context in E-Learning Settings
ERIC Educational Resources Information Center
Jovanovic, Jelena; Gasevic, Dragan; Knight, Colin; Richards, Griff
2007-01-01
This paper presents an ontology-based framework aimed at explicit representation of context-specific metadata derived from the actual usage of learning objects and learning designs. The core part of the proposed framework is a learning object context ontology, that leverages a range of other kinds of learning ontologies (e.g., user modeling…
Malone, James; Brown, Andy; Lister, Allyson L; Ison, Jon; Hull, Duncan; Parkinson, Helen; Stevens, Robert
2014-01-01
Biomedical ontologists to date have concentrated on ontological descriptions of biomedical entities such as gene products and their attributes, phenotypes and so on. Recently, effort has diversified to descriptions of the laboratory investigations by which these entities were produced. However, much biological insight is gained from the analysis of the data produced from these investigations, and there is a lack of adequate descriptions of the wide range of software that are central to bioinformatics. We need to describe how data are analyzed for discovery, audit trails, provenance and reproducibility. The Software Ontology (SWO) is a description of software used to store, manage and analyze data. Input to the SWO has come from beyond the life sciences, but its main focus is the life sciences. We used agile techniques to gather input for the SWO and keep engagement with our users. The result is an ontology that meets the needs of a broad range of users by describing software, its information processing tasks, data inputs and outputs, data formats versions and so on. Recently, the SWO has incorporated EDAM, a vocabulary for describing data and related concepts in bioinformatics. The SWO is currently being used to describe software used in multiple biomedical applications. The SWO is another element of the biomedical ontology landscape that is necessary for the description of biomedical entities and how they were discovered. An ontology of software used to analyze data produced by investigations in the life sciences can be made in such a way that it covers the important features requested and prioritized by its users. The SWO thus fits into the landscape of biomedical ontologies and is produced using techniques designed to keep it in line with user's needs. The Software Ontology is available under an Apache 2.0 license at http://theswo.sourceforge.net/; the Software Ontology blog can be read at http://softwareontology.wordpress.com.
Definition of an Ontology Matching Algorithm for Context Integration in Smart Cities
Otero-Cerdeira, Lorena; Rodríguez-Martínez, Francisco J.; Gómez-Rodríguez, Alma
2014-01-01
In this paper we describe a novel proposal in the field of smart cities: using an ontology matching algorithm to guarantee the automatic information exchange between the agents and the smart city. A smart city is composed by different types of agents that behave as producers and/or consumers of the information in the smart city. In our proposal, the data from the context is obtained by sensor and device agents while users interact with the smart city by means of user or system agents. The knowledge of each agent, as well as the smart city's knowledge, is semantically represented using different ontologies. To have an open city, that is fully accessible to any agent and therefore to provide enhanced services to the users, there is the need to ensure a seamless communication between agents and the city, regardless of their inner knowledge representations, i.e., ontologies. To meet this goal we use ontology matching techniques, specifically we have defined a new ontology matching algorithm called OntoPhil to be deployed within a smart city, which has never been done before. OntoPhil was tested on the benchmarks provided by the well known evaluation initiative, Ontology Alignment Evaluation Initiative, and also compared to other matching algorithms, although these algorithms were not specifically designed for smart cities. Additionally, specific tests involving a smart city's ontology and different types of agents were conducted to validate the usefulness of OntoPhil in the smart city environment. PMID:25494353
Definition of an Ontology Matching Algorithm for Context Integration in Smart Cities.
Otero-Cerdeira, Lorena; Rodríguez-Martínez, Francisco J; Gómez-Rodríguez, Alma
2014-12-08
In this paper we describe a novel proposal in the field of smart cities: using an ontology matching algorithm to guarantee the automatic information exchange between the agents and the smart city. A smart city is composed by different types of agents that behave as producers and/or consumers of the information in the smart city. In our proposal, the data from the context is obtained by sensor and device agents while users interact with the smart city by means of user or system agents. The knowledge of each agent, as well as the smart city's knowledge, is semantically represented using different ontologies. To have an open city, that is fully accessible to any agent and therefore to provide enhanced services to the users, there is the need to ensure a seamless communication between agents and the city, regardless of their inner knowledge representations, i.e., ontologies. To meet this goal we use ontology matching techniques, specifically we have defined a new ontology matching algorithm called OntoPhil to be deployed within a smart city, which has never been done before. OntoPhil was tested on the benchmarks provided by the well known evaluation initiative, Ontology Alignment Evaluation Initiative, and also compared to other matching algorithms, although these algorithms were not specifically designed for smart cities. Additionally, specific tests involving a smart city's ontology and different types of agents were conducted to validate the usefulness of OntoPhil in the smart city environment.
ExAtlas: An interactive online tool for meta-analysis of gene expression data.
Sharov, Alexei A; Schlessinger, David; Ko, Minoru S H
2015-12-01
We have developed ExAtlas, an on-line software tool for meta-analysis and visualization of gene expression data. In contrast to existing software tools, ExAtlas compares multi-component data sets and generates results for all combinations (e.g. all gene expression profiles versus all Gene Ontology annotations). ExAtlas handles both users' own data and data extracted semi-automatically from the public repository (GEO/NCBI database). ExAtlas provides a variety of tools for meta-analyses: (1) standard meta-analysis (fixed effects, random effects, z-score, and Fisher's methods); (2) analyses of global correlations between gene expression data sets; (3) gene set enrichment; (4) gene set overlap; (5) gene association by expression profile; (6) gene specificity; and (7) statistical analysis (ANOVA, pairwise comparison, and PCA). ExAtlas produces graphical outputs, including heatmaps, scatter-plots, bar-charts, and three-dimensional images. Some of the most widely used public data sets (e.g. GNF/BioGPS, Gene Ontology, KEGG, GAD phenotypes, BrainScan, ENCODE ChIP-seq, and protein-protein interaction) are pre-loaded and can be used for functional annotations.
The Semantic Retrieval of Spatial Data Service Based on Ontology in SIG
NASA Astrophysics Data System (ADS)
Sun, S.; Liu, D.; Li, G.; Yu, W.
2011-08-01
The research of SIG (Spatial Information Grid) mainly solves the problem of how to connect different computing resources, so that users can use all the resources in the Grid transparently and seamlessly. In SIG, spatial data service is described in some kinds of specifications, which use different meta-information of each kind of services. This kind of standardization cannot resolve the problem of semantic heterogeneity, which may limit user to obtain the required resources. This paper tries to solve two kinds of semantic heterogeneities (name heterogeneity and structure heterogeneity) in spatial data service retrieval based on ontology, and also, based on the hierarchical subsumption relationship among concept in ontology, the query words can be extended and more resource can be matched and found for user. These applications of ontology in spatial data resource retrieval can help to improve the capability of keyword matching, and find more related resources.
Ku, Hao-Hsiang
2015-01-01
Nowadays, people can easily use a smartphone to get wanted information and requested services. Hence, this study designs and proposes a Golf Swing Injury Detection and Evaluation open service platform with Ontology-oritened clustering case-based reasoning mechanism, which is called GoSIDE, based on Arduino and Open Service Gateway initative (OSGi). GoSIDE is a three-tier architecture, which is composed of Mobile Users, Application Servers and a Cloud-based Digital Convergence Server. A mobile user is with a smartphone and Kinect sensors to detect the user's Golf swing actions and to interact with iDTV. An application server is with Intelligent Golf Swing Posture Analysis Model (iGoSPAM) to check a user's Golf swing actions and to alter this user when he is with error actions. Cloud-based Digital Convergence Server is with Ontology-oriented Clustering Case-based Reasoning (CBR) for Quality of Experiences (OCC4QoE), which is designed to provide QoE services by QoE-based Ontology strategies, rules and events for this user. Furthermore, GoSIDE will automatically trigger OCC4QoE and deliver popular rules for a new user. Experiment results illustrate that GoSIDE can provide appropriate detections for Golfers. Finally, GoSIDE can be a reference model for researchers and engineers.
Analysis and visualization of disease courses in a semantically-enabled cancer registry.
Esteban-Gil, Angel; Fernández-Breis, Jesualdo Tomás; Boeker, Martin
2017-09-29
Regional and epidemiological cancer registries are important for cancer research and the quality management of cancer treatment. Many technological solutions are available to collect and analyse data for cancer registries nowadays. However, the lack of a well-defined common semantic model is a problem when user-defined analyses and data linking to external resources are required. The objectives of this study are: (1) design of a semantic model for local cancer registries; (2) development of a semantically-enabled cancer registry based on this model; and (3) semantic exploitation of the cancer registry for analysing and visualising disease courses. Our proposal is based on our previous results and experience working with semantic technologies. Data stored in a cancer registry database were transformed into RDF employing a process driven by OWL ontologies. The semantic representation of the data was then processed to extract semantic patient profiles, which were exploited by means of SPARQL queries to identify groups of similar patients and to analyse the disease timelines of patients. Based on the requirements analysis, we have produced a draft of an ontology that models the semantics of a local cancer registry in a pragmatic extensible way. We have implemented a Semantic Web platform that allows transforming and storing data from cancer registries in RDF. This platform also permits users to formulate incremental user-defined queries through a graphical user interface. The query results can be displayed in several customisable ways. The complex disease timelines of individual patients can be clearly represented. Different events, e.g. different therapies and disease courses, are presented according to their temporal and causal relations. The presented platform is an example of the parallel development of ontologies and applications that take advantage of semantic web technologies in the medical field. The semantic structure of the representation renders it easy to analyse key figures of the patients and their evolution at different granularity levels.
User centered and ontology based information retrieval system for life sciences.
Sy, Mohameth-François; Ranwez, Sylvie; Montmain, Jacky; Regnault, Armelle; Crampes, Michel; Ranwez, Vincent
2012-01-25
Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.
User centered and ontology based information retrieval system for life sciences
2012-01-01
Background Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. Results This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. Conclusions The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help. PMID:22373375
NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation.
Martínez-Romero, Marcos; Jonquet, Clement; O'Connor, Martin J; Graybeal, John; Pazos, Alejandro; Musen, Mark A
2017-06-07
Ontologies and controlled terminologies have become increasingly important in biomedical research. Researchers use ontologies to annotate their data with ontology terms, enabling better data integration and interoperability across disparate datasets. However, the number, variety and complexity of current biomedical ontologies make it cumbersome for researchers to determine which ones to reuse for their specific needs. To overcome this problem, in 2010 the National Center for Biomedical Ontology (NCBO) released the Ontology Recommender, which is a service that receives a biomedical text corpus or a list of keywords and suggests ontologies appropriate for referencing the indicated terms. We developed a new version of the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a novel recommendation approach that evaluates the relevance of an ontology to biomedical text data according to four different criteria: (1) the extent to which the ontology covers the input data; (2) the acceptance of the ontology in the biomedical community; (3) the level of detail of the ontology classes that cover the input data; and (4) the specialization of the ontology to the domain of the input data. Our evaluation shows that the enhanced recommender provides higher quality suggestions than the original approach, providing better coverage of the input data, more detailed information about their concepts, increased specialization for the domain of the input data, and greater acceptance and use in the community. In addition, it provides users with more explanatory information, along with suggestions of not only individual ontologies but also groups of ontologies to use together. It also can be customized to fit the needs of different ontology recommendation scenarios. Ontology Recommender 2.0 suggests relevant ontologies for annotating biomedical text data. It combines the strengths of its predecessor with a range of adjustments and new features that improve its reliability and usefulness. Ontology Recommender 2.0 recommends over 500 biomedical ontologies from the NCBO BioPortal platform, where it is openly available (both via the user interface at http://bioportal.bioontology.org/recommender , and via a Web service API).
Sorokine, Alexandre; Schlicher, Bob G.; Ward, Richard C.; ...
2015-05-22
This paper describes an original approach to generating scenarios for the purpose of testing the algorithms used to detect special nuclear materials (SNM) that incorporates the use of ontologies. Separating the signal of SNM from the background requires sophisticated algorithms. To assist in developing such algorithms, there is a need for scenarios that capture a very wide range of variables affecting the detection process, depending on the type of detector being used. To provide such a cpability, we developed an ontology-driven information system (ODIS) for generating scenarios that can be used in creating scenarios for testing of algorithms for SNMmore » detection. The ontology-driven scenario generator (ODSG) is an ODIS based on information supplied by subject matter experts and other documentation. The details of the creation of the ontology, the development of the ontology-driven information system, and the design of the web user interface (UI) are presented along with specific examples of scenarios generated using the ODSG. We demonstrate that the paradigm behind the ODSG is capable of addressing the problem of semantic complexity at both the user and developer levels. Compared to traditional approaches, an ODIS provides benefits such as faithful representation of the users' domain conceptualization, simplified management of very large and semantically diverse datasets, and the ability to handle frequent changes to the application and the UI. Furthermore, the approach makes possible the generation of a much larger number of specific scenarios based on limited user-supplied information« less
WebProtégé: A Collaborative Ontology Editor and Knowledge Acquisition Tool for the Web
Tudorache, Tania; Nyulas, Csongor; Noy, Natalya F.; Musen, Mark A.
2012-01-01
In this paper, we present WebProtégé—a lightweight ontology editor and knowledge acquisition tool for the Web. With the wide adoption of Web 2.0 platforms and the gradual adoption of ontologies and Semantic Web technologies in the real world, we need ontology-development tools that are better suited for the novel ways of interacting, constructing and consuming knowledge. Users today take Web-based content creation and online collaboration for granted. WebProtégé integrates these features as part of the ontology development process itself. We tried to lower the entry barrier to ontology development by providing a tool that is accessible from any Web browser, has extensive support for collaboration, and a highly customizable and pluggable user interface that can be adapted to any level of user expertise. The declarative user interface enabled us to create custom knowledge-acquisition forms tailored for domain experts. We built WebProtégé using the existing Protégé infrastructure, which supports collaboration on the back end side, and the Google Web Toolkit for the front end. The generic and extensible infrastructure allowed us to easily deploy WebProtégé in production settings for several projects. We present the main features of WebProtégé and its architecture and describe briefly some of its uses for real-world projects. WebProtégé is free and open source. An online demo is available at http://webprotege.stanford.edu. PMID:23807872
Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.
2006-12-01
The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.
Developing a Domain Ontology: the Case of Water Cycle and Hydrology
NASA Astrophysics Data System (ADS)
Gupta, H.; Pozzi, W.; Piasecki, M.; Imam, B.; Houser, P.; Raskin, R.; Ramachandran, R.; Martinez Baquero, G.
2008-12-01
A semantic web ontology enables semantic data integration and semantic smart searching. Several organizations have attempted to implement smart registration and integration or searching using ontologies. These are the NOESIS (NSF project: LEAD) and HydroSeek (NSF project: CUAHS HIS) data discovery engines and the NSF project GEON. All three applications use ontologies to discover data from multiple sources and projects. The NASA WaterNet project was established to identify creative, innovative ways to bridge NASA research results to real world applications, linking decision support needs to available data, observations, and modeling capability. WaterNet (NASA project) utilized the smart query tool Noesis as a testbed to test whether different ontologies (and different catalog searches) could be combined to match resources with user needs. NOESIS contains the upper level SWEET ontology that accepts plug in domain ontologies to refine user search queries, reducing the burden of multiple keyword searches. Another smart search interface was that developed for CUAHSI, HydroSeek, that uses a multi-layered concept search ontology, tagging variables names from any number of data sources to specific leaf and higher level concepts on which the search is executed. This approach has proven to be quite successful in mitigating semantic heterogeneity as the user does not need to know the semantic specifics of each data source system but just uses a set of common keywords to discover the data for a specific temporal and geospatial domain. This presentation will show tests with Noesis and Hydroseek lead to the conclusion that the construction of a complex, and highly heterogeneous water cycle ontology requires multiple ontology modules. To illustrate the complexity and heterogeneity of a water cycle ontology, Hydroseek successfully utilizes WaterOneFlow to integrate data across multiple different data collections, such as USGS NWIS. However,different methodologies are employed by the Earth Science, the Hydrological, and Hydraulic Engineering Communities, and each community employs models that require different input data. If a sub-domain ontology is created for each of these,describing water balance calculations, then the resulting structure of the semantic network describing these various terms can be rather complex, heterogeneous, and overlapping, and will require "mapping" between equivalent terms in the ontologies, along with the development of an upper level conceptual or domain ontology to utilize and link to those already in existence.
Data Quality Screening Service
NASA Technical Reports Server (NTRS)
Strub, Richard; Lynnes, Christopher; Hearty, Thomas; Won, Young-In; Fox, Peter; Zednik, Stephan
2013-01-01
A report describes the Data Quality Screening Service (DQSS), which is designed to help automate the filtering of remote sensing data on behalf of science users. Whereas this process often involves much research through quality documents followed by laborious coding, the DQSS is a Web Service that provides data users with data pre-filtered to their particular criteria, while at the same time guiding the user with filtering recommendations of the cognizant data experts. The DQSS design is based on a formal semantic Web ontology that describes data fields and the quality fields for applying quality control within a data product. The accompanying code base handles several remote sensing datasets and quality control schemes for data products stored in Hierarchical Data Format (HDF), a common format for NASA remote sensing data. Together, the ontology and code support a variety of quality control schemes through the implementation of the Boolean expression with simple, reusable conditional expressions as operands. Additional datasets are added to the DQSS simply by registering instances in the ontology if they follow a quality scheme that is already modeled in the ontology. New quality schemes are added by extending the ontology and adding code for each new scheme.
a Context-Aware Tourism Recommender System Based on a Spreading Activation Method
NASA Astrophysics Data System (ADS)
Bahramian, Z.; Abbaspour, R. Ali; Claramunt, C.
2017-09-01
Users planning a trip to a given destination often search for the most appropriate points of interest location, this being a non-straightforward task as the range of information available is very large and not very well structured. The research presented by this paper introduces a context-aware tourism recommender system that overcomes the information overload problem by providing personalized recommendations based on the user's preferences. It also incorporates contextual information to improve the recommendation process. As previous context-aware tourism recommender systems suffer from a lack of formal definition to represent contextual information and user's preferences, the proposed system is enhanced using an ontology approach. We also apply a spreading activation technique to contextualize user preferences and learn the user profile dynamically according to the user's feedback. The proposed method assigns more effect in the spreading process for nodes which their preference values are assigned directly by the user. The results show the overall performance of the proposed context-aware tourism recommender systems by an experimental application to the city of Tehran.
ODISEES: A New Paradigm in Data Access
NASA Astrophysics Data System (ADS)
Huffer, E.; Little, M. M.; Kusterer, J.
2013-12-01
As part of its ongoing efforts to improve access to data, the Atmospheric Science Data Center has developed a high-precision Earth Science domain ontology (the 'ES Ontology') implemented in a graph database ('the Semantic Metadata Repository') that is used to store detailed, semantically-enhanced, parameter-level metadata for ASDC data products. The ES Ontology provides the semantic infrastructure needed to drive the ASDC's Ontology-Driven Interactive Search Environment for Earth Science ('ODISEES'), a data discovery and access tool, and will support additional data services such as analytics and visualization. The ES ontology is designed on the premise that naming conventions alone are not adequate to provide the information needed by prospective data consumers to assess the suitability of a given dataset for their research requirements; nor are current metadata conventions adequate to support seamless machine-to-machine interactions between file servers and end-user applications. Data consumers need information not only about what two data elements have in common, but also about how they are different. End-user applications need consistent, detailed metadata to support real-time data interoperability. The ES ontology is a highly precise, bottom-up, queriable model of the Earth Science domain that focuses on critical details about the measurable phenomena, instrument techniques, data processing methods, and data file structures. Earth Science parameters are described in detail in the ES Ontology and mapped to the corresponding variables that occur in ASDC datasets. Variables are in turn mapped to well-annotated representations of the datasets that they occur in, the instrument(s) used to create them, the instrument platforms, the processing methods, etc., creating a linked-data structure that allows both human and machine users to access a wealth of information critical to understanding and manipulating the data. The mappings are recorded in the Semantic Metadata Repository as RDF-triples. An off-the-shelf Ontology Development Environment and a custom Metadata Conversion Tool comprise a human-machine/machine-machine hybrid tool that partially automates the creation of metadata as RDF-triples by interfacing with existing metadata repositories and providing a user interface that solicits input from a human user, when needed. RDF-triples are pushed to the Ontology Development Environment, where a reasoning engine executes a series of inference rules whose antecedent conditions can be satisfied by the initial set of RDF-triples, thereby generating the additional detailed metadata that is missing in existing repositories. A SPARQL Endpoint, a web-based query service and a Graphical User Interface allow prospective data consumers - even those with no familiarity with NASA data products - to search the metadata repository to find and order data products that meet their exact specifications. A web-based API will provide an interface for machine-to-machine transactions.
Using ontology-based annotation to profile disease research
Coulet, Adrien; LePendu, Paea; Shah, Nigam H
2012-01-01
Background Profiling the allocation and trend of research activity is of interest to funding agencies, administrators, and researchers. However, the lack of a common classification system hinders the comprehensive and systematic profiling of research activities. This study introduces ontology-based annotation as a method to overcome this difficulty. Analyzing over a decade of funding data and publication data, the trends of disease research are profiled across topics, across institutions, and over time. Results This study introduces and explores the notions of research sponsorship and allocation and shows that leaders of research activity can be identified within specific disease areas of interest, such as those with high mortality or high sponsorship. The funding profiles of disease topics readily cluster themselves in agreement with the ontology hierarchy and closely mirror the funding agency priorities. Finally, four temporal trends are identified among research topics. Conclusions This work utilizes disease ontology (DO)-based annotation to profile effectively the landscape of biomedical research activity. By using DO in this manner a use-case driven mechanism is also proposed to evaluate the utility of classification hierarchies. PMID:22494789
Standardized description of scientific evidence using the Evidence Ontology (ECO)
Chibucos, Marcus C.; Mungall, Christopher J.; Balakrishnan, Rama; Christie, Karen R.; Huntley, Rachael P.; White, Owen; Blake, Judith A.; Lewis, Suzanna E.; Giglio, Michelle
2014-01-01
The Evidence Ontology (ECO) is a structured, controlled vocabulary for capturing evidence in biological research. ECO includes diverse terms for categorizing evidence that supports annotation assertions including experimental types, computational methods, author statements and curator inferences. Using ECO, annotation assertions can be distinguished according to the evidence they are based on such as those made by curators versus those automatically computed or those made via high-throughput data review versus single test experiments. Originally created for capturing evidence associated with Gene Ontology annotations, ECO is now used in other capacities by many additional annotation resources including UniProt, Mouse Genome Informatics, Saccharomyces Genome Database, PomBase, the Protein Information Resource and others. Information on the development and use of ECO can be found at http://evidenceontology.org. The ontology is freely available under Creative Commons license (CC BY-SA 3.0), and can be downloaded in both Open Biological Ontologies and Web Ontology Language formats at http://code.google.com/p/evidenceontology. Also at this site is a tracker for user submission of term requests and questions. ECO remains under active development in response to user-requested terms and in collaborations with other ontologies and database resources. Database URL: Evidence Ontology Web site: http://evidenceontology.org PMID:25052702
Hybrid ontology for semantic information retrieval model using keyword matching indexing system.
Uthayan, K R; Mala, G S Anandha
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.
Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System
Uthayan, K. R.; Anandha Mala, G. S.
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology. PMID:25922851
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources
Jonquet, Clement; LePendu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F.; Musen, Mark A.; Shah, Nigam H.
2011-01-01
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index—a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.” PMID:21918645
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources.
Jonquet, Clement; Lependu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F; Musen, Mark A; Shah, Nigam H
2011-09-01
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics "under the hood."
NASA Astrophysics Data System (ADS)
García, Isaías; Benavides, Carmen; Alaiz, Héctor; Alonso, Angel
2013-08-01
This paper describes research on the use of knowledge models (ontologies) for building computer-aided educational software in the field of control engineering. Ontologies are able to represent in the computer a very rich conceptual model of a given domain. This model can be used later for a number of purposes in different software applications. In this study, domain ontology about the field of lead-lag compensator design has been built and used for automatic exercise generation, graphical user interface population and interaction with the user at any level of detail, including explanations about why things occur. An application called Onto-CELE (ontology-based control engineering learning environment) uses the ontology for implementing a learning environment that can be used for self and lifelong learning purposes. The experience has shown that the use of knowledge models as the basis for educational software applications is capable of showing students the whole complexity of the analysis and design processes at any level of detail. A practical experience with postgraduate students has shown the mentioned benefits and possibilities of the approach.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE.
Demelo, Jonathan; Parsons, Paul; Sedig, Kamran
2017-02-02
Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts. ©Jonathan Demelo, Paul Parsons, Kamran Sedig. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.02.2017.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE
2017-01-01
Background Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Objective Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. Methods We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Results Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Conclusions Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts. PMID:28153818
2014-01-01
Motivation Biomedical ontologists to date have concentrated on ontological descriptions of biomedical entities such as gene products and their attributes, phenotypes and so on. Recently, effort has diversified to descriptions of the laboratory investigations by which these entities were produced. However, much biological insight is gained from the analysis of the data produced from these investigations, and there is a lack of adequate descriptions of the wide range of software that are central to bioinformatics. We need to describe how data are analyzed for discovery, audit trails, provenance and reproducibility. Results The Software Ontology (SWO) is a description of software used to store, manage and analyze data. Input to the SWO has come from beyond the life sciences, but its main focus is the life sciences. We used agile techniques to gather input for the SWO and keep engagement with our users. The result is an ontology that meets the needs of a broad range of users by describing software, its information processing tasks, data inputs and outputs, data formats versions and so on. Recently, the SWO has incorporated EDAM, a vocabulary for describing data and related concepts in bioinformatics. The SWO is currently being used to describe software used in multiple biomedical applications. Conclusion The SWO is another element of the biomedical ontology landscape that is necessary for the description of biomedical entities and how they were discovered. An ontology of software used to analyze data produced by investigations in the life sciences can be made in such a way that it covers the important features requested and prioritized by its users. The SWO thus fits into the landscape of biomedical ontologies and is produced using techniques designed to keep it in line with user’s needs. Availability The Software Ontology is available under an Apache 2.0 license at http://theswo.sourceforge.net/; the Software Ontology blog can be read at http://softwareontology.wordpress.com. PMID:25068035
Applications of Ontology Design Patterns in Biomedical Ontologies
Mortensen, Jonathan M.; Horridge, Matthew; Musen, Mark A.; Noy, Natalya F.
2012-01-01
Ontology design patterns (ODPs) are a proposed solution to facilitate ontology development, and to help users avoid some of the most frequent modeling mistakes. ODPs originate from similar approaches in software engineering, where software design patterns have become a critical aspect of software development. There is little empirical evidence for ODP prevalence or effectiveness thus far. In this work, we determine the use and applicability of ODPs in a case study of biomedical ontologies. We encoded ontology design patterns from two ODP catalogs. We then searched for these patterns in a set of eight ontologies. We found five patterns of the 69 patterns. Two of the eight ontologies contained these patterns. While ontology design patterns provide a vehicle for capturing formally reoccurring models and best practices in ontology design, we show that today their use in a case study of widely used biomedical ontologies is limited. PMID:23304337
Semantics-driven modelling of user preferences for information retrieval in the biomedical domain.
Gladun, Anatoly; Rogushina, Julia; Valencia-García, Rafael; Béjar, Rodrigo Martínez
2013-03-01
A large amount of biomedical and genomic data are currently available on the Internet. However, data are distributed into heterogeneous biological information sources, with little or even no organization. Semantic technologies provide a consistent and reliable basis with which to confront the challenges involved in the organization, manipulation and visualization of data and knowledge. One of the knowledge representation techniques used in semantic processing is the ontology, which is commonly defined as a formal and explicit specification of a shared conceptualization of a domain of interest. The work presented here introduces a set of interoperable algorithms that can use domain and ontological information to improve information-retrieval processes. This work presents an ontology-based information-retrieval system for the biomedical domain. This system, with which some experiments have been carried out that are described in this paper, is based on the use of domain ontologies for the creation and normalization of lightweight ontologies that represent user preferences in a determined domain in order to improve information-retrieval processes.
Ontology-based classification of remote sensing images using spectral rules
NASA Astrophysics Data System (ADS)
Andrés, Samuel; Arvor, Damien; Mougenot, Isabelle; Libourel, Thérèse; Durieux, Laurent
2017-05-01
Earth Observation data is of great interest for a wide spectrum of scientific domain applications. An enhanced access to remote sensing images for "domain" experts thus represents a great advance since it allows users to interpret remote sensing images based on their domain expert knowledge. However, such an advantage can also turn into a major limitation if this knowledge is not formalized, and thus is difficult for it to be shared with and understood by other users. In this context, knowledge representation techniques such as ontologies should play a major role in the future of remote sensing applications. We implemented an ontology-based prototype to automatically classify Landsat images based on explicit spectral rules. The ontology is designed in a very modular way in order to achieve a generic and versatile representation of concepts we think of utmost importance in remote sensing. The prototype was tested on four subsets of Landsat images and the results confirmed the potential of ontologies to formalize expert knowledge and classify remote sensing images.
Conceptual Web Users' Actions Prediction for Ontology-Based Browsing Recommendations
NASA Astrophysics Data System (ADS)
Robal, Tarmo; Kalja, Ahto
The Internet consists of thousands of web sites with different kinds of structures. However, users are browsing the web according to their informational expectations towards the web site searched, having an implicit conceptual model of the domain in their minds. Nevertheless, people tend to repeat themselves and have partially shared conceptual views while surfing the web, finding some areas of web sites more interesting than others. Herein, we take advantage of the latter and provide a model and a study on predicting users' actions based on the web ontology concepts and their relations.
Web information retrieval based on ontology
NASA Astrophysics Data System (ADS)
Zhang, Jian
2013-03-01
The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.
An Ontology for the Discovery of Time-series Data
NASA Astrophysics Data System (ADS)
Hooper, R. P.; Choi, Y.; Piasecki, M.; Zaslavsky, I.; Valentine, D. W.; Whitenack, T.
2010-12-01
An ontology was developed to enable a single-dimensional keyword search of time-series data collected at fixed points, such as stream gage records, water quality observations, or repeated biological measurements collected at fixed stations. The hierarchical levels were developed to allow navigation from general concepts to more specific ones, terminating in a leaf concept, which is the specific property measured. For example, the concept “nutrient” has child concepts of “nitrogen”, “phosphorus”, and “carbon”; each of these children concepts are then broken into the actual constituent measured (e.g., “total kjeldahl nitrogen” or “nitrate + nitrite”). In this way, a non-expert user can find all nutrients containing nitrogen without knowing all the species measured, but an expert user can go immediately to the compound of interest. In addition, a property, such as dissolved silica, can appear as a leaf concept under nutrients or weathering products. This flexibility allows users from various disciplines to find properties of interest. The ontology can be viewed at http://water.sdsc.edu/hiscentral/startree.aspx. Properties measured by various data publishers (e.g., universities and government agencies) are tagged with leaf concepts from this ontology. A discovery client, HydroDesktop, creates a search request by defining the spatial and temporal extent of interest and a keyword taken from the discovery ontology. Metadata returned from the catalog describes the time series which meet the specified search criteria. This ontology is considered to be an initial description of physical, chemical and biological properties measured in water and suspended sediment. Future plans call for creating a moderated forum for the scientific community to add to and to modify this ontology. Further information for the Hydrologic Information Systems project, of which this is a part, is available at http://his.cuahsi.org.
Versioning System for Distributed Ontology Development
2016-03-15
provides guidelines for evaluating the impact of the version changes. This page intentionally left blank. v...conformance to a clear set of development and versioning guidelines to assure that changes and extensions can be integrated back into the “main development... guidelines for evolution of an ontology would have considerably helped the users of the ontology in these situations. The currently accessible
Towards Agile Ontology Maintenance
NASA Astrophysics Data System (ADS)
Luczak-Rösch, Markus
Ontologies are an appropriate means to represent knowledge on the Web. Research on ontology engineering reached practices for an integrative lifecycle support. However, a broader success of ontologies in Web-based information systems remains unreached while the more lightweight semantic approaches are rather successful. We assume, paired with the emerging trend of services and microservices on the Web, new dynamic scenarios gain momentum in which a shared knowledge base is made available to several dynamically changing services with disparate requirements. Our work envisions a step towards such a dynamic scenario in which an ontology adapts to the requirements of the accessing services and applications as well as the user's needs in an agile way and reduces the experts' involvement in ontology maintenance processes.
Text-Content-Analysis based on the Syntactic Correlations between Ontologies
NASA Astrophysics Data System (ADS)
Tenschert, Axel; Kotsiopoulos, Ioannis; Koller, Bastian
The work presented in this chapter is concerned with the analysis of semantic knowledge structures, represented in the form of Ontologies, through which Service Level Agreements (SLAs) are enriched with new semantic data. The objective of the enrichment process is to enable SLA negotiation in a way that is much more convenient for a Service Users. For this purpose the deployment of an SLA-Management-System as well as the development of an analyzing procedure for Ontologies is required. This chapter will refer to the BREIN, the FinGrid and the LarKC projects. The analyzing procedure examines the syntactic correlations of several Ontologies whose focus lies in the field of mechanical engineering. A method of analyzing text and content is developed as part of this procedure. In order to so, we introduce a formalism as well as a method for understanding content. The analysis and methods are integrated to an SLA Management System which enables a Service User to interact with the system as a service by negotiating the user requests and including the semantic knowledge. Through negotiation between Service User and Service Provider the analysis procedure considers the user requests by extending the SLAs with semantic knowledge. Through this the economic use of an SLA-Management-System is increased by the enhancement of SLAs with semantic knowledge structures. The main focus of this chapter is the analyzing procedure, respectively the Text-Content-Analysis, which provides the mentioned semantic knowledge structures.
DEVA: An extensible ontology-based annotation model for visual document collections
NASA Astrophysics Data System (ADS)
Jelmini, Carlo; Marchand-Maillet, Stephane
2003-01-01
The description of visual documents is a fundamental aspect of any efficient information management system, but the process of manually annotating large collections of documents is tedious and far from being perfect. The need for a generic and extensible annotation model therefore arises. In this paper, we present DEVA, an open, generic and expressive multimedia annotation framework. DEVA is an extension of the Dublin Core specification. The model can represent the semantic content of any visual document. It is described in the ontology language DAML+OIL and can easily be extended with external specialized ontologies, adapting the vocabulary to the given application domain. In parallel, we present the Magritte annotation tool, which is an early prototype that validates the DEVA features. Magritte allows to manually annotating image collections. It is designed with a modular and extensible architecture, which enables the user to dynamically adapt the user interface to specialized ontologies merged into DEVA.
SAFOD Brittle Microstructure and Mechanics Knowledge Base (BM2KB)
NASA Astrophysics Data System (ADS)
Babaie, Hassan A.; Broda Cindi, M.; Hadizadeh, Jafar; Kumar, Anuj
2013-07-01
Scientific drilling near Parkfield, California has established the San Andreas Fault Observatory at Depth (SAFOD), which provides the solid earth community with short range geophysical and fault zone material data. The BM2KB ontology was developed in order to formalize the knowledge about brittle microstructures in the fault rocks sampled from the SAFOD cores. A knowledge base, instantiated from this domain ontology, stores and presents the observed microstructural and analytical data with respect to implications for brittle deformation and mechanics of faulting. These data can be searched on the knowledge base‧s Web interface by selecting a set of terms (classes, properties) from different drop-down lists that are dynamically populated from the ontology. In addition to this general search, a query can also be conducted to view data contributed by a specific investigator. A search by sample is done using the EarthScope SAFOD Core Viewer that allows a user to locate samples on high resolution images of core sections belonging to different runs and holes. The class hierarchy of the BM2KB ontology was initially designed using the Unified Modeling Language (UML), which was used as a visual guide to develop the ontology in OWL applying the Protégé ontology editor. Various Semantic Web technologies such as the RDF, RDFS, and OWL ontology languages, SPARQL query language, and Pellet reasoning engine, were used to develop the ontology. An interactive Web application interface was developed through Jena, a java based framework, with AJAX technology, jsp pages, and java servlets, and deployed via an Apache tomcat server. The interface allows the registered user to submit data related to their research on a sample of the SAFOD core. The submitted data, after initial review by the knowledge base administrator, are added to the extensible knowledge base and become available in subsequent queries to all types of users. The interface facilitates inference capabilities in the ontology, supports SPARQL queries, allows for modifications based on successive discoveries, and provides an accessible knowledge base on the Web.
Semantic Integration for Marine Science Interoperability Using Web Technologies
NASA Astrophysics Data System (ADS)
Rueda, C.; Bermudez, L.; Graybeal, J.; Isenor, A. W.
2008-12-01
The Marine Metadata Interoperability Project, MMI (http://marinemetadata.org) promotes the exchange, integration, and use of marine data through enhanced data publishing, discovery, documentation, and accessibility. A key effort is the definition of an Architectural Framework and Operational Concept for Semantic Interoperability (http://marinemetadata.org/sfc), which is complemented with the development of tools that realize critical use cases in semantic interoperability. In this presentation, we describe a set of such Semantic Web tools that allow performing important interoperability tasks, ranging from the creation of controlled vocabularies and the mapping of terms across multiple ontologies, to the online registration, storage, and search services needed to work with the ontologies (http://mmisw.org). This set of services uses Web standards and technologies, including Resource Description Framework (RDF), Web Ontology language (OWL), Web services, and toolkits for Rich Internet Application development. We will describe the following components: MMI Ontology Registry: The MMI Ontology Registry and Repository provides registry and storage services for ontologies. Entries in the registry are associated with projects defined by the registered users. Also, sophisticated search functions, for example according to metadata items and vocabulary terms, are provided. Client applications can submit search requests using the WC3 SPARQL Query Language for RDF. Voc2RDF: This component converts an ASCII comma-delimited set of terms and definitions into an RDF file. Voc2RDF facilitates the creation of controlled vocabularies by using a simple form-based user interface. Created vocabularies and their descriptive metadata can be submitted to the MMI Ontology Registry for versioning and community access. VINE: The Vocabulary Integration Environment component allows the user to map vocabulary terms across multiple ontologies. Various relationships can be established, for example exactMatch, narrowerThan, and subClassOf. VINE can compute inferred mappings based on the given associations. Attributes about each mapping, like comments and a confidence level, can also be included. VINE also supports registering and storing resulting mapping files in the Ontology Registry. The presentation will describe the application of semantic technologies in general, and our planned applications in particular, to solve data management problems in the marine and environmental sciences.
Brochhausen, Mathias; Spear, Andrew D.; Cocos, Cristian; Weiler, Gabriele; Martín, Luis; Anguita, Alberto; Stenzhorn, Holger; Daskalaki, Evangelia; Schera, Fatima; Schwarz, Ulf; Sfakianakis, Stelios; Kiefer, Stephan; Dörr, Martin; Graf, Norbert; Tsiknakis, Manolis
2017-01-01
Objective This paper introduces the objectives, methods and results of ontology development in the EU co-funded project Advancing Clinico-genomic Trials on Cancer – Open Grid Services for Improving Medical Knowledge Discovery (ACGT). While the available data in the life sciences has recently grown both in amount and quality, the full exploitation of it is being hindered by the use of different underlying technologies, coding systems, category schemes and reporting methods on the part of different research groups. The goal of the ACGT project is to contribute to the resolution of these problems by developing an ontology-driven, semantic grid services infrastructure that will enable efficient execution of discovery-driven scientific workflows in the context of multi-centric, post-genomic clinical trials. The focus of the present paper is the ACGT Master Ontology (MO). Methods ACGT project researchers undertook a systematic review of existing domain and upper-level ontologies, as well as of existing ontology design software, implementation methods, and end-user interfaces. This included the careful study of best practices, design principles and evaluation methods for ontology design, maintenance, implementation, and versioning, as well as for use on the part of domain experts and clinicians. Results To date, the results of the ACGT project include (i) the development of a master ontology (the ACGT-MO) based on clearly defined principles of ontology development and evaluation; (ii) the development of a technical infra-structure (the ACGT Platform) that implements the ACGT-MO utilizing independent tools, components and resources that have been developed based on open architectural standards, and which includes an application updating and evolving the ontology efficiently in response to end-user needs; and (iii) the development of an Ontology-based Trial Management Application (ObTiMA) that integrates the ACGT-MO into the design process of clinical trials in order to guarantee automatic semantic integration without the need to perform a separate mapping process. PMID:20438862
Samwald, Matthias; Lim, Ernest; Masiar, Peter; Marenco, Luis; Chen, Huajun; Morse, Thomas; Mutalik, Pradeep; Shepherd, Gordon; Miller, Perry; Cheung, Kei-Hoi
2009-01-01
The amount of biomedical data available in Semantic Web formats has been rapidly growing in recent years. While these formats are machine-friendly, user-friendly web interfaces allowing easy querying of these data are typically lacking. We present "Entrez Neuron", a pilot neuron-centric interface that allows for keyword-based queries against a coherent repository of OWL ontologies. These ontologies describe neuronal structures, physiology, mathematical models and microscopy images. The returned query results are organized hierarchically according to brain architecture. Where possible, the application makes use of entities from the Open Biomedical Ontologies (OBO) and the 'HCLS knowledgebase' developed by the W3C Interest Group for Health Care and Life Science. It makes use of the emerging RDFa standard to embed ontology fragments and semantic annotations within its HTML-based user interface. The application and underlying ontologies demonstrate how Semantic Web technologies can be used for information integration within a curated information repository and between curated information repositories. It also demonstrates how information integration can be accomplished on the client side, through simple copying and pasting of portions of documents that contain RDFa markup.
KA-SB: from data integration to large scale reasoning
Roldán-García, María del Mar; Navas-Delgado, Ismael; Kerzazi, Amine; Chniber, Othmane; Molina-Castro, Joaquín; Aldana-Montes, José F
2009-01-01
Background The analysis of information in the biological domain is usually focused on the analysis of data from single on-line data sources. Unfortunately, studying a biological process requires having access to disperse, heterogeneous, autonomous data sources. In this context, an analysis of the information is not possible without the integration of such data. Methods KA-SB is a querying and analysis system for final users based on combining a data integration solution with a reasoner. Thus, the tool has been created with a process divided into two steps: 1) KOMF, the Khaos Ontology-based Mediator Framework, is used to retrieve information from heterogeneous and distributed databases; 2) the integrated information is crystallized in a (persistent and high performance) reasoner (DBOWL). This information could be further analyzed later (by means of querying and reasoning). Results In this paper we present a novel system that combines the use of a mediation system with the reasoning capabilities of a large scale reasoner to provide a way of finding new knowledge and of analyzing the integrated information from different databases, which is retrieved as a set of ontology instances. This tool uses a graphical query interface to build user queries easily, which shows a graphical representation of the ontology and allows users o build queries by clicking on the ontology concepts. Conclusion These kinds of systems (based on KOMF) will provide users with very large amounts of information (interpreted as ontology instances once retrieved), which cannot be managed using traditional main memory-based reasoners. We propose a process for creating persistent and scalable knowledgebases from sets of OWL instances obtained by integrating heterogeneous data sources with KOMF. This process has been applied to develop a demo tool , which uses the BioPax Level 3 ontology as the integration schema, and integrates UNIPROT, KEGG, CHEBI, BRENDA and SABIORK databases. PMID:19796402
Ontology-based, Tissue MicroArray oriented, image centered tissue bank
Viti, Federica; Merelli, Ivan; Caprera, Andrea; Lazzari, Barbara; Stella, Alessandra; Milanesi, Luciano
2008-01-01
Background Tissue MicroArray technique is becoming increasingly important in pathology for the validation of experimental data from transcriptomic analysis. This approach produces many images which need to be properly managed, if possible with an infrastructure able to support tissue sharing between institutes. Moreover, the available frameworks oriented to Tissue MicroArray provide good storage for clinical patient, sample treatment and block construction information, but their utility is limited by the lack of data integration with biomolecular information. Results In this work we propose a Tissue MicroArray web oriented system to support researchers in managing bio-samples and, through the use of ontologies, enables tissue sharing aimed at the design of Tissue MicroArray experiments and results evaluation. Indeed, our system provides ontological description both for pre-analysis tissue images and for post-process analysis image results, which is crucial for information exchange. Moreover, working on well-defined terms it is then possible to query web resources for literature articles to integrate both pathology and bioinformatics data. Conclusions Using this system, users associate an ontology-based description to each image uploaded into the database and also integrate results with the ontological description of biosequences identified in every tissue. Moreover, it is possible to integrate the ontological description provided by the user with a full compliant gene ontology definition, enabling statistical studies about correlation between the analyzed pathology and the most commonly related biological processes. PMID:18460177
Ontological Issues in Higher Levels of Information Fusion: User Refinement of the Fusion Process
2003-01-01
fusion question, the thing that is separates the Greek We explore the higher-level purpose offusion systems by philosophical questions and modem day...the The Greeks focused on both data fusion and the Fusion02 conference there are common fusion questions philosophical questions of an ontology - the...data World of Visible Things Belief (pistis) fusion - user refinement. The rest of the paper is as Appearances follows: Section 2 details the Greek
G-Bean: an ontology-graph based web tool for biomedical literature retrieval
2014-01-01
Background Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. Methods G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Results Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. Conclusions G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user. PMID:25474588
G-Bean: an ontology-graph based web tool for biomedical literature retrieval.
Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S
2014-01-01
Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jarocki, John Charles; Zage, David John; Fisher, Andrew N.
LinkShop is a software tool for applying the method of Linkography to the analysis time-sequence data. LinkShop provides command line, web, and application programming interfaces (API) for input and processing of time-sequence data, abstraction models, and ontologies. The software creates graph representations of the abstraction model, ontology, and derived linkograph. Finally, the tool allows the user to perform statistical measurements of the linkograph and refine the ontology through direct manipulation of the linkograph.
A methodological approach for designing a usable ontology-based GUI in healthcare.
Lasierra, N; Kushniruk, A; Alesanco, A; Borycki, E; García, J
2013-01-01
This paper presents a methodological approach to the design and evaluation of an interface for an ontology-based system used for designing care plans for monitoring patients at home. In order to define the care plans, physicians need a tool for creating instances of the ontology and configuring some rules. Our purpose is to develop an interface to allow clinicians to interact with the ontology. Although ontology-driven applications do not necessarily present the ontology in the user interface, it is our hypothesis that showing selected parts of the ontology in a "usable" way could enhance clinician's understanding and make easier the definition of the care plans. Based on prototyping and iterative testing, this methodology combines visualization techniques and usability methods. Preliminary results obtained after a formative evaluation indicate the effectiveness of suggested combination.
A multi-ontology approach to annotate scientific documents based on a modularization technique.
Gomes, Priscilla Corrêa E Castro; Moura, Ana Maria de Carvalho; Cavalcanti, Maria Cláudia
2015-12-01
Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public databases available on the Web provide useful data, but much more useful information is only accessible in scientific texts. Text annotation may help as it relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
WebProtégé: a collaborative Web-based platform for editing biomedical ontologies.
Horridge, Matthew; Tudorache, Tania; Nuylas, Csongor; Vendetti, Jennifer; Noy, Natalya F; Musen, Mark A
2014-08-15
WebProtégé is an open-source Web application for editing OWL 2 ontologies. It contains several features to aid collaboration, including support for the discussion of issues, change notification and revision-based change tracking. WebProtégé also features a simple user interface, which is geared towards editing the kinds of class descriptions and annotations that are prevalent throughout biomedical ontologies. Moreover, it is possible to configure the user interface using views that are optimized for editing Open Biomedical Ontology (OBO) class descriptions and metadata. Some of these views are shown in the Supplementary Material and can be seen in WebProtégé itself by configuring the project as an OBO project. WebProtégé is freely available for use on the Web at http://webprotege.stanford.edu. It is implemented in Java and JavaScript using the OWL API and the Google Web Toolkit. All major browsers are supported. For users who do not wish to host their ontologies on the Stanford servers, WebProtégé is available as a Web app that can be run locally using a Servlet container such as Tomcat. Binaries, source code and documentation are available under an open-source license at http://protegewiki.stanford.edu/wiki/WebProtege. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sahoo, Satya S; Ramesh, Priya; Welter, Elisabeth; Bukach, Ashley; Valdez, Joshua; Tatsuoka, Curtis; Bamps, Yvan; Stoll, Shelley; Jobst, Barbara C; Sajatovic, Martha
2016-10-01
We present Insight as an integrated database and analysis platform for epilepsy self-management research as part of the national Managing Epilepsy Well Network. Insight is the only available informatics platform for accessing and analyzing integrated data from multiple epilepsy self-management research studies with several new data management features and user-friendly functionalities. The features of Insight include, (1) use of Common Data Elements defined by members of the research community and an epilepsy domain ontology for data integration and querying, (2) visualization tools to support real time exploration of data distribution across research studies, and (3) an interactive visual query interface for provenance-enabled research cohort identification. The Insight platform contains data from five completed epilepsy self-management research studies covering various categories of data, including depression, quality of life, seizure frequency, and socioeconomic information. The data represents over 400 participants with 7552 data points. The Insight data exploration and cohort identification query interface has been developed using Ruby on Rails Web technology and open source Web Ontology Language Application Programming Interface to support ontology-based reasoning. We have developed an efficient ontology management module that automatically updates the ontology mappings each time a new version of the Epilepsy and Seizure Ontology is released. The Insight platform features a Role-based Access Control module to authenticate and effectively manage user access to different research studies. User access to Insight is managed by the Managing Epilepsy Well Network database steering committee consisting of representatives of all current collaborating centers of the Managing Epilepsy Well Network. New research studies are being continuously added to the Insight database and the size as well as the unique coverage of the dataset allows investigators to conduct aggregate data analysis that will inform the next generation of epilepsy self-management studies. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Ontology-Based High-Level Context Inference for Human Behavior Identification
Villalonga, Claudia; Razzaq, Muhammad Asif; Khan, Wajahat Ali; Pomares, Hector; Rojas, Ignacio; Lee, Sungyoung; Banos, Oresti
2016-01-01
Recent years have witnessed a huge progress in the automatic identification of individual primitives of human behavior, such as activities or locations. However, the complex nature of human behavior demands more abstract contextual information for its analysis. This work presents an ontology-based method that combines low-level primitives of behavior, namely activity, locations and emotions, unprecedented to date, to intelligently derive more meaningful high-level context information. The paper contributes with a new open ontology describing both low-level and high-level context information, as well as their relationships. Furthermore, a framework building on the developed ontology and reasoning models is presented and evaluated. The proposed method proves to be robust while identifying high-level contexts even in the event of erroneously-detected low-level contexts. Despite reasonable inference times being obtained for a relevant set of users and instances, additional work is required to scale to long-term scenarios with a large number of users. PMID:27690050
Automated software system for checking the structure and format of ACM SIG documents
NASA Astrophysics Data System (ADS)
Mirza, Arsalan Rahman; Sah, Melike
2017-04-01
Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.
Whetzel, Patricia L; Noy, Natalya F; Shah, Nigam H; Alexander, Paul R; Nyulas, Csongor; Tudorache, Tania; Musen, Mark A
2011-07-01
The National Center for Biomedical Ontology (NCBO) is one of the National Centers for Biomedical Computing funded under the NIH Roadmap Initiative. Contributing to the national computing infrastructure, NCBO has developed BioPortal, a web portal that provides access to a library of biomedical ontologies and terminologies (http://bioportal.bioontology.org) via the NCBO Web services. BioPortal enables community participation in the evaluation and evolution of ontology content by providing features to add mappings between terms, to add comments linked to specific ontology terms and to provide ontology reviews. The NCBO Web services (http://www.bioontology.org/wiki/index.php/NCBO_REST_services) enable this functionality and provide a uniform mechanism to access ontologies from a variety of knowledge representation formats, such as Web Ontology Language (OWL) and Open Biological and Biomedical Ontologies (OBO) format. The Web services provide multi-layered access to the ontology content, from getting all terms in an ontology to retrieving metadata about a term. Users can easily incorporate the NCBO Web services into software applications to generate semantically aware applications and to facilitate structured data collection.
Federated ontology-based queries over cancer data
2012-01-01
Background Personalised medicine provides patients with treatments that are specific to their genetic profiles. It requires efficient data sharing of disparate data types across a variety of scientific disciplines, such as molecular biology, pathology, radiology and clinical practice. Personalised medicine aims to offer the safest and most effective therapeutic strategy based on the gene variations of each subject. In particular, this is valid in oncology, where knowledge about genetic mutations has already led to new therapies. Current molecular biology techniques (microarrays, proteomics, epigenetic technology and improved DNA sequencing technology) enable better characterisation of cancer tumours. The vast amounts of data, however, coupled with the use of different terms - or semantic heterogeneity - in each discipline makes the retrieval and integration of information difficult. Results Existing software infrastructures for data-sharing in the cancer domain, such as caGrid, support access to distributed information. caGrid follows a service-oriented model-driven architecture. Each data source in caGrid is associated with metadata at increasing levels of abstraction, including syntactic, structural, reference and domain metadata. The domain metadata consists of ontology-based annotations associated with the structural information of each data source. However, caGrid's current querying functionality is given at the structural metadata level, without capitalising on the ontology-based annotations. This paper presents the design of and theoretical foundations for distributed ontology-based queries over cancer research data. Concept-based queries are reformulated to the target query language, where join conditions between multiple data sources are found by exploiting the semantic annotations. The system has been implemented, as a proof of concept, over the caGrid infrastructure. The approach is applicable to other model-driven architectures. A graphical user interface has been developed, supporting ontology-based queries over caGrid data sources. An extensive evaluation of the query reformulation technique is included. Conclusions To support personalised medicine in oncology, it is crucial to retrieve and integrate molecular, pathology, radiology and clinical data in an efficient manner. The semantic heterogeneity of the data makes this a challenging task. Ontologies provide a formal framework to support querying and integration. This paper provides an ontology-based solution for querying distributed databases over service-oriented, model-driven infrastructures. PMID:22373043
Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J; Dou, Dejing; Huang, Weili; Smith, Barry; Blake, Judith A; Eilbeck, Karen; Natale, Darren A; Lin, Yu; Wu, Bin; Silva, Nisansa de; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming; Ruttenberg, Alan
2016-01-01
As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.
A four stage approach for ontology-based health information system design.
Kuziemsky, Craig E; Lau, Francis
2010-11-01
To describe and illustrate a four stage methodological approach to capture user knowledge in a biomedical domain area, use that knowledge to design an ontology, and then implement and evaluate the ontology as a health information system (HIS). A hybrid participatory design-grounded theory (GT-PD) method was used to obtain data and code them for ontology development. Prototyping was used to implement the ontology as a computer-based tool. Usability testing evaluated the computer-based tool. An empirically derived domain ontology and set of three problem-solving approaches were developed as a formalized model of the concepts and categories from the GT coding. The ontology and problem-solving approaches were used to design and implement a HIS that tested favorably in usability testing. The four stage approach illustrated in this paper is useful for designing and implementing an ontology as the basis for a HIS. The approach extends existing ontology development methodologies by providing an empirical basis for theory incorporated into ontology design. Copyright © 2010 Elsevier B.V. All rights reserved.
Supporting ontology adaptation and versioning based on a graph of relevance
NASA Astrophysics Data System (ADS)
Sassi, Najla; Jaziri, Wassim; Alharbi, Saad
2016-11-01
Ontologies recently have become a topic of interest in computer science since they are seen as a semantic support to explicit and enrich data-models as well as to ensure interoperability of data. Moreover, supporting ontology adaptation becomes essential and extremely important, mainly when using ontologies in changing environments. An important issue when dealing with ontology adaptation is the management of several versions. Ontology versioning is a complex and multifaceted problem as it should take into account change management, versions storage and access, consistency issues, etc. The purpose of this paper is to propose an approach and tool for ontology adaptation and versioning. A series of techniques are proposed to 'safely' evolve a given ontology and produce a new consistent version. The ontology versions are ordered in a graph according to their relevance. The relevance is computed based on four criteria: conceptualisation, usage frequency, abstraction and completeness. The techniques to carry out the versioning process are implemented in the Consistology tool, which has been developed to assist users in expressing adaptation requirements and managing ontology versions.
Samwald, Matthias; Lim, Ernest; Masiar, Peter; Marenco, Luis; Chen, Huajun; Morse, Thomas; Mutalik, Pradeep; Shepherd, Gordon; Miller, Perry; Cheung, Kei-Hoi
2013-01-01
The amount of biomedical data available in Semantic Web formats has been rapidly growing in recent years. While these formats are machine-friendly, user-friendly web interfaces allowing easy querying of these data are typically lacking. We present “Entrez Neuron”, a pilot neuron-centric interface that allows for keyword-based queries against a coherent repository of OWL ontologies. These ontologies describe neuronal structures, physiology, mathematical models and microscopy images. The returned query results are organized hierarchically according to brain architecture. Where possible, the application makes use of entities from the Open Biomedical Ontologies (OBO) and the ‘HCLS knowledgebase’ developed by the W3C Interest Group for Health Care and Life Science. It makes use of the emerging RDFa standard to embed ontology fragments and semantic annotations within its HTML-based user interface. The application and underlying ontologies demonstrates how Semantic Web technologies can be used for information integration within a curated information repository and between curated information repositories. It also demonstrates how information integration can be accomplished on the client side, through simple copying and pasting of portions of documents that contain RDFa markup. PMID:19745321
Context-aware recommender system based on ontology for recommending tourist destinations at Bandung
NASA Astrophysics Data System (ADS)
Rizaldy Hafid Arigi, L.; Abdurahman Baizal, Z. K.; Herdiani, Anisa
2018-03-01
Recommender System is software that is able to provide personalized recommendation suits users’ needs. Recommender System has been widely implemented in various domains, including tourism. One approach that can be done for more personalized recommendations is the use of contextual information. This paper proposes a context aware recommender based ontology system in the tourism domain. The system is capable of recommending tourist destinations by using user preferences of the categories of tourism and contextual information such as user locations, weather around tourist destinations and close time of destination. Based on the evaluation, the system has accuracy of of 0.94 (item recommendation precision evaluated by expert) and 0.58 (implicitly from system-end user interaction). Based on the evaluation of user satisfaction, the system provides a satisfaction level of more than 0.7 (scale 0 to 1) for speed factors for providing liked recommendations (PE), informative description of recommendations (INF) and user trust (TR).
Enhancing Users' Participation in Business Process Modeling through Ontology-Based Training
NASA Astrophysics Data System (ADS)
Macris, A.; Malamateniou, F.; Vassilacopoulos, G.
Successful business process design requires active participation of users who are familiar with organizational activities and business process modelling concepts. Hence, there is a need to provide users with reusable, flexible, agile and adaptable training material in order to enable them instil their knowledge and expertise in business process design and automation activities. Knowledge reusability is of paramount importance in designing training material on process modelling since it enables users participate actively in process design/redesign activities stimulated by the changing business environment. This paper presents a prototype approach for the design and use of training material that provides significant advantages to both the designer (knowledge - content reusability and semantic web enabling) and the user (semantic search, knowledge navigation and knowledge dissemination). The approach is based on externalizing domain knowledge in the form of ontology-based knowledge networks (i.e. training scenarios serving specific training needs) so that it is made reusable.
An Ontology-Based Approach to Incorporate User-Generated Geo-Content Into Sdi
NASA Astrophysics Data System (ADS)
Deng, D.-P.; Lemmens, R.
2011-08-01
The Web is changing the way people share and communicate information because of emergence of various Web technologies, which enable people to contribute information on the Web. User-Generated Geo-Content (UGGC) is a potential resource of geographic information. Due to the different production methods, UGGC often cannot fit in geographic information model. There is a semantic gap between UGGC and formal geographic information. To integrate UGGC into geographic information, this study conducts an ontology-based process to bridge this semantic gap. This ontology-based process includes five steps: Collection, Extraction, Formalization, Mapping, and Deployment. In addition, this study implements this process on Twitter messages, which is relevant to Japan Earthquake disaster. By using this process, we extract disaster relief information from Twitter messages, and develop a knowledge base for GeoSPARQL queries in disaster relief information.
Semi-automated ontology generation and evolution
NASA Astrophysics Data System (ADS)
Stirtzinger, Anthony P.; Anken, Craig S.
2009-05-01
Extending the notion of data models or object models, ontology can provide rich semantic definition not only to the meta-data but also to the instance data of domain knowledge, making these semantic definitions available in machine readable form. However, the generation of an effective ontology is a difficult task involving considerable labor and skill. This paper discusses an Ontology Generation and Evolution Processor (OGEP) aimed at automating this process, only requesting user input when un-resolvable ambiguous situations occur. OGEP directly attacks the main barrier which prevents automated (or self learning) ontology generation: the ability to understand the meaning of artifacts and the relationships the artifacts have to the domain space. OGEP leverages existing lexical to ontological mappings in the form of WordNet, and Suggested Upper Merged Ontology (SUMO) integrated with a semantic pattern-based structure referred to as the Semantic Grounding Mechanism (SGM) and implemented as a Corpus Reasoner. The OGEP processing is initiated by a Corpus Parser performing a lexical analysis of the corpus, reading in a document (or corpus) and preparing it for processing by annotating words and phrases. After the Corpus Parser is done, the Corpus Reasoner uses the parts of speech output to determine the semantic meaning of a word or phrase. The Corpus Reasoner is the crux of the OGEP system, analyzing, extrapolating, and evolving data from free text into cohesive semantic relationships. The Semantic Grounding Mechanism provides a basis for identifying and mapping semantic relationships. By blending together the WordNet lexicon and SUMO ontological layout, the SGM is given breadth and depth in its ability to extrapolate semantic relationships between domain entities. The combination of all these components results in an innovative approach to user assisted semantic-based ontology generation. This paper will describe the OGEP technology in the context of the architectural components referenced above and identify a potential technology transition path to Scott AFB's Tanker Airlift Control Center (TACC) which serves as the Air Operations Center (AOC) for the Air Mobility Command (AMC).
Dead simple OWL design patterns
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osumi-Sutherland, David; Courtot, Melanie; Balhoff, James P.
Bio-ontologies typically require multiple axes of classification to support the needs of their users. Development of such ontologies can only be made scalable and sustainable by the use of inference to automate classification via consistent patterns of axiomatization. Many bio-ontologies originating in OBO or OWL follow this approach. These patterns need to be documented in a form that requires minimal expertise to understand and edit and that can be validated and applied using any of the various programmatic approaches to working with OWL ontologies. We describe a system, Dead Simple OWL Design Patterns (DOS-DPs), which fulfills these requirements, illustrating themore » system with examples from the Gene Ontology. In conclusion, the rapid adoption of DOS-DPs by multiple ontology development projects illustrates both the ease-of use and the pressing need for the simple design pattern system we have developed.« less
Dead simple OWL design patterns
Osumi-Sutherland, David; Courtot, Melanie; Balhoff, James P.; ...
2017-06-05
Bio-ontologies typically require multiple axes of classification to support the needs of their users. Development of such ontologies can only be made scalable and sustainable by the use of inference to automate classification via consistent patterns of axiomatization. Many bio-ontologies originating in OBO or OWL follow this approach. These patterns need to be documented in a form that requires minimal expertise to understand and edit and that can be validated and applied using any of the various programmatic approaches to working with OWL ontologies. We describe a system, Dead Simple OWL Design Patterns (DOS-DPs), which fulfills these requirements, illustrating themore » system with examples from the Gene Ontology. In conclusion, the rapid adoption of DOS-DPs by multiple ontology development projects illustrates both the ease-of use and the pressing need for the simple design pattern system we have developed.« less
OMOGENIA: A Semantically Driven Collaborative Environment
NASA Astrophysics Data System (ADS)
Liapis, Aggelos
Ontology creation can be thought of as a social procedure. Indeed the concepts involved in general need to be elicited from communities of domain experts and end-users by teams of knowledge engineers. Many problems in ontology creation appear to resemble certain problems in software design, particularly with respect to the setup of collaborative systems. For instance, the resolution of conceptual conflicts between formalized ontologies is a major engineering problem as ontologies move into widespread use on the semantic web. Such conflict resolution often requires human collaboration and cannot be achieved by automated methods with the exception of simple cases. In this chapter we discuss research in the field of computer-supported cooperative work (CSCW) that focuses on classification and which throws light on ontology building. Furthermore, we present a semantically driven collaborative environment called OMOGENIA as a natural way to display and examine the structure of an evolving ontology in a collaborative setting.
An ontology based trust verification of software license agreement
NASA Astrophysics Data System (ADS)
Lu, Wenhuan; Li, Xiaoqing; Gan, Zengqin; Wei, Jianguo
2017-08-01
When we install software or download software, there will show up so big mass document to state the rights and obligations, for which lots of person are not patient to read it or understand it. That would may make users feel distrust for the software. In this paper, we propose an ontology based verification for Software License Agreement. First of all, this work proposed an ontology model for domain of Software License Agreement. The domain ontology is constructed by proposed methodology according to copyright laws and 30 software license agreements. The License Ontology can act as a part of generalized copyright law knowledge model, and also can work as visualization of software licenses. Based on this proposed ontology, a software license oriented text summarization approach is proposed which performances showing that it can improve the accuracy of software licenses summarizing. Based on the summarization, the underline purpose of the software license can be explicitly explored for trust verification.
Semantic technologies improving the recall and precision of the Mercury metadata search engine
NASA Astrophysics Data System (ADS)
Pouchard, L. C.; Cook, R. B.; Green, J.; Palanisamy, G.; Noy, N.
2011-12-01
The Mercury federated metadata system [1] was developed at the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC), a NASA-sponsored effort holding datasets about biogeochemical dynamics, ecological data, and environmental processes. Mercury currently indexes over 100,000 records from several data providers conforming to community standards, e.g. EML, FGDC, FGDC Biological Profile, ISO 19115 and DIF. With the breadth of sciences represented in Mercury, the potential exists to address some key interdisciplinary scientific challenges related to climate change, its environmental and ecological impacts, and mitigation of these impacts. However, this wealth of metadata also hinders pinpointing datasets relevant to a particular inquiry. We implemented a semantic solution after concluding that traditional search approaches cannot improve the accuracy of the search results in this domain because: a) unlike everyday queries, scientific queries seek to return specific datasets with numerous parameters that may or may not be exposed to search (Deep Web queries); b) the relevance of a dataset cannot be judged by its popularity, as each scientific inquiry tends to be unique; and c)each domain science has its own terminology, more or less curated, consensual, and standardized depending on the domain. The same terms may refer to different concepts across domains (homonyms), but different terms mean the same thing (synonyms). Interdisciplinary research is arduous because an expert in a domain must become fluent in the language of another, just to find relevant datasets. Thus, we decided to use scientific ontologies because they can provide a context for a free-text search, in a way that string-based keywords never will. With added context, relevant datasets are more easily discoverable. To enable search and programmatic access to ontology entities in Mercury, we are using an instance of the BioPortal ontology repository. Mercury accesses ontology entities using the BioPortal REST API by passing a search parameter to BioPortal that may return domain context, parameter attribute, or entity annotations depending on the entity's associated ontological relationships. As Mercury's facetted search is popular with users, the results are displayed as facets. Unlike a facetted search however, the ontology-based solution implements both restrictions (improving precision) and expansions (improving recall) on the results of the initial search. For instance, "carbon" acquires a scientific context and additional key terms or phrases for discovering domain-specific datasets. A limitation of our solution is that the user must perform an additional step. Another limitation is that the quality of the newly discovered metadata is contingent upon the quality of the ontologies we use. Our solution leverages Mercury's federated capabilities to collect records from heterogeneous domains, and BioPortal's storage, curation and access capabilities for ontology entities. With minimal additional development, our approach builds on two mature systems for finding relevant datasets for interdisciplinary inquiries. We thus indicate a path forward for linking environmental, ecological and biological sciences. References: [1] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94.
Ontological Problem-Solving Framework for Dynamically Configuring Sensor Systems and Algorithms
Qualls, Joseph; Russomanno, David J.
2011-01-01
The deployment of ubiquitous sensor systems and algorithms has led to many challenges, such as matching sensor systems to compatible algorithms which are capable of satisfying a task. Compounding the challenges is the lack of the requisite knowledge models needed to discover sensors and algorithms and to subsequently integrate their capabilities to satisfy a specific task. A novel ontological problem-solving framework has been designed to match sensors to compatible algorithms to form synthesized systems, which are capable of satisfying a task and then assigning the synthesized systems to high-level missions. The approach designed for the ontological problem-solving framework has been instantiated in the context of a persistence surveillance prototype environment, which includes profiling sensor systems and algorithms to demonstrate proof-of-concept principles. Even though the problem-solving approach was instantiated with profiling sensor systems and algorithms, the ontological framework may be useful with other heterogeneous sensing-system environments. PMID:22163793
A top-level ontology of functions and its application in the Open Biomedical Ontologies.
Burek, Patryk; Hoehndorf, Robert; Loebe, Frank; Visagie, Johann; Herre, Heinrich; Kelso, Janet
2006-07-15
A clear understanding of functions in biology is a key component in accurate modelling of molecular, cellular and organismal biology. Using the existing biomedical ontologies it has been impossible to capture the complexity of the community's knowledge about biological functions. We present here a top-level ontological framework for representing knowledge about biological functions. This framework lends greater accuracy, power and expressiveness to biomedical ontologies by providing a means to capture existing functional knowledge in a more formal manner. An initial major application of the ontology of functions is the provision of a principled way in which to curate functional knowledge and annotations in biomedical ontologies. Further potential applications include the facilitation of ontology interoperability and automated reasoning. A major advantage of the proposed implementation is that it is an extension to existing biomedical ontologies, and can be applied without substantial changes to these domain ontologies. The Ontology of Functions (OF) can be downloaded in OWL format from http://onto.eva.mpg.de/. Additionally, a UML profile and supplementary information and guides for using the OF can be accessed from the same website.
Gene Ontology Consortium: going forward
2015-01-01
The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. PMID:25428369
The agent-based spatial information semantic grid
NASA Astrophysics Data System (ADS)
Cui, Wei; Zhu, YaQiong; Zhou, Yong; Li, Deren
2006-10-01
Analyzing the characteristic of multi-Agent and geographic Ontology, The concept of the Agent-based Spatial Information Semantic Grid (ASISG) is defined and the architecture of the ASISG is advanced. ASISG is composed with Multi-Agents and geographic Ontology. The Multi-Agent Systems are composed with User Agents, General Ontology Agent, Geo-Agents, Broker Agents, Resource Agents, Spatial Data Analysis Agents, Spatial Data Access Agents, Task Execution Agent and Monitor Agent. The architecture of ASISG have three layers, they are the fabric layer, the grid management layer and the application layer. The fabric layer what is composed with Data Access Agent, Resource Agent and Geo-Agent encapsulates the data of spatial information system so that exhibits a conceptual interface for the Grid management layer. The Grid management layer, which is composed with General Ontology Agent, Task Execution Agent and Monitor Agent and Data Analysis Agent, used a hybrid method to manage all resources that were registered in a General Ontology Agent that is described by a General Ontology System. The hybrid method is assembled by resource dissemination and resource discovery. The resource dissemination push resource from Local Ontology Agent to General Ontology Agent and the resource discovery pull resource from the General Ontology Agent to Local Ontology Agents. The Local Ontology Agent is derived from special domain and describes the semantic information of local GIS. The nature of the Local Ontology Agents can be filtrated to construct a virtual organization what could provides a global scheme. The virtual organization lightens the burdens of guests because they need not search information site by site manually. The application layer what is composed with User Agent, Geo-Agent and Task Execution Agent can apply a corresponding interface to a domain user. The functions that ASISG should provide are: 1) It integrates different spatial information systems on the semantic The Grid management layer establishes a virtual environment that integrates seamlessly all GIS notes. 2) When the resource management system searches data on different spatial information systems, it transfers the meaning of different Local Ontology Agents rather than access data directly. So the ability of search and query can be said to be on the semantic level. 3) The data access procedure is transparent to guests, that is, they could access the information from remote site as current disk because the General Ontology Agent could automatically link data by the Data Agents that link the Ontology concept to GIS data. 4) The capability of processing massive spatial data. Storing, accessing and managing massive spatial data from TB to PB; efficiently analyzing and processing spatial data to produce model, information and knowledge; and providing 3D and multimedia visualization services. 5) The capability of high performance computing and processing on spatial information. Solving spatial problems with high precision, high quality, and on a large scale; and process spatial information in real time or on time, with high-speed and high efficiency. 6) The capability of sharing spatial resources. The distributed heterogeneous spatial information resources are Shared and realizing integrated and inter-operated on semantic level, so as to make best use of spatial information resources,such as computing resources, storage devices, spatial data (integrating from GIS, RS and GPS), spatial applications and services, GIS platforms, 7) The capability of integrating legacy GIS system. A ASISG can not only be used to construct new advanced spatial application systems, but also integrate legacy GIS system, so as to keep extensibility and inheritance and guarantee investment of users. 8) The capability of collaboration. Large-scale spatial information applications and services always involve different departments in different geographic places, so remote and uniform services are needed. 9) The capability of supporting integration of heterogeneous systems. Large-scale spatial information systems are always synthetically applications, so ASISG should provide interoperation and consistency through adopting open and applied technology standards. 10) The capability of adapting dynamic changes. Business requirements, application patterns, management strategies, and IT products always change endlessly for any departments, so ASISG should be self-adaptive. Two examples are provided in this paper, those examples provide a detailed way on how you design your semantic grid based on Multi-Agent systems and Ontology. In conclusion, the semantic grid of spatial information system could improve the ability of the integration and interoperability of spatial information grid.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-03-08
High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank--like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-01-01
Background High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. Results The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank – like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Conclusion Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL. PMID:17430575
NASA Astrophysics Data System (ADS)
Li, Y.; Jiang, Y.; Yang, C. P.; Armstrong, E. M.; Huang, T.; Moroni, D. F.; McGibbney, L. J.
2016-12-01
Big oceanographic data have been produced, archived and made available online, but finding the right data for scientific research and application development is still a significant challenge. A long-standing problem in data discovery is how to find the interrelationships between keywords and data, as well as the intrarelationships of the two individually. Most previous research attempted to solve this problem by building domain-specific ontology either manually or through automatic machine learning techniques. The former is costly, labor intensive and hard to keep up-to-date, while the latter is prone to noise and may be difficult for human to understand. Large-scale user behavior data modelling represents a largely untapped, unique, and valuable source for discovering semantic relationships among domain-specific vocabulary. In this article, we propose a search engine framework for mining and utilizing dataset relevancy from oceanographic dataset metadata, user behaviors, and existing ontology. The objective is to improve discovery accuracy of oceanographic data and reduce time for scientist to discover, download and reformat data for their projects. Experiments and a search example show that the proposed search engine helps both scientists and general users search with better ranking results, recommendation, and ontology navigation.
OntoCAT -- simple ontology search and integration in Java, R and REST/JavaScript
2011-01-01
Background Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. Results OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. Conclusions OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. Availability http://www.ontocat.org PMID:21619703
OntoCAT--simple ontology search and integration in Java, R and REST/JavaScript.
Adamusiak, Tomasz; Burdett, Tony; Kurbatova, Natalja; Joeri van der Velde, K; Abeygunawardena, Niran; Antonakaki, Despoina; Kapushesky, Misha; Parkinson, Helen; Swertz, Morris A
2011-05-29
Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. http://www.ontocat.org.
ARMOUR - A Rice miRNA: mRNA Interaction Resource.
Sanan-Mishra, Neeti; Tripathi, Anita; Goswami, Kavita; Shukla, Rohit N; Vasudevan, Madavan; Goswami, Hitesh
2018-01-01
ARMOUR was developed as A Rice miRNA:mRNA interaction resource. This informative and interactive database includes the experimentally validated expression profiles of miRNAs under different developmental and abiotic stress conditions across seven Indian rice cultivars. This comprehensive database covers 689 known and 1664 predicted novel miRNAs and their expression profiles in more than 38 different tissues or conditions along with their predicted/known target transcripts. The understanding of miRNA:mRNA interactome in regulation of functional cellular machinery is supported by the sequence information of the mature and hairpin structures. ARMOUR provides flexibility to users in querying the database using multiple ways like known gene identifiers, gene ontology identifiers, KEGG identifiers and also allows on the fly fold change analysis and sequence search query with inbuilt BLAST algorithm. ARMOUR database provides a cohesive platform for novel and mature miRNAs and their expression in different experimental conditions and allows searching for their interacting mRNA targets, GO annotation and their involvement in various biological pathways. The ARMOUR database includes a provision for adding more experimental data from users, with an aim to develop it as a platform for sharing and comparing experimental data contributed by research groups working on rice.
ERIC Educational Resources Information Center
Zeng, Qingtian; Zhao, Zhongying; Liang, Yongquan
2009-01-01
User's knowledge requirement acquisition and analysis are very important for a personalized or user-adaptive learning system. Two approaches to capture user's knowledge requirement about course content within an e-learning system are proposed and implemented in this paper. The first approach is based on the historical data accumulated by an…
Ontolology Negotiation Between Scientific Archives
NASA Technical Reports Server (NTRS)
Bailin, Sidney C.; Truszkowski, Walt; Obenschain, Arthur F. (Technical Monitor)
2001-01-01
This paper describes an approach to ontology negotiation between information agents. Ontologies are declarative (data driven) expressions of an agent's "world": the objects, operations, facts, and rules that constitute the logical space within which an agent performs. Ontology negotiation enables agents to cooperate in performing a task, even if they are based on different ontologies. 'Me process allows agents to discover ontology conflicts and then, though incremental interpretation, clarification, and explanation, establish a common basis for communicating with each other. The need for ontology negotiation stems from the proliferation of information sources and of agents with widely varying specialty expertise. The unmanageability of massive amounts of web-based information is already becoming apparent. It is starting to have an impact on professions that rely on distributed archived information. If the expansion continues at its present rate without an ontology negotiation process being introduced, there will soon be no way to ensure the accuracy and completeness of information that scientists obtain from sources other than their own experiments. Ontology negotiation is becoming increasingly recognized as a crucial element of scalable agent technology. This is because agents, by their very nature, are supposed to operate with a fair amount of autonomy and independence from their end-users. Part of this independence is the ability to enlist other agents for help in performing a task (such as locating information on the web). The agents enlisted for help may be "owned" by a different end-user or organization (such as a document archive), and there is no guarantee that they will use the same terminology or understand the same concepts (objects, operators, theorems, rules) as the recruiting agent. For NASA, the need for ontology negotiation arises at the boundaries between scientific disciplines. For example: modeling the effects of global warming might involve knowledge about imaging, climate analysis, ecology, demographics, industrial economics, and biology. The need for ontology negotiation also arises at the boundaries between scientific programs. For example, a Principal Investigator may want to use information from a previous mission to complement downloads from the instruments currently deployed.
Profiling structured product labeling with NDF-RT and RxNorm
2012-01-01
Background Structured Product Labeling (SPL) is a document markup standard approved by Health Level Seven (HL7) and adopted by United States Food and Drug Administration (FDA) as a mechanism for exchanging drug product information. The SPL drug labels contain rich information about FDA approved clinical drugs. However, the lack of linkage to standard drug ontologies hinders their meaningful use. NDF-RT (National Drug File Reference Terminology) and NLM RxNorm as standard drug ontology were used to standardize and profile the product labels. Methods In this paper, we present a framework that intends to map SPL drug labels with existing drug ontologies: NDF-RT and RxNorm. We also applied existing categorical annotations from the drug ontologies to classify SPL drug labels into corresponding classes. We established the classification and relevant linkage for SPL drug labels using the following three approaches. First, we retrieved NDF-RT categorical information from the External Pharmacologic Class (EPC) indexing SPLs. Second, we used the RxNorm and NDF-RT mappings to classify and link SPLs with NDF-RT categories. Third, we profiled SPLs using RxNorm term type information. In the implementation process, we employed a Semantic Web technology framework, in which we stored the data sets from NDF-RT and SPLs into a RDF triple store, and executed SPARQL queries to retrieve data from customized SPARQL endpoints. Meanwhile, we imported RxNorm data into MySQL relational database. Results In total, 96.0% SPL drug labels were mapped with NDF-RT categories whereas 97.0% SPL drug labels are linked to RxNorm codes. We found that the majority of SPL drug labels are mapped to chemical ingredient concepts in both drug ontologies whereas a relatively small portion of SPL drug labels are mapped to clinical drug concepts. Conclusions The profiling outcomes produced by this study would provide useful insights on meaningful use of FDA SPL drug labels in clinical applications through standard drug ontologies such as NDF-RT and RxNorm. PMID:23256517
Profiling structured product labeling with NDF-RT and RxNorm.
Zhu, Qian; Jiang, Guoqian; Chute, Christopher G
2012-12-20
Structured Product Labeling (SPL) is a document markup standard approved by Health Level Seven (HL7) and adopted by United States Food and Drug Administration (FDA) as a mechanism for exchanging drug product information. The SPL drug labels contain rich information about FDA approved clinical drugs. However, the lack of linkage to standard drug ontologies hinders their meaningful use. NDF-RT (National Drug File Reference Terminology) and NLM RxNorm as standard drug ontology were used to standardize and profile the product labels. In this paper, we present a framework that intends to map SPL drug labels with existing drug ontologies: NDF-RT and RxNorm. We also applied existing categorical annotations from the drug ontologies to classify SPL drug labels into corresponding classes. We established the classification and relevant linkage for SPL drug labels using the following three approaches. First, we retrieved NDF-RT categorical information from the External Pharmacologic Class (EPC) indexing SPLs. Second, we used the RxNorm and NDF-RT mappings to classify and link SPLs with NDF-RT categories. Third, we profiled SPLs using RxNorm term type information. In the implementation process, we employed a Semantic Web technology framework, in which we stored the data sets from NDF-RT and SPLs into a RDF triple store, and executed SPARQL queries to retrieve data from customized SPARQL endpoints. Meanwhile, we imported RxNorm data into MySQL relational database. In total, 96.0% SPL drug labels were mapped with NDF-RT categories whereas 97.0% SPL drug labels are linked to RxNorm codes. We found that the majority of SPL drug labels are mapped to chemical ingredient concepts in both drug ontologies whereas a relatively small portion of SPL drug labels are mapped to clinical drug concepts. The profiling outcomes produced by this study would provide useful insights on meaningful use of FDA SPL drug labels in clinical applications through standard drug ontologies such as NDF-RT and RxNorm.
Summarizing an Ontology: A "Big Knowledge" Coverage Approach.
Zheng, Ling; Perl, Yehoshua; Elhanan, Gai; Ochs, Christopher; Geller, James; Halper, Michael
2017-01-01
Maintenance and use of a large ontology, consisting of thousands of knowledge assertions, are hampered by its scope and complexity. It is important to provide tools for summarization of ontology content in order to facilitate user "big picture" comprehension. We present a parameterized methodology for the semi-automatic summarization of major topics in an ontology, based on a compact summary of the ontology, called an "aggregate partial-area taxonomy", followed by manual enhancement. An experiment is presented to test the effectiveness of such summarization measured by coverage of a given list of major topics of the corresponding application domain. SNOMED CT's Specimen hierarchy is the test-bed. A domain-expert provided a list of topics that serves as a gold standard. The enhanced results show that the aggregate taxonomy covers most of the domain's main topics.
The Orthology Ontology: development and applications.
Fernández-Breis, Jesualdo Tomás; Chiba, Hirokazu; Legaz-García, María Del Carmen; Uchiyama, Ikuo
2016-06-04
Computational comparative analysis of multiple genomes provides valuable opportunities to biomedical research. In particular, orthology analysis can play a central role in comparative genomics; it guides establishing evolutionary relations among genes of organisms and allows functional inference of gene products. However, the wide variations in current orthology databases necessitate the research toward the shareability of the content that is generated by different tools and stored in different structures. Exchanging the content with other research communities requires making the meaning of the content explicit. The need for a common ontology has led to the creation of the Orthology Ontology (ORTH) following the best practices in ontology construction. Here, we describe our model and major entities of the ontology that is implemented in the Web Ontology Language (OWL), followed by the assessment of the quality of the ontology and the application of the ORTH to existing orthology datasets. This shareable ontology enables the possibility to develop Linked Orthology Datasets and a meta-predictor of orthology through standardization for the representation of orthology databases. The ORTH is freely available in OWL format to all users at http://purl.org/net/orth . The Orthology Ontology can serve as a framework for the semantic standardization of orthology content and it will contribute to a better exploitation of orthology resources in biomedical research. The results demonstrate the feasibility of developing shareable datasets using this ontology. Further applications will maximize the usefulness of this ontology.
A Social Network System Based on an Ontology in the Korea Institute of Oriental Medicine
NASA Astrophysics Data System (ADS)
Kim, Sang-Kyun; Han, Jeong-Min; Song, Mi-Young
We in this paper propose a social network based on ontology in Korea Institute of Oriental Medicine (KIOM). By using the social network, researchers can find collaborators and share research results with others so that studies in Korean Medicine fields can be activated. For this purpose, first, personal profiles, scholarships, careers, licenses, academic activities, research results, and personal connections for all of researchers in KIOM are collected. After relationship and hierarchy among ontology classes and attributes of classes are defined through analyzing the collected information, a social network ontology are constructed using FOAF and OWL. This ontology can be easily interconnected with other social network by FOAF and provide the reasoning based on OWL ontology. In future, we construct the search and reasoning system using the ontology. Moreover, if the social network is activated, we will open it to whole Korean Medicine fields.
Sojic, Aleksandra; Terkaj, Walter; Contini, Giorgia; Sacco, Marco
2016-05-04
The public health initiatives for obesity prevention are increasingly exploiting the advantages of smart technologies that can register various kinds of data related to physical, physiological, and behavioural conditions. Since individual features and habits vary among people, the design of appropriate intervention strategies for motivating changes in behavioural patterns towards a healthy lifestyle requires the interpretation and integration of collected information, while considering individual profiles in a personalised manner. The ontology-based modelling is recognised as a promising approach in facing the interoperability and integration of heterogeneous information related to characterisation of personal profiles. The presented ontology captures individual profiles across several obesity-related knowledge-domains structured into dedicated modules in order to support inference about health condition, physical features, behavioural habits associated with a person, and relevant changes over time. The modularisation strategy is designed to facilitate ontology development, maintenance, and reuse. The domain-specific modules formalised in the Web Ontology Language (OWL) integrate the domain-specific sets of rules formalised in the Semantic Web Rule Language (SWRL). The inference rules follow a modelling pattern designed to support personalised assessment of health condition as age- and gender-specific. The test cases exemplify a personalised assessment of the obesity-related health conditions for the population of teenagers. The paper addresses several issues concerning the modelling of normative concepts related to obesity and depicts how the public health concern impacts classification of teenagers according to their phenotypes. The modelling choices regarding the ontology-structure are explained in the context of the modelling goal to integrate multiple knowledge-domains and support reasoning about the individual changes over time. The presented modularisation pattern enhances reusability of the domain-specific modules across various health care domains.
Using a User-Interactive QA System for Personalized E-Learning
ERIC Educational Resources Information Center
Hu, Dawei; Chen, Wei; Zeng, Qingtian; Hao, Tianyong; Min, Feng; Wenyin, Liu
2008-01-01
A personalized e-learning framework based on a user-interactive question-answering (QA) system is proposed, in which a user-modeling approach is used to capture personal information of students and a personalized answer extraction algorithm is proposed for personalized automatic answering. In our approach, a topic ontology (or concept hierarchy)…
Development of Health Information Search Engine Based on Metadata and Ontology
Song, Tae-Min; Jin, Dal-Lae
2014-01-01
Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907
Development of health information search engine based on metadata and ontology.
Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae
2014-04-01
The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents
NASA Astrophysics Data System (ADS)
Thiam, Mouhamadou; Bennacer, Nacéra; Pernelle, Nathalie; Lô, Moussa
SHIRIis an ontology-based system for integration of semi-structured documents related to a specific domain. The system’s purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. In this paper, we focus on the Extract-Align algorithm which exploits a set of named entity and term patterns to extract term candidates to be aligned with the ontology. It proceeds in an incremental manner in order to populate the ontology with terms describing instances of the domain and to reduce the access to extern resources such as Web. We experiment it on a HTML corpus related to call for papers in computer science and the results that we obtain are very promising. These results show how the incremental behaviour of Extract-Align algorithm enriches the ontology and the number of terms (or named entities) aligned directly with the ontology increases.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-06-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
GONUTS: the Gene Ontology Normal Usage Tracking System
Renfro, Daniel P.; McIntosh, Brenley K.; Venkatraman, Anand; Siegele, Deborah A.; Hu, James C.
2012-01-01
The Gene Ontology Normal Usage Tracking System (GONUTS) is a community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins. GONUTS uses wiki technology to allow registered users to share and edit notes on the use of each term in GO, and to contribute annotations for specific genes of interest. By providing a site for generation of third-party documentation at the granularity of individual terms, GONUTS complements the official documentation of the Gene Ontology Consortium. To provide examples for community users, GONUTS displays the complete GO annotations from seven model organisms: Saccharomyces cerevisiae, Dictyostelium discoideum, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus and Arabidopsis thaliana. To support community annotation, GONUTS allows automated creation of gene pages for gene products in UniProt. GONUTS will improve the consistency of annotation efforts across genome projects, and should be useful in training new annotators and consumers in the production of GO annotations and the use of GO terms. GONUTS can be accessed at http://gowiki.tamu.edu. The source code for generating the content of GONUTS is available upon request. PMID:22110029
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-03-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Gene Ontology Consortium: going forward.
2015-01-01
The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Plant Ontology: A Tool for Plant Genomics.
Cooper, Laurel; Jaiswal, Pankaj
2016-01-01
The use of controlled, structured vocabularies (ontologies) has become a critical tool for scientists in the post-genomic era of massive datasets. Adoption and integration of common vocabularies and annotation practices enables cross-species comparative analyses and increases data sharing and reusability. The Plant Ontology (PO; http://www.plantontology.org/ ) describes plant anatomy, morphology, and the stages of plant development, and offers a database of plant genomics annotations associated to the PO terms. The scope of the PO has grown from its original design covering only rice, maize, and Arabidopsis, and now includes terms to describe all green plants from angiosperms to green algae.This chapter introduces how the PO and other related ontologies are constructed and organized, including languages and software used for ontology development, and provides an overview of the key features. Detailed instructions illustrate how to search and browse the PO database and access the associated annotation data. Users are encouraged to provide input on the ontology through the online term request form and contribute datasets for integration in the PO database.
Tutorial on Protein Ontology Resources
Arighi, Cecilia; Drabkin, Harold; Christie, Karen R.; Ross, Karen; Natale, Darren
2017-01-01
The Protein Ontology (PRO) is the reference ontology for proteins in the Open Biomedical Ontologies (OBO) foundry and consists of three sub-ontologies representing protein classes of homologous genes, proteoforms (e.g., splice isoforms, sequence variants, and post-translationally modified forms), and protein complexes. PRO defines classes of proteins and protein complexes, both species-specific and species non-specific, and indicates their relationships in a hierarchical framework, supporting accurate protein annotation at the appropriate level of granularity, analyses of protein conservation across species, and semantic reasoning. In this first section of this chapter, we describe the PRO framework including categories of PRO terms and the relationship of PRO to other ontologies and protein resources. Next, we provide a tutorial about the PRO website (proconsortium.org) where users can browse and search the PRO hierarchy, view reports on individual PRO terms, and visualize relationships among PRO terms in a hierarchical table view, a multiple sequence alignment view, and a Cytoscape network view. Finally, we describe several examples illustrating the unique and rich information available in PRO. PMID:28150233
Research on designing ontologies for location-based services
NASA Astrophysics Data System (ADS)
Cheng, Gang; Du, Qingyun; Cai, Zhongliang; Huang, Maojun; Zhao, Haiyun
2007-06-01
With the far and wide applications of Location-Based Services (LBS), the call for more semantic and accurate services is emerging. From a semantic viewpoint, the major characteristic of, and challenge for, LBS is the fact that they serve as mediator between a possibly unknown user and possibly a priori unknown services. While some geographic information technology standards provide the basis for syntactic interoperability, they do not yet provide methods for dealing with problems of semantic heterogeneity. In this paper we design ontologies for LBS which are used for the identification and association of semantically corresponding concepts to overcome the semantic problems. In order to better understand the semantic content of the data in LBS, we analyze several elements both data and services involved. Then, we model these data and services in a way that captures their peculiarities and allows their sharing between users and services and exchange among different LBS, when desired. For this, we use the Protégé-OWL plug-in for creating hybrid hierarchy of ontologies to enhance the semantic content both the user information and the services have. To argue about the design choices and show their applicability, we present a simple example from a characteristic real world application.
Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay
2004-01-01
Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175
Information Pre-Processing using Domain Meta-Ontology and Rule Learning System
NASA Astrophysics Data System (ADS)
Ranganathan, Girish R.; Biletskiy, Yevgen
Around the globe, extraordinary amounts of documents are being created by Enterprises and by users outside these Enterprises. The documents created in the Enterprises constitute the main focus of the present chapter. These documents are used to perform numerous amounts of machine processing. While using thesedocuments for machine processing, lack of semantics of the information in these documents may cause misinterpretation of the information, thereby inhibiting the productiveness of computer assisted analytical work. Hence, it would be profitable to the Enterprises if they use well defined domain ontologies which will serve as rich source(s) of semantics for the information in the documents. These domain ontologies can be created manually, semi-automatically or fully automatically. The focus of this chapter is to propose an intermediate solution which will enable relatively easy creation of these domain ontologies. The process of extracting and capturing domain ontologies from these voluminous documents requires extensive involvement of domain experts and application of methods of ontology learning that are substantially labor intensive; therefore, some intermediate solutions which would assist in capturing domain ontologies must be developed. This chapter proposes a solution in this direction which involves building a meta-ontology that will serve as an intermediate information source for the main domain ontology. This chapter proposes a solution in this direction which involves building a meta-ontology as a rapid approach in conceptualizing a domain of interest from huge amount of source documents. This meta-ontology can be populated by ontological concepts, attributes and relations from documents, and then refined in order to form better domain ontology either through automatic ontology learning methods or some other relevant ontology building approach.
An ontology for sensor networks
NASA Astrophysics Data System (ADS)
Compton, Michael; Neuhaus, Holger; Bermudez, Luis; Cox, Simon
2010-05-01
Sensors and networks of sensors are important ways of monitoring and digitizing reality. As the number and size of sensor networks grows, so too does the amount of data collected. Users of such networks typically need to discover the sensors and data that fit their needs without necessarily understanding the complexities of the network itself. The burden on users is eased if the network and its data are expressed in terms of concepts familiar to the users and their job functions, rather than in terms of the network or how it was designed. Furthermore, the task of collecting and combining data from multiple sensor networks is made easier if metadata about the data and the networks is stored in a format and conceptual models that is amenable to machine reasoning and inference. While the OGC's (Open Geospatial Consortium) SWE (Sensor Web Enablement) standards provide for the description and access to data and metadata for sensors, they do not provide facilities for abstraction, categorization, and reasoning consistent with standard technologies. Once sensors and networks are described using rich semantics (that is, by using logic to describe the sensors, the domain of interest, and the measurements) then reasoning and classification can be used to analyse and categorise data, relate measurements with similar information content, and manage, query and task sensors. This will enable types of automated processing and logical assurance built on OGC standards. The W3C SSN-XG (Semantic Sensor Networks Incubator Group) is producing a generic ontology to describe sensors, their environment and the measurements they make. The ontology provides definitions for the structure of sensors and observations, leaving the details of the observed domain unspecified. This allows abstract representations of real world entities, which are not observed directly but through their observable qualities. Domain semantics, units of measurement, time and time series, and location and mobility ontologies can be easily attached when instantiating the ontology for any particular sensors in a domain. After a review of previous work on the specification of sensors, the group is developing the ontology in conjunction with use case development. Part of the difficulty of such work is that relevant concepts from for example OGC standards and other ontologies must be identified and aligned and also placed in a consistent and logically correct way into the ontology. In terms of alignment with OGC's SWE, the ontology is intended to be able to model concepts from SensorML and O&M. Similar to SensorML and O&M, the ontology is based around concepts of systems, processes, and observations. It supports the description of the physical and processing structure of sensors. Sensors are not constrained to physical sensing devices: rather a sensor is anything that can estimate or calculate the value of a phenomenon, so a device or computational process or combination could play the role of a sensor. The representation of a sensor in the ontology links together what is measured (the domain phenomena), the sensor's physical and other properties and its functions and processing. Parts of the ontology are well aligned with SensorML and O&M, but parts are not, and the group is working to understand how differences from (and alignment with) the OGC standards affect the application of the ontology.
A Process for the Representation of openEHR ADL Archetypes in OWL Ontologies.
Porn, Alex Mateus; Peres, Leticia Mara; Didonet Del Fabro, Marcos
2015-01-01
ADL is a formal language to express archetypes, independent of standards or domain. However, its specification is not precise enough in relation to the specialization and semantic of archetypes, presenting difficulties in implementation and a few available tools. Archetypes may be implemented using other languages such as XML or OWL, increasing integration with Semantic Web tools. Exchanging and transforming data can be better implemented with semantics oriented models, for example using OWL which is a language to define and instantiate Web ontologies defined by W3C. OWL permits defining significant, detailed, precise and consistent distinctions among classes, properties and relations by the user, ensuring the consistency of knowledge than using ADL techniques. This paper presents a process of an openEHR ADL archetypes representation in OWL ontologies. This process consists of ADL archetypes conversion in OWL ontologies and validation of OWL resultant ontologies using the mutation test.
Semi-automatic Data Integration using Karma
NASA Astrophysics Data System (ADS)
Garijo, D.; Kejriwal, M.; Pierce, S. A.; Houser, P. I. Q.; Peckham, S. D.; Stanko, Z.; Hardesty Lewis, D.; Gil, Y.; Pennington, D. D.; Knoblock, C.
2017-12-01
Data integration applications are ubiquitous in scientific disciplines. A state-of-the-art data integration system accepts both a set of data sources and a target ontology as input, and semi-automatically maps the data sources in terms of concepts and relationships in the target ontology. Mappings can be both complex and highly domain-specific. Once such a semantic model, expressing the mapping using community-wide standard, is acquired, the source data can be stored in a single repository or database using the semantics of the target ontology. However, acquiring the mapping is a labor-prone process, and state-of-the-art artificial intelligence systems are unable to fully automate the process using heuristics and algorithms alone. Instead, a more realistic goal is to develop adaptive tools that minimize user feedback (e.g., by offering good mapping recommendations), while at the same time making it intuitive and easy for the user to both correct errors and to define complex mappings. We present Karma, a data integration system that has been developed over multiple years in the information integration group at the Information Sciences Institute, a research institute at the University of Southern California's Viterbi School of Engineering. Karma is a state-of-the-art data integration tool that supports an interactive graphical user interface, and has been featured in multiple domains over the last five years, including geospatial, biological, humanities and bibliographic applications. Karma allows a user to import their own ontology and datasets using widely used formats such as RDF, XML, CSV and JSON, can be set up either locally or on a server, supports a native backend database for prototyping queries, and can even be seamlessly integrated into external computational pipelines, including those ingesting data via streaming data sources, Web APIs and SQL databases. We illustrate a Karma workflow at a conceptual level, along with a live demo, and show use cases of Karma specifically for the geosciences. In particular, we show how Karma can be used intuitively to obtain the mapping model between case study data sources and a publicly available and expressive target ontology that has been designed to capture a broad set of concepts in geoscience with standardized, easily searchable names.
ERIC Educational Resources Information Center
Choi, Yunseon
2016-01-01
Introduction: The purpose of this paper is to provide a framework for building a consumer health ontology using social tags. This would assist health users when they are accessing health information and increase the number of documents relevant to their needs. Methods: In order to extract concepts from social tags, this study conducted an…
Ontologies, Knowledge Bases and Knowledge Management
2002-07-01
AFRL-IF-RS-TR-2002-163 Final Technical Report July 2002 ONTOLOGIES, KNOWLEDGE BASES AND KNOWLEDGE MANAGEMENT USC Information ...and layer additional information necessary to make specific uses of the knowledge in this core. Finally, while we were able to find adequate solutions... knowledge base and inference engine. Figure 3.2: SDA Editor Interface 46 Although the SDA has access to information about the situation, we wanted the user
Semantic Data Access Services at NASA's Atmospheric Science Data Center
NASA Astrophysics Data System (ADS)
Huffer, E.; Hertz, J.; Kusterer, J.
2012-12-01
The corpus of Earth Science data products at the Atmospheric Science Data Center at NASA's Langley Research Center comprises a widely heterogeneous set of products, even among those whose subject matter is very similar. Two distinct data products may both contain data on the same parameter, for instance, solar irradiance; but the instruments used, and the circumstances under which the data were collected and processed, may differ significantly. Understanding the differences is critical to using the data effectively. Data distribution services must be able to provide prospective users with enough information to allow them to meaningfully compare and evaluate the data products offered. Semantic technologies - ontologies, triple stores, reasoners, linked data - offer functionality for addressing this issue. Ontologies can provide robust, high-fidelity domain models that serve as common schema for discovering, evaluating, comparing and integrating data from disparate products. Reasoning engines and triple stores can leverage ontologies to support intelligent search applications that allow users to discover, query, retrieve, and easily reformat data from a broad spectrum of sources. We argue that because of the extremely complex nature of scientific data, data distribution systems should wholeheartedly embrace semantic technologies in order to make their data accessible to a broad array of prospective end users, and to ensure that the data they provide will be clearly understood and used appropriately by consumers. Toward this end, we propose a distribution system in which formal ontological models that accurately and comprehensively represent the ASDC's data domain, and fully leverage the expressivity and inferential capabilities of first order logic, are used to generate graph-based representations of the relevant relationships among data sets, observational systems, metadata files, and geospatial, temporal and scientific parameters to help prospective data consumers navigate directly to relevant data sets and query, subset, retrieve and compare the measurement and calculation data they contain. A critical part of developing semantically-enabled data distribution capabilities is developing an ontology that adequately describes 1) the data products - their structure, their content, and any supporting documentation; 2) the data domain - the objects and processes that the products denote; and 3) the relationship between the data and the domain. The ontology, in addition, should be machine readable and capable of integrating with the larger data distribution system to provide an interactive user experience. We will demonstrate how a formal, high-fidelity, queriable ontology representing the atmospheric science domain objects and data products, together with a robust set of inference rules for generating interactive graphs, allows researchers to navigate quickly and painlessly through the large volume of data at the ASDC. Scientists will be able to discover data products that exactly meet their particular criteria, link to information about the instruments and processing methods that generated the data; and compare and contrast related products.
Knowledge acquisition and learning process description in context of e-learning
NASA Astrophysics Data System (ADS)
Kiselev, B. G.; Yakutenko, V. A.; Yuriev, M. A.
2017-01-01
This paper investigates the problem of design of e-learning and MOOC systems. It describes instructional design-based approaches to e-learning systems design: IMS Learning Design, MISA and TELOS. To solve this problem we present Knowledge Field of Educational Environment with Competence boundary conditions - instructional engineering method for self-learning systems design. It is based on the simplified TELOS approach and enables a user to create their individual learning path by choosing prerequisite and target competencies. The paper provides the ontology model for the described instructional engineering method, real life use cases and the classification of the presented model. Ontology model consists of 13 classes and 15 properties. Some of them are inherited from Knowledge Field of Educational Environment and some are new and describe competence boundary conditions and knowledge validation objects. Ontology model uses logical constraints and is described using OWL 2 standard. To give TELOS users better understanding of our approach we list mapping between TELOS and KFEEC.
SAFOD Brittle Microstructure and Mechanics Knowledge Base (SAFOD BM2KB)
NASA Astrophysics Data System (ADS)
Babaie, H. A.; Hadizadeh, J.; di Toro, G.; Mair, K.; Kumar, A.
2008-12-01
We have developed a knowledge base to store and present the data collected by a group of investigators studying the microstructures and mechanics of brittle faulting using core samples from the SAFOD (San Andreas Fault Observatory at Depth) project. The investigations are carried out with a variety of analytical and experimental methods primarily to better understand the physics of strain localization in fault gouge. The knowledge base instantiates an specially-designed brittle rock deformation ontology developed at Georgia State University. The inference rules embedded in the semantic web languages, such as OWL, RDF, and RDFS, which are used in our ontology, allow the Pellet reasoner used in this application to derive additional truths about the ontology and knowledge of this domain. Access to the knowledge base is via a public website, which is designed to provide the knowledge acquired by all the investigators involved in the project. The stored data will be products of studies such as: experiments (e.g., high-velocity friction experiment), analyses (e.g., microstructural, chemical, mass transfer, mineralogical, surface, image, texture), microscopy (optical, HRSEM, FESEM, HRTEM]), tomography, porosity measurement, microprobe, and cathodoluminesence. Data about laboratories, experimental conditions, methods, assumptions, equipments, and mechanical properties and lithology of the studied samples will also be presented on the website per investigation. The ontology was modeled applying the UML (Unified Modeling Language) in Rational Rose, and implemented in OWL-DL (Ontology Web Language) using the Protégé ontology editor. The UML model was converted to OWL-DL by first mapping it to Ecore (.ecore) and Generator model (.genmodel) with the help of the EMF (Eclipse Modeling Framework) plugin in Eclipse. The Ecore model was then mapped to a .uml file, which later was converted into an .owl file and subsequently imported into the Protégé ontology editing environment. The web-interface was developed in java using eclipse as the IDE. The web interfaces to query and submit data were implemented applying JSP, servlets, javascript, and AJAX. The Jena API, a Java framework for building Semantic Web applications, was used to develop the web-interface. Jena provided a programmatic environment for RDF, RDFS, OWL, and SPARQL query engine. Building web applications with AJAX helps retrieving data from the server asynchronously in the background without interfering with the display and behavior of the existing page. The application was deployed on an apache tomcat server at GSU. The SAFOD BM2KB website provides user-friendly search, submit, feedback, and other services. The General Search option allows users to search the knowledge base by selecting the classes (e.g., Experiment, Surface Analysis), their respective attributes (e.g., apparatus, date performed), and the relationships to other classes (e.g., Sample, Laboratory). The Search by Sample option allows users to search the knowledge base based on sample number. The Search by Investigator lets users to search the knowledge base by choosing an investigator who is involved in this project. The website also allows users to submit new data. The Submit Data option opens a page where users can submit the SAFOD data to our knowledge base by selecting specific classes and attributes. The submitted data then become available for query as part of the knowledge base. The SAFOD BM2KB can be accessed from the main SAFOD website.
COMODI: an ontology to characterise differences in versions of computational models in biology.
Scharm, Martin; Waltemath, Dagmar; Mendes, Pedro; Wolkenhauer, Olaf
2016-07-11
Open model repositories provide ready-to-reuse computational models of biological systems. Models within those repositories evolve over time, leading to different model versions. Taken together, the underlying changes reflect a model's provenance and thus can give valuable insights into the studied biology. Currently, however, changes cannot be semantically interpreted. To improve this situation, we developed an ontology of terms describing changes in models. The ontology can be used by scientists and within software to characterise model updates at the level of single changes. When studying or reusing a model, these annotations help with determining the relevance of a change in a given context. We manually studied changes in selected models from BioModels and the Physiome Model Repository. Using the BiVeS tool for difference detection, we then performed an automatic analysis of changes in all models published in these repositories. The resulting set of concepts led us to define candidate terms for the ontology. In a final step, we aggregated and classified these terms and built the first version of the ontology. We present COMODI, an ontology needed because COmputational MOdels DIffer. It empowers users and software to describe changes in a model on the semantic level. COMODI also enables software to implement user-specific filter options for the display of model changes. Finally, COMODI is a step towards predicting how a change in a model influences the simulation results. COMODI, coupled with our algorithm for difference detection, ensures the transparency of a model's evolution, and it enhances the traceability of updates and error corrections. COMODI is encoded in OWL. It is openly available at http://comodi.sems.uni-rostock.de/ .
Przydzial, Magdalena J; Bhhatarai, Barun; Koleti, Amar; Vempati, Uma; Schürer, Stephan C
2013-12-15
Novel tools need to be developed to help scientists analyze large amounts of available screening data with the goal to identify entry points for the development of novel chemical probes and drugs. As the largest class of drug targets, G protein-coupled receptors (GPCRs) remain of particular interest and are pursued by numerous academic and industrial research projects. We report the first GPCR ontology to facilitate integration and aggregation of GPCR-targeting drugs and demonstrate its application to classify and analyze a large subset of the PubChem database. The GPCR ontology, based on previously reported BioAssay Ontology, depicts available pharmacological, biochemical and physiological profiles of GPCRs and their ligands. The novelty of the GPCR ontology lies in the use of diverse experimental datasets linked by a model to formally define these concepts. Using a reasoning system, GPCR ontology offers potential for knowledge-based classification of individuals (such as small molecules) as a function of the data. The GPCR ontology is available at http://www.bioassayontology.org/bao_gpcr and the National Center for Biomedical Ontologies Web site.
Tissue enrichment analysis for C. elegans genomics.
Angeles-Albores, David; N Lee, Raymond Y; Chan, Juancarlos; Sternberg, Paul W
2016-09-13
Over the last ten years, there has been explosive development in methods for measuring gene expression. These methods can identify thousands of genes altered between conditions, but understanding these datasets and forming hypotheses based on them remains challenging. One way to analyze these datasets is to associate ontologies (hierarchical, descriptive vocabularies with controlled relations between terms) with genes and to look for enrichment of specific terms. Although Gene Ontology (GO) is available for Caenorhabditis elegans, it does not include anatomical information. We have developed a tool for identifying enrichment of C. elegans tissues among gene sets and generated a website GUI where users can access this tool. Since a common drawback to ontology enrichment analyses is its verbosity, we developed a very simple filtering algorithm to reduce the ontology size by an order of magnitude. We adjusted these filters and validated our tool using a set of 30 gold standards from Expression Cluster data in WormBase. We show our tool can even discriminate between embryonic and larval tissues and can even identify tissues down to the single-cell level. We used our tool to identify multiple neuronal tissues that are down-regulated due to pathogen infection in C. elegans. Our Tissue Enrichment Analysis (TEA) can be found within WormBase, and can be downloaded using Python's standard pip installer. It tests a slimmed-down C. elegans tissue ontology for enrichment of specific terms and provides users with a text and graphic representation of the results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arpin, Bettina Karin Schimanski; Jones, Brian S.; Bemesderfer, Joy
2010-06-01
Are your employees unhappy with internal corporate search? Frequent complaints include: too many results to sift through; results are unrelated/outdated; employees aren't sure which terms to search for. One way to improve intranet search is to implement a controlled vocabulary ontology. Employing this takes the guess work out of searching, makes search efficient and precise, educates employees about the lingo used within the corporation, and allows employees to contribute to the corpus of terms. It promotes internal corporate search to rival its superior sibling, internet search. We will cover our experiences, lessons learned, and conclusions from implementing a controlled vocabularymore » ontology at Sandia National Laboratories. The work focuses on construction of this ontology from the content perspective and the technical perspective. We'll discuss the following: (1) The tool we used to build a polyhierarchical taxonomy; (2) Examples of two methods of indexing the content: traditional 'back of the book' and folksonomy word-mapping; (3) Tips on how to build future search capabilities while building the basic controlled vocabulary; (4) How to implement the controlled vocabulary as an ontology that mimics Google's search suggestions; (5) Making the user experience more interactive and intuitive; and (6) Sorting suggestions based on preferred, alternate and related terms using SPARQL queries. In summary, future improvements will be presented, including permitting end-users to add, edit and remove terms, and filtering on different subject domains.« less
Building an Ontology-driven Database for Clinical Immune Research
Ma, Jingming
2006-01-01
The clinical researches of immune response usually generate a huge amount of biomedical testing data over a certain period of time. The user-friendly data management systems based on the relational database will help immunologists/clinicians to fully manage the data. On the other hand, the same biological assays such as ELISPOT and flow cytometric assays are involved in immunological experiments no matter of different study purposes. The reuse of biological knowledge is one of driving forces behind this ontology-driven data management. Therefore, an ontology-driven database will help to handle different clinical immune researches and help immunologists/clinicians easily understand the immunological data from each other. We will discuss some outlines for building an ontology-driven data management for clinical immune researches (ODMim). PMID:17238637
The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery
2014-01-01
The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, functions, quantities, and informational entities including textual, geometrical, and mathematical entities, and provides specific extensions in the domains of chemistry, biology, biochemistry, and bioinformatics. SIO provides an ontological foundation for the Bio2RDF linked data for the life sciences project and is used for semantic integration and discovery for SADI-based semantic web services. SIO is freely available to all users under a creative commons by attribution license. See website for further information: http://sio.semanticscience.org. PMID:24602174
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database.
The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful AOP-DB user interface in...
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database
The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful user interface in the for...
A bibliometric and visual analysis of global geo-ontology research
NASA Astrophysics Data System (ADS)
Li, Lin; Liu, Yu; Zhu, Haihong; Ying, Shen; Luo, Qinyao; Luo, Heng; Kuai, Xi; Xia, Hui; Shen, Hang
2017-02-01
In this paper, the results of a bibliometric and visual analysis of geo-ontology research articles collected from the Web of Science (WOS) database between 1999 and 2014 are presented. The numbers of national institutions and published papers are visualized and a global research heat map is drawn, illustrating an overview of global geo-ontology research. In addition, we present a chord diagram of countries and perform a visual cluster analysis of a knowledge co-citation network of references, disclosing potential academic communities and identifying key points, main research areas, and future research trends. The International Journal of Geographical Information Science, Progress in Human Geography, and Computers & Geosciences are the most active journals. The USA makes the largest contributions to geo-ontology research by virtue of its highest numbers of independent and collaborative papers, and its dominance was also confirmed in the country chord diagram. The majority of institutions are in the USA, Western Europe, and Eastern Asia. Wuhan University, University of Munster, and the Chinese Academy of Sciences are notable geo-ontology institutions. Keywords such as "Semantic Web," "GIS," and "space" have attracted a great deal of attention. "Semantic granularity in ontology-driven geographic information systems, "Ontologies in support of activities in geographical space" and "A translation approach to portable ontology specifications" have the highest cited centrality. Geographical space, computer-human interaction, and ontology cognition are the three main research areas of geo-ontology. The semantic mismatch between the producers and users of ontology data as well as error propagation in interdisciplinary and cross-linguistic data reuse needs to be solved. In addition, the development of geo-ontology modeling primitives based on OWL (Web Ontology Language)and finding methods to automatically rework data in Semantic Web are needed. Furthermore, the topological relations between geographical entities still require further study.
Representing virus-host interactions and other multi-organism processes in the Gene Ontology.
Foulger, R E; Osumi-Sutherland, D; McIntosh, B K; Hulo, C; Masson, P; Poux, S; Le Mercier, P; Lomax, J
2015-07-28
The Gene Ontology project is a collaborative effort to provide descriptions of gene products in a consistent and computable language, and in a species-independent manner. The Gene Ontology is designed to be applicable to all organisms but up to now has been largely under-utilized for prokaryotes and viruses, in part because of a lack of appropriate ontology terms. To address this issue, we have developed a set of Gene Ontology classes that are applicable to microbes and their hosts, improving both coverage and quality in this area of the Gene Ontology. Describing microbial and viral gene products brings with it the additional challenge of capturing both the host and the microbe. Recognising this, we have worked closely with annotation groups to test and optimize the GO classes, and we describe here a set of annotation guidelines that allow the controlled description of two interacting organisms. Building on the microbial resources already in existence such as ViralZone, UniProtKB keywords and MeGO, this project provides an integrated ontology to describe interactions between microbial species and their hosts, with mappings to the external resources above. Housing this information within the freely-accessible Gene Ontology project allows the classes and annotation structure to be utilized by a large community of biologists and users.
Hastings, Janna; de Matos, Paula; Dekker, Adriano; Ennis, Marcus; Harsha, Bhavana; Kale, Namrata; Muthukrishnan, Venkatesh; Owen, Gareth; Turner, Steve; Williams, Mark; Steinbeck, Christoph
2013-01-01
ChEBI (http://www.ebi.ac.uk/chebi) is a database and ontology of chemical entities of biological interest. Over the past few years, ChEBI has continued to grow steadily in content, and has added several new features. In addition to incorporating all user-requested compounds, our annotation efforts have emphasized immunology, natural products and metabolites in many species. All database entries are now 'is_a' classified within the ontology, meaning that all of the chemicals are available to semantic reasoning tools that harness the classification hierarchy. We have completely aligned the ontology with the Open Biomedical Ontologies (OBO) Foundry-recommended upper level Basic Formal Ontology. Furthermore, we have aligned our chemical classification with the classification of chemical-involving processes in the Gene Ontology (GO), and as a result of this effort, the majority of chemical-involving processes in GO are now defined in terms of the ChEBI entities that participate in them. This effort necessitated incorporating many additional biologically relevant compounds. We have incorporated additional data types including reference citations, and the species and component for metabolites. Finally, our website and web services have had several enhancements, most notably the provision of a dynamic new interactive graph-based ontology visualization.
Ontology Design of Influential People Identification Using Centrality
NASA Astrophysics Data System (ADS)
Maulana Awangga, Rolly; Yusril, Muhammad; Setyawan, Helmi
2018-04-01
Identifying influential people as a node in a graph theory commonly calculated by social network analysis. The social network data has the user as node and edge as relation forming a friend relation graph. This research is conducting different meaning of every nodes relation in the social network. Ontology was perfect match science to describe the social network data as conceptual and domain. Ontology gives essential relationship in a social network more than a current graph. Ontology proposed as a standard for knowledge representation for the semantic web by World Wide Web Consortium. The formal data representation use Resource Description Framework (RDF) and Web Ontology Language (OWL) which is strategic for Open Knowledge-Based website data. Ontology used in the semantic description for a relationship in the social network, it is open to developing semantic based relationship ontology by adding and modifying various and different relationship to have influential people as a conclusion. This research proposes a model using OWL and RDF for influential people identification in the social network. The study use degree centrality, between ness centrality, and closeness centrality measurement for data validation. As a conclusion, influential people identification in Facebook can use proposed Ontology model in the Group, Photos, Photo Tag, Friends, Events and Works data.
Towards a Consistent and Scientifically Accurate Drug Ontology.
Hogan, William R; Hanna, Josh; Joseph, Eric; Brochhausen, Mathias
2013-01-01
Our use case for comparative effectiveness research requires an ontology of drugs that enables querying National Drug Codes (NDCs) by active ingredient, mechanism of action, physiological effect, and therapeutic class of the drug products they represent. We conducted an ontological analysis of drugs from the realist perspective, and evaluated existing drug terminology, ontology, and database artifacts from (1) the technical perspective, (2) the perspective of pharmacology and medical science (3) the perspective of description logic semantics (if they were available in Web Ontology Language or OWL), and (4) the perspective of our realism-based analysis of the domain. No existing resource was sufficient. Therefore, we built the Drug Ontology (DrOn) in OWL, which we populated with NDCs and other classes from RxNorm using only content created by the National Library of Medicine. We also built an application that uses DrOn to query for NDCs as outlined above, available at: http://ingarden.uams.edu/ingredients. The application uses an OWL-based description logic reasoner to execute end-user queries. DrOn is available at http://code.google.com/p/dr-on.
Common IED exploitation target set ontology
NASA Astrophysics Data System (ADS)
Russomanno, David J.; Qualls, Joseph; Wowczuk, Zenovy; Franken, Paul; Robinson, William
2010-04-01
The Common IED Exploitation Target Set (CIEDETS) ontology provides a comprehensive semantic data model for capturing knowledge about sensors, platforms, missions, environments, and other aspects of systems under test. The ontology also includes representative IEDs; modeled as explosives, camouflage, concealment objects, and other background objects, which comprise an overall threat scene. The ontology is represented using the Web Ontology Language and the SPARQL Protocol and RDF Query Language, which ensures portability of the acquired knowledge base across applications. The resulting knowledge base is a component of the CIEDETS application, which is intended to support the end user sensor test and evaluation community. CIEDETS associates a system under test to a subset of cataloged threats based on the probability that the system will detect the threat. The associations between systems under test, threats, and the detection probabilities are established based on a hybrid reasoning strategy, which applies a combination of heuristics and simplified modeling techniques. Besides supporting the CIEDETS application, which is focused on efficient and consistent system testing, the ontology can be leveraged in a myriad of other applications, including serving as a knowledge source for mission planning tools.
Android Based Mobile Environment for Moodle Users
ERIC Educational Resources Information Center
de Clunie, Gisela T.; Clunie, Clifton; Castillo, Aris; Rangel, Norman
2013-01-01
This paper is about the development of a platform that eases, throughout Android based mobile devices, mobility of users of virtual courses at Technological University of Panama. The platform deploys computational techniques such as "web services," design patterns, ontologies and mobile technologies to allow mobile devices communicate…
Towards Semantic Modelling of Business Processes for Networked Enterprises
NASA Astrophysics Data System (ADS)
Furdík, Karol; Mach, Marián; Sabol, Tomáš
The paper presents an approach to the semantic modelling and annotation of business processes and information resources, as it was designed within the FP7 ICT EU project SPIKE to support creation and maintenance of short-term business alliances and networked enterprises. A methodology for the development of the resource ontology, as a shareable knowledge model for semantic description of business processes, is proposed. Systematically collected user requirements, conceptual models implied by the selected implementation platform as well as available ontology resources and standards are employed in the ontology creation. The process of semantic annotation is described and illustrated using an example taken from a real application case.
ODISEES: Ontology-Driven Interactive Search Environment for Earth Sciences
NASA Technical Reports Server (NTRS)
Rutherford, Matthew T.; Huffer, Elisabeth B.; Kusterer, John M.; Quam, Brandi M.
2015-01-01
This paper discusses the Ontology-driven Interactive Search Environment for Earth Sciences (ODISEES) project currently being developed to aid researchers attempting to find usable data among an overabundance of closely related data. ODISEES' ontological structure relies on a modular, adaptable concept modeling approach, which allows the domain to be modeled more or less as it is without worrying about terminology or external requirements. In the model, variables are individually assigned semantic content based on the characteristics of the measurements they represent, allowing intuitive discovery and comparison of data without requiring the user to sift through large numbers of data sets and variables to find the desired information.
Using Linked Open Data and Semantic Integration to Search Across Geoscience Repositories
NASA Astrophysics Data System (ADS)
Mickle, A.; Raymond, L. M.; Shepherd, A.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Fils, D.; Hitzler, P.; Janowicz, K.; Jones, M.; Krisnadhi, A.; Lehnert, K. A.; Narock, T.; Schildhauer, M.; Wiebe, P. H.
2014-12-01
The MBLWHOI Library is a partner in the OceanLink project, an NSF EarthCube Building Block, applying semantic technologies to enable knowledge discovery, sharing and integration. OceanLink is testing ontology design patterns that link together: two data repositories, Rolling Deck to Repository (R2R), Biological and Chemical Oceanography Data Management Office (BCO-DMO); the MBLWHOI Library Institutional Repository (IR) Woods Hole Open Access Server (WHOAS); National Science Foundation (NSF) funded awards; and American Geophysical Union (AGU) conference presentations. The Library is collaborating with scientific users, data managers, DSpace engineers, experts in ontology design patterns, and user interface developers to make WHOAS, a DSpace repository, linked open data enabled. The goal is to allow searching across repositories without any of the information providers having to change how they manage their collections. The tools developed for DSpace will be made available to the community of users. There are 257 registered DSpace repositories in the United Stated and over 1700 worldwide. Outcomes include: Integration of DSpace with OpenRDF Sesame triple store to provide SPARQL endpoint for the storage and query of RDF representation of DSpace resources, Mapping of DSpace resources to OceanLink ontology, and DSpace "data" add on to provide resolvable linked open data representation of DSpace resources.
Ontology-based topic clustering for online discussion data
NASA Astrophysics Data System (ADS)
Wang, Yongheng; Cao, Kening; Zhang, Xiaoming
2013-03-01
With the rapid development of online communities, mining and extracting quality knowledge from online discussions becomes very important for the industrial and marketing sector, as well as for e-commerce applications and government. Most of the existing techniques model a discussion as a social network of users represented by a user-based graph without considering the content of the discussion. In this paper we propose a new multilayered mode to analysis online discussions. The user-based and message-based representation is combined in this model. A novel frequent concept sets based clustering method is used to cluster the original online discussion network into topic space. Domain ontology is used to improve the clustering accuracy. Parallel methods are also used to make the algorithms scalable to very large data sets. Our experimental study shows that the model and algorithms are effective when analyzing large scale online discussion data.
OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system.
Senderov, Viktor; Simov, Kiril; Franz, Nico; Stoev, Pavel; Catapano, Terry; Agosti, Donat; Sautter, Guido; Morris, Robert A; Penev, Lyubomir
2018-01-18
The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies for biodiversity resources, such as DarwinCore-based ontologies, and semantic publishing ontologies, such as the SPAR Ontologies. We bridge this gap by providing an ontology focusing on biological taxonomy. OpenBiodiv-O introduces classes, properties, and axioms in the domains of scholarly biodiversity publishing and biological taxonomy and aligns them with several important domain ontologies (FaBiO, DoCO, DwC, Darwin-SW, NOMEN, ENVO). By doing so, it bridges the ontological gap across scholarly biodiversity publishing and biological taxonomy and allows for the creation of a Linked Open Dataset (LOD) of biodiversity information (a biodiversity knowledge graph) and enables the creation of the OpenBiodiv Knowledge Management System. A key feature of the ontology is that it is an ontology of the scientific process of biological taxonomy and not of any particular state of knowledge. This feature allows it to express a multiplicity of scientific opinions. The resulting OpenBiodiv knowledge system may gain a high level of trust in the scientific community as it does not force a scientific opinion on its users (e.g. practicing taxonomists, library researchers, etc.), but rather provides the tools for experts to encode different views as science progresses. OpenBiodiv-O provides a conceptual model of the structure of a biodiversity publication and the development of related taxonomic concepts. It also serves as the basis for the OpenBiodiv Knowledge Management System.
Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains
Walk, Simon; Singer, Philipp; Strohmaier, Markus; Tudorache, Tania; Musen, Mark A.; Noy, Natalya F.
2014-01-01
Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For example, the 11th revision of the International Classification of Diseases, which is currently under active development by the World Health Organization contains nearly 50, 000 classes representing a vast variety of different diseases and causes of death. This evolution in terms of size was accompanied by an evolution in the way ontologies are engineered. Because no single individual has the expertise to develop such large-scale ontologies, ontology-engineering projects have evolved from small-scale efforts involving just a few domain experts to large-scale projects that require effective collaboration between dozens or even hundreds of experts, practitioners and other stakeholders. Understanding the way these different stakeholders collaborate will enable us to improve editing environments that support such collaborations. In this paper, we uncover how large ontology-engineering projects, such as the International Classification of Diseases in its 11th revision, unfold by analyzing usage logs of five different biomedical ontology-engineering projects of varying sizes and scopes using Markov chains. We discover intriguing interaction patterns (e.g., which properties users frequently change after specific given ones) that suggest that large collaborative ontology-engineering projects are governed by a few general principles that determine and drive development. From our analysis, we identify commonalities and differences between different projects that have implications for project managers, ontology editors, developers and contributors working on collaborative ontology-engineering projects and tools in the biomedical domain. PMID:24953242
Discovering beaten paths in collaborative ontology-engineering projects using Markov chains.
Walk, Simon; Singer, Philipp; Strohmaier, Markus; Tudorache, Tania; Musen, Mark A; Noy, Natalya F
2014-10-01
Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For example, the 11th revision of the International Classification of Diseases, which is currently under active development by the World Health Organization contains nearly 50,000 classes representing a vast variety of different diseases and causes of death. This evolution in terms of size was accompanied by an evolution in the way ontologies are engineered. Because no single individual has the expertise to develop such large-scale ontologies, ontology-engineering projects have evolved from small-scale efforts involving just a few domain experts to large-scale projects that require effective collaboration between dozens or even hundreds of experts, practitioners and other stakeholders. Understanding the way these different stakeholders collaborate will enable us to improve editing environments that support such collaborations. In this paper, we uncover how large ontology-engineering projects, such as the International Classification of Diseases in its 11th revision, unfold by analyzing usage logs of five different biomedical ontology-engineering projects of varying sizes and scopes using Markov chains. We discover intriguing interaction patterns (e.g., which properties users frequently change after specific given ones) that suggest that large collaborative ontology-engineering projects are governed by a few general principles that determine and drive development. From our analysis, we identify commonalities and differences between different projects that have implications for project managers, ontology editors, developers and contributors working on collaborative ontology-engineering projects and tools in the biomedical domain. Copyright © 2014 Elsevier Inc. All rights reserved.
Deploying the ODISEES Ontology-guided Search in the NASA Earth Exchange (NEX)
NASA Astrophysics Data System (ADS)
Huffer, E.; Gleason, J. L.; Cotnoir, M.; Spaulding, R.; Deardorff, G.
2016-12-01
Robust, semantically rich metadata can support data discovery and access, and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Despite this, for users not already familiar with the data in a given archive, most metadata is insufficient to help them find appropriate data for their projects. With this in mind, the Ontology-driven Interactive Search Environment (ODISEES) Data Discovery Portal was developed to enable users to find and download data variables that satisfy precise, parameter-level criteria, even when they know little or nothing about the naming conventions employed by data providers, or where suitable data might be archived. ODISEES relies on an Earth science ontology and metadata repository that provide an ontological framework for describing NASA data holdings with enough detail and fidelity to enable researchers to find, compare and evaluate individual data variables. Users can search for data by indicating the specific parameters desired, and comparing the results in a table that lets them quickly determine which data is most suitable. ODISEES and OLYMPUS, a tool for generating the semantically enhanced metadata used by ODISEES, are being developed in collaboration with the NASA Earth Exchange (NEX) project at the NASA Ames Research Center to prototype a robust data discovery and access service that could be made available to NEX users. NEX is a collaborative platform that provides researchers with access to TB to PB-scale datasets and analysis tools to operate on those data. By integrating ODISEES into the NEX Web Portal we hope to enable NEX users to locate datasets relevant to their research and download them directly into the NAS environment, where they can run applications using those datasets on the NAS supercomputers. This poster will describe the prototype integration of ODISEES into the NEX portal development environment, the mechanism implemented to use NASA APIs to retrieve data, and the approach to transfer data into the NAS supercomputing environment. Finally, we will describe the end-to-end demonstration of the capabilities implemented. This work was funded by the Advanced Information Systems Technology Program of NASA's Research Opportunities in Space and Earth Science.
Bratsas, Charalampos; Koutkias, Vassilis; Kaimakamis, Evangelos; Bamidis, Panagiotis; Maglaveras, Nicos
2007-01-01
Medical Computational Problem (MCP) solving is related to medical problems and their computerized algorithmic solutions. In this paper, an extension of an ontology-based model to fuzzy logic is presented, as a means to enhance the information retrieval (IR) procedure in semantic management of MCPs. We present herein the methodology followed for the fuzzy expansion of the ontology model, the fuzzy query expansion procedure, as well as an appropriate ontology-based Vector Space Model (VSM) that was constructed for efficient mapping of user-defined MCP search criteria and MCP acquired knowledge. The relevant fuzzy thesaurus is constructed by calculating the simultaneous occurrences of terms and the term-to-term similarities derived from the ontology that utilizes UMLS (Unified Medical Language System) concepts by using Concept Unique Identifiers (CUI), synonyms, semantic types, and broader-narrower relationships for fuzzy query expansion. The current approach constitutes a sophisticated advance for effective, semantics-based MCP-related IR.
The Evidence and Conclusion Ontology (ECO): Supporting GO Annotations.
Chibucos, Marcus C; Siegele, Deborah A; Hu, James C; Giglio, Michelle
2017-01-01
The Evidence and Conclusion Ontology (ECO) is a community resource for describing the various types of evidence that are generated during the course of a scientific study and which are typically used to support assertions made by researchers. ECO describes multiple evidence types, including evidence resulting from experimental (i.e., wet lab) techniques, evidence arising from computational methods, statements made by authors (whether or not supported by evidence), and inferences drawn by researchers curating the literature. In addition to summarizing the evidence that supports a particular assertion, ECO also offers a means to document whether a computer or a human performed the process of making the annotation. Incorporating ECO into an annotation system makes it possible to leverage the structure of the ontology such that associated data can be grouped hierarchically, users can select data associated with particular evidence types, and quality control pipelines can be optimized. Today, over 30 resources, including the Gene Ontology, use the Evidence and Conclusion Ontology to represent both evidence and how annotations are made.
ERIC Educational Resources Information Center
Garcia-Barriocanal, Elena; Sicilia, Miguel-Angel; Sanchez-Alonso, Salvador; Lytras, Miltiadis
2011-01-01
Web 2.0 technologies can be considered a loosely defined set of Web application styles that foster a kind of media consumer more engaged, and usually active in creating and maintaining Internet contents. Thus, Web 2.0 applications have resulted in increased user participation and massive user-generated (or user-published) open multimedia content,…
The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability.
Diehl, Alexander D; Meehan, Terrence F; Bradford, Yvonne M; Brush, Matthew H; Dahdul, Wasila M; Dougall, David S; He, Yongqun; Osumi-Sutherland, David; Ruttenberg, Alan; Sarntivijai, Sirarat; Van Slyke, Ceri E; Vasilevsky, Nicole A; Haendel, Melissa A; Blake, Judith A; Mungall, Christopher J
2016-07-04
The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.
Dönitz, Jürgen; Wingender, Edgar
2012-01-01
The semantic web depends on the use of ontologies to let electronic systems interpret contextual information. Optimally, the handling and access of ontologies should be completely transparent to the user. As a means to this end, we have developed a service that attempts to bridge the gap between experts in a certain knowledge domain, ontologists, and application developers. The ontology-based answers (OBA) service introduced here can be embedded into custom applications to grant access to the classes of ontologies and their relations as most important structural features as well as to information encoded in the relations between ontology classes. Thus computational biologists can benefit from ontologies without detailed knowledge about the respective ontology. The content of ontologies is mapped to a graph of connected objects which is compatible to the object-oriented programming style in Java. Semantic functions implement knowledge about the complex semantics of an ontology beyond the class hierarchy and "partOf" relations. By using these OBA functions an application can, for example, provide a semantic search function, or (in the examples outlined) map an anatomical structure to the organs it belongs to. The semantic functions relieve the application developer from the necessity of acquiring in-depth knowledge about the semantics and curation guidelines of the used ontologies by implementing the required knowledge. The architecture of the OBA service encapsulates the logic to process ontologies in order to achieve a separation from the application logic. A public server with the current plugins is available and can be used with the provided connector in a custom application in scenarios analogous to the presented use cases. The server and the client are freely available if a project requires the use of custom plugins or non-public ontologies. The OBA service and further documentation is available at http://www.bioinf.med.uni-goettingen.de/projects/oba.
Dönitz, Jürgen; Wingender, Edgar
2012-01-01
The semantic web depends on the use of ontologies to let electronic systems interpret contextual information. Optimally, the handling and access of ontologies should be completely transparent to the user. As a means to this end, we have developed a service that attempts to bridge the gap between experts in a certain knowledge domain, ontologists, and application developers. The ontology-based answers (OBA) service introduced here can be embedded into custom applications to grant access to the classes of ontologies and their relations as most important structural features as well as to information encoded in the relations between ontology classes. Thus computational biologists can benefit from ontologies without detailed knowledge about the respective ontology. The content of ontologies is mapped to a graph of connected objects which is compatible to the object-oriented programming style in Java. Semantic functions implement knowledge about the complex semantics of an ontology beyond the class hierarchy and “partOf” relations. By using these OBA functions an application can, for example, provide a semantic search function, or (in the examples outlined) map an anatomical structure to the organs it belongs to. The semantic functions relieve the application developer from the necessity of acquiring in-depth knowledge about the semantics and curation guidelines of the used ontologies by implementing the required knowledge. The architecture of the OBA service encapsulates the logic to process ontologies in order to achieve a separation from the application logic. A public server with the current plugins is available and can be used with the provided connector in a custom application in scenarios analogous to the presented use cases. The server and the client are freely available if a project requires the use of custom plugins or non-public ontologies. The OBA service and further documentation is available at http://www.bioinf.med.uni-goettingen.de/projects/oba PMID:23060901
Semantic solutions to Heliophysics data access
NASA Astrophysics Data System (ADS)
Narock, T. W.; Vandegriff, J. D.; Weigel, R. S.
2011-12-01
Within the domain of Heliophysics, data discovery is being actively addressed. However, data diversity in the returned results has proven to be a significant barrier to integrated multi-mission analysis. Software is being actively developed (e.g. Vandergriff and Brown, 2008) that is data format and measurement type agnostic. However, such approaches rely on an a priori definition of common baseline parameters, units, and coordinate systems onto which all data will be mapped. In this work, we describe our efforts at utilizing a task ontology (Guarino, 1998) to model the steps involved in data transformation within Heliophysics. Thus, given Heliophysics logic and heterogeneous input data, we are able to develop software that is able to infer the set of steps required to compute user specified parameters. Such a framework offers flexibility by allowing users to define their own preferred sets of parameters, units, and coordinate systems they would like in their analysis. In addition, the storage of this information as ontology instances means they are external to source code and are easily shareable and extensible. The additional inclusion of a provenance ontology allows us to capture the historical record of each data analysis session for future review. We describe our use of existing task and provenance ontologies and provide example use cases as well as potential future applications. References J. Vandegriff and L. Brown, (2010), A framework for reading and unifying heliophysics time series data, Earth Science Informatics, Volume 3, Numbers 1-2, Pages 75-86 N. Guarino, (1998), Formal Ontology in Information Systems, Proceedings of FOIS'98, Trento, Italy, 6-8 June 1998. Amsterdam, IOS Press, pp. 3-15.
OBO to UML: Support for the development of conceptual models in the biomedical domain.
Waldemarin, Ricardo C; de Farias, Cléver R G
2018-04-01
A conceptual model abstractly defines a number of concepts and their relationships for the purposes of understanding and communication. Once a conceptual model is available, it can also be used as a starting point for the development of a software system. The development of conceptual models using the Unified Modeling Language (UML) facilitates the representation of modeled concepts and allows software developers to directly reuse these concepts in the design of a software system. The OBO Foundry represents the most relevant collaborative effort towards the development of ontologies in the biomedical domain. The development of UML conceptual models in the biomedical domain may benefit from the use of domain-specific semantics and notation. Further, the development of these models may also benefit from the reuse of knowledge contained in OBO ontologies. This paper investigates the support for the development of conceptual models in the biomedical domain using UML as a conceptual modeling language and using the support provided by the OBO Foundry for the development of biomedical ontologies, namely entity kind and relationship types definitions provided by the Basic Formal Ontology (BFO) and the OBO Core Relations Ontology (OBO Core), respectively. Further, the paper investigates the support for the reuse of biomedical knowledge currently available in OBOFFF ontologies in the development these conceptual models. The paper describes a UML profile for the OBO Core Relations Ontology, which basically defines a number of stereotypes to represent BFO entity kinds and OBO Core relationship types definitions. The paper also presents a support toolset consisting of a graphical editor named OBO-RO Editor, which directly supports the development of UML models using the extensions defined by our profile, and a command-line tool named OBO2UML, which directly converts an OBOFFF ontology into a UML model. Copyright © 2018 Elsevier Inc. All rights reserved.
OWLing Clinical Data Repositories With the Ontology Web Language
Pastor, Xavier; Lozano, Esther
2014-01-01
Background The health sciences are based upon information. Clinical information is usually stored and managed by physicians with precarious tools, such as spreadsheets. The biomedical domain is more complex than other domains that have adopted information and communication technologies as pervasive business tools. Moreover, medicine continuously changes its corpus of knowledge because of new discoveries and the rearrangements in the relationships among concepts. This scenario makes it especially difficult to offer good tools to answer the professional needs of researchers and constitutes a barrier that needs innovation to discover useful solutions. Objective The objective was to design and implement a framework for the development of clinical data repositories, capable of facing the continuous change in the biomedicine domain and minimizing the technical knowledge required from final users. Methods We combined knowledge management tools and methodologies with relational technology. We present an ontology-based approach that is flexible and efficient for dealing with complexity and change, integrated with a solid relational storage and a Web graphical user interface. Results Onto Clinical Research Forms (OntoCRF) is a framework for the definition, modeling, and instantiation of data repositories. It does not need any database design or programming. All required information to define a new project is explicitly stated in ontologies. Moreover, the user interface is built automatically on the fly as Web pages, whereas data are stored in a generic repository. This allows for immediate deployment and population of the database as well as instant online availability of any modification. Conclusions OntoCRF is a complete framework to build data repositories with a solid relational storage. Driven by ontologies, OntoCRF is more flexible and efficient to deal with complexity and change than traditional systems and does not require very skilled technical people facilitating the engineering of clinical software systems. PMID:25599697
OWLing Clinical Data Repositories With the Ontology Web Language.
Lozano-Rubí, Raimundo; Pastor, Xavier; Lozano, Esther
2014-08-01
The health sciences are based upon information. Clinical information is usually stored and managed by physicians with precarious tools, such as spreadsheets. The biomedical domain is more complex than other domains that have adopted information and communication technologies as pervasive business tools. Moreover, medicine continuously changes its corpus of knowledge because of new discoveries and the rearrangements in the relationships among concepts. This scenario makes it especially difficult to offer good tools to answer the professional needs of researchers and constitutes a barrier that needs innovation to discover useful solutions. The objective was to design and implement a framework for the development of clinical data repositories, capable of facing the continuous change in the biomedicine domain and minimizing the technical knowledge required from final users. We combined knowledge management tools and methodologies with relational technology. We present an ontology-based approach that is flexible and efficient for dealing with complexity and change, integrated with a solid relational storage and a Web graphical user interface. Onto Clinical Research Forms (OntoCRF) is a framework for the definition, modeling, and instantiation of data repositories. It does not need any database design or programming. All required information to define a new project is explicitly stated in ontologies. Moreover, the user interface is built automatically on the fly as Web pages, whereas data are stored in a generic repository. This allows for immediate deployment and population of the database as well as instant online availability of any modification. OntoCRF is a complete framework to build data repositories with a solid relational storage. Driven by ontologies, OntoCRF is more flexible and efficient to deal with complexity and change than traditional systems and does not require very skilled technical people facilitating the engineering of clinical software systems.
NASA Astrophysics Data System (ADS)
Elag, M.; Goodall, J. L.
2013-12-01
Hydrologic modeling often requires the re-use and integration of models from different disciplines to simulate complex environmental systems. Component-based modeling introduces a flexible approach for integrating physical-based processes across disciplinary boundaries. Several hydrologic-related modeling communities have adopted the component-based approach for simulating complex physical systems by integrating model components across disciplinary boundaries in a workflow. However, it is not always straightforward to create these interdisciplinary models due to the lack of sufficient knowledge about a hydrologic process. This shortcoming is a result of using informal methods for organizing and sharing information about a hydrologic process. A knowledge-based ontology provides such standards and is considered the ideal approach for overcoming this challenge. The aims of this research are to present the methodology used in analyzing the basic hydrologic domain in order to identify hydrologic processes, the ontology itself, and how the proposed ontology is integrated with the Water Resources Component (WRC) ontology. The proposed ontology standardizes the definitions of a hydrologic process, the relationships between hydrologic processes, and their associated scientific equations. The objective of the proposed Hydrologic Process (HP) Ontology is to advance the idea of creating a unified knowledge framework for components' metadata by introducing a domain-level ontology for hydrologic processes. The HP ontology is a step toward an explicit and robust domain knowledge framework that can be evolved through the contribution of domain users. Analysis of the hydrologic domain is accomplished using the Formal Concept Approach (FCA), in which the infiltration process, an important hydrologic process, is examined. Two infiltration methods, the Green-Ampt and Philip's methods, were used to demonstrate the implementation of information in the HP ontology. Furthermore, a SPARQL service is provided for semantic-based querying of the ontology.
Pitfalls of Ontology in Medicine.
Aldosari, Bakheet; Alanazi, Abdullah; Househ, Mowafa
2017-01-01
Much research has been done in the last few decades in clinical research, medicine, life sciences, etc. leading to an exponential increase in the generation of data. Managing this vast information not only requires integration of the data, but also a means to analyze, relate, and retrieve it. Ontology, in the field of medicine, describes the concepts of medical terminologies and the relation between them, thus, enabling the sharing of medical knowledge. Ontology-based analyses are associated with a risk that errors in modeling may deteriorate the results' quality. Identifying flawed practices or anomalies in ontologies is a crucial issue to be addressed by researchers. In this paper, we review the negative sides of ontology in the field of medicine. Our study results show that ontologies are perceived as a mere tool to represent medical knowledge, thus relying more on the computer science-based understanding of medical terms. While this approach may be sufficient for data entry systems, in which the users merely need to browse the hierarchy and select relevant terms, it may not suffice the real-world scenario of dealing with complex patient records, which are not only grammatically complex, but also are sometimes documented in many native languages. In conclusion, more research is required in identifying poor practices and anomalies in the development of ontologies by computer scientists within the field of medicine.
Towards ontology personalization to enrich social conversations on AAC systems
NASA Astrophysics Data System (ADS)
Mancilla V., Daniela; Sastoque H., Sebastian; Iregui G., Marcela
2015-01-01
Communication is one of the essential needs of human beings. Augmentative and Alternative Communication Systems (AAC) seek to help in the generation of oral and written language to people with physical disorders that limit their natural communication. These systems present significant challenges such as: the composition of consistent messages according to syntactic and semantic rules, the improvement of message production times, the application to social contexts and, consequently, the incorporation of user-specific information. This work presents an original ontology personalization approach for an AAC instant messaging system incorporating personalized information to improve the efficacy and efficiency of the message production. This proposal is based on a projection of a general ontology into a more specific one, avoiding storage redundancy and data coupling, representing a big opportunity to enrich communication capabilities of current AAC systems. The evaluation was performed for a study case based on an AAC system for assistance in composing messages. The results show that adding user-specific information allows generation of enriched phrases, so improving the accuracy of the message, facilitating the communication process.
Caniza, Horacio; Romero, Alfonso E; Heron, Samuel; Yang, Haixuan; Devoto, Alessandra; Frasca, Marco; Mesiti, Marco; Valentini, Giorgio; Paccanaro, Alberto
2014-08-01
We present GOssTo, the Gene Ontology semantic similarity Tool, a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology. GOssTo is bundled with six semantic similarity measures, including both term- and graph-based measures, and has extension capabilities to allow the user to add new similarities. Importantly, for any measure, GOssTo can also calculate the Random Walk Contribution that has been shown to greatly improve the accuracy of similarity measures. GOssTo is very fast, easy to use, and it allows the calculation of similarities on a genomic scale in a few minutes on a regular desktop machine. alberto@cs.rhul.ac.uk GOssTo is available both as a stand-alone application running on GNU/Linux, Windows and MacOS from www.paccanarolab.org/gossto and as a web application from www.paccanarolab.org/gosstoweb. The stand-alone application features a simple and concise command line interface for easy integration into high-throughput data processing pipelines. © The Author 2014. Published by Oxford University Press.
Improving ontology matching with propagation strategy and user feedback
NASA Astrophysics Data System (ADS)
Li, Chunhua; Cui, Zhiming; Zhao, Pengpeng; Wu, Jian; Xin, Jie; He, Tianxu
2015-07-01
Markov logic networks which unify probabilistic graphical model and first-order logic provide an excellent framework for ontology matching. The existing approach requires a threshold to produce matching candidates and use a small set of constraints acting as filter to select the final alignments. We introduce novel match propagation strategy to model the influences between potential entity mappings across ontologies, which can help to identify the correct correspondences and produce missed correspondences. The estimation of appropriate threshold is a difficult task. We propose an interactive method for threshold selection through which we obtain an additional measurable improvement. Running experiments on a public dataset has demonstrated the effectiveness of proposed approach in terms of the quality of result alignment.
BioPortal: An Open-Source Community-Based Ontology Repository
NASA Astrophysics Data System (ADS)
Noy, N.; NCBO Team
2011-12-01
Advances in computing power and new computational techniques have changed the way researchers approach science. In many fields, one of the most fruitful approaches has been to use semantically aware software to break down the barriers among disparate domains, systems, data sources, and technologies. Such software facilitates data aggregation, improves search, and ultimately allows the detection of new associations that were previously not detectable. Achieving these analyses requires software systems that take advantage of the semantics and that can intelligently negotiate domains and knowledge sources, identifying commonality across systems that use different and conflicting vocabularies, while understanding apparent differences that may be concealed by the use of superficially similar terms. An ontology, a semantically rich vocabulary for a domain of interest, is the cornerstone of software for bridging systems, domains, and resources. However, as ontologies become the foundation of all semantic technologies in e-science, we must develop an infrastructure for sharing ontologies, finding and evaluating them, integrating and mapping among them, and using ontologies in applications that help scientists process their data. BioPortal [1] is an open-source on-line community-based ontology repository that has been used as a critical component of semantic infrastructure in several domains, including biomedicine and bio-geochemical data. BioPortal, uses the social approaches in the Web 2.0 style to bring structure and order to the collection of biomedical ontologies. It enables users to provide and discuss a wide array of knowledge components, from submitting the ontologies themselves, to commenting on and discussing classes in the ontologies, to reviewing ontologies in the context of their own ontology-based projects, to creating mappings between overlapping ontologies and discussing and critiquing the mappings. Critically, it provides web-service access to all its content, enabling its integration in semantically enriched applications. [1] Noy, N.F., Shah, N.H., et al., BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res, 2009. 37(Web Server issue): p. W170-3.
Mining Rare Associations between Biological Ontologies
Benites, Fernando; Simon, Svenja; Sapozhnikova, Elena
2014-01-01
The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations. PMID:24404165
Mining rare associations between biological ontologies.
Benites, Fernando; Simon, Svenja; Sapozhnikova, Elena
2014-01-01
The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations.
CoMetaR: A Collaborative Metadata Repository for Biomedical Research Networks.
Stöhr, Mark R; Helm, Gudrun; Majeed, Raphael W; Günther, Andreas
2017-01-01
The German Center for Lung Research (DZL) is a research network with the aim of researching respiratory diseases. To perform consortium-wide queries through one single interface, it requires a uniform conceptual structure. No single terminology covers all our concepts. To achieve a broadly accepted and complete ontology, we developed a platform for collaborative metadata management "CoMetaR". Anyone can browse and discuss the ontology while editing can be performed by authenticated users.
A User-Centric Knowledge Creation Model in a Web of Object-Enabled Internet of Things Environment
Kibria, Muhammad Golam; Fattah, Sheik Mohammad Mostakim; Jeong, Kwanghyeon; Chong, Ilyoung; Jeong, Youn-Kwae
2015-01-01
User-centric service features in a Web of Object-enabled Internet of Things environment can be provided by using a semantic ontology that classifies and integrates objects on the World Wide Web as well as shares and merges context-aware information and accumulated knowledge. The semantic ontology is applied on a Web of Object platform to virtualize the real world physical devices and information to form virtual objects that represent the features and capabilities of devices in the virtual world. Detailed information and functionalities of multiple virtual objects are combined with service rules to form composite virtual objects that offer context-aware knowledge-based services, where context awareness plays an important role in enabling automatic modification of the system to reconfigure the services based on the context. Converting the raw data into meaningful information and connecting the information to form the knowledge and storing and reusing the objects in the knowledge base can both be expressed by semantic ontology. In this paper, a knowledge creation model that synchronizes a service logistic model and a virtual world knowledge model on a Web of Object platform has been proposed. To realize the context-aware knowledge-based service creation and execution, a conceptual semantic ontology model has been developed and a prototype has been implemented for a use case scenario of emergency service. PMID:26393609
A User-Centric Knowledge Creation Model in a Web of Object-Enabled Internet of Things Environment.
Kibria, Muhammad Golam; Fattah, Sheik Mohammad Mostakim; Jeong, Kwanghyeon; Chong, Ilyoung; Jeong, Youn-Kwae
2015-09-18
User-centric service features in a Web of Object-enabled Internet of Things environment can be provided by using a semantic ontology that classifies and integrates objects on the World Wide Web as well as shares and merges context-aware information and accumulated knowledge. The semantic ontology is applied on a Web of Object platform to virtualize the real world physical devices and information to form virtual objects that represent the features and capabilities of devices in the virtual world. Detailed information and functionalities of multiple virtual objects are combined with service rules to form composite virtual objects that offer context-aware knowledge-based services, where context awareness plays an important role in enabling automatic modification of the system to reconfigure the services based on the context. Converting the raw data into meaningful information and connecting the information to form the knowledge and storing and reusing the objects in the knowledge base can both be expressed by semantic ontology. In this paper, a knowledge creation model that synchronizes a service logistic model and a virtual world knowledge model on a Web of Object platform has been proposed. To realize the context-aware knowledge-based service creation and execution, a conceptual semantic ontology model has been developed and a prototype has been implemented for a use case scenario of emergency service.
Legaz-García, María Del Carmen; Dentler, Kathrin; Fernández-Breis, Jesualdo Tomás; Cornet, Ronald
2017-01-01
ArchMS is a framework that represents clinical information and knowledge using ontologies in OWL, which facilitates semantic interoperability and thereby the exploitation and secondary use of clinical data. However, it does not yet support the automated assessment of quality of care. CLIF is a stepwise method to formalize quality indicators. The method has been implemented in the CLIF tool which supports its users in generating computable queries based on a patient data model which can be based on archetypes. To enable the automated computation of quality indicators using ontologies and archetypes, we tested whether ArchMS and the CLIF tool can be integrated. We successfully automated the process of generating SPARQL queries from quality indicators that have been formalized with CLIF and integrated them into ArchMS. Hence, ontologies and archetypes can be combined for the execution of formalized quality indicators.
Ontology Development and Evolution in the Accident Investigation Domain
NASA Technical Reports Server (NTRS)
Carvalho, Robert; Berrios, Dan; Williams, James
2004-01-01
InvestiigationOrganizer (IO) is a collaborative semantic web system designed to support the conduct of mishap investigations. IO provides a common repository for a wide range of mishap related information, allowing investigators to integrate evidence, causal models, and investigation results. IO has been used to support investigations ranging from a small property damage case to the loss of the Space Shuttle Columbia. Through IO'S use in these investigations, we have learned significant lessons? about the application of ontologies and semantic systems to solving real-world problems. This paper will describe the development of the ontology within IO, from the initial development, its growth in response to user requests during use in investigations, and the recent work that was done to control the results of that growth. This paper will also describe the lessons learned from this experience and how they may apply to the implementaton of future ontologies and semantic systems.
Gene Ontology: Pitfalls, Biases, and Remedies.
Gaudet, Pascale; Dessimoz, Christophe
2017-01-01
The Gene Ontology (GO) is a formidable resource, but there are several considerations about it that are essential to understand the data and interpret it correctly. The GO is sufficiently simple that it can be used without deep understanding of its structure or how it is developed, which is both a strength and a weakness. In this chapter, we discuss some common misinterpretations of the ontology and the annotations. A better understanding of the pitfalls and the biases in the GO should help users make the most of this very rich resource. We also review some of the misconceptions and misleading assumptions commonly made about GO, including the effect of data incompleteness, the importance of annotation qualifiers, and the transitivity or lack thereof associated with different ontology relations. We also discuss several biases that can confound aggregate analyses such as gene enrichment analyses. For each of these pitfalls and biases, we suggest remedies and best practices.
A Lexical-Ontological Resource for Consumer Healthcare
NASA Astrophysics Data System (ADS)
Cardillo, Elena; Serafini, Luciano; Tamilin, Andrei
In Consumer Healthcare Informatics it is still difficult for laypeople to find, understand and act on health information, due to the persistent communication gap between specialized medical terminology and that used by healthcare consumers. Furthermore, existing clinically-oriented terminologies cannot provide sufficient support when integrated into consumer-oriented applications, so there is a need to create consumer-friendly terminologies reflecting the different ways healthcare consumers express and think about health topics. Following this direction, this work suggests a way to support the design of an ontology-based system that mitigates this gap, using knowledge engineering and semantic web technologies. The system is based on the development of a consumer-oriented medical terminology that will be integrated with other medical domain ontologies and terminologies into a medical ontology repository. This will support consumer-oriented healthcare systems, such as Personal Health Records, by providing many knowledge services to help users in accessing and managing their healthcare data.
A Lexical-Ontological Resource for Consumer Heathcare
NASA Astrophysics Data System (ADS)
Cardillo, Elena
In Consumer Healthcare Informatics it is still difficult for laypersons to understand and act on health information, due to the persistent communication gap between specialized medical terminology and that used by healthcare consumers. Furthermore, existing clinically-oriented terminologies cannot provide sufficient support when integrated into consumer-oriented applications, so there is a need to create consumer-friendly terminologies reflecting the different ways healthcare consumers express and think about health topics. Following this direction, this work suggests a way to support the design of an ontology-based system that mitigates this gap, using knowledge engineering and Semantic Web technologies. The system is based on the development of a consumer-oriented medical terminology which will be integrated with other existing domain ontologies/terminologies into a medical ontology repository. This will support consumer-oriented healthcare systems by providing many knowledge services to help users in accessing and managing their healthcare data.
Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D
2017-01-04
The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ramzan, Asia; Wang, Hai; Buckingham, Christopher
2014-01-01
Clinical decision support systems (CDSSs) often base their knowledge and advice on human expertise. Knowledge representation needs to be in a format that can be easily understood by human users as well as supporting ongoing knowledge engineering, including evolution and consistency of knowledge. This paper reports on the development of an ontology specification for managing knowledge engineering in a CDSS for assessing and managing risks associated with mental-health problems. The Galatean Risk and Safety Tool, GRiST, represents mental-health expertise in the form of a psychological model of classification. The hierarchical structure was directly represented in the machine using an XML document. Functionality of the model and knowledge management were controlled using attributes in the XML nodes, with an accompanying paper manual for specifying how end-user tools should behave when interfacing with the XML. This paper explains the advantages of using the web-ontology language, OWL, as the specification, details some of the issues and problems encountered in translating the psychological model to OWL, and shows how OWL benefits knowledge engineering. The conclusions are that OWL can have an important role in managing complex knowledge domains for systems based on human expertise without impeding the end-users' understanding of the knowledge base. The generic classification model underpinning GRiST makes it applicable to many decision domains and the accompanying OWL specification facilitates its implementation.
Uciteli, Alexandr; Herre, Heinrich
2015-01-01
The specification of metadata in clinical and epidemiological study projects absorbs significant expense. The validity and quality of the collected data depend heavily on the precise and semantical correct representation of their metadata. In various research organizations, which are planning and coordinating studies, the required metadata are specified differently, depending on many conditions, e.g., on the used study management software. The latter does not always meet the needs of a particular research organization, e.g., with respect to the relevant metadata attributes and structuring possibilities. The objective of the research, set forth in this paper, is the development of a new approach for ontology-based representation and management of metadata. The basic features of this approach are demonstrated by the software tool OntoStudyEdit (OSE). The OSE is designed and developed according to the three ontology method. This method for developing software is based on the interactions of three different kinds of ontologies: a task ontology, a domain ontology and a top-level ontology. The OSE can be easily adapted to different requirements, and it supports an ontologically founded representation and efficient management of metadata. The metadata specifications can by imported from various sources; they can be edited with the OSE, and they can be exported in/to several formats, which are used, e.g., by different study management software. Advantages of this approach are the adaptability of the OSE by integrating suitable domain ontologies, the ontological specification of mappings between the import/export formats and the DO, the specification of the study metadata in a uniform manner and its reuse in different research projects, and an intuitive data entry for non-expert users.
Hogan, William R; Wagner, Michael M; Brochhausen, Mathias; Levander, John; Brown, Shawn T; Millett, Nicholas; DePasse, Jay; Hanna, Josh
2016-08-18
We developed the Apollo Structured Vocabulary (Apollo-SV)-an OWL2 ontology of phenomena in infectious disease epidemiology and population biology-as part of a project whose goal is to increase the use of epidemic simulators in public health practice. Apollo-SV defines a terminology for use in simulator configuration. Apollo-SV is the product of an ontological analysis of the domain of infectious disease epidemiology, with particular attention to the inputs and outputs of nine simulators. Apollo-SV contains 802 classes for representing the inputs and outputs of simulators, of which approximately half are new and half are imported from existing ontologies. The most important Apollo-SV class for users of simulators is infectious disease scenario, which is a representation of an ecosystem at simulator time zero that has at least one infection process (a class) affecting at least one population (also a class). Other important classes represent ecosystem elements (e.g., households), ecosystem processes (e.g., infection acquisition and infectious disease), censuses of ecosystem elements (e.g., censuses of populations), and infectious disease control measures. In the larger project, which created an end-user application that can send the same infectious disease scenario to multiple simulators, Apollo-SV serves as the controlled terminology and strongly influences the design of the message syntax used to represent an infectious disease scenario. As we added simulators for different pathogens (e.g., malaria and dengue), the core classes of Apollo-SV have remained stable, suggesting that our conceptualization of the information required by simulators is sound. Despite adhering to the OBO Foundry principle of orthogonality, we could not reuse Infectious Disease Ontology classes as the basis for infectious disease scenarios. We thus defined new classes in Apollo-SV for host, pathogen, infection, infectious disease, colonization, and infection acquisition. Unlike IDO, our ontological analysis extended to existing mathematical models of key biological phenomena studied by infectious disease epidemiology and population biology. Our ontological analysis as expressed in Apollo-SV was instrumental in developing a simulator-independent representation of infectious disease scenarios that can be run on multiple epidemic simulators. Our experience suggests the importance of extending ontological analysis of a domain to include existing mathematical models of the phenomena studied by the domain. Apollo-SV is freely available at: http://purl.obolibrary.org/obo/apollo_sv.owl .
MIRO: guidelines for minimum information for the reporting of an ontology.
Matentzoglu, Nicolas; Malone, James; Mungall, Chris; Stevens, Robert
2018-01-18
Creation and use of ontologies has become a mainstream activity in many disciplines, in particular, the biomedical domain. Ontology developers often disseminate information about these ontologies in peer-reviewed ontology description reports. There appears to be, however, a high degree of variability in the content of these reports. Often, important details are omitted such that it is difficult to gain a sufficient understanding of the ontology, its content and method of creation. We propose the Minimum Information for Reporting an Ontology (MIRO) guidelines as a means to facilitate a higher degree of completeness and consistency between ontology documentation, including published papers, and ultimately a higher standard of report quality. A draft of the MIRO guidelines was circulated for public comment in the form of a questionnaire, and we subsequently collected 110 responses from ontology authors, developers, users and reviewers. We report on the feedback of this consultation, including comments on each guideline, and present our analysis on the relative importance of each MIRO information item. These results were used to update the MIRO guidelines, mainly by providing more detailed operational definitions of the individual items and assigning degrees of importance. Based on our revised version of MIRO, we conducted a review of 15 recently published ontology description reports from three important journals in the Semantic Web and Biomedical domain and analysed them for compliance with the MIRO guidelines. We found that only 41.38% of the information items were covered by the majority of the papers (and deemed important by the survey respondents) and a large number of important items are not covered at all, like those related to testing and versioning policies. We believe that the community-reviewed MIRO guidelines can contribute to improving significantly the quality of ontology description reports and other documentation, in particular by increasing consistent reporting of important ontology features that are otherwise often neglected.
Fish Ontology framework for taxonomy-based fish recognition
Ali, Najib M.; Khan, Haris A.; Then, Amy Y-Hui; Ving Ching, Chong; Gaur, Manas
2017-01-01
Life science ontologies play an important role in Semantic Web. Given the diversity in fish species and the associated wealth of information, it is imperative to develop an ontology capable of linking and integrating this information in an automated fashion. As such, we introduce the Fish Ontology (FO), an automated classification architecture of existing fish taxa which provides taxonomic information on unknown fish based on metadata restrictions. It is designed to support knowledge discovery, provide semantic annotation of fish and fisheries resources, data integration, and information retrieval. Automated classification for unknown specimens is a unique feature that currently does not appear to exist in other known ontologies. Examples of automated classification for major groups of fish are demonstrated, showing the inferred information by introducing several restrictions at the species or specimen level. The current version of FO has 1,830 classes, includes widely used fisheries terminology, and models major aspects of fish taxonomy, grouping, and character. With more than 30,000 known fish species globally, the FO will be an indispensable tool for fish scientists and other interested users. PMID:28929028
Informatics in radiology: radiology gamuts ontology: differential diagnosis for the Semantic Web.
Budovec, Joseph J; Lam, Cesar A; Kahn, Charles E
2014-01-01
The Semantic Web is an effort to add semantics, or "meaning," to empower automated searching and processing of Web-based information. The overarching goal of the Semantic Web is to enable users to more easily find, share, and combine information. Critical to this vision are knowledge models called ontologies, which define a set of concepts and formalize the relations between them. Ontologies have been developed to manage and exploit the large and rapidly growing volume of information in biomedical domains. In diagnostic radiology, lists of differential diagnoses of imaging observations, called gamuts, provide an important source of knowledge. The Radiology Gamuts Ontology (RGO) is a formal knowledge model of differential diagnoses in radiology that includes 1674 differential diagnoses, 19,017 terms, and 52,976 links between terms. Its knowledge is used to provide an interactive, freely available online reference of radiology gamuts ( www.gamuts.net ). A Web service allows its content to be discovered and consumed by other information systems. The RGO integrates radiologic knowledge with other biomedical ontologies as part of the Semantic Web. © RSNA, 2014.
OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software.
Kriventseva, Evgenia V; Tegenfeldt, Fredrik; Petty, Tom J; Waterhouse, Robert M; Simão, Felipe A; Pozdnyakov, Igor A; Ioannidis, Panagiotis; Zdobnov, Evgeny M
2015-01-01
Orthology, refining the concept of homology, is the cornerstone of evolutionary comparative studies. With the ever-increasing availability of genomic data, inference of orthology has become instrumental for generating hypotheses about gene functions crucial to many studies. This update of the OrthoDB hierarchical catalog of orthologs (http://www.orthodb.org) covers 3027 complete genomes, including the most comprehensive set of 87 arthropods, 61 vertebrates, 227 fungi and 2627 bacteria (sampling the most complete and representative genomes from over 11,000 available). In addition to the most extensive integration of functional annotations from UniProt, InterPro, GO, OMIM, model organism phenotypes and COG functional categories, OrthoDB uniquely provides evolutionary annotations including rates of ortholog sequence divergence, copy-number profiles, sibling groups and gene architectures. We re-designed the entirety of the OrthoDB website from the underlying technology to the user interface, enabling the user to specify species of interest and to select the relevant orthology level by the NCBI taxonomy. The text searches allow use of complex logic with various identifiers of genes, proteins, domains, ontologies or annotation keywords and phrases. Gene copy-number profiles can also be queried. This release comes with the freely available underlying ortholog clustering pipeline (http://www.orthodb.org/software). © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Topographic mapping data semantics through data conversion and enhancement: Chapter 7
Varanka, Dalia; Carter, Jonathan; Usery, E. Lynn; Shoberg, Thomas; Edited by Ashish, Naveen; Sheth, Amit P.
2011-01-01
This paper presents research on the semantics of topographic data for triples and ontologies to blend the capabilities of the Semantic Web and The National Map of the U.S. Geological Survey. Automated conversion of relational topographic data of several geographic sample areas to the triple data model standard resulted in relatively poor semantic associations. Further research employed vocabularies of feature type and spatial relation terms. A user interface was designed to model the capture of non-standard terms relevant to public users and to map those terms to existing data models of The National Map through the use of ontology. Server access for the study area triple stores was made publicly available, illustrating how the development of linked data may transform institutional policies to open government data resources to the public. This paper presents these data conversion and research techniques that were tested as open linked data concepts leveraged through a user-centered interface and open USGS server access to the public.
Speechlinks: Robust Cross-Lingual Tactical Communication Aids
2008-06-01
domain, the ontology based translation has proven to be challenging to build in this domain, however recent developments show promising results...assignments, and the effect of domain knowledge on those requirements. • Improving the front end of the speech recognizer remains one of the most challenging ...users by being very selective. 4.2.3.2 Analysis of the Normal user type inference result Figure 4.11 shows one of the most challenging users to
Annotation of phenotypic diversity: decoupling data curation and ontology curation using Phenex.
Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J
2014-01-01
Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.
Sankar, Punnaivanam; Alain, Krief; Aghila, Gnanasekaran
2010-05-24
We have developed a model structure-editing tool, ChemEd, programmed in JAVA, which allows drawing chemical structures on a graphical user interface (GUI) by selecting appropriate structural fragments defined in a fragment library. The terms representing the structural fragments are organized in fragment ontology to provide a conceptual support. ChemEd describes the chemical structure in an XML document (ChemFul) with rich semantics explicitly encoding the details of the chemical bonding, the hybridization status, and the electron environment around each atom. The document can be further processed through suitable algorithms and with the support of external chemical ontologies to generate understandable reports about the functional groups present in the structure and their specific environment.
NASA Astrophysics Data System (ADS)
Wood, Chris
2016-04-01
Under the Marine Strategy Framework Directive (MSFD), EU Member States are mandated to achieve or maintain 'Good Environmental Status' (GES) in their marine areas by 2020, through a series of Programme of Measures (PoMs). The Celtic Seas Partnership (CSP), an EU LIFE+ project, aims to support policy makers, special-interest groups, users of the marine environment, and other interested stakeholders on MSFD implementation in the Celtic Seas geographical area. As part of this support, a metadata portal has been built to provide a signposting service to datasets that are relevant to MSFD within the Celtic Seas. To ensure that the metadata has the widest possible reach, a linked data approach was employed to construct the database. Although the metadata are stored in a traditional RDBS, the metadata are exposed as linked data via the D2RQ platform, allowing virtual RDF graphs to be generated. SPARQL queries can be executed against the end-point allowing any user to manipulate the metadata. D2RQ's mapping language, based on turtle, was used to map a wide range of relevant ontologies to the metadata (e.g. The Provenance Ontology (prov-o), Ocean Data Ontology (odo), Dublin Core Elements and Terms (dc & dcterms), Friend of a Friend (foaf), and Geospatial ontologies (geo)) allowing users to browse the metadata, either via SPARQL queries or by using D2RQ's HTML interface. The metadata were further enhanced by mapping relevant parameters to the NERC Vocabulary Server, itself built on a SPARQL endpoint. Additionally, a custom web front-end was built to enable users to browse the metadata and express queries through an intuitive graphical user interface that requires no prior knowledge of SPARQL. As well as providing means to browse the data via MSFD-related parameters (Descriptor, Criteria, and Indicator), the metadata records include the dataset's country of origin, the list of organisations involved in the management of the data, and links to any relevant INSPIRE-compliant services relating to the dataset. The web front-end therefore enables users to effectively filter, sort, or search the metadata. As the MSFD timeline requires Member States to review their progress on achieving or maintaining GES every six years, the timely development of this metadata portal will not only aid interested stakeholders in understanding how member states are meeting their targets, but also shows how linked data can be used effectively to support policy makers and associated legislative bodies.
DMTO: a realistic ontology for standard diabetes mellitus treatment.
El-Sappagh, Shaker; Kwak, Daehan; Ali, Farman; Kwak, Kyung-Sup
2018-02-06
Treatment of type 2 diabetes mellitus (T2DM) is a complex problem. A clinical decision support system (CDSS) based on massive and distributed electronic health record data can facilitate the automation of this process and enhance its accuracy. The most important component of any CDSS is its knowledge base. This knowledge base can be formulated using ontologies. The formal description logic of ontology supports the inference of hidden knowledge. Building a complete, coherent, consistent, interoperable, and sharable ontology is a challenge. This paper introduces the first version of the newly constructed Diabetes Mellitus Treatment Ontology (DMTO) as a basis for shared-semantics, domain-specific, standard, machine-readable, and interoperable knowledge relevant to T2DM treatment. It is a comprehensive ontology and provides the highest coverage and the most complete picture of coded knowledge about T2DM patients' current conditions, previous profiles, and T2DM-related aspects, including complications, symptoms, lab tests, interactions, treatment plan (TP) frameworks, and glucose-related diseases and medications. It adheres to the design principles recommended by the Open Biomedical Ontologies Foundry and is based on ontological realism that follows the principles of the Basic Formal Ontology and the Ontology for General Medical Science. DMTO is implemented under Protégé 5.0 in Web Ontology Language (OWL) 2 format and is publicly available through the National Center for Biomedical Ontology's BioPortal at http://bioportal.bioontology.org/ontologies/DMTO . The current version of DMTO includes more than 10,700 classes, 277 relations, 39,425 annotations, 214 semantic rules, and 62,974 axioms. We provide proof of concept for this approach to modeling TPs. The ontology is able to collect and analyze most features of T2DM as well as customize chronic TPs with the most appropriate drugs, foods, and physical exercises. DMTO is ready to be used as a knowledge base for semantically intelligent and distributed CDSS systems.
Spatial information semantic query based on SPARQL
NASA Astrophysics Data System (ADS)
Xiao, Zhifeng; Huang, Lei; Zhai, Xiaofang
2009-10-01
How can the efficiency of spatial information inquiries be enhanced in today's fast-growing information age? We are rich in geospatial data but poor in up-to-date geospatial information and knowledge that are ready to be accessed by public users. This paper adopts an approach for querying spatial semantic by building an Web Ontology language(OWL) format ontology and introducing SPARQL Protocol and RDF Query Language(SPARQL) to search spatial semantic relations. It is important to establish spatial semantics that support for effective spatial reasoning for performing semantic query. Compared to earlier keyword-based and information retrieval techniques that rely on syntax, we use semantic approaches in our spatial queries system. Semantic approaches need to be developed by ontology, so we use OWL to describe spatial information extracted by the large-scale map of Wuhan. Spatial information expressed by ontology with formal semantics is available to machines for processing and to people for understanding. The approach is illustrated by introducing a case study for using SPARQL to query geo-spatial ontology instances of Wuhan. The paper shows that making use of SPARQL to search OWL ontology instances can ensure the result's accuracy and applicability. The result also indicates constructing a geo-spatial semantic query system has positive efforts on forming spatial query and retrieval.
Ontology-based geospatial data query and integration
Zhao, T.; Zhang, C.; Wei, M.; Peng, Z.-R.
2008-01-01
Geospatial data sharing is an increasingly important subject as large amount of data is produced by a variety of sources, stored in incompatible formats, and accessible through different GIS applications. Past efforts to enable sharing have produced standardized data format such as GML and data access protocols such as Web Feature Service (WFS). While these standards help enabling client applications to gain access to heterogeneous data stored in different formats from diverse sources, the usability of the access is limited due to the lack of data semantics encoded in the WFS feature types. Past research has used ontology languages to describe the semantics of geospatial data but ontology-based queries cannot be applied directly to legacy data stored in databases or shapefiles, or to feature data in WFS services. This paper presents a method to enable ontology query on spatial data available from WFS services and on data stored in databases. We do not create ontology instances explicitly and thus avoid the problems of data replication. Instead, user queries are rewritten to WFS getFeature requests and SQL queries to database. The method also has the benefits of being able to utilize existing tools of databases, WFS, and GML while enabling query based on ontology semantics. ?? 2008 Springer-Verlag Berlin Heidelberg.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Diehl, Alexander D.; Meehan, Terrence F.; Bradford, Yvonne M.
Background: The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. Construction and content: Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologiesmore » in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. Utility and discussion: The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. Conclusions: The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.« less
OLS Client and OLS Dialog: Open Source Tools to Annotate Public Omics Datasets.
Perez-Riverol, Yasset; Ternent, Tobias; Koch, Maximilian; Barsnes, Harald; Vrousgou, Olga; Jupp, Simon; Vizcaíno, Juan Antonio
2017-10-01
The availability of user-friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open-source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.
Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J
2016-02-01
Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Liu, Jia; Liu, Longli; Xue, Yong; Dong, Jing; Hu, Yingcui; Hill, Richard; Guang, Jie; Li, Chi
2017-01-01
Workflow for remote sensing quantitative retrieval is the ;bridge; between Grid services and Grid-enabled application of remote sensing quantitative retrieval. Workflow averts low-level implementation details of the Grid and hence enables users to focus on higher levels of application. The workflow for remote sensing quantitative retrieval plays an important role in remote sensing Grid and Cloud computing services, which can support the modelling, construction and implementation of large-scale complicated applications of remote sensing science. The validation of workflow is important in order to support the large-scale sophisticated scientific computation processes with enhanced performance and to minimize potential waste of time and resources. To research the semantic correctness of user-defined workflows, in this paper, we propose a workflow validation method based on tacit knowledge research in the remote sensing domain. We first discuss the remote sensing model and metadata. Through detailed analysis, we then discuss the method of extracting the domain tacit knowledge and expressing the knowledge with ontology. Additionally, we construct the domain ontology with Protégé. Through our experimental study, we verify the validity of this method in two ways, namely data source consistency error validation and parameters matching error validation.
Classification of user interfaces for graph-based online analytical processing
NASA Astrophysics Data System (ADS)
Michaelis, James R.
2016-05-01
In the domain of business intelligence, user-oriented software for conducting multidimensional analysis via Online- Analytical Processing (OLAP) is now commonplace. In this setting, datasets commonly have well-defined sets of dimensions and measures around which analysis tasks can be conducted. However, many forms of data used in intelligence operations - deriving from social networks, online communications, and text corpora - will consist of graphs with varying forms of potential dimensional structure. Hence, enabling OLAP over such data collections requires explicit definition and extraction of supporting dimensions and measures. Further, as Graph OLAP remains an emerging technique, limited research has been done on its user interface requirements. Namely, on effective pairing of interface designs to different types of graph-derived dimensions and measures. This paper presents a novel technique for pairing of user interface designs to Graph OLAP datasets, rooted in Analytic Hierarchy Process (AHP) driven comparisons. Attributes of the classification strategy are encoded through an AHP ontology, developed in our alternate work and extended to support pairwise comparison of interfaces. Specifically, according to their ability, as perceived by Subject Matter Experts, to support dimensions and measures corresponding to Graph OLAP dataset attributes. To frame this discussion, a survey is provided both on existing variations of Graph OLAP, as well as existing interface designs previously applied in multidimensional analysis settings. Following this, a review of our AHP ontology is provided, along with a listing of corresponding dataset and interface attributes applicable toward SME recommendation structuring. A walkthrough of AHP-based recommendation encoding via the ontology-based approach is then provided. The paper concludes with a short summary of proposed future directions seen as essential for this research area.
The Semantic Web: From Representation to Realization
NASA Astrophysics Data System (ADS)
Thórisson, Kristinn R.; Spivack, Nova; Wissner, James M.
A semantically-linked web of electronic information - the Semantic Web - promises numerous benefits including increased precision in automated information sorting, searching, organizing and summarizing. Realizing this requires significantly more reliable meta-information than is readily available today. It also requires a better way to represent information that supports unified management of diverse data and diverse Manipulation methods: from basic keywords to various types of artificial intelligence, to the highest level of intelligent manipulation - the human mind. How this is best done is far from obvious. Relying solely on hand-crafted annotation and ontologies, or solely on artificial intelligence techniques, seems less likely for success than a combination of the two. In this paper describe an integrated, complete solution to these challenges that has already been implemented and tested with hundreds of thousands of users. It is based on an ontological representational level we call SemCards that combines ontological rigour with flexible user interface constructs. SemCards are machine- and human-readable digital entities that allow non-experts to create and use semantic content, while empowering machines to better assist and participate in the process. SemCards enable users to easily create semantically-grounded data that in turn acts as examples for automation processes, creating a positive iterative feedback loop of metadata creation and refinement between user and machine. They provide a holistic solution to the Semantic Web, supporting powerful management of the full lifecycle of data, including its creation, retrieval, classification, sorting and sharing. We have implemented the SemCard technology on the semantic Web site Twine.com, showing that the technology is indeed versatile and scalable. Here we present the key ideas behind SemCards and describe the initial implementation of the technology.
The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability
Diehl, Alexander D.; Meehan, Terrence F.; Bradford, Yvonne M.; ...
2016-07-04
Background: The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. Construction and content: Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologiesmore » in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. Utility and discussion: The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. Conclusions: The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.« less
Tcheremenskaia, Olga; Benigni, Romualdo; Nikolova, Ivelina; Jeliazkova, Nina; Escher, Sylvia E; Batke, Monika; Baier, Thomas; Poroikov, Vladimir; Lagunin, Alexey; Rautenberg, Micha; Hardy, Barry
2012-04-24
The OpenTox Framework, developed by the partners in the OpenTox project (http://www.opentox.org), aims at providing a unified access to toxicity data, predictive models and validation procedures. Interoperability of resources is achieved using a common information model, based on the OpenTox ontologies, describing predictive algorithms, models and toxicity data. As toxicological data may come from different, heterogeneous sources, a deployed ontology, unifying the terminology and the resources, is critical for the rational and reliable organization of the data, and its automatic processing. The following related ontologies have been developed for OpenTox: a) Toxicological ontology - listing the toxicological endpoints; b) Organs system and Effects ontology - addressing organs, targets/examinations and effects observed in in vivo studies; c) ToxML ontology - representing semi-automatic conversion of the ToxML schema; d) OpenTox ontology- representation of OpenTox framework components: chemical compounds, datasets, types of algorithms, models and validation web services; e) ToxLink-ToxCast assays ontology and f) OpenToxipedia community knowledge resource on toxicology terminology.OpenTox components are made available through standardized REST web services, where every compound, data set, and predictive method has a unique resolvable address (URI), used to retrieve its Resource Description Framework (RDF) representation, or to initiate the associated calculations and generate new RDF-based resources.The services support the integration of toxicity and chemical data from various sources, the generation and validation of computer models for toxic effects, seamless integration of new algorithms and scientifically sound validation routines and provide a flexible framework, which allows building arbitrary number of applications, tailored to solving different problems by end users (e.g. toxicologists). The OpenTox toxicological ontology projects may be accessed via the OpenTox ontology development page http://www.opentox.org/dev/ontology; the OpenTox ontology is available as OWL at http://opentox.org/api/1 1/opentox.owl, the ToxML - OWL conversion utility is an open source resource available at http://ambit.svn.sourceforge.net/viewvc/ambit/branches/toxml-utils/
Data-driven Ontology Development: A Case Study at NASA's Atmospheric Science Data Center
NASA Astrophysics Data System (ADS)
Hertz, J.; Huffer, E.; Kusterer, J.
2012-12-01
Well-founded ontologies are key to enabling transformative semantic technologies and accelerating scientific research. One example is semantically enabled search and discovery, making scientific data accessible and more understandable by accurately modeling a complex domain. The ontology creation process remains a challenge for many anxious to pursue semantic technologies. The key may be that the creation process -- whether formal, community-based, automated or semi-automated -- should encompass not only a foundational core and supplemental resources but also a focus on the purpose or mission the ontology is created to support. Are there tools or processes to de-mystify, assess or enhance the resulting ontology? We suggest that comparison and analysis of a domain-focused ontology can be made using text engineering tools for information extraction, tokenizers, named entity transducers and others. The results are analyzed to ensure the ontology reflects the core purpose of the domain's mission and that the ontology integrates and describes the supporting data in the language of the domain - how the science is analyzed and discussed among all users of the data. Commonalities and relationships among domain resources describing the Clouds and Earth's Radiant Energy (CERES) Bi-Directional Scan (BDS) datasets from NASA's Atmospheric Science Data Center are compared. The domain resources include: a formal ontology created for CERES; scientific works such as papers, conference proceedings and notes; information extracted from the datasets (i.e., header metadata); and BDS scientific documentation (Algorithm Theoretical Basis Documents, collection guides, data quality summaries and others). These resources are analyzed using the open source software General Architecture for Text Engineering, a mature framework for computational tasks involving human language.
PhysiomeSpace: digital library service for biomedical data
Testi, Debora; Quadrani, Paolo; Viceconti, Marco
2010-01-01
Every research laboratory has a wealth of biomedical data locked up, which, if shared with other experts, could dramatically improve biomedical and healthcare research. With the PhysiomeSpace service, it is now possible with a few clicks to share with selected users biomedical data in an easy, controlled and safe way. The digital library service is managed using a client–server approach. The client application is used to import, fuse and enrich the data information according to the PhysiomeSpace resource ontology and upload/download the data to the library. The server services are hosted on the Biomed Town community portal, where through a web interface, the user can complete the metadata curation and share and/or publish the data resources. A search service capitalizes on the domain ontology and on the enrichment of metadata for each resource, providing a powerful discovery environment. Once the users have found the data resources they are interested in, they can add them to their basket, following a metaphor popular in e-commerce web sites. When all the necessary resources have been selected, the user can download the basket contents into the client application. The digital library service is now in beta and open to the biomedical research community. PMID:20478910
PhysiomeSpace: digital library service for biomedical data.
Testi, Debora; Quadrani, Paolo; Viceconti, Marco
2010-06-28
Every research laboratory has a wealth of biomedical data locked up, which, if shared with other experts, could dramatically improve biomedical and healthcare research. With the PhysiomeSpace service, it is now possible with a few clicks to share with selected users biomedical data in an easy, controlled and safe way. The digital library service is managed using a client-server approach. The client application is used to import, fuse and enrich the data information according to the PhysiomeSpace resource ontology and upload/download the data to the library. The server services are hosted on the Biomed Town community portal, where through a web interface, the user can complete the metadata curation and share and/or publish the data resources. A search service capitalizes on the domain ontology and on the enrichment of metadata for each resource, providing a powerful discovery environment. Once the users have found the data resources they are interested in, they can add them to their basket, following a metaphor popular in e-commerce web sites. When all the necessary resources have been selected, the user can download the basket contents into the client application. The digital library service is now in beta and open to the biomedical research community.
Schweitzer, M; Lasierra, N; Hoerbst, A
2015-01-01
Increasing the flexibility from a user-perspective and enabling a workflow based interaction, facilitates an easy user-friendly utilization of EHRs for healthcare professionals' daily work. To offer such versatile EHR-functionality, our approach is based on the execution of clinical workflows by means of a composition of semantic web-services. The backbone of such architecture is an ontology which enables to represent clinical workflows and facilitates the selection of suitable services. In this paper we present the methods and results after running observations of diabetes routine consultations which were conducted in order to identify those workflows and the relation among the included tasks. Mentioned workflows were first modeled by BPMN and then generalized. As a following step in our study, interviews will be conducted with clinical personnel to validate modeled workflows.
Algorithmic and user study of an autocompletion algorithm on a large medical vocabulary.
Sevenster, Merlijn; van Ommering, Rob; Qian, Yuechen
2012-02-01
Autocompletion supports human-computer interaction in software applications that let users enter textual data. We will be inspired by the use case in which medical professionals enter ontology concepts, catering the ongoing demand for structured and standardized data in medicine. Goal is to give an algorithmic analysis of one particular autocompletion algorithm, called multi-prefix matching algorithm, which suggests terms whose words' prefixes contain all words in the string typed by the user, e.g., in this sense, opt ner me matches optic nerve meningioma. Second we aim to investigate how well it supports users entering concepts from a large and comprehensive medical vocabulary (snomed ct). We give a concise description of the multi-prefix algorithm, and sketch how it can be optimized to meet required response time. Performance will be compared to a baseline algorithm, which gives suggestions that extend the string typed by the user to the right, e.g. optic nerve m gives optic nerve meningioma, but opt ner me does not. We conduct a user experiment in which 12 participants are invited to complete 40 snomed ct terms with the baseline algorithm and another set of 40 snomed ct terms with the multi-prefix algorithm. Our results show that users need significantly fewer keystrokes when supported by the multi-prefix algorithm than when supported by the baseline algorithm. The proposed algorithm is a competitive candidate for searching and retrieving terms from a large medical ontology. Copyright © 2011 Elsevier Inc. All rights reserved.
AIM: a personal view of where I have been and where we might be going.
Rector, A
2001-08-01
My own career in medical informatics and AI in medicine has oscillated between concerns with medical records and concerns with knowledge representation with decision support as a pivotal integrating issue. It has focused on using AI to organise information and reduce 'muddle' and improve the user interfaces to produce 'useful and usable systems' to help doctors with a 'humanly impossible task'. Increasingly knowledge representation and ontologies have become the fulcrum for orchestrating re-use of information and integration of systems. Encouragingly, the dilemma between computational tractability and expressiveness is lessening, and ontologies and description logics are joining the mainstream both in AI in Medicine and in Intelligent Information Management generally. It has been shown possible to scale up ontologies to meet medical needs, and increasingly ontologies are playing a key role in meeting the requirements to scale up the complexity of clinical systems to meet the ever increasing demands brought about by new emphasis on reduction of errors, clinical accountability, and the explosion of knowledge on the Web.
The neuron classification problem
Bota, Mihail; Swanson, Larry W.
2007-01-01
A systematic account of neuron cell types is a basic prerequisite for determining the vertebrate nervous system global wiring diagram. With comprehensive lineage and phylogenetic information unavailable, a general ontology based on structure-function taxonomy is proposed and implemented in a knowledge management system, and a prototype analysis of select regions (including retina, cerebellum, and hypothalamus) presented. The supporting Brain Architecture Knowledge Management System (BAMS) Neuron ontology is online and its user interface allows queries about terms and their definitions, classification criteria based on the original literature and “Petilla Convention” guidelines, hierarchies, and relations—with annotations documenting each ontology entry. Combined with three BAMS modules for neural regions, connections between regions and neuron types, and molecules, the Neuron ontology provides a general framework for physical descriptions and computational modeling of neural systems. The knowledge management system interacts with other web resources, is accessible in both XML and RDF/OWL, is extendible to the whole body, and awaits large-scale data population requiring community participation for timely implementation. PMID:17582506
Phenex: ontological annotation of phenotypic diversity.
Balhoff, James P; Dahdul, Wasila M; Kothari, Cartik R; Lapp, Hilmar; Lundberg, John G; Mabee, Paula; Midford, Peter E; Westerfield, Monte; Vision, Todd J
2010-05-05
Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.
Ontology for the asexual development and anatomy of the colonial chordate Botryllus schlosseri.
Manni, Lucia; Gasparini, Fabio; Hotta, Kohji; Ishizuka, Katherine J; Ricci, Lorenzo; Tiozzo, Stefano; Voskoboynik, Ayelet; Dauga, Delphine
2014-01-01
Ontologies provide an important resource to integrate information. For developmental biology and comparative anatomy studies, ontologies of a species are used to formalize and annotate data that are related to anatomical structures, their lineage and timing of development. Here, we have constructed the first ontology for anatomy and asexual development (blastogenesis) of a bilaterian, the colonial tunicate Botryllus schlosseri. Tunicates, like Botryllus schlosseri, are non-vertebrates and the only chordate taxon species that reproduce both sexually and asexually. Their tadpole larval stage possesses structures characteristic of all chordates, i.e. a notochord, a dorsal neural tube, and gill slits. Larvae settle and metamorphose into individuals that are either solitary or colonial. The latter reproduce both sexually and asexually and these two reproductive modes lead to essentially the same adult body plan. The Botryllus schlosseri Ontology of Development and Anatomy (BODA) will facilitate the comparison between both types of development. BODA uses the rules defined by the Open Biomedical Ontologies Foundry. It is based on studies that investigate the anatomy, blastogenesis and regeneration of this organism. BODA features allow the users to easily search and identify anatomical structures in the colony, to define the developmental stage, and to follow the morphogenetic events of a tissue and/or organ of interest throughout asexual development. We invite the scientific community to use this resource as a reference for the anatomy and developmental ontology of B. schlosseri and encourage recommendations for updates and improvements.
Ontology for the Asexual Development and Anatomy of the Colonial Chordate Botryllus schlosseri
Manni, Lucia; Gasparini, Fabio; Hotta, Kohji; Ishizuka, Katherine J.; Ricci, Lorenzo; Tiozzo, Stefano; Voskoboynik, Ayelet; Dauga, Delphine
2014-01-01
Ontologies provide an important resource to integrate information. For developmental biology and comparative anatomy studies, ontologies of a species are used to formalize and annotate data that are related to anatomical structures, their lineage and timing of development. Here, we have constructed the first ontology for anatomy and asexual development (blastogenesis) of a bilaterian, the colonial tunicate Botryllus schlosseri. Tunicates, like Botryllus schlosseri, are non-vertebrates and the only chordate taxon species that reproduce both sexually and asexually. Their tadpole larval stage possesses structures characteristic of all chordates, i.e. a notochord, a dorsal neural tube, and gill slits. Larvae settle and metamorphose into individuals that are either solitary or colonial. The latter reproduce both sexually and asexually and these two reproductive modes lead to essentially the same adult body plan. The Botryllus schlosseri Ontology of Development and Anatomy (BODA) will facilitate the comparison between both types of development. BODA uses the rules defined by the Open Biomedical Ontologies Foundry. It is based on studies that investigate the anatomy, blastogenesis and regeneration of this organism. BODA features allow the users to easily search and identify anatomical structures in the colony, to define the developmental stage, and to follow the morphogenetic events of a tissue and/or organ of interest throughout asexual development. We invite the scientific community to use this resource as a reference for the anatomy and developmental ontology of B. schlosseri and encourage recommendations for updates and improvements. PMID:24789338
An ontological knowledge framework for adaptive medical workflow.
Dang, Jiangbo; Hedayati, Amir; Hampel, Ken; Toklu, Candemir
2008-10-01
As emerging technologies, semantic Web and SOA (Service-Oriented Architecture) allow BPMS (Business Process Management System) to automate business processes that can be described as services, which in turn can be used to wrap existing enterprise applications. BPMS provides tools and methodologies to compose Web services that can be executed as business processes and monitored by BPM (Business Process Management) consoles. Ontologies are a formal declarative knowledge representation model. It provides a foundation upon which machine understandable knowledge can be obtained, and as a result, it makes machine intelligence possible. Healthcare systems can adopt these technologies to make them ubiquitous, adaptive, and intelligent, and then serve patients better. This paper presents an ontological knowledge framework that covers healthcare domains that a hospital encompasses-from the medical or administrative tasks, to hospital assets, medical insurances, patient records, drugs, and regulations. Therefore, our ontology makes our vision of personalized healthcare possible by capturing all necessary knowledge for a complex personalized healthcare scenario involving patient care, insurance policies, and drug prescriptions, and compliances. For example, our ontology facilitates a workflow management system to allow users, from physicians to administrative assistants, to manage, even create context-aware new medical workflows and execute them on-the-fly.
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes
Jelokhani-Niaraki, Saber; Minuchehr, Zarrin; Nassiri, Mohammad Reza
2015-01-01
During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data. PMID:25873847
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes.
Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Minuchehr, Zarrin; Nassiri, Mohammad Reza
2015-03-01
During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.
Research on land registration procedure ontology of China
NASA Astrophysics Data System (ADS)
Zhao, Zhongjun; Du, Qingyun; Zhang, Weiwei; Liu, Tao
2009-10-01
Land registration is public act which is to record the state-owned land use right, collective land ownership, collective land use right and land mortgage, servitude, as well as other land rights required the registration according to laws and regulations onto land registering books. Land registration is one of the important government affairs , so it is very important to standardize, optimize and humanize the process of land registration. The management works of organization are realized through a variety of workflows. Process knowledge is in essence a kind of methodology knowledge and a system which including the core and the relational knowledge. In this paper, the ontology is introduced into the field of land registration and management, trying to optimize the flow of land registration, to promote the automation-building and intelligent Service of land registration affairs, to provide humanized and intelligent service for multi-types of users . This paper tries to build land registration procedure ontology by defining the land registration procedure ontology's key concepts which represent the kinds of processes of land registration and mapping the kinds of processes to OWL-S. The land registration procedure ontology shall be the start and the basis of the Web service.
2012-01-01
Background The OpenTox Framework, developed by the partners in the OpenTox project (http://www.opentox.org), aims at providing a unified access to toxicity data, predictive models and validation procedures. Interoperability of resources is achieved using a common information model, based on the OpenTox ontologies, describing predictive algorithms, models and toxicity data. As toxicological data may come from different, heterogeneous sources, a deployed ontology, unifying the terminology and the resources, is critical for the rational and reliable organization of the data, and its automatic processing. Results The following related ontologies have been developed for OpenTox: a) Toxicological ontology – listing the toxicological endpoints; b) Organs system and Effects ontology – addressing organs, targets/examinations and effects observed in in vivo studies; c) ToxML ontology – representing semi-automatic conversion of the ToxML schema; d) OpenTox ontology– representation of OpenTox framework components: chemical compounds, datasets, types of algorithms, models and validation web services; e) ToxLink–ToxCast assays ontology and f) OpenToxipedia community knowledge resource on toxicology terminology. OpenTox components are made available through standardized REST web services, where every compound, data set, and predictive method has a unique resolvable address (URI), used to retrieve its Resource Description Framework (RDF) representation, or to initiate the associated calculations and generate new RDF-based resources. The services support the integration of toxicity and chemical data from various sources, the generation and validation of computer models for toxic effects, seamless integration of new algorithms and scientifically sound validation routines and provide a flexible framework, which allows building arbitrary number of applications, tailored to solving different problems by end users (e.g. toxicologists). Availability The OpenTox toxicological ontology projects may be accessed via the OpenTox ontology development page http://www.opentox.org/dev/ontology; the OpenTox ontology is available as OWL at http://opentox.org/api/1 1/opentox.owl, the ToxML - OWL conversion utility is an open source resource available at http://ambit.svn.sourceforge.net/viewvc/ambit/branches/toxml-utils/ PMID:22541598
The MMI Device Ontology: Enabling Sensor Integration
NASA Astrophysics Data System (ADS)
Rueda, C.; Galbraith, N.; Morris, R. A.; Bermudez, L. E.; Graybeal, J.; Arko, R. A.; Mmi Device Ontology Working Group
2010-12-01
The Marine Metadata Interoperability (MMI) project has developed an ontology for devices to describe sensors and sensor networks. This ontology is implemented in the W3C Web Ontology Language (OWL) and provides an extensible conceptual model and controlled vocabularies for describing heterogeneous instrument types, with different data characteristics, and their attributes. It can help users populate metadata records for sensors; associate devices with their platforms, deployments, measurement capabilities and restrictions; aid in discovery of sensor data, both historic and real-time; and improve the interoperability of observational oceanographic data sets. We developed the MMI Device Ontology following a community-based approach. By building on and integrating other models and ontologies from related disciplines, we sought to facilitate semantic interoperability while avoiding duplication. Key concepts and insights from various communities, including the Open Geospatial Consortium (eg., SensorML and Observations and Measurements specifications), Semantic Web for Earth and Environmental Terminology (SWEET), and W3C Semantic Sensor Network Incubator Group, have significantly enriched the development of the ontology. Individuals ranging from instrument designers, science data producers and consumers to ontology specialists and other technologists contributed to the work. Applications of the MMI Device Ontology are underway for several community use cases. These include vessel-mounted multibeam mapping sonars for the Rolling Deck to Repository (R2R) program and description of diverse instruments on deepwater Ocean Reference Stations for the OceanSITES program. These trials involve creation of records completely describing instruments, either by individual instances or by manufacturer and model. Individual terms in the MMI Device Ontology can be referenced with their corresponding Uniform Resource Identifiers (URIs) in sensor-related metadata specifications (e.g., SensorML, NetCDF). These identifiers can be resolved through a web browser, or other client applications via HTTP against the MMI Ontology Registry and Repository (ORR), where the ontology is maintained. SPARQL-based query capabilities, which are enhanced with reasoning, along with several supported output formats, allow the effective interaction of diverse client applications with the semantic information associated with the device ontology. In this presentation we describe the process for the development of the MMI Device Ontology and illustrate extensions and applications that demonstrate the benefits of adopting this semantic approach, including example queries involving inference. We also highlight the issues encountered and future work.
Díaz-Rodríguez, Natalia; Cadahía, Olmo León; Cuéllar, Manuel Pegalajar; Lilius, Johan; Calvo-Flores, Miguel Delgado
2014-01-01
Human activity recognition is a key task in ambient intelligence applications to achieve proper ambient assisted living. There has been remarkable progress in this domain, but some challenges still remain to obtain robust methods. Our goal in this work is to provide a system that allows the modeling and recognition of a set of complex activities in real life scenarios involving interaction with the environment. The proposed framework is a hybrid model that comprises two main modules: a low level sub-activity recognizer, based on data-driven methods, and a high-level activity recognizer, implemented with a fuzzy ontology to include the semantic interpretation of actions performed by users. The fuzzy ontology is fed by the sub-activities recognized by the low level data-driven component and provides fuzzy ontological reasoning to recognize both the activities and their influence in the environment with semantics. An additional benefit of the approach is the ability to handle vagueness and uncertainty in the knowledge-based module, which substantially outperforms the treatment of incomplete and/or imprecise data with respect to classic crisp ontologies. We validate these advantages with the public CAD-120 dataset (Cornell Activity Dataset), achieving an accuracy of 90.1% and 91.07% for low-level and high-level activities, respectively. This entails an improvement over fully data-driven or ontology-based approaches. PMID:25268914
NASA Astrophysics Data System (ADS)
Paulraj, D.; Swamynathan, S.; Madhaiyan, M.
2012-11-01
Web Service composition has become indispensable as a single web service cannot satisfy complex functional requirements. Composition of services has received much interest to support business-to-business (B2B) or enterprise application integration. An important component of the service composition is the discovery of relevant services. In Semantic Web Services (SWS), service discovery is generally achieved by using service profile of Ontology Web Languages for Services (OWL-S). The profile of the service is a derived and concise description but not a functional part of the service. The information contained in the service profile is sufficient for atomic service discovery, but it is not sufficient for the discovery of composite semantic web services (CSWS). The purpose of this article is two-fold: first to prove that the process model is a better choice than the service profile for service discovery. Second, to facilitate the composition of inter-organisational CSWS by proposing a new composition method which uses process ontology. The proposed service composition approach uses an algorithm which performs a fine grained match at the level of atomic process rather than at the level of the entire service in a composite semantic web service. Many works carried out in this area have proposed solutions only for the composition of atomic services and this article proposes a solution for the composition of composite semantic web services.
A semantic proteomics dashboard (SemPoD) for data management in translational research.
Jayapandian, Catherine P; Zhao, Meng; Ewing, Rob M; Zhang, Guo-Qiang; Sahoo, Satya S
2012-01-01
One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, also called provenance information, is a key factor ineffective data integration, reproducibility of results, correct attribution of original source, and answering research queries involving "What", "Where", "When", "Which", "Who", "How", and "Why" (also known as the W7 model). But, at present there is limited or no effective approach to managing and leveraging provenance information for integrating data across studies or projects. Hence, there is an urgent need for a paradigm shift in creating a "provenance-aware" informatics platform to address this challenge. We introduce an ontology-driven, intuitive Semantic Proteomics Dashboard (SemPoD) that uses provenance together with domain information (semantic provenance) to enable researchers to query, compare, and correlate different types of data across multiple projects, and allow integration with legacy data to support their ongoing research. The SemPoD platform, currently in use at the Case Center for Proteomics and Bioinformatics (CPB), consists of three components: (a) Ontology-driven Visual Query Composer, (b) Result Explorer, and (c) Query Manager. Currently, SemPoD allows provenance-aware querying of 1153 mass-spectrometry experiments from 20 different projects. SemPod uses the systems molecular biology provenance ontology (SysPro) to support a dynamic query composition interface, which automatically updates the components of the query interface based on previous user selections and efficiently prunes the result set usinga "smart filtering" approach. The SysPro ontology re-uses terms from the PROV-ontology (PROV-O) being developed by the World Wide Web Consortium (W3C) provenance working group, the minimum information required for reporting a molecular interaction experiment (MIMIx), and the minimum information about a proteomics experiment (MIAPE) guidelines. The SemPoD was evaluated both in terms of user feedback and as scalability of the system. SemPoD is an intuitive and powerful provenance ontology-driven data access and query platform that uses the MIAPE and MIMIx metadata guideline to create an integrated view over large-scale systems molecular biology datasets. SemPoD leverages the SysPro ontology to create an intuitive dashboard for biologists to compose queries, explore the results, and use a query manager for storing queries for later use. SemPoD can be deployed over many existing database applications storing 'omics data, including, as illustrated here, the LabKey data-management system. The initial user feedback evaluating the usability and functionality of SemPoD has been very positive and it is being considered for wider deployment beyond the proteomics domain, and in other 'omics' centers.
An Ontology-based Context-aware System for Smart Homes: E-care@home.
Alirezaie, Marjan; Renoux, Jennifer; Köckemann, Uwe; Kristoffersson, Annica; Karlsson, Lars; Blomqvist, Eva; Tsiftes, Nicolas; Voigt, Thiemo; Loutfi, Amy
2017-07-06
Smart home environments have a significant potential to provide for long-term monitoring of users with special needs in order to promote the possibility to age at home. Such environments are typically equipped with a number of heterogeneous sensors that monitor both health and environmental parameters. This paper presents a framework called E-care@home, consisting of an IoT infrastructure, which provides information with an unambiguous, shared meaning across IoT devices, end-users, relatives, health and care professionals and organizations. We focus on integrating measurements gathered from heterogeneous sources by using ontologies in order to enable semantic interpretation of events and context awareness. Activities are deduced using an incremental answer set solver for stream reasoning. The paper demonstrates the proposed framework using an instantiation of a smart environment that is able to perform context recognition based on the activities and the events occurring in the home.
An Ontology-based Context-aware System for Smart Homes: E-care@home
Alirezaie, Marjan; Köckemann, Uwe; Kristoffersson, Annica; Karlsson, Lars; Blomqvist, Eva; Voigt, Thiemo; Loutfi, Amy
2017-01-01
Smart home environments have a significant potential to provide for long-term monitoring of users with special needs in order to promote the possibility to age at home. Such environments are typically equipped with a number of heterogeneous sensors that monitor both health and environmental parameters. This paper presents a framework called E-care@home, consisting of an IoT infrastructure, which provides information with an unambiguous, shared meaning across IoT devices, end-users, relatives, health and care professionals and organizations. We focus on integrating measurements gathered from heterogeneous sources by using ontologies in order to enable semantic interpretation of events and context awareness. Activities are deduced using an incremental answer set solver for stream reasoning. The paper demonstrates the proposed framework using an instantiation of a smart environment that is able to perform context recognition based on the activities and the events occurring in the home. PMID:28684686
Where to search top-K biomedical ontologies?
Oliveira, Daniela; Butt, Anila Sahar; Haller, Armin; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh
2018-03-20
Searching for precise terms and terminological definitions in the biomedical data space is problematic, as researchers find overlapping, closely related and even equivalent concepts in a single or multiple ontologies. Search engines that retrieve ontological resources often suggest an extensive list of search results for a given input term, which leads to the tedious task of selecting the best-fit ontological resource (class or property) for the input term and reduces user confidence in the retrieval engines. A systematic evaluation of these search engines is necessary to understand their strengths and weaknesses in different search requirements. We have implemented seven comparable Information Retrieval ranking algorithms to search through ontologies and compared them against four search engines for ontologies. Free-text queries have been performed, the outcomes have been judged by experts and the ranking algorithms and search engines have been evaluated against the expert-based ground truth (GT). In addition, we propose a probabilistic GT that is developed automatically to provide deeper insights and confidence to the expert-based GT as well as evaluating a broader range of search queries. The main outcome of this work is the identification of key search factors for biomedical ontologies together with search requirements and a set of recommendations that will help biomedical experts and ontology engineers to select the best-suited retrieval mechanism in their search scenarios. We expect that this evaluation will allow researchers and practitioners to apply the current search techniques more reliably and that it will help them to select the right solution for their daily work. The source code (of seven ranking algorithms), ground truths and experimental results are available at https://github.com/danielapoliveira/bioont-search-benchmark.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-08-25
Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-01-01
Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
Semantically optiMize the dAta seRvice operaTion (SMART) system for better data discovery and access
NASA Astrophysics Data System (ADS)
Yang, C.; Huang, T.; Armstrong, E. M.; Moroni, D. F.; Liu, K.; Gui, Z.
2013-12-01
Abstract: We present a Semantically optiMize the dAta seRvice operaTion (SMART) system for better data discovery and access across the NASA data systems, Global Earth Observation System of Systems (GEOSS) Clearinghouse and Data.gov to facilitate scientists to select Earth observation data that fit better their needs in four aspects: 1. Integrating and interfacing the SMART system to include the functionality of a) semantic reasoning based on Jena, an open source semantic reasoning engine, b) semantic similarity calculation, c) recommendation based on spatiotemporal, semantic, and user workflow patterns, and d) ranking results based on similarity between search terms and data ontology. 2. Collaborating with data user communities to a) capture science data ontology and record relevant ontology triple stores, b) analyze and mine user search and download patterns, c) integrate SMART into metadata-centric discovery system for community-wide usage and feedback, and d) customizing data discovery, search and access user interface to include the ranked results, recommendation components, and semantic based navigations. 3. Laying the groundwork to interface the SMART system with other data search and discovery systems as an open source data search and discovery solution. The SMART systems leverages NASA, GEO, FGDC data discovery, search and access for the Earth science community by enabling scientists to readily discover and access data appropriate to their endeavors, increasing the efficiency of data exploration and decreasing the time that scientists must spend on searching, downloading, and processing the datasets most applicable to their research. By incorporating the SMART system, it is a likely aim that the time being devoted to discovering the most applicable dataset will be substantially reduced, thereby reducing the number of user inquiries and likewise reducing the time and resources expended by a data center in addressing user inquiries. Keywords: EarthCube; ECHO, DAACs, GeoPlatform; Geospatial Cyberinfrastructure References: 1. Yang, P., Evans, J., Cole, M., Alameh, N., Marley, S., & Bambacus, M., (2007). The Emerging Concepts and Applications of the Spatial Web Portal. Photogrammetry Engineering &Remote Sensing,73(6):691-698. 2. Zhang, C, Zhao, T. and W. Li. (2010). The Framework of a Geospatial Semantic Web based Spatial Decision Support System for Digital Earth. International Journal of Digital Earth. 3(2):111-134. 3. Yang C., Raskin R., Goodchild M.F., Gahegan M., 2010, Geospatial Cyberinfrastructure: Past, Present and Future,Computers, Environment, and Urban Systems, 34(4):264-277. 4. Liu K., Yang C., Li W., Gui Z., Xu C., Xia J., 2013. Using ontology and similarity calculations to rank Earth science data searching results, International Journal of Geospatial Information Applications. (in press)
An Integrated Children Disease Prediction Tool within a Special Social Network.
Apostolova Trpkovska, Marika; Yildirim Yayilgan, Sule; Besimi, Adrian
2016-01-01
This paper proposes a social network with an integrated children disease prediction system developed by the use of the specially designed Children General Disease Ontology (CGDO). This ontology consists of children diseases and their relationship with symptoms and Semantic Web Rule Language (SWRL rules) that are specially designed for predicting diseases. The prediction process starts by filling data about the appeared signs and symptoms by the user which are after that mapped with the CGDO ontology. Once the data are mapped, the prediction results are presented. The phase of prediction executes the rules which extract the predicted disease details based on the SWRL rule specified. The motivation behind the development of this system is to spread knowledge about the children diseases and their symptoms in a very simple way using the specialized social networking website www.emama.mk.
Large-scale gene function analysis with the PANTHER classification system.
Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D
2013-08-01
The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
Demir, E; Babur, O; Dogrusoz, U; Gursoy, A; Nisanci, G; Cetin-Atalay, R; Ozturk, M
2002-07-01
Availability of the sequences of entire genomes shifts the scientific curiosity towards the identification of function of the genomes in large scale as in genome studies. In the near future, data produced about cellular processes at molecular level will accumulate with an accelerating rate as a result of proteomics studies. In this regard, it is essential to develop tools for storing, integrating, accessing, and analyzing this data effectively. We define an ontology for a comprehensive representation of cellular events. The ontology presented here enables integration of fragmented or incomplete pathway information and supports manipulation and incorporation of the stored data, as well as multiple levels of abstraction. Based on this ontology, we present the architecture of an integrated environment named Patika (Pathway Analysis Tool for Integration and Knowledge Acquisition). Patika is composed of a server-side, scalable, object-oriented database and client-side editors to provide an integrated, multi-user environment for visualizing and manipulating network of cellular events. This tool features automated pathway layout, functional computation support, advanced querying and a user-friendly graphical interface. We expect that Patika will be a valuable tool for rapid knowledge acquisition, microarray generated large-scale data interpretation, disease gene identification, and drug development. A prototype of Patika is available upon request from the authors.
PhenoTips: patient phenotyping software for clinical and research use.
Girdea, Marta; Dumitriu, Sergiu; Fiume, Marc; Bowdin, Sarah; Boycott, Kym M; Chénier, Sébastien; Chitayat, David; Faghfoury, Hanna; Meyn, M Stephen; Ray, Peter N; So, Joyce; Stavropoulos, Dimitri J; Brudno, Michael
2013-08-01
We have developed PhenoTips: open source software for collecting and analyzing phenotypic information for patients with genetic disorders. Our software combines an easy-to-use interface, compatible with any device that runs a Web browser, with a standardized database back end. The PhenoTips' user interface closely mirrors clinician workflows so as to facilitate the recording of observations made during the patient encounter. Collected data include demographics, medical history, family history, physical and laboratory measurements, physical findings, and additional notes. Phenotypic information is represented using the Human Phenotype Ontology; however, the complexity of the ontology is hidden behind a user interface, which combines simple selection of common phenotypes with error-tolerant, predictive search of the entire ontology. PhenoTips supports accurate diagnosis by analyzing the entered data, then suggesting additional clinical investigations and providing Online Mendelian Inheritance in Man (OMIM) links to likely disorders. By collecting, classifying, and analyzing phenotypic information during the patient encounter, PhenoTips allows for streamlining of clinic workflow, efficient data entry, improved diagnosis, standardization of collected patient phenotypes, and sharing of anonymized patient phenotype data for the study of rare disorders. Our source code and a demo version of PhenoTips are available at http://phenotips.org. © 2013 WILEY PERIODICALS, INC.
EuroPhenome: a repository for high-throughput mouse phenotyping data
Morgan, Hugh; Beck, Tim; Blake, Andrew; Gates, Hilary; Adams, Niels; Debouzy, Guillaume; Leblanc, Sophie; Lengger, Christoph; Maier, Holger; Melvin, David; Meziane, Hamid; Richardson, Dave; Wells, Sara; White, Jacqui; Wood, Joe; de Angelis, Martin Hrabé; Brown, Steve D. M.; Hancock, John M.; Mallon, Ann-Marie
2010-01-01
The broad aim of biomedical science in the postgenomic era is to link genomic and phenotype information to allow deeper understanding of the processes leading from genomic changes to altered phenotype and disease. The EuroPhenome project (http://www.EuroPhenome.org) is a comprehensive resource for raw and annotated high-throughput phenotyping data arising from projects such as EUMODIC. EUMODIC is gathering data from the EMPReSSslim pipeline (http://www.empress.har.mrc.ac.uk/) which is performed on inbred mouse strains and knock-out lines arising from the EUCOMM project. The EuroPhenome interface allows the user to access the data via the phenotype or genotype. It also allows the user to access the data in a variety of ways, including graphical display, statistical analysis and access to the raw data via web services. The raw phenotyping data captured in EuroPhenome is annotated by an annotation pipeline which automatically identifies statistically different mutants from the appropriate baseline and assigns ontology terms for that specific test. Mutant phenotypes can be quickly identified using two EuroPhenome tools: PhenoMap, a graphical representation of statistically relevant phenotypes, and mining for a mutant using ontology terms. To assist with data definition and cross-database comparisons, phenotype data is annotated using combinations of terms from biological ontologies. PMID:19933761
Samwald, Matthias; Miñarro Giménez, Jose Antonio; Boyce, Richard D; Freimuth, Robert R; Adlassnig, Klaus-Peter; Dumontier, Michel
2015-02-22
Every year, hundreds of thousands of patients experience treatment failure or adverse drug reactions (ADRs), many of which could be prevented by pharmacogenomic testing. However, the primary knowledge needed for clinical pharmacogenomics is currently dispersed over disparate data structures and captured in unstructured or semi-structured formalizations. This is a source of potential ambiguity and complexity, making it difficult to create reliable information technology systems for enabling clinical pharmacogenomics. We developed Web Ontology Language (OWL) ontologies and automated reasoning methodologies to meet the following goals: 1) provide a simple and concise formalism for representing pharmacogenomic knowledge, 2) finde errors and insufficient definitions in pharmacogenomic knowledge bases, 3) automatically assign alleles and phenotypes to patients, 4) match patients to clinically appropriate pharmacogenomic guidelines and clinical decision support messages and 5) facilitate the detection of inconsistencies and overlaps between pharmacogenomic treatment guidelines from different sources. We evaluated different reasoning systems and test our approach with a large collection of publicly available genetic profiles. Our methodology proved to be a novel and useful choice for representing, analyzing and using pharmacogenomic data. The Genomic Clinical Decision Support (Genomic CDS) ontology represents 336 SNPs with 707 variants; 665 haplotypes related to 43 genes; 22 rules related to drug-response phenotypes; and 308 clinical decision support rules. OWL reasoning identified CDS rules with overlapping target populations but differing treatment recommendations. Only a modest number of clinical decision support rules were triggered for a collection of 943 public genetic profiles. We found significant performance differences across available OWL reasoners. The ontology-based framework we developed can be used to represent, organize and reason over the growing wealth of pharmacogenomic knowledge, as well as to identify errors, inconsistencies and insufficient definitions in source data sets or individual patient data. Our study highlights both advantages and potential practical issues with such an ontology-based approach.
Huang, Jingshan; Eilbeck, Karen; Smith, Barry; Blake, Judith A; Dou, Dejing; Huang, Weili; Natale, Darren A; Ruttenberg, Alan; Huan, Jun; Zimmermann, Michael T; Jiang, Guoqian; Lin, Yu; Wu, Bin; Strachan, Harrison J; He, Yongqun; Zhang, Shaojie; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming
2016-01-01
In recent years, sequencing technologies have enabled the identification of a wide range of non-coding RNAs (ncRNAs). Unfortunately, annotation and integration of ncRNA data has lagged behind their identification. Given the large quantity of information being obtained in this area, there emerges an urgent need to integrate what is being discovered by a broad range of relevant communities. To this end, the Non-Coding RNA Ontology (NCRO) is being developed to provide a systematically structured and precisely defined controlled vocabulary for the domain of ncRNAs, thereby facilitating the discovery, curation, analysis, exchange, and reasoning of data about structures of ncRNAs, their molecular and cellular functions, and their impacts upon phenotypes. The goal of NCRO is to serve as a common resource for annotations of diverse research in a way that will significantly enhance integrative and comparative analysis of the myriad resources currently housed in disparate sources. It is our belief that the NCRO ontology can perform an important role in the comprehensive unification of ncRNA biology and, indeed, fill a critical gap in both the Open Biological and Biomedical Ontologies (OBO) Library and the National Center for Biomedical Ontology (NCBO) BioPortal. Our initial focus is on the ontological representation of small regulatory ncRNAs, which we see as the first step in providing a resource for the annotation of data about all forms of ncRNAs. The NCRO ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/ncro.owl.
NASA Technical Reports Server (NTRS)
Wales, Roxana C.; Shalin, Valerie L.; Bass, Deborah S.
2004-01-01
This paper focuses on the development and use of the abbreviated names as well as an emergent ontology associated with making requests for action of a distant robotic rover during the 2003-2004 NASA Mars Exploration Rover (MER) mission, run by the Jet Propulsion Laboratory. The infancy of the domain of Martian telerobotic science, in which specialists request work from a rover moving through the landscape, as well as the need to consider the interdisciplinary teams involved in the work required an empirical approach. The formulation of this ontology is grounded in human behavior and work practice. The purpose of this paper is to identify general issues for an ontology of action (specifically for requests for action), while maintaining sensitivity to the users, tools and the work system within a specific technical domain. We found that this ontology of action must take into account a dynamic environment, changing in response to the movement of the rover, changes on the rover itself, as well as be responsive to the purposeful intent of the science requestors. Analysis of MER mission events demonstrates that the work practice and even robotic tool usage changes over time. Therefore, an ontology must adapt and represent both incremental change and revolutionary change, and the ontology can never be more than a partial agreement on the conceptualizations involved. Although examined in a rather unique technical domain, the general issues pertain to the control of any complex, distributed work system as well as the archival record of its accomplishments.
Mainstream web standards now support science data too
NASA Astrophysics Data System (ADS)
Richard, S. M.; Cox, S. J. D.; Janowicz, K.; Fox, P. A.
2017-12-01
The science community has developed many models and ontologies for representation of scientific data and knowledge. In some cases these have been built as part of coordinated frameworks. For example, the biomedical communities OBO Foundry federates applications covering various aspects of life sciences, which are united through reference to a common foundational ontology (BFO). The SWEET ontology, originally developed at NASA and now governed through ESIP, is a single large unified ontology for earth and environmental sciences. On a smaller scale, GeoSciML provides a UML and corresponding XML representation of geological mapping and observation data. Some of the key concepts related to scientific data and observations have recently been incorporated into domain-neutral mainstream ontologies developed by the World Wide Web consortium through their Spatial Data on the Web working group (SDWWG). OWL-Time has been enhanced to support temporal reference systems needed for science, and has been deployed in a linked data representation of the International Chronostratigraphic Chart. The Semantic Sensor Network ontology has been extended to cover samples and sampling, including relationships between samples. Gridded data and time-series is supported by applications of the statistical data-cube ontology (QB) for earth observations (the EO-QB profile) and spatio-temporal data (QB4ST). These standard ontologies and encodings can be used directly for science data, or can provide a bridge to specialized domain ontologies. There are a number of advantages in alignment with the W3C standards. The W3C vocabularies use discipline-neutral language and thus support cross-disciplinary applications directly without complex mappings. The W3C vocabularies are already aligned with the core ontologies that are the building blocks of the semantic web. The W3C vocabularies are each tightly scoped thus encouraging good practices in the combination of complementary small ontologies. The W3C vocabularies are hosted on well known, reliable infrastructure. The W3C SDWWG outputs are being selectively adopted by the general schema.org discovery framework.
Kozaki, Kouji; Yamagata, Yuki; Mizoguchi, Riichiro; Imai, Takeshi; Ohe, Kazuhiko
2017-06-19
Medical ontologies are expected to contribute to the effective use of medical information resources that store considerable amount of data. In this study, we focused on disease ontology because the complicated mechanisms of diseases are related to concepts across various medical domains. The authors developed a River Flow Model (RFM) of diseases, which captures diseases as the causal chains of abnormal states. It represents causes of diseases, disease progression, and downstream consequences of diseases, which is compliant with the intuition of medical experts. In this paper, we discuss a fact repository for causal chains of disease based on the disease ontology. It could be a valuable knowledge base for advanced medical information systems. We developed the fact repository for causal chains of diseases based on our disease ontology and abnormality ontology. This section summarizes these two ontologies. It is developed as linked data so that information scientists can access it using SPARQL queries through an Resource Description Framework (RDF) model for causal chain of diseases. We designed the RDF model as an implementation of the RFM for the fact repository based on the ontological definitions of the RFM. 1554 diseases and 7080 abnormal states in six major clinical areas, which are extracted from the disease ontology, are published as linked data (RDF) with SPARQL endpoint (accessible API). Furthermore, the authors developed Disease Compass, a navigation system for disease knowledge. Disease Compass can browse the causal chains of a disease and obtain related information, including abnormal states, through two web services that provide general information from linked data, such as DBpedia, and 3D anatomical images. Disease Compass can provide a complete picture of disease-associated processes in such a way that fits with a clinician's understanding of diseases. Therefore, it supports user exploration of disease knowledge with access to pertinent information from a variety of sources.
Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies
NASA Astrophysics Data System (ADS)
Ma, X.
2014-12-01
Knowledge evolves in geoscience, and the evolution is reflected in datasets. In a context with distributed data sources, the evolution of knowledge may cause considerable challenges to data management and re-use. For example, a short news published in 2009 (Mascarelli, 2009) revealed the geoscience community's concern that the International Commission on Stratigraphy's change to the definition of Quaternary may bring heavy reworking of geologic maps. Now we are in the era of the World Wide Web, and geoscience knowledge is increasingly modeled and encoded in the form of ontologies and vocabularies by using semantic technologies. Accordingly, knowledge evolution leads to a consequence called ontology dynamics. Flouris et al. (2008) summarized 10 topics of general ontology changes/dynamics such as: ontology mapping, morphism, evolution, debugging and versioning, etc. Ontology dynamics makes impacts at several stages of a data life cycle and causes challenges, such as: the request for reworking of the extant data in a data center, semantic mismatch among data sources, differentiated understanding of a same piece of dataset between data providers and data users, as well as error propagation in cross-discipline data discovery and re-use (Ma et al., 2014). This presentation will analyze the best practices in the geoscience community so far and summarize a few recommendations to reduce the negative impacts of ontology dynamics in a data life cycle, including: communities of practice and collaboration on ontology and vocabulary building, link data records to standardized terms, and methods for (semi-)automatic reworking of datasets using semantic technologies. References: Flouris, G., Manakanatas, D., Kondylakis, H., Plexousakis, D., Antoniou, G., 2008. Ontology change: classification and survey. The Knowledge Engineering Review 23 (2), 117-152. Ma, X., Fox, P., Rozell, E., West, P., Zednik, S., 2014. Ontology dynamics in a data life cycle: Challenges and recommendations from a Geoscience Perspective. Journal of Earth Science 25 (2), 407-412. Mascarelli, A.L., 2009. Quaternary geologists win timescale vote. Nature 459, 624.
A semantic medical multimedia retrieval approach using ontology information hiding.
Guo, Kehua; Zhang, Shigeng
2013-01-01
Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.
A Two-Stage Composition Method for Danger-Aware Services Based on Context Similarity
NASA Astrophysics Data System (ADS)
Wang, Junbo; Cheng, Zixue; Jing, Lei; Ota, Kaoru; Kansen, Mizuo
Context-aware systems detect user's physical and social contexts based on sensor networks, and provide services that adapt to the user accordingly. Representing, detecting, and managing the contexts are important issues in context-aware systems. Composition of contexts is a useful method for these works, since it can detect a context by automatically composing small pieces of information to discover service. Danger-aware services are a kind of context-aware services which need description of relations between a user and his/her surrounding objects and between users. However when applying the existing composition methods to danger-aware services, they show the following shortcomings that (1) they have not provided an explicit method for representing composition of multi-user' contexts, (2) there is no flexible reasoning mechanism based on similarity of contexts, so that they can just provide services exactly following the predefined context reasoning rules. Therefore, in this paper, we propose a two-stage composition method based on context similarity to solve the above problems. The first stage is composition of the useful information to represent the context for a single user. The second stage is composition of multi-users' contexts to provide services by considering the relation of users. Finally the danger degree of the detected context is computed by using context similarity between the detected context and the predefined context. Context is dynamically represented based on two-stage composition rules and a Situation theory based Ontology, which combines the advantages of Ontology and Situation theory. We implement the system in an indoor ubiquitous environment, and evaluate the system through two experiments with the support of subjects. The experiment results show the method is effective, and the accuracy of danger detection is acceptable to a danger-aware system.
Frishkoff, Gwen; Sydes, Jason; Mueller, Kurt; Frank, Robert; Curran, Tim; Connolly, John; Kilborn, Kerry; Molfese, Dennis; Perfetti, Charles; Malony, Allen
2011-01-01
We present MINEMO (Minimal Information for Neural ElectroMagnetic Ontologies), a checklist for the description of event-related potentials (ERP) studies. MINEMO extends MINI (Minimal Information for Neuroscience Investigations)to the ERP domain. Checklist terms are explicated in NEMO, a formal ontology that is designed to support ERP data sharing and integration. MINEMO is also linked to an ERP database and web application (the NEMO portal). Users upload their data and enter MINEMO information through the portal. The database then stores these entries in RDF (Resource Description Framework), along with summary metrics, i.e., spatial and temporal metadata. Together these spatial, temporal, and functional metadata provide a complete description of ERP data and the context in which these data were acquired. The RDF files then serve as inputs to ontology-based labeling and meta-analysis. Our ultimate goal is to represent ERPs using a rich semantic structure, so results can be queried at multiple levels, to stimulate novel hypotheses and to promote a high-level, integrative account of ERP results across diverse study methods and paradigms. PMID:22180824
NASA Astrophysics Data System (ADS)
Wasielewska, K.; Ganzha, M.
2012-10-01
In this paper we consider combining ontologically demarcated information with Saaty's Analytic Hierarchy Process (AHP) [1] for the multicriterial assessment of offers during contract negotiations. The context for the proposal is provided by the Agents in Grid project (AiG; [2]), which aims at development of an agent-based infrastructure for efficient resource management in the Grid. In the AiG project, software agents representing users can either (1) join a team and earn money, or (2) find a team to execute a job. Moreover, agents form teams, managers of which negotiate with clients and workers terms of potential collaboration. Here, ontologically described contracts (Service Level Agreements) are the results of autonomous multiround negotiations. Therefore, taking into account relatively complex nature of the negotiated contracts, multicriterial assessment of proposals plays a crucial role. The AHP method is based on pairwise comparisons of criteria and relies on the judgement of a panel of experts. It measures how well does an offer serve the objective of a decision maker. In this paper, we propose how the AHP method can be used to assess ontologically described contract proposals.
Constructing Adverse Outcome Pathways: a Demonstration of ...
Adverse outcome pathway (AOP) provides a conceptual framework to evaluate and integrate chemical toxicity and its effects across the levels of biological organization. As such, it is essential to develop a resource-efficient and effective approach to extend molecular initiating events (MIEs) of chemicals to their downstream phenotypes of a greater regulatory relevance. A number of ongoing public phenomics (high throughput phenotyping) efforts have been generating abundant phenotypic data annotated with ontology terms. These phenotypes can be analyzed semantically and linked to MIEs of interest, all in the context of a knowledge base integrated from a variety of ontologies for various species and knowledge domains. In such analyses, two phenotypic profiles (PPs; anchored by genes or diseases) each characterized by multiple ontology terms are compared for their semantic similarities within a common ontology graph, but across boundaries of species and knowledge domains. Taking advantage of publicly available ontologies and software tool kits, we have implemented an OS-Mapping (Ontology-based Semantics Mapping) approach as a Java application, and constructed a network of 19383 PPs as nodes with edges weighed by their pairwise semantic similarity scores. Individual PPs were assembled from public phenomics data. Out of possible 1.87×108 pairwise connections among these nodes, about 71% of them have similarity scores between 0.2 and the maximum possible of 1.0.
NASA Astrophysics Data System (ADS)
Pérez-Luque, A. J.; Pérez-Pérez, R.; Bonet-García, F. J.; Magaña, P. J.
2015-05-01
The implementation of the Natura 2000 network requires methods to assess the conservation status of habitats. This paper shows a methodological approach that combines the use of (satellite) Earth observation with ontologies to monitor Natura 2000 habitats and assess their functioning. We have created an ontological system called Savia that can describe both the ecosystem functioning and the behaviour of abiotic factors in a Natura 2000 habitat. This system is able to automatically download images from MODIS products, create indicators and compute temporal trends for them. We have developed an ontology that takes into account the different concepts and relations about indicators and temporal trends, and the spatio-temporal components of the datasets. All the information generated from datasets and MODIS images, is stored into a knowledge base according to the ontology. Users can formulate complex questions using a SPARQL end-point. This system has been tested and validated in a case study that uses Quercus pyrenaica Willd. forests as a target habitat in Sierra Nevada (Spain), a Natura 2000 site. We assess ecosystem functioning using NDVI. The selected abiotic factor is snow cover. Savia provides useful data regarding these two variables and reflects relationships between them.
NASA Astrophysics Data System (ADS)
Vega, Francisco; Pérez, Wilson; Tello, Andrés.; Saquicela, Victor; Espinoza, Mauricio; Solano-Quinde, Lizandro; Vidal, Maria-Esther; La Cruz, Alexandra
2015-12-01
Advances in medical imaging have fostered medical diagnosis based on digital images. Consequently, the number of studies by medical images diagnosis increases, thus, collaborative work and tele-radiology systems are required to effectively scale up to this diagnosis trend. We tackle the problem of the collaborative access of medical images, and present WebMedSA, a framework to manage large datasets of medical images. WebMedSA relies on a PACS and supports the ontological annotation, as well as segmentation and visualization of the images based on their semantic description. Ontological annotations can be performed directly on the volumetric image or at different image planes (e.g., axial, coronal, or sagittal); furthermore, annotations can be complemented after applying a segmentation technique. WebMedSA is based on three main steps: (1) RDF-ization process for extracting, anonymizing, and serializing metadata comprised in DICOM medical images into RDF/XML; (2) Integration of different biomedical ontologies (using L-MOM library), making this approach ontology independent; and (3) segmentation and visualization of annotated data which is further used to generate new annotations according to expert knowledge, and validation. Initial user evaluations suggest that WebMedSA facilitates the exchange of knowledge between radiologists, and provides the basis for collaborative work among them.
Ontology-Based Administration of Web Directories
NASA Astrophysics Data System (ADS)
Horvat, Marko; Gledec, Gordan; Bogunović, Nikola
Administration of a Web directory and maintenance of its content and the associated structure is a delicate and labor intensive task performed exclusively by human domain experts. Subsequently there is an imminent risk of a directory structures becoming unbalanced, uneven and difficult to use to all except for a few users proficient with the particular Web directory and its domain. These problems emphasize the need to establish two important issues: i) generic and objective measures of Web directories structure quality, and ii) mechanism for fully automated development of a Web directory's structure. In this paper we demonstrate how to formally and fully integrate Web directories with the Semantic Web vision. We propose a set of criteria for evaluation of a Web directory's structure quality. Some criterion functions are based on heuristics while others require the application of ontologies. We also suggest an ontology-based algorithm for construction of Web directories. By using ontologies to describe the semantics of Web resources and Web directories' categories it is possible to define algorithms that can build or rearrange the structure of a Web directory. Assessment procedures can provide feedback and help steer the ontology-based construction process. The issues raised in the article can be equally applied to new and existing Web directories.
Share Repository Framework: Component Specification and Otology
2008-04-23
Palantir Technologies has created one such software application to support the DoD intelligence community by providing robust capabilities for...managing data from various sources. The Palantir tool is based on user-defined ontologies and supports multiple representation and analysis tools
Blank, Carrine E; Cui, Hong; Moore, Lisa R; Walls, Ramona L
2016-01-01
MicrO is an ontology of microbiological terms, including prokaryotic qualities and processes, material entities (such as cell components), chemical entities (such as microbiological culture media and medium ingredients), and assays. The ontology was built to support the ongoing development of a natural language processing algorithm, MicroPIE (or, Microbial Phenomics Information Extractor). During the MicroPIE design process, we realized there was a need for a prokaryotic ontology which would capture the evolutionary diversity of phenotypes and metabolic processes across the tree of life, capture the diversity of synonyms and information contained in the taxonomic literature, and relate microbiological entities and processes to terms in a large number of other ontologies, most particularly the Gene Ontology (GO), the Phenotypic Quality Ontology (PATO), and the Chemical Entities of Biological Interest (ChEBI). We thus constructed MicrO to be rich in logical axioms and synonyms gathered from the taxonomic literature. MicrO currently has ~14550 classes (~2550 of which are new, the remainder being microbiologically-relevant classes imported from other ontologies), connected by ~24,130 logical axioms (5,446 of which are new), and is available at (http://purl.obolibrary.org/obo/MicrO.owl) and on the project website at https://github.com/carrineblank/MicrO. MicrO has been integrated into the OBO Foundry Library (http://www.obofoundry.org/ontology/micro.html), so that other ontologies can borrow and re-use classes. Term requests and user feedback can be made using MicrO's Issue Tracker in GitHub. We designed MicrO such that it can support the ongoing and future development of algorithms that can leverage the controlled vocabulary and logical inference power provided by the ontology. By connecting microbial classes with large numbers of chemical entities, material entities, biological processes, molecular functions, and qualities using a dense array of logical axioms, we intend MicrO to be a powerful new tool to increase the computing power of bioinformatics tools such as the automated text mining of prokaryotic taxonomic descriptions using natural language processing. We also intend MicrO to support the development of new bioinformatics tools that aim to develop new connections between microbial phenotypes and genotypes (i.e., the gene content in genomes). Future ontology development will include incorporation of pathogenic phenotypes and prokaryotic habitats.
NASA Astrophysics Data System (ADS)
Sunitha, A.; Babu, G. Suresh
2014-11-01
Recent studies in the decision making efforts in the area of public healthcare systems have been tremendously inspired and influenced by the entry of ontology. Ontology driven systems results in the effective implementation of healthcare strategies for the policy makers. The central source of knowledge is the ontology containing all the relevant domain concepts such as locations, diseases, environments and their domain sensitive inter-relationships which is the prime objective, concern and the motivation behind this paper. The paper further focuses on the development of a semantic knowledge-base for public healthcare system. This paper describes the approach and methodologies in bringing out a novel conceptual theme in establishing a firm linkage between three different ontologies related to diseases, places and environments in one integrated platform. This platform correlates the real-time mechanisms prevailing within the semantic knowledgebase and establishing their inter-relationships for the first time in India. This is hoped to formulate a strong foundation for establishing a much awaited basic need for a meaningful healthcare decision making system in the country. Introduction through a wide range of best practices facilitate the adoption of this approach for better appreciation, understanding and long term outcomes in the area. The methods and approach illustrated in the paper relate to health mapping methods, reusability of health applications, and interoperability issues based on mapping of the data attributes with ontology concepts in generating semantic integrated data driving an inference engine for user-interfaced semantic queries.
The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses
Cooper, Laurel; Walls, Ramona L.; Elser, Justin; Gandolfo, Maria A.; Stevenson, Dennis W.; Smith, Barry; Preece, Justin; Athreya, Balaji; Mungall, Christopher J.; Rensing, Stefan; Hiss, Manuel; Lang, Daniel; Reski, Ralf; Berardini, Tanya Z.; Li, Donghui; Huala, Eva; Schaeffer, Mary; Menda, Naama; Arnaud, Elizabeth; Shrestha, Rosemary; Yamazaki, Yukiko; Jaiswal, Pankaj
2013-01-01
The Plant Ontology (PO; http://www.plantontology.org/) is a publicly available, collaborative effort to develop and maintain a controlled, structured vocabulary (‘ontology’) of terms to describe plant anatomy, morphology and the stages of plant development. The goals of the PO are to link (annotate) gene expression and phenotype data to plant structures and stages of plant development, using the data model adopted by the Gene Ontology. From its original design covering only rice, maize and Arabidopsis, the scope of the PO has been expanded to include all green plants. The PO was the first multispecies anatomy ontology developed for the annotation of genes and phenotypes. Also, to our knowledge, it was one of the first biological ontologies that provides translations (via synonyms) in non-English languages such as Japanese and Spanish. As of Release #18 (July 2012), there are about 2.2 million annotations linking PO terms to >110,000 unique data objects representing genes or gene models, proteins, RNAs, germplasm and quantitative trait loci (QTLs) from 22 plant species. In this paper, we focus on the plant anatomical entity branch of the PO, describing the organizing principles, resources available to users and examples of how the PO is integrated into other plant genomics databases and web portals. We also provide two examples of comparative analyses, demonstrating how the ontology structure and PO-annotated data can be used to discover the patterns of expression of the LEAFY (LFY) and terpene synthase (TPS) gene homologs. PMID:23220694
NASA Technical Reports Server (NTRS)
Hegde, Mahabaleshwara; Strub, Richard F.; Lynnes, Christopher S.; Fang, Hongliang; Teng, William
2008-01-01
Mirador is a web interface for searching Earth Science data archived at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). Mirador provides keyword-based search and guided navigation for providing efficient search and access to Earth Science data. Mirador employs the power of Google's universal search technology for fast metadata keyword searches, augmented by additional capabilities such as event searches (e.g., hurricanes), searches based on location gazetteer, and data services like format converters and data sub-setters. The objective of guided data navigation is to present users with multiple guided navigation in Mirador is an ontology based on the Global Change Master directory (GCMD) Directory Interchange Format (DIF). Current implementation includes the project ontology covering various instruments and model data. Additional capabilities in the pipeline include Earth Science parameter and applications ontologies.
SPARQL Query Re-writing Using Partonomy Based Transformation Rules
NASA Astrophysics Data System (ADS)
Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.
Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.
SOBA: sequence ontology bioinformatics analysis.
Moore, Barry; Fan, Guozhen; Eilbeck, Karen
2010-07-01
The advent of cheaper, faster sequencing technologies has pushed the task of sequence annotation from the exclusive domain of large-scale multi-national sequencing projects to that of research laboratories and small consortia. The bioinformatics burden placed on these laboratories, some with very little programming experience can be daunting. Fortunately, there exist software libraries and pipelines designed with these groups in mind, to ease the transition from an assembled genome to an annotated and accessible genome resource. We have developed the Sequence Ontology Bioinformatics Analysis (SOBA) tool to provide a simple statistical and graphical summary of an annotated genome. We envisage its use during annotation jamborees, genome comparison and for use by developers for rapid feedback during annotation software development and testing. SOBA also provides annotation consistency feedback to ensure correct use of terminology within annotations, and guides users to add new terms to the Sequence Ontology when required. SOBA is available at http://www.sequenceontology.org/cgi-bin/soba.cgi.
An Ontology-Based Conceptual Model For Accumulating And Reusing Knowledge In A DMAIC Process
NASA Astrophysics Data System (ADS)
Nguyen, ThanhDat; Kifor, Claudiu Vasile
2015-09-01
DMAIC (Define, Measure, Analyze, Improve, and Control) is an important process used to enhance quality of processes basing on knowledge. However, it is difficult to access DMAIC knowledge. Conventional approaches meet a problem arising from structuring and reusing DMAIC knowledge. The main reason is that DMAIC knowledge is not represented and organized systematically. In this article, we overcome the problem basing on a conceptual model that is a combination of DMAIC process, knowledge management, and Ontology engineering. The main idea of our model is to utilizing Ontologies to represent knowledge generated by each of DMAIC phases. We build five different knowledge bases for storing all knowledge of DMAIC phases with the support of necessary tools and appropriate techniques in Information Technology area. Consequently, these knowledge bases provide knowledge available to experts, managers, and web users during or after DMAIC execution in order to share and reuse existing knowledge.
Buttigieg, Pier Luigi; Pafilis, Evangelos; Lewis, Suzanna E.; ...
2016-09-23
Background: The Environment Ontology (ENVO; http://www.environmentontology.org/), first described in 2013, is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations. However, the need for environmental semantics is common to a multitude of fields, and ENVO's use has steadily grown since its initial description. We have thus expanded, enhanced, and generalised the ontology to support its increasingly diverse applications. Methods: We have updated our development suite to promote expressivity, consistency, and speed: we now develop ENVOmore » in the Web Ontology Language (OWL) and employ templating methods to accelerate class creation. We have also taken steps to better align ENVO with the Open Biological and Biomedical Ontologies (OBO) Foundry principles and interoperate with existing OBO ontologies. Further, we applied text-mining approaches to extract habitat information from the Encyclopedia of Life and automatically create experimental habitat classes within ENVO. Results: Relative to its state in 2013, ENVO's content, scope, and implementation have been enhanced and much of its existing content revised for improved semantic representation. ENVO now offers representations of habitats, environmental processes, anthropogenic environments, and entities relevant to environmental health initiatives and the global Sustainable Development Agenda for 2030. Several branches of ENVO have been used to incubate and seed new ontologies in previously unrepresented domains such as food and agronomy. The current release version of the ontology, in OWL format, is available at http://purl.obolibrary.org/obo/envo.owl. Conclusions: ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, 'omics, and socioeconomic development. Through continued interactions with our users and partners, particularly those performing data archiving and sythesis, we anticipate that ENVO's growth will accelerate in 2017. As always, we invite further contributions and collaboration to advance the semantic representation of the environment, ranging from geographic features and environmental materials, across habitats and ecosystems, to everyday objects in household settings.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buttigieg, Pier Luigi; Pafilis, Evangelos; Lewis, Suzanna E.
Background: The Environment Ontology (ENVO; http://www.environmentontology.org/), first described in 2013, is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations. However, the need for environmental semantics is common to a multitude of fields, and ENVO's use has steadily grown since its initial description. We have thus expanded, enhanced, and generalised the ontology to support its increasingly diverse applications. Methods: We have updated our development suite to promote expressivity, consistency, and speed: we now develop ENVOmore » in the Web Ontology Language (OWL) and employ templating methods to accelerate class creation. We have also taken steps to better align ENVO with the Open Biological and Biomedical Ontologies (OBO) Foundry principles and interoperate with existing OBO ontologies. Further, we applied text-mining approaches to extract habitat information from the Encyclopedia of Life and automatically create experimental habitat classes within ENVO. Results: Relative to its state in 2013, ENVO's content, scope, and implementation have been enhanced and much of its existing content revised for improved semantic representation. ENVO now offers representations of habitats, environmental processes, anthropogenic environments, and entities relevant to environmental health initiatives and the global Sustainable Development Agenda for 2030. Several branches of ENVO have been used to incubate and seed new ontologies in previously unrepresented domains such as food and agronomy. The current release version of the ontology, in OWL format, is available at http://purl.obolibrary.org/obo/envo.owl. Conclusions: ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, 'omics, and socioeconomic development. Through continued interactions with our users and partners, particularly those performing data archiving and sythesis, we anticipate that ENVO's growth will accelerate in 2017. As always, we invite further contributions and collaboration to advance the semantic representation of the environment, ranging from geographic features and environmental materials, across habitats and ecosystems, to everyday objects in household settings.« less
Buttigieg, Pier Luigi; Pafilis, Evangelos; Lewis, Suzanna E; Schildhauer, Mark P; Walls, Ramona L; Mungall, Christopher J
2016-09-23
The Environment Ontology (ENVO; http://www.environmentontology.org/ ), first described in 2013, is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations. However, the need for environmental semantics is common to a multitude of fields, and ENVO's use has steadily grown since its initial description. We have thus expanded, enhanced, and generalised the ontology to support its increasingly diverse applications. We have updated our development suite to promote expressivity, consistency, and speed: we now develop ENVO in the Web Ontology Language (OWL) and employ templating methods to accelerate class creation. We have also taken steps to better align ENVO with the Open Biological and Biomedical Ontologies (OBO) Foundry principles and interoperate with existing OBO ontologies. Further, we applied text-mining approaches to extract habitat information from the Encyclopedia of Life and automatically create experimental habitat classes within ENVO. Relative to its state in 2013, ENVO's content, scope, and implementation have been enhanced and much of its existing content revised for improved semantic representation. ENVO now offers representations of habitats, environmental processes, anthropogenic environments, and entities relevant to environmental health initiatives and the global Sustainable Development Agenda for 2030. Several branches of ENVO have been used to incubate and seed new ontologies in previously unrepresented domains such as food and agronomy. The current release version of the ontology, in OWL format, is available at http://purl.obolibrary.org/obo/envo.owl . ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, 'omics, and socioeconomic development. Through continued interactions with our users and partners, particularly those performing data archiving and sythesis, we anticipate that ENVO's growth will accelerate in 2017. As always, we invite further contributions and collaboration to advance the semantic representation of the environment, ranging from geographic features and environmental materials, across habitats and ecosystems, to everyday objects in household settings.
The Semantic eScience Framework
NASA Astrophysics Data System (ADS)
McGuinness, Deborah; Fox, Peter; Hendler, James
2010-05-01
The goal of this effort is to design and implement a configurable and extensible semantic eScience framework (SESF). Configuration requires research into accommodating different levels of semantic expressivity and user requirements from use cases. Extensibility is being achieved in a modular approach to the semantic encodings (i.e. ontologies) performed in community settings, i.e. an ontology framework into which specific applications all the way up to communities can extend the semantics for their needs.We report on how we are accommodating the rapid advances in semantic technologies and tools and the sustainable software path for the future (certain) technical advances. In addition to a generalization of the current data science interface, we will present plans for an upper-level interface suitable for use by clearinghouses, and/or educational portals, digital libraries, and other disciplines.SESF builds upon previous work in the Virtual Solar-Terrestrial Observatory. The VSTO utilizes leading edge knowledge representation, query and reasoning techniques to support knowledge-enhanced search, data access, integration, and manipulation. It encodes term meanings and their inter-relationships in ontologies anduses these ontologies and associated inference engines to semantically enable the data services. The Semantically-Enabled Science Data Integration (SESDI) project implemented data integration capabilities among three sub-disciplines; solar radiation, volcanic outgassing and atmospheric structure using extensions to existingmodular ontolgies and used the VSTO data framework, while adding smart faceted search and semantic data registrationtools. The Semantic Provenance Capture in Data Ingest Systems (SPCDIS) has added explanation provenance capabilities to an observational data ingest pipeline for images of the Sun providing a set of tools to answer diverseend user questions such as ``Why does this image look bad?. http://tw.rpi.edu/portal/SESF
The Semantic eScience Framework
NASA Astrophysics Data System (ADS)
Fox, P. A.; McGuinness, D. L.
2009-12-01
The goal of this effort is to design and implement a configurable and extensible semantic eScience framework (SESF). Configuration requires research into accommodating different levels of semantic expressivity and user requirements from use cases. Extensibility is being achieved in a modular approach to the semantic encodings (i.e. ontologies) performed in community settings, i.e. an ontology framework into which specific applications all the way up to communities can extend the semantics for their needs.We report on how we are accommodating the rapid advances in semantic technologies and tools and the sustainable software path for the future (certain) technical advances. In addition to a generalization of the current data science interface, we will present plans for an upper-level interface suitable for use by clearinghouses, and/or educational portals, digital libraries, and other disciplines.SESF builds upon previous work in the Virtual Solar-Terrestrial Observatory. The VSTO utilizes leading edge knowledge representation, query and reasoning techniques to support knowledge-enhanced search, data access, integration, and manipulation. It encodes term meanings and their inter-relationships in ontologies anduses these ontologies and associated inference engines to semantically enable the data services. The Semantically-Enabled Science Data Integration (SESDI) project implemented data integration capabilities among three sub-disciplines; solar radiation, volcanic outgassing and atmospheric structure using extensions to existingmodular ontolgies and used the VSTO data framework, while adding smart faceted search and semantic data registrationtools. The Semantic Provenance Capture in Data Ingest Systems (SPCDIS) has added explanation provenance capabilities to an observational data ingest pipeline for images of the Sun providing a set of tools to answer diverseend user questions such as ``Why does this image look bad?.
Modelling and approaching pragmatic interoperability of distributed geoscience data
NASA Astrophysics Data System (ADS)
Ma, Xiaogang
2010-05-01
Interoperability of geodata, which is essential for sharing information and discovering insights within a cyberinfrastructure, is receiving increasing attention. A key requirement of interoperability in the context of geodata sharing is that data provided by local sources can be accessed, decoded, understood and appropriately used by external users. Various researchers have discussed that there are four levels in data interoperability issues: system, syntax, schematics and semantics, which respectively relate to the platform, encoding, structure and meaning of geodata. Ontology-driven approaches have been significantly studied addressing schematic and semantic interoperability issues of geodata in the last decade. There are different types, e.g. top-level ontologies, domain ontologies and application ontologies and display forms, e.g. glossaries, thesauri, conceptual schemas and logical theories. Many geodata providers are maintaining their identified local application ontologies in order to drive standardization in local databases. However, semantic heterogeneities often exist between these local ontologies, even though they are derived from equivalent disciplines. In contrast, common ontologies are being studied in different geoscience disciplines (e.g., NAMD, SWEET, etc.) as a standardization procedure to coordinate diverse local ontologies. Semantic mediation, e.g. mapping between local ontologies, or mapping local ontologies to common ontologies, has been studied as an effective way of achieving semantic interoperability between local ontologies thus reconciling semantic heterogeneities in multi-source geodata. Nevertheless, confusion still exists in the research field of semantic interoperability. One problem is caused by eliminating elements of local pragmatic contexts in semantic mediation. Comparing to the context-independent feature of a common domain ontology, local application ontologies are closely related to elements (e.g., people, time, location, intention, procedure, consequence, etc.) of local pragmatic contexts and thus context-dependent. Elimination of these elements will inevitably lead to information loss in semantic mediation between local ontologies. Correspondingly, understanding and effect of exchanged data in a new context may differ from that in its original context. Another problem is the dilemma on how to find a balance between flexibility and standardization of local ontologies, because ontologies are not fixed, but continuously evolving. It is commonly realized that we cannot use a unified ontology to replace all local ontologies because they are context-dependent and need flexibility. However, without coordination of standards, freely developed local ontologies and databases will bring enormous work of mediation between them. Finding a balance between standardization and flexibility for evolving ontologies, in a practical sense, requires negotiations (i.e. conversations, agreements and collaborations) between different local pragmatic contexts. The purpose of this work is to set up a computer-friendly model representing local pragmatic contexts (i.e. geodata sources), and propose a practical semantic negotiation procedure for approaching pragmatic interoperability between local pragmatic contexts. Information agents, objective facts and subjective dimensions are reviewed as elements of a conceptual model for representing pragmatic contexts. The author uses them to draw a practical semantic negotiation procedure approaching pragmatic interoperability of distributed geodata. The proposed conceptual model and semantic negotiation procedure were encoded with Description Logic, and then applied to analyze and manipulate semantic negotiations between different local ontologies within the National Mineral Resources Assessment (NMRA) project of China, which involves multi-source and multi-subject geodata sharing.
Collaborative development of predictive toxicology applications
2010-01-01
OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals. The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation. Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way. PMID:20807436
Collaborative development of predictive toxicology applications.
Hardy, Barry; Douglas, Nicki; Helma, Christoph; Rautenberg, Micha; Jeliazkova, Nina; Jeliazkov, Vedrin; Nikolova, Ivelina; Benigni, Romualdo; Tcheremenskaia, Olga; Kramer, Stefan; Girschick, Tobias; Buchwald, Fabian; Wicker, Joerg; Karwath, Andreas; Gütlein, Martin; Maunz, Andreas; Sarimveis, Haralambos; Melagraki, Georgia; Afantitis, Antreas; Sopasakis, Pantelis; Gallagher, David; Poroikov, Vladimir; Filimonov, Dmitry; Zakharov, Alexey; Lagunin, Alexey; Gloriozova, Tatyana; Novikov, Sergey; Skvortsova, Natalia; Druzhilovsky, Dmitry; Chawla, Sunil; Ghosh, Indira; Ray, Surajit; Patel, Hitesh; Escher, Sylvia
2010-08-31
OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals.The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation.Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.
SNOMED CT module-driven clinical archetype management.
Allones, J L; Taboada, M; Martinez, D; Lozano, R; Sobrido, M J
2013-06-01
To explore semantic search to improve management and user navigation in clinical archetype repositories. In order to support semantic searches across archetypes, an automated method based on SNOMED CT modularization is implemented to transform clinical archetypes into SNOMED CT extracts. Concurrently, query terms are converted into SNOMED CT concepts using the search engine Lucene. Retrieval is then carried out by matching query concepts with the corresponding SNOMED CT segments. A test collection of the 16 clinical archetypes, including over 250 terms, and a subset of 55 clinical terms from two medical dictionaries, MediLexicon and MedlinePlus, were used to test our method. The keyword-based service supported by the OpenEHR repository offered us a benchmark to evaluate the enhancement of performance. In total, our approach reached 97.4% precision and 69.1% recall, providing a substantial improvement of recall (more than 70%) compared to the benchmark. Exploiting medical domain knowledge from ontologies such as SNOMED CT may overcome some limitations of the keyword-based systems and thus improve the search experience of repository users. An automated approach based on ontology segmentation is an efficient and feasible way for supporting modeling, management and user navigation in clinical archetype repositories. Copyright © 2013 Elsevier Inc. All rights reserved.
The Gene Ontology of eukaryotic cilia and flagella.
Roncaglia, Paola; van Dam, Teunis J P; Christie, Karen R; Nacheva, Lora; Toedt, Grischa; Huynen, Martijn A; Huntley, Rachael P; Gibson, Toby J; Lomax, Jane
2017-01-01
Recent research into ciliary structure and function provides important insights into inherited diseases termed ciliopathies and other cilia-related disorders. This wealth of knowledge needs to be translated into a computational representation to be fully exploitable by the research community. To this end, members of the Gene Ontology (GO) and SYSCILIA Consortia have worked together to improve representation of ciliary substructures and processes in GO. Members of the SYSCILIA and Gene Ontology Consortia suggested additions and changes to GO, to reflect new knowledge in the field. The project initially aimed to improve coverage of ciliary parts, and was then broadened to cilia-related biological processes. Discussions were documented in a public tracker. We engaged the broader cilia community via direct consultation and by referring to the literature. Ontology updates were implemented via ontology editing tools. So far, we have created or modified 127 GO terms representing parts and processes related to eukaryotic cilia/flagella or prokaryotic flagella. A growing number of biological pathways are known to involve cilia, and we continue to incorporate this knowledge in GO. The resulting expansion in GO allows more precise representation of experimentally derived knowledge, and SYSCILIA and GO biocurators have created 199 annotations to 50 human ciliary proteins. The revised ontology was also used to curate mouse proteins in a collaborative project. The revised GO and annotations, used in comparative 'before and after' analyses of representative ciliary datasets, improve enrichment results significantly. Our work has resulted in a broader and deeper coverage of ciliary composition and function. These improvements in ontology and protein annotation will benefit all users of GO enrichment analysis tools, as well as the ciliary research community, in areas ranging from microscopy image annotation to interpretation of high-throughput studies. We welcome feedback to further enhance the representation of cilia biology in GO.
Design and Application of an Ontology for Component-Based Modeling of Water Systems
NASA Astrophysics Data System (ADS)
Elag, M.; Goodall, J. L.
2012-12-01
Many Earth system modeling frameworks have adopted an approach of componentizing models so that a large model can be assembled by linking a set of smaller model components. These model components can then be more easily reused, extended, and maintained by a large group of model developers and end users. While there has been a notable increase in component-based model frameworks in the Earth sciences in recent years, there has been less work on creating framework-agnostic metadata and ontologies for model components. Well defined model component metadata is needed, however, to facilitate sharing, reuse, and interoperability both within and across Earth system modeling frameworks. To address this need, we have designed an ontology for the water resources community named the Water Resources Component (WRC) ontology in order to advance the application of component-based modeling frameworks across water related disciplines. Here we present the design of the WRC ontology and demonstrate its application for integration of model components used in watershed management. First we show how the watershed modeling system Soil and Water Assessment Tool (SWAT) can be decomposed into a set of hydrological and ecological components that adopt the Open Modeling Interface (OpenMI) standard. Then we show how the components can be used to estimate nitrogen losses from land to surface water for the Baltimore Ecosystem study area. Results of this work are (i) a demonstration of how the WRC ontology advances the conceptual integration between components of water related disciplines by handling the semantic and syntactic heterogeneity present when describing components from different disciplines and (ii) an investigation of a methodology by which large models can be decomposed into a set of model components that can be well described by populating metadata according to the WRC ontology.
An open annotation ontology for science on web 3.0
2011-01-01
Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Methods Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. Results This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO’s Google Code page: http://code.google.com/p/annotation-ontology/ . Conclusions The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors. PMID:21624159
An open annotation ontology for science on web 3.0.
Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim
2011-05-17
There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.
Mazandu, Gaston K; Mulder, Nicola J
2013-09-25
The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
Spatial cyberinfrastructures, ontologies, and the humanities.
Sieber, Renee E; Wellen, Christopher C; Jin, Yuan
2011-04-05
We report on research into building a cyberinfrastructure for Chinese biographical and geographic data. Our cyberinfrastructure contains (i) the McGill-Harvard-Yenching Library Ming Qing Women's Writings database (MQWW), the only online database on historical Chinese women's writings, (ii) the China Biographical Database, the authority for Chinese historical people, and (iii) the China Historical Geographical Information System, one of the first historical geographic information systems. Key to this integration is that linked databases retain separate identities as bases of knowledge, while they possess sufficient semantic interoperability to allow for multidatabase concepts and to support cross-database queries on an ad hoc basis. Computational ontologies create underlying semantics for database access. This paper focuses on the spatial component in a humanities cyberinfrastructure, which includes issues of conflicting data, heterogeneous data models, disambiguation, and geographic scale. First, we describe the methodology for integrating the databases. Then we detail the system architecture, which includes a tier of ontologies and schema. We describe the user interface and applications that allow for cross-database queries. For instance, users should be able to analyze the data, examine hypotheses on spatial and temporal relationships, and generate historical maps with datasets from MQWW for research, teaching, and publication on Chinese women writers, their familial relations, publishing venues, and the literary and social communities. Last, we discuss the social side of cyberinfrastructure development, as people are considered to be as critical as the technical components for its success.
IPRStats: visualization of the functional potential of an InterProScan run.
Kelly, Ryan J; Vincent, David E; Friedberg, Iddo
2010-12-21
InterPro is a collection of protein signatures for the classification and automated annotation of proteins. Interproscan is a software tool that scans protein sequences against Interpro member databases using a variety of profile-based, hidden markov model and positional specific score matrix methods. It not only combines a set of analysis tools, but also performs data look-up from various sources, as well as some redundancy removal. Interproscan is robust and scalable, able to perform on any machine from a netbook to a large cluster. However, when performing whole-genome or metagenome analysis, there is a need for a fast statistical visualization of the results to have good initial grasp on the functional potential of the sequences in the analyzed data set. This is especially important when analyzing and comparing metagenomic or metaproteomic data-sets. IPRStats is a tool for the visualization of Interproscan results. Interproscan results are parsed from the Interproscan XML or EBIXML file into an SQLite or MySQL database. The results for each signature database scan are read and displayed as pie-charts or bar charts as summary statistics. A table is also provided, where each entry is a signature (e.g. a Pfam entry) accompanied by one or more Gene Ontology terms, if Interproscan was run using the Gene Ontology option. We present an platform-independent, open source licensed tool that is useful for Interproscan users who wish to view the summary of their results in a rapid and concise fashion.
Ontology and modeling patterns for state-based behavior representation
NASA Technical Reports Server (NTRS)
Castet, Jean-Francois; Rozek, Matthew L.; Ingham, Michel D.; Rouquette, Nicolas F.; Chung, Seung H.; Kerzhner, Aleksandr A.; Donahue, Kenneth M.; Jenkins, J. Steven; Wagner, David A.; Dvorak, Daniel L.;
2015-01-01
This paper provides an approach to capture state-based behavior of elements, that is, the specification of their state evolution in time, and the interactions amongst them. Elements can be components (e.g., sensors, actuators) or environments, and are characterized by state variables that vary with time. The behaviors of these elements, as well as interactions among them are represented through constraints on state variables. This paper discusses the concepts and relationships introduced in this behavior ontology, and the modeling patterns associated with it. Two example cases are provided to illustrate their usage, as well as to demonstrate the flexibility and scalability of the behavior ontology: a simple flashlight electrical model and a more complex spacecraft model involving instruments, power and data behaviors. Finally, an implementation in a SysML profile is provided.
Evolving BioAssay Ontology (BAO): modularization, integration and applications
2014-01-01
The lack of established standards to describe and annotate biological assays and screening outcomes in the domain of drug and chemical probe discovery is a severe limitation to utilize public and proprietary drug screening data to their maximum potential. We have created the BioAssay Ontology (BAO) project (http://bioassayontology.org) to develop common reference metadata terms and definitions required for describing relevant information of low-and high-throughput drug and probe screening assays and results. The main objectives of BAO are to enable effective integration, aggregation, retrieval, and analyses of drug screening data. Since we first released BAO on the BioPortal in 2010 we have considerably expanded and enhanced BAO and we have applied the ontology in several internal and external collaborative projects, for example the BioAssay Research Database (BARD). We describe the evolution of BAO with a design that enables modeling complex assays including profile and panel assays such as those in the Library of Integrated Network-based Cellular Signatures (LINCS). One of the critical questions in evolving BAO is the following: how can we provide a way to efficiently reuse and share among various research projects specific parts of our ontologies without violating the integrity of the ontology and without creating redundancies. This paper provides a comprehensive answer to this question with a description of a methodology for ontology modularization using a layered architecture. Our modularization approach defines several distinct BAO components and separates internal from external modules and domain-level from structural components. This approach facilitates the generation/extraction of derived ontologies (or perspectives) that can suit particular use cases or software applications. We describe the evolution of BAO related to its formal structures, engineering approaches, and content to enable modeling of complex assays and integration with other ontologies and datasets. PMID:25093074
Evolving BioAssay Ontology (BAO): modularization, integration and applications.
Abeyruwan, Saminda; Vempati, Uma D; Küçük-McGinty, Hande; Visser, Ubbo; Koleti, Amar; Mir, Ahsan; Sakurai, Kunie; Chung, Caty; Bittker, Joshua A; Clemons, Paul A; Brudz, Steve; Siripala, Anosha; Morales, Arturo J; Romacker, Martin; Twomey, David; Bureeva, Svetlana; Lemmon, Vance; Schürer, Stephan C
2014-01-01
The lack of established standards to describe and annotate biological assays and screening outcomes in the domain of drug and chemical probe discovery is a severe limitation to utilize public and proprietary drug screening data to their maximum potential. We have created the BioAssay Ontology (BAO) project (http://bioassayontology.org) to develop common reference metadata terms and definitions required for describing relevant information of low-and high-throughput drug and probe screening assays and results. The main objectives of BAO are to enable effective integration, aggregation, retrieval, and analyses of drug screening data. Since we first released BAO on the BioPortal in 2010 we have considerably expanded and enhanced BAO and we have applied the ontology in several internal and external collaborative projects, for example the BioAssay Research Database (BARD). We describe the evolution of BAO with a design that enables modeling complex assays including profile and panel assays such as those in the Library of Integrated Network-based Cellular Signatures (LINCS). One of the critical questions in evolving BAO is the following: how can we provide a way to efficiently reuse and share among various research projects specific parts of our ontologies without violating the integrity of the ontology and without creating redundancies. This paper provides a comprehensive answer to this question with a description of a methodology for ontology modularization using a layered architecture. Our modularization approach defines several distinct BAO components and separates internal from external modules and domain-level from structural components. This approach facilitates the generation/extraction of derived ontologies (or perspectives) that can suit particular use cases or software applications. We describe the evolution of BAO related to its formal structures, engineering approaches, and content to enable modeling of complex assays and integration with other ontologies and datasets.
A Semantic Medical Multimedia Retrieval Approach Using Ontology Information Hiding
Guo, Kehua; Zhang, Shigeng
2013-01-01
Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches. PMID:24082915
VuWiki: An Ontology-Based Semantic Wiki for Vulnerability Assessments
NASA Astrophysics Data System (ADS)
Khazai, Bijan; Kunz-Plapp, Tina; Büscher, Christian; Wegner, Antje
2014-05-01
The concept of vulnerability, as well as its implementation in vulnerability assessments, is used in various disciplines and contexts ranging from disaster management and reduction to ecology, public health or climate change and adaptation, and a corresponding multitude of ideas about how to conceptualize and measure vulnerability exists. Three decades of research in vulnerability have generated a complex and growing body of knowledge that challenges newcomers, practitioners and even experienced researchers. To provide a structured representation of the knowledge field "vulnerability assessment", we have set up an ontology-based semantic wiki for reviewing and representing vulnerability assessments: VuWiki, www.vuwiki.org. Based on a survey of 55 vulnerability assessment studies, we first developed an ontology as an explicit reference system for describing vulnerability assessments. We developed the ontology in a theoretically controlled manner based on general systems theory and guided by principles for ontology development in the field of earth and environment (Raskin and Pan 2005). Four key questions form the first level "branches" or categories of the developed ontology: (1) Vulnerability of what? (2) Vulnerability to what? (3) What reference framework was used in the vulnerability assessment?, and (4) What methodological approach was used in the vulnerability assessment? These questions correspond to the basic, abstract structure of the knowledge domain of vulnerability assessments and have been deduced from theories and concepts of various disciplines. The ontology was then implemented in a semantic wiki which allows for the classification and annotation of vulnerability assessments. As a semantic wiki, VuWiki does not aim at "synthesizing" a holistic and overarching model of vulnerability. Instead, it provides both scientists and practitioners with a uniform ontology as a reference system and offers easy and structured access to the knowledge field of vulnerability assessments with the possibility for any user to retrieve assessments using specific research criteria. Furthermore, Vuwiki can serve as a collaborative knowledge platform that allows for the active participation of those generating and using the knowledge represented in the wiki.
NASA Astrophysics Data System (ADS)
Macris, Aristomenis M.; Georgakellos, Dimitrios A.
Technology selection decisions such as equipment purchasing and supplier selection are decisions of strategic importance to companies. The nature of these decisions usually is complex, unstructured and thus, difficult to be captured in a way that will be efficiently reusable. Knowledge reusability is of paramount importance since it enables users participate actively in process design/redesign activities stimulated by the changing technology selection environment. This paper addresses the technology selection problem through an ontology-based approach that captures and makes reusable the equipment purchasing process and assists in identifying (a) the specifications requested by the users' organization, (b) those offered by various candidate vendors' organizations and (c) in performing specifications gap analysis as a prerequisite for effective and efficient technology selection. This approach has practical appeal, operational simplicity, and the potential for both immediate and long-term strategic impact. An example from the iron and steel industry is also presented to illustrate the approach.
A Hyperbolic Ontology Visualization Tool for Model Application Programming Interface Documentation
NASA Technical Reports Server (NTRS)
Hyman, Cody
2011-01-01
Spacecraft modeling, a critically important portion in validating planned spacecraft activities, is currently carried out using a time consuming method of mission to mission model implementations and integration. A current project in early development, Integrated Spacecraft Analysis (ISCA), aims to remedy this hindrance by providing reusable architectures and reducing time spent integrating models with planning and sequencing tools. The principle objective of this internship was to develop a user interface for an experimental ontology-based structure visualization of navigation and attitude control system modeling software. To satisfy this, a number of tree and graph visualization tools were researched and a Java based hyperbolic graph viewer was selected for experimental adaptation. Early results show promise in the ability to organize and display large amounts of spacecraft model documentation efficiently and effectively through a web browser. This viewer serves as a conceptual implementation for future development but trials with both ISCA developers and end users should be performed to truly evaluate the effectiveness of continued development of such visualizations.
Exploration of SWRL Rule Bases through Visualization, Paraphrasing, and Categorization of Rules
NASA Astrophysics Data System (ADS)
Hassanpour, Saeed; O'Connor, Martin J.; Das, Amar K.
Rule bases are increasingly being used as repositories of knowledge content on the Semantic Web. As the size and complexity of these rule bases increases, developers and end users need methods of rule abstraction to facilitate rule management. In this paper, we describe a rule abstraction method for Semantic Web Rule Language (SWRL) rules that is based on lexical analysis and a set of heuristics. Our method results in a tree data structure that we exploit in creating techniques to visualize, paraphrase, and categorize SWRL rules. We evaluate our approach by applying it to several biomedical ontologies that contain SWRL rules, and show how the results reveal rule patterns within the rule base. We have implemented our method as a plug-in tool for Protégé-OWL, the most widely used ontology modeling software for the Semantic Web. Our tool can allow users to rapidly explore content and patterns in SWRL rule bases, enabling their acquisition and management.
Medication Reconciliation: Work Domain Ontology, prototype development, and a predictive model.
Markowitz, Eliz; Bernstam, Elmer V; Herskovic, Jorge; Zhang, Jiajie; Shneiderman, Ben; Plaisant, Catherine; Johnson, Todd R
2011-01-01
Medication errors can result from administration inaccuracies at any point of care and are a major cause for concern. To develop a successful Medication Reconciliation (MR) tool, we believe it necessary to build a Work Domain Ontology (WDO) for the MR process. A WDO defines the explicit, abstract, implementation-independent description of the task by separating the task from work context, application technology, and cognitive architecture. We developed a prototype based upon the WDO and designed to adhere to standard principles of interface design. The prototype was compared to Legacy Health System's and Pre-Admission Medication List Builder MR tools via a Keystroke-Level Model analysis for three MR tasks. The analysis found the prototype requires the fewest mental operations, completes tasks in the fewest steps, and completes tasks in the least amount of time. Accordingly, we believe that developing a MR tool, based upon the WDO and user interface guidelines, improves user efficiency and reduces cognitive load.
Medication Reconciliation: Work Domain Ontology, Prototype Development, and a Predictive Model
Markowitz, Eliz; Bernstam, Elmer V.; Herskovic, Jorge; Zhang, Jiajie; Shneiderman, Ben; Plaisant, Catherine; Johnson, Todd R.
2011-01-01
Medication errors can result from administration inaccuracies at any point of care and are a major cause for concern. To develop a successful Medication Reconciliation (MR) tool, we believe it necessary to build a Work Domain Ontology (WDO) for the MR process. A WDO defines the explicit, abstract, implementation-independent description of the task by separating the task from work context, application technology, and cognitive architecture. We developed a prototype based upon the WDO and designed to adhere to standard principles of interface design. The prototype was compared to Legacy Health System’s and Pre-Admission Medication List Builder MR tools via a Keystroke-Level Model analysis for three MR tasks. The analysis found the prototype requires the fewest mental operations, completes tasks in the fewest steps, and completes tasks in the least amount of time. Accordingly, we believe that developing a MR tool, based upon the WDO and user interface guidelines, improves user efficiency and reduces cognitive load. PMID:22195146
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.
Motivation and Organizational Principles for Anatomical Knowledge Representation
Rosse, Cornelius; Mejino, José L.; Modayur, Bharath R.; Jakobovits, Rex; Hinshaw, Kevin P.; Brinkley, James F.
1998-01-01
Abstract Objective: Conceptualization of the physical objects and spaces that constitute the human body at the macroscopic level of organization, specified as a machine-parseable ontology that, in its human-readable form, is comprehensible to both expert and novice users of anatomical information. Design: Conceived as an anatomical enhancement of the UMLS Semantic Network and Metathesaurus, the anatomical ontology was formulated by specifying defining attributes and differentia for classes and subclasses of physical anatomical entities based on their partitive and spatial relationships. The validity of the classification was assessed by instantiating the ontology for the thorax. Several transitive relationships were used for symbolically modeling aspects of the physical organization of the thorax. Results: By declaring Organ as the macroscopic organizational unit of the body, and defining the entities that constitute organs and higher level entities constituted by organs, all anatomical entities could be assigned to one of three top level classes (Anatomical structure, Anatomical spatial entity and Body substance). The ontology accommodates both the systemic and regional (topographical) views of anatomy, as well as diverse clinical naming conventions of anatomical entities. Conclusions: The ontology formulated for the thorax is extendible to microscopic and cellular levels, as well as to other body parts, in that its classes subsume essentially all anatomical entities that constitute the body. Explicit definitions of these entities and their relationships provide the first requirement for standards in anatomical concept representation. Conceived from an anatomical viewpoint, the ontology can be generalized and mapped to other biomedical domains and problem solving tasks that require anatomical knowledge. PMID:9452983
SciFlo: Semantically-Enabled Grid Workflow for Collaborative Science
NASA Astrophysics Data System (ADS)
Yunck, T.; Wilson, B. D.; Raskin, R.; Manipon, G.
2005-12-01
SciFlo is a system for Scientific Knowledge Creation on the Grid using a Semantically-Enabled Dataflow Execution Environment. SciFlo leverages Simple Object Access Protocol (SOAP) Web Services and the Grid Computing standards (WS-* standards and the Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable SOAP Services, native executables, local command-line scripts, and python codes into a distributed computing flow (a graph of operators). SciFlo's XML dataflow documents can be a mixture of concrete operators (fully bound operations) and abstract template operators (late binding via semantic lookup). All data objects and operators can be both simply typed (simple and complex types in XML schema) and semantically typed using controlled vocabularies (linked to OWL ontologies such as SWEET). By exploiting ontology-enhanced search and inference, one can discover (and automatically invoke) Web Services and operators that have been semantically labeled as performing the desired transformation, and adapt a particular invocation to the proper interface (number, types, and meaning of inputs and outputs). The SciFlo client & server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. A Visual Programming tool is also being developed, but it is not required. Once an analysis has been specified for a granule or day of data, it can be easily repeated with different control parameters and over months or years of data. SciFlo uses and preserves semantics, and also generates and infers new semantic annotations. Specifically, the SciFlo engine uses semantic metadata to understand (infer) what it is doing and potentially improve the data flow; preserves semantics by saving links to the semantics of (metadata describing) the input datasets, related datasets, and the data transformations (algorithms) used to generate downstream products; generates new metadata by allowing the user to add semantic annotations to the generated data products (or simply accept automatically generated provenance annotations); and infers new semantic metadata by understanding and applying logic to the semantics of the data and the transformations performed. Much ontology development still needs to be done but, nevertheless, SciFlo documents provide a substrate for using and preserving more semantics as ontologies develop. We will give a live demonstration of the growing SciFlo network using an example dataflow in which atmospheric temperature and water vapor profiles from three Earth Observing System (EOS) instruments are retrieved using SOAP (geo-location query & data access) services, co-registered, and visually & statistically compared on demand (see http://sciflo.jpl.nasa.gov for more information).
Automated Database Mediation Using Ontological Metadata Mappings
Marenco, Luis; Wang, Rixin; Nadkarni, Prakash
2009-01-01
Objective To devise an automated approach for integrating federated database information using database ontologies constructed from their extended metadata. Background One challenge of database federation is that the granularity of representation of equivalent data varies across systems. Dealing effectively with this problem is analogous to dealing with precoordinated vs. postcoordinated concepts in biomedical ontologies. Model Description The authors describe an approach based on ontological metadata mapping rules defined with elements of a global vocabulary, which allows a query specified at one granularity level to fetch data, where possible, from databases within the federation that use different granularities. This is implemented in OntoMediator, a newly developed production component of our previously described Query Integrator System. OntoMediator's operation is illustrated with a query that accesses three geographically separate, interoperating databases. An example based on SNOMED also illustrates the applicability of high-level rules to support the enforcement of constraints that can prevent inappropriate curator or power-user actions. Summary A rule-based framework simplifies the design and maintenance of systems where categories of data must be mapped to each other, for the purpose of either cross-database query or for curation of the contents of compositional controlled vocabularies. PMID:19567801
Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature
Müller, Hans-Michael; Kenny, Eimear E
2004-01-01
We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other organism-specific corpora of text. Textpresso can be accessed at http://www.textpresso.org or via WormBase at http://www.wormbase.org. PMID:15383839
Service composition towards increasing end-user accessibility.
Kaklanis, Nikolaos; Votis, Konstantinos; Tzovaras, Dimitrios
2015-01-01
This paper presents the Cloud4all Service Synthesizer Tool, a framework that enables efficient orchestration of accessibility services, as well as their combination into complex forms, providing more advanced functionalities towards increasing the accessibility of end-users with various types of functional limitations. The supported services are described formally within an ontology, enabling, thus, semantic service composition. The proposed service composition approach is based on semantic matching between services specifications on the one hand and user needs/preferences and current context of use on the other hand. The use of automatic composition of accessibility services can significantly enhance end-users' accessibility, especially in cases where assistive solutions are not available in their device.
Towards refactoring the Molecular Function Ontology with a UML profile for function modeling.
Burek, Patryk; Loebe, Frank; Herre, Heinrich
2017-10-04
Gene Ontology (GO) is the largest resource for cataloging gene products. This resource grows steadily and, naturally, this growth raises issues regarding the structure of the ontology. Moreover, modeling and refactoring large ontologies such as GO is generally far from being simple, as a whole as well as when focusing on certain aspects or fragments. It seems that human-friendly graphical modeling languages such as the Unified Modeling Language (UML) could be helpful in connection with these tasks. We investigate the use of UML for making the structural organization of the Molecular Function Ontology (MFO), a sub-ontology of GO, more explicit. More precisely, we present a UML dialect, called the Function Modeling Language (FueL), which is suited for capturing functions in an ontologically founded way. FueL is equipped, among other features, with language elements that arise from studying patterns of subsumption between functions. We show how to use this UML dialect for capturing the structure of molecular functions. Furthermore, we propose and discuss some refactoring options concerning fragments of MFO. FueL enables the systematic, graphical representation of functions and their interrelations, including making information explicit that is currently either implicit in MFO or is mainly captured in textual descriptions. Moreover, the considered subsumption patterns lend themselves to the methodical analysis of refactoring options with respect to MFO. On this basis we argue that the approach can increase the comprehensibility of the structure of MFO for humans and can support communication, for example, during revision and further development.
A Semantically Enabled Metadata Repository for Solar Irradiance Data Products
NASA Astrophysics Data System (ADS)
Wilson, A.; Cox, M.; Lindholm, D. M.; Nadiadi, I.; Traver, T.
2014-12-01
The Laboratory for Atmospheric and Space Physics, LASP, has been conducting research in Atmospheric and Space science for over 60 years, and providing the associated data products to the public. LASP has a long history, in particular, of making space-based measurements of the solar irradiance, which serves as crucial input to several areas of scientific research, including solar-terrestrial interactions, atmospheric, and climate. LISIRD, the LASP Interactive Solar Irradiance Data Center, serves these datasets to the public, including solar spectral irradiance (SSI) and total solar irradiance (TSI) data. The LASP extended metadata repository, LEMR, is a database of information about the datasets served by LASP, such as parameters, uncertainties, temporal and spectral ranges, current version, alerts, etc. It serves as the definitive, single source of truth for that information. The database is populated with information garnered via web forms and automated processes. Dataset owners keep the information current and verified for datasets under their purview. This information can be pulled dynamically for many purposes. Web sites such as LISIRD can include this information in web page content as it is rendered, ensuring users get current, accurate information. It can also be pulled to create metadata records in various metadata formats, such as SPASE (for heliophysics) and ISO 19115. Once these records are be made available to the appropriate registries, our data will be discoverable by users coming in via those organizations. The database is implemented as a RDF triplestore, a collection of instances of subject-object-predicate data entities identifiable with a URI. This capability coupled with SPARQL over HTTP read access enables semantic queries over the repository contents. To create the repository we leveraged VIVO, an open source semantic web application, to manage and create new ontologies and populate repository content. A variety of ontologies were used in creating the triplestore, including ontologies that came with VIVO such as FOAF. Also, the W3C DCAT ontology was integrated and extended to describe properties of our data products that we needed to capture, such as spectral range. The presentation will describe the architecture, ontology issues, and tools used to create LEMR and plans for its evolution.
A Framework for Integrating Oceanographic Data Repositories
NASA Astrophysics Data System (ADS)
Rozell, E.; Maffei, A. R.; Beaulieu, S. E.; Fox, P. A.
2010-12-01
Oceanographic research covers a broad range of science domains and requires a tremendous amount of cross-disciplinary collaboration. Advances in cyberinfrastructure are making it easier to share data across disciplines through the use of web services and community vocabularies. Best practices in the design of web services and vocabularies to support interoperability amongst science data repositories are only starting to emerge. Strategic design decisions in these areas are crucial to the creation of end-user data and application integration tools. We present S2S, a novel framework for deploying customizable user interfaces to support the search and analysis of data from multiple repositories. Our research methods follow the Semantic Web methodology and technology development process developed by Fox et al. This methodology stresses the importance of close scientist-technologist interactions when developing scientific use cases, keeping the project well scoped and ensuring the result meets a real scientific need. The S2S framework motivates the development of standardized web services with well-described parameters, as well as the integration of existing web services and applications in the search and analysis of data. S2S also encourages the use and development of community vocabularies and ontologies to support federated search and reduce the amount of domain expertise required in the data discovery process. S2S utilizes the Web Ontology Language (OWL) to describe the components of the framework, including web service parameters, and OpenSearch as a standard description for web services, particularly search services for oceanographic data repositories. We have created search services for an oceanographic metadata database, a large set of quality-controlled ocean profile measurements, and a biogeographic search service. S2S provides an application programming interface (API) that can be used to generate custom user interfaces, supporting data and application integration across these repositories and other web resources. Although initially targeted towards a general oceanographic audience, the S2S framework shows promise in many science domains, inspired in part by the broad disciplinary coverage of oceanography. This presentation will cover the challenges addressed by the S2S framework, the research methods used in its development, and the resulting architecture for the system. It will demonstrate how S2S is remarkably extensible, and can be generalized to many science domains. Given these characteristics, the framework can simplify the process of data discovery and analysis for the end user, and can help to shift the responsibility of search interface development away from data managers.
Evaluating Health Information Systems Using Ontologies
Anderberg, Peter; Larsson, Tobias C; Fricker, Samuel A; Berglund, Johan
2016-01-01
Background There are several frameworks that attempt to address the challenges of evaluation of health information systems by offering models, methods, and guidelines about what to evaluate, how to evaluate, and how to report the evaluation results. Model-based evaluation frameworks usually suggest universally applicable evaluation aspects but do not consider case-specific aspects. On the other hand, evaluation frameworks that are case specific, by eliciting user requirements, limit their output to the evaluation aspects suggested by the users in the early phases of system development. In addition, these case-specific approaches extract different sets of evaluation aspects from each case, making it challenging to collectively compare, unify, or aggregate the evaluation of a set of heterogeneous health information systems. Objectives The aim of this paper is to find a method capable of suggesting evaluation aspects for a set of one or more health information systems—whether similar or heterogeneous—by organizing, unifying, and aggregating the quality attributes extracted from those systems and from an external evaluation framework. Methods On the basis of the available literature in semantic networks and ontologies, a method (called Unified eValuation using Ontology; UVON) was developed that can organize, unify, and aggregate the quality attributes of several health information systems into a tree-style ontology structure. The method was extended to integrate its generated ontology with the evaluation aspects suggested by model-based evaluation frameworks. An approach was developed to extract evaluation aspects from the ontology that also considers evaluation case practicalities such as the maximum number of evaluation aspects to be measured or their required degree of specificity. The method was applied and tested in Future Internet Social and Technological Alignment Research (FI-STAR), a project of 7 cloud-based eHealth applications that were developed and deployed across European Union countries. Results The relevance of the evaluation aspects created by the UVON method for the FI-STAR project was validated by the corresponding stakeholders of each case. These evaluation aspects were extracted from a UVON-generated ontology structure that reflects both the internally declared required quality attributes in the 7 eHealth applications of the FI-STAR project and the evaluation aspects recommended by the Model for ASsessment of Telemedicine applications (MAST) evaluation framework. The extracted evaluation aspects were used to create questionnaires (for the corresponding patients and health professionals) to evaluate each individual case and the whole of the FI-STAR project. Conclusions The UVON method can provide a relevant set of evaluation aspects for a heterogeneous set of health information systems by organizing, unifying, and aggregating the quality attributes through ontological structures. Those quality attributes can be either suggested by evaluation models or elicited from the stakeholders of those systems in the form of system requirements. The method continues to be systematic, context sensitive, and relevant across a heterogeneous set of health information systems. PMID:27311735
Evaluating Health Information Systems Using Ontologies.
Eivazzadeh, Shahryar; Anderberg, Peter; Larsson, Tobias C; Fricker, Samuel A; Berglund, Johan
2016-06-16
There are several frameworks that attempt to address the challenges of evaluation of health information systems by offering models, methods, and guidelines about what to evaluate, how to evaluate, and how to report the evaluation results. Model-based evaluation frameworks usually suggest universally applicable evaluation aspects but do not consider case-specific aspects. On the other hand, evaluation frameworks that are case specific, by eliciting user requirements, limit their output to the evaluation aspects suggested by the users in the early phases of system development. In addition, these case-specific approaches extract different sets of evaluation aspects from each case, making it challenging to collectively compare, unify, or aggregate the evaluation of a set of heterogeneous health information systems. The aim of this paper is to find a method capable of suggesting evaluation aspects for a set of one or more health information systems-whether similar or heterogeneous-by organizing, unifying, and aggregating the quality attributes extracted from those systems and from an external evaluation framework. On the basis of the available literature in semantic networks and ontologies, a method (called Unified eValuation using Ontology; UVON) was developed that can organize, unify, and aggregate the quality attributes of several health information systems into a tree-style ontology structure. The method was extended to integrate its generated ontology with the evaluation aspects suggested by model-based evaluation frameworks. An approach was developed to extract evaluation aspects from the ontology that also considers evaluation case practicalities such as the maximum number of evaluation aspects to be measured or their required degree of specificity. The method was applied and tested in Future Internet Social and Technological Alignment Research (FI-STAR), a project of 7 cloud-based eHealth applications that were developed and deployed across European Union countries. The relevance of the evaluation aspects created by the UVON method for the FI-STAR project was validated by the corresponding stakeholders of each case. These evaluation aspects were extracted from a UVON-generated ontology structure that reflects both the internally declared required quality attributes in the 7 eHealth applications of the FI-STAR project and the evaluation aspects recommended by the Model for ASsessment of Telemedicine applications (MAST) evaluation framework. The extracted evaluation aspects were used to create questionnaires (for the corresponding patients and health professionals) to evaluate each individual case and the whole of the FI-STAR project. The UVON method can provide a relevant set of evaluation aspects for a heterogeneous set of health information systems by organizing, unifying, and aggregating the quality attributes through ontological structures. Those quality attributes can be either suggested by evaluation models or elicited from the stakeholders of those systems in the form of system requirements. The method continues to be systematic, context sensitive, and relevant across a heterogeneous set of health information systems.
Intelligence Reach for Expertise (IREx)
NASA Astrophysics Data System (ADS)
Hadley, Christina; Schoening, James R.; Schreiber, Yonatan
2015-05-01
IREx is a search engine for next-generation analysts to find collaborators. U.S. Army Field Manual 2.0 (Intelligence) calls for collaboration within and outside the area of operations, but finding the best collaborator for a given task can be challenging. IREx will be demonstrated as part of Actionable Intelligence Technology Enabled Capability Demonstration (AI-TECD) at the E15 field exercises at Ft. Dix in July 2015. It includes a Task Model for describing a task and its prerequisite competencies, plus a User Model (i.e., a user profile) for individuals to assert their capabilities and other relevant data. These models use a canonical suite of ontologies as a foundation for these models, which enables robust queries and also keeps the models logically consistent. IREx also supports learning validation, where a learner who has completed a course module can search and find a suitable task to practice and demonstrate that their new knowledge can be used in the real world for its intended purpose. The IREx models are in the initial phase of a process to develop them as an IEEE standard. This initiative is currently an approved IEEE Study Group, after which follows a standards working group, then a balloting group, and if all goes well, an IEEE standard.
NASA Astrophysics Data System (ADS)
Wright, D. J.; Lassoued, Y.; Dwyer, N.; Haddad, T.; Bermudez, L. E.; Dunne, D.
2009-12-01
Coastal mapping plays an important role in informing marine spatial planning, resource management, maritime safety, hazard assessment and even national sovereignty. As such, there is now a plethora of data/metadata catalogs, pre-made maps, tabular and text information on resource availability and exploitation, and decision-making tools. A recent trend has been to encapsulate these in a special class of web-enabled geographic information systems called a coastal web atlas (CWA). While multiple benefits are derived from tailor-made atlases, there is great value added from the integration of disparate CWAs. CWAs linked to one another can query more successfully to optimize planning and decision-making. If a dataset is missing in one atlas, it may be immediately located in another. Similar datasets in two atlases may be combined to enhance study in either region. *But how best to achieve semantic interoperability to mitigate vague data queries, concepts or natural language semantics when retrieving and integrating data and information?* We report on the development of a new prototype seeking to interoperate between two initial CWAs: the Marine Irish Digital Atlas (MIDA) and the Oregon Coastal Atlas (OCA). These two mature atlases are used as a testbed for more regional connections, with the intent for the OCA to use lessons learned to develop a regional network of CWAs along the west coast, and for MIDA to do the same in building and strengthening atlas networks with the UK, Belgium, and other parts of Europe. Our prototype uses semantic interoperability via services harmonization and ontology mediation, allowing local atlases to use their own data structures, and vocabularies (ontologies). We use standard technologies such as OGC Web Map Services (WMS) for delivering maps, and OGC Catalogue Service for the Web (CSW) for delivering and querying ISO-19139 metadata. The metadata records of a given CWA use a given ontology of terms called local ontology. Human or machine users formulate their requests using a common ontology of metadata terms, called global ontology. A CSW mediator rewrites the user’s request into CSW requests over local CSWs using their own (local) ontologies, collects the results and sends them back to the user. To extend the system, we have recently added global maritime boundaries and are also considering nearshore ocean observing system data. Ongoing work includes adding WFS, error management, and exception handling, enabling Smart Searches, and writing full documentation. This prototype is a central research project of the new International Coastal Atlas Network (ICAN), a group of 30+ organizations from 14 nations (and growing) dedicated to seeking interoperability approaches to CWAs in support of coastal zone management and the translation of coastal science to coastal decision-making.
Finding My Needle in the Haystack: Effective Personalized Re-ranking of Search Results in Prospector
NASA Astrophysics Data System (ADS)
König, Florian; van Velsen, Lex; Paramythis, Alexandros
This paper provides an overview of Prospector, a personalized Internet meta-search engine, which utilizes a combination of ontological information, ratings-based models of user interests, and complementary theme-oriented group models to recommend (through re-ranking) search results obtained from an underlying search engine. Re-ranking brings “closer to the top” those items that are of particular interest to a user or have high relevance to a given theme. A user-based, real-world evaluation has shown that the system is effective in promoting results of interest, but lags behind Google in user acceptance, possibly due to the absence of features popularized by said search engine. Overall, users would consider employing a personalized search engine to perform searches with terms that require disambiguation and / or contextualization.
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations.
Martínez-Romero, Marcos; O'Connor, Martin J; Shankar, Ravi D; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L; Gevaert, Olivier; Graybeal, John; Musen, Mark A
2017-01-01
In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository.
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations
Martínez-Romero, Marcos; O’Connor, Martin J.; Shankar, Ravi D.; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L.; Gevaert, Olivier; Graybeal, John; Musen, Mark A.
2017-01-01
In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository. PMID:29854196
SPARQL Assist language-neutral query composer
2012-01-01
Background SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. Results We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. Conclusions To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources. PMID:22373327
SPARQL assist language-neutral query composer.
McCarthy, Luke; Vandervalk, Ben; Wilkinson, Mark
2012-01-25
SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources.
Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn
2011-04-29
In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures
2013-01-01
Background The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. Results We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. Conclusions The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis. PMID:24067102
Spatial cyberinfrastructures, ontologies, and the humanities
Sieber, Renee E.; Wellen, Christopher C.; Jin, Yuan
2011-01-01
We report on research into building a cyberinfrastructure for Chinese biographical and geographic data. Our cyberinfrastructure contains (i) the McGill-Harvard-Yenching Library Ming Qing Women's Writings database (MQWW), the only online database on historical Chinese women's writings, (ii) the China Biographical Database, the authority for Chinese historical people, and (iii) the China Historical Geographical Information System, one of the first historical geographic information systems. Key to this integration is that linked databases retain separate identities as bases of knowledge, while they possess sufficient semantic interoperability to allow for multidatabase concepts and to support cross-database queries on an ad hoc basis. Computational ontologies create underlying semantics for database access. This paper focuses on the spatial component in a humanities cyberinfrastructure, which includes issues of conflicting data, heterogeneous data models, disambiguation, and geographic scale. First, we describe the methodology for integrating the databases. Then we detail the system architecture, which includes a tier of ontologies and schema. We describe the user interface and applications that allow for cross-database queries. For instance, users should be able to analyze the data, examine hypotheses on spatial and temporal relationships, and generate historical maps with datasets from MQWW for research, teaching, and publication on Chinese women writers, their familial relations, publishing venues, and the literary and social communities. Last, we discuss the social side of cyberinfrastructure development, as people are considered to be as critical as the technical components for its success. PMID:21444819
Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn
2011-01-01
In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources. PMID:21677865
Research and application of knowledge resources network for product innovation.
Li, Chuan; Li, Wen-qiang; Li, Yan; Na, Hui-zhen; Shi, Qian
2015-01-01
In order to enhance the capabilities of knowledge service in product innovation design service platform, a method of acquiring knowledge resources supporting for product innovation from the Internet and providing knowledge active push is proposed. Through knowledge modeling for product innovation based on ontology, the integrated architecture of knowledge resources network is put forward. The technology for the acquisition of network knowledge resources based on focused crawler and web services is studied. Knowledge active push is provided for users by user behavior analysis and knowledge evaluation in order to improve users' enthusiasm for participation in platform. Finally, an application example is illustrated to prove the effectiveness of the method.
NASA Astrophysics Data System (ADS)
Shepherd, Adam; Arko, Robert; Krisnadhi, Adila; Hitzler, Pascal; Janowicz, Krzysztof; Chandler, Cyndy; Narock, Tom; Cheatham, Michelle; Schildhauer, Mark; Jones, Matt; Raymond, Lisa; Mickle, Audrey; Finin, Tim; Fils, Doug; Carbotte, Suzanne; Lehnert, Kerstin
2015-04-01
Integrating datasets for new use cases is one of the common drivers for adopting semantic web technologies. Even though linked data principles enables this type of activity over time, the task of reconciling new ontological commitments for newer use cases can be daunting. This situation was faced by the Biological and Chemical Oceanography Data Management Office (BCO-DMO) as it sought to integrate its existing linked data with other data repositories to address newer scientific use cases as a partner in the GeoLink Project. To achieve a successful integration with other GeoLink partners, BCO-DMO's metadata would need to be described using the new ontologies developed by the GeoLink partners - a situation that could impact semantic inferencing, pre-existing software and external users of BCO-DMO's linked data. This presentation describes the process of how GeoLink is bridging the gap between local, pre-existing ontologies to achieve scientific metadata integration for all its partners through the use of ontology design patterns. GeoLink, an NSF EarthCube Building Block, brings together experts from the geosciences, computer science, and library science in an effort to improve discovery and reuse of data and knowledge. Its participating repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecology and biogeochemistry to paleoclimatology. GeoLink's outcomes include a set of reusable ontology design patterns (ODPs) that describe core geoscience concepts, a network of Linked Data published by participating repositories using those ODPs, and tools to facilitate discovery of related content in multiple repositories.
Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.
2016-12-01
Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren
2011-01-01
Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151
Moving Controlled Vocabularies into the Semantic Web
NASA Astrophysics Data System (ADS)
Thomas, R.; Lowry, R. K.; Kokkinaki, A.
2015-12-01
One of the issues with legacy oceanographic data formats is that the only tool available for describing what a measurement is and how it was made is a single metadata tag known as the parameter code. The British Oceanographic Data Centre (BODC) has been supporting the international oceanographic community gain maximum benefit from this through a controlled vocabulary known as the BODC Parameter Usage Vocabulary (PUV). Over time this has grown to over 34,000 entries some of which have preferred labels with over 400 bytes of descriptive information detailing what was measured and how. A decade ago the BODC pioneered making this information available in a more useful form with the implementation of a prototype vocabulary server (NVS) that referenced each 'parameter code' as a URL. This developed into the current server (NVS V2) in which the parameter URL resolves into an RDF document based on the SKOS data model which includes a list of resource URLs mapped to the 'parameter'. For example the parameter code for a contaminant in biota, such as 'cadmium in Mytilus edulis', carries RDF triples leading to the entry for Mytilus edulis in the WoRMS and for cadmium in the ChEBI ontologies. By providing links into these external ontologies the information captured in a 1980s parameter code now conforms to the Linked Data paradigm of the Semantic Web, vastly increasing the descriptive information accessible to a user. This presentation will describe the next steps along the road to the Semantic Web with the development of a SPARQL end point1 to expose the PUV plus the 190 other controlled vocabularies held in NVS. Whilst this is ideal for those fluent in SPARQL, most users require something a little more user-friendly and so the NVS browser2 was developed over the end point to allow less technical users to query the vocabularies and navigate the NVS ontology. This tool integrates into an editor that allows vocabulary content to be manipulated by authorised users outside BODC. Having placed Linked Data tooling over a single SPARQL end point the obvious future development for this system is to support semantic interoperability outside NVS by the incorporation of federated SPARQL end points in the USA and Australia during the ODIP II project. 1https://vocab.nerc.ac.uk/sparql 2 https://www.bodc.ac.uk/data/codes_and_formats/vocabulary_search/
Meeting medical terminology needs--the Ontology-Enhanced Medical Concept Mapper.
Leroy, G; Chen, H
2001-12-01
This paper describes the development and testing of the Medical Concept Mapper, a tool designed to facilitate access to online medical information sources by providing users with appropriate medical search terms for their personal queries. Our system is valuable for patients whose knowledge of medical vocabularies is inadequate to find the desired information, and for medical experts who search for information outside their field of expertise. The Medical Concept Mapper maps synonyms and semantically related concepts to a user's query. The system is unique because it integrates our natural language processing tool, i.e., the Arizona (AZ) Noun Phraser, with human-created ontologies, the Unified Medical Language System (UMLS) and WordNet, and our computer generated Concept Space, into one system. Our unique contribution results from combining the UMLS Semantic Net with Concept Space in our deep semantic parsing (DSP) algorithm. This algorithm establishes a medical query context based on the UMLS Semantic Net, which allows Concept Space terms to be filtered so as to isolate related terms relevant to the query. We performed two user studies in which Medical Concept Mapper terms were compared against human experts' terms. We conclude that the AZ Noun Phraser is well suited to extract medical phrases from user queries, that WordNet is not well suited to provide strictly medical synonyms, that the UMLS Metathesaurus is well suited to provide medical synonyms, and that Concept Space is well suited to provide related medical terms, especially when these terms are limited by our DSP algorithm.
Solovieva, Elena; Shikanai, Toshihide; Fujita, Noriaki; Narimatsu, Hisashi
2018-04-18
Inherited mutations in glyco-related genes can affect the biosynthesis and degradation of glycans and result in severe genetic diseases and disorders. The Glyco-Disease Genes Database (GDGDB), which provides information about these diseases and disorders as well as their causative genes, has been developed by the Research Center for Medical Glycoscience (RCMG) and released in April 2010. GDGDB currently provides information on about 80 genetic diseases and disorders caused by single-gene mutations in glyco-related genes. Many biomedical resources provide information about genetic disorders and genes involved in their pathogenesis, but resources focused on genetic disorders known to be related to glycan metabolism are lacking. With the aim of providing more comprehensive knowledge on genetic diseases and disorders of glycan biosynthesis and degradation, we enriched the content of the GDGDB database and improved the methods for data representation. We developed the Genetic Glyco-Diseases Ontology (GGDonto) and a RDF/SPARQL-based user interface using Semantic Web technologies. In particular, we represented the GGDonto content using Semantic Web languages, such as RDF, RDFS, SKOS, and OWL, and created an interactive user interface based on SPARQL queries. This user interface provides features to browse the hierarchy of the ontology, view detailed information on diseases and related genes, and find relevant background information. Moreover, it provides the ability to filter and search information by faceted and keyword searches. Focused on the molecular etiology, pathogenesis, and clinical manifestations of genetic diseases and disorders of glycan metabolism and developed as a knowledge-base for this scientific field, GGDonto provides comprehensive information on various topics, including links to aid the integration with other scientific resources. The availability and accessibility of this knowledge will help users better understand how genetic defects impact the metabolism of glycans as well as how this impaired metabolism affects various biological functions and human health. In this way, GGDonto will be useful in fields related to glycoscience, including cell biology, biotechnology, and biomedical, and pharmaceutical research.
Bard, Jonathan
2012-01-01
This paper describes a new ontology of human developmental anatomy covering the first 49 days [Carnegie stages (CS)1–20], primarily structured around the parts of organ systems and their development. The ontology includes more than 2000 anatomical entities (AEs) that range from the whole embryo, through organ systems and organ parts down to simple or leaf tissues (groups of cells with the same morphological phenotype), as well as features such as cavities. Each AE has assigned to it a set of facts of the form
Bard, Jonathan
2012-11-01
This paper describes a new ontology of human developmental anatomy covering the first 49 days [Carnegie stages (CS)1-20], primarily structured around the parts of organ systems and their development. The ontology includes more than 2000 anatomical entities (AEs) that range from the whole embryo, through organ systems and organ parts down to simple or leaf tissues (groups of cells with the same morphological phenotype), as well as features such as cavities. Each AE has assigned to it a set of facts of the form
Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach
NASA Astrophysics Data System (ADS)
Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.
2013-04-01
In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check the consistency of the developed ontologies, and logical reasoning is performed to infer implicit relations between defined concepts. The ontology for the definition of building is specified using the Ontology Web Language (OWL). It is the most widely used ontology language that is based on Description Logics (DL). DL allows the description of internal properties of modelled concepts (roof typology, shape, area, height etc.) and relationships between objects (IS_A, MEMBER_OF/INSTANCE_OF). It captures terminological knowledge (TBox) as well as assertional knowledge (ABox) - that represents facts about concept instances, i.e. the buildings in airborne LiDAR data. To assess the classification accuracy, ground truth data generated by visual interpretation and calculated classification results in terms of precision and recall are used. The advantages of this approach are: (i) flexibility, (ii) transferability, and (iii) extendibility - i.e. ontology can be extended with further concepts, data properties and object properties.
Connecting Provenance with Semantic Descriptions in the NASA Earth Exchange (NEX)
NASA Astrophysics Data System (ADS)
Votava, P.; Michaelis, A.; Nemani, R. R.
2012-12-01
NASA Earth Exchange (NEX) is a data, modeling and knowledge collaboratory that houses NASA satellite data, climate data and ancillary data where a focused community may come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform. Some of the main goals of NEX are transparency and repeatability and to that extent we have been adding components that enable tracking of provenance of both scientific processes and datasets produced by these processes. As scientific processes become more complex, they are often developed collaboratively and it becomes increasingly important for the research team to be able to track the development of the process and the datasets that are produced along the way. Additionally, we want to be able to link the processes and the datasets developed on NEX to an existing information and knowledge, so that the users can query and compare the provenance of any dataset or process with regard to the component-specific attributes such as data quality, geographic location, related publications, user comments and annotations etc. We have developed several ontologies that describe datasets and workflow components available on NEX using the OWL ontology language as well as a simple ontology that provides linking mechanism to the collected provenance information. The provenance is captured in two ways - we utilize existing provenance infrastructure of VisTrails, which is used as a workflow engine on NEX, and we extend the captured provenance using the PROV data model expressed through the PROV-O ontology. We do this in order to link and query the provenance easier in the context of the existing NEX information and knowledge. The captured provenance graph is processed and stored using RDFlib with MySQL backend that can be queried using either RDFLib or SPARQL. As a concrete example, we show how this information is captured during anomaly detection process in large satellite datasets.
Conservation-Oriented Hbim. The Bimexplorer Web Tool
NASA Astrophysics Data System (ADS)
Quattrini, R.; Pierdicca, R.; Morbidoni, C.; Malinverni, E. S.
2017-05-01
The application of (H)BIM within the domain of Architectural Historical Heritage has huge potential that can be even exploited within the restoration domain. The work presents a novel approach to solve the widespread interoperability issue related to the data enrichment in BIM environment, by developing and testing a web tool based on a specific workflow experienced choosing as the case study a Romanic church in Portonovo, Ancona, Italy. Following the need to make the data, organized in a BIM environment, usable for the different actors involved in the restoration phase, we have created a pipeline that take advantage of BIM existing platforms and semantic-web technologies, enabling the end user to query a repository composed of semantically structured data. The pipeline of work consists in four major steps: i) modelling an ontology with the main information needs for the domain of interest, providing a data structure that can be leveraged to inform the data-enrichment phase and, later, to meaningfully query the data; ii) data enrichment, by creating a set of shared parameters reflecting the properties in our domain ontology; iii) structuring data in a machine-readable format (through a data conversion) to represent the domain (ontology) and analyse data of specific buildings respectively; iv) development of a demonstrative data exploration web application based on the faceted browsing paradigm and allowing to exploit both structured metadata and 3D visualization. The application can be configured by a domain expert to reflect a given domain ontology, and used by an operator to query and explore the data in a more efficient and reliable way. With the proposed solution the analysis of data can be reused together with the 3D model, providing the end-user with a non proprietary tool; in this way, the planned maintenance or the restoration project became more collaborative and interactive, optimizing the whole process of HBIM data collection.
The IHMC CmapTools software in research and education: a multi-level use case in Space Meteorology
NASA Astrophysics Data System (ADS)
Messerotti, Mauro
2010-05-01
The IHMC (Institute for Human and Machine Cognition, Florida University System, USA) CmapTools software is a powerful multi-platform tool for knowledge modelling in graphical form based on concept maps. In this work we present its application for the high-level development of a set of multi-level concept maps in the framework of Space Meteorology to act as the kernel of a space meteorology domain ontology. This is an example of a research use case, as a domain ontology coded in machine-readable form via e.g. OWL (Web Ontology Language) is suitable to be an active layer of any knowledge management system embedded in a Virtual Observatory (VO). Apart from being manageable at machine level, concept maps developed via CmapTools are intrinsically human-readable and can embed hyperlinks and objects of many kinds. Therefore they are suitable to be published on the web: the coded knowledge can be exploited for educational purposes by the students and the public, as the level of information can be naturally organized among linked concept maps in progressively increasing complexity levels. Hence CmapTools and its advanced version COE (Concept-map Ontology Editor) represent effective and user-friendly software tools for high-level knowledge represention in research and education.
2006-12-01
speed of search engines improves the efficiency of such methods, effectiveness is not improved. The objective of this thesis is to construct and test...interest, users are assisted in finding a relevant set of key terms that will aid the search engines in narrowing, widening, or refocusing a Web search
Ontology-Based Peer Exchange Network (OPEN)
ERIC Educational Resources Information Center
Dong, Hui
2010-01-01
In current Peer-to-Peer networks, distributed and semantic free indexing is widely used by systems adopting "Distributed Hash Table" ("DHT") mechanisms. Although such systems typically solve a. user query rather fast in a deterministic way, they only support a very narrow search scheme, namely the exact hash key match. Furthermore, DHT systems put…
An Assistant for Loading Learning Object Metadata: An Ontology Based Approach
ERIC Educational Resources Information Center
Casali, Ana; Deco, Claudia; Romano, Agustín; Tomé, Guillermo
2013-01-01
In the last years, the development of different Repositories of Learning Objects has been increased. Users can retrieve these resources for reuse and personalization through searches in web repositories. The importance of high quality metadata is key for a successful retrieval. Learning Objects are described with metadata usually in the standard…
Reasoning and Ontologies for Personalized E-Learning in the Semantic Web
ERIC Educational Resources Information Center
Henze, Nicola; Dolog, Peter; Nejdl, Wolfgang
2004-01-01
The challenge of the semantic web is the provision of distributed information with well-defined meaning, understandable for different parties. Particularly, applications should be able to provide individually optimized access to information by taking the individual needs and requirements of the users into account. In this paper we propose a…
NASA Astrophysics Data System (ADS)
Chmiel, P.; Ganzha, M.; Jaworska, T.; Paprzycki, M.
2017-10-01
Nowadays, as a part of systematic growth of volume, and variety, of information that can be found on the Internet, we observe also dramatic increase in sizes of available image collections. There are many ways to help users browsing / selecting images of interest. One of popular approaches are Content-Based Image Retrieval (CBIR) systems, which allow users to search for images that match their interests, expressed in the form of images (query by example). However, we believe that image search and retrieval could take advantage of semantic technologies. We have decided to test this hypothesis. Specifically, on the basis of knowledge captured in the CBIR, we have developed a domain ontology of residential real estate (detached houses, in particular). This allows us to semantically represent each image (and its constitutive architectural elements) represented within the CBIR. The proposed ontology was extended to capture not only the elements resulting from image segmentation, but also "spatial relations" between them. As a result, a new approach to querying the image database (semantic querying) has materialized, thus extending capabilities of the developed system.
VitisExpDB: a database resource for grape functional genomics.
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-02-28
The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-01-01
Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
Profiling users in the UNIX os environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dao, V N P; Vemuri, R; Templeton, S J
2000-09-29
This paper presents results obtained by using a method of profiling a user based on the login host, the login time, the command set, and the command set execution time of the profiled user. It is assumed that the user is logging onto a UNIX host on a computer network. The paper concentrates on two areas: short-term and long-term profiling. In short-term profiling the focus is on profiling the user at a given session where user characteristics do not change much. In long-term profiling, the duration of observation is over a much longer period of time. The latter is moremore » challenging because of a phenomenon called concept or profile drift. Profile drift occurs when a user logs onto a host for an extended period of time (over several sessions).« less
clusterProfiler: an R package for comparing biological themes among gene clusters.
Yu, Guangchuang; Wang, Li-Gen; Han, Yanyan; He, Qing-Yu
2012-05-01
Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.
Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E
2016-03-11
Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.
A semantically-aided architecture for a web-based monitoring system for carotid atherosclerosis.
Kolias, Vassileios D; Stamou, Giorgos; Golemati, Spyretta; Stoitsis, Giannis; Gkekas, Christos D; Liapis, Christos D; Nikita, Konstantina S
2015-08-01
Carotid atherosclerosis is a multifactorial disease and its clinical diagnosis depends on the evaluation of heterogeneous clinical data, such as imaging exams, biochemical tests and the patient's clinical history. The lack of interoperability between Health Information Systems (HIS) does not allow the physicians to acquire all the necessary data for the diagnostic process. In this paper, a semantically-aided architecture is proposed for a web-based monitoring system for carotid atherosclerosis that is able to gather and unify heterogeneous data with the use of an ontology and to create a common interface for data access enhancing the interoperability of HIS. The architecture is based on an application ontology of carotid atherosclerosis that is used to (a) integrate heterogeneous data sources on the basis of semantic representation and ontological reasoning and (b) access the critical information using SPARQL query rewriting and ontology-based data access services. The architecture was tested over a carotid atherosclerosis dataset consisting of the imaging exams and the clinical profile of 233 patients, using a set of complex queries, constructed by the physicians. The proposed architecture was evaluated with respect to the complexity of the queries that the physicians could make and the retrieval speed. The proposed architecture gave promising results in terms of interoperability, data integration of heterogeneous sources with an ontological way and expanded capabilities of query and retrieval in HIS.
Drug target ontology to classify and integrate drug discovery data.
Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande; Turner, John Paul; Vidovic, Dusica; Forlin, Michele; Koleti, Amar; Nguyen, Dac-Trung; Jensen, Lars Juhl; Guha, Rajarshi; Mathias, Stephen L; Ursu, Oleg; Stathias, Vasileios; Duan, Jianbin; Nabizadeh, Nooshin; Chung, Caty; Mader, Christopher; Visser, Ubbo; Yang, Jeremy J; Bologa, Cristian G; Oprea, Tudor I; Schürer, Stephan C
2017-11-09
One of the most successful approaches to develop new small molecule therapeutics has been to start from a validated druggable protein target. However, only a small subset of potentially druggable targets has attracted significant research and development resources. The Illuminating the Druggable Genome (IDG) project develops resources to catalyze the development of likely targetable, yet currently understudied prospective drug targets. A central component of the IDG program is a comprehensive knowledge resource of the druggable genome. As part of that effort, we have developed a framework to integrate, navigate, and analyze drug discovery data based on formalized and standardized classifications and annotations of druggable protein targets, the Drug Target Ontology (DTO). DTO was constructed by extensive curation and consolidation of various resources. DTO classifies the four major drug target protein families, GPCRs, kinases, ion channels and nuclear receptors, based on phylogenecity, function, target development level, disease association, tissue expression, chemical ligand and substrate characteristics, and target-family specific characteristics. The formal ontology was built using a new software tool to auto-generate most axioms from a database while supporting manual knowledge acquisition. A modular, hierarchical implementation facilitate ontology development and maintenance and makes use of various external ontologies, thus integrating the DTO into the ecosystem of biomedical ontologies. As a formal OWL-DL ontology, DTO contains asserted and inferred axioms. Modeling data from the Library of Integrated Network-based Cellular Signatures (LINCS) program illustrates the potential of DTO for contextual data integration and nuanced definition of important drug target characteristics. DTO has been implemented in the IDG user interface Portal, Pharos and the TIN-X explorer of protein target disease relationships. DTO was built based on the need for a formal semantic model for druggable targets including various related information such as protein, gene, protein domain, protein structure, binding site, small molecule drug, mechanism of action, protein tissue localization, disease association, and many other types of information. DTO will further facilitate the otherwise challenging integration and formal linking to biological assays, phenotypes, disease models, drug poly-pharmacology, binding kinetics and many other processes, functions and qualities that are at the core of drug discovery. The first version of DTO is publically available via the website http://drugtargetontology.org/ , Github ( http://github.com/DrugTargetOntology/DTO ), and the NCBO Bioportal ( http://bioportal.bioontology.org/ontologies/DTO ). The long-term goal of DTO is to provide such an integrative framework and to populate the ontology with this information as a community resource.
Research and Application of Knowledge Resources Network for Product Innovation
Li, Chuan; Li, Wen-qiang; Li, Yan; Na, Hui-zhen; Shi, Qian
2015-01-01
In order to enhance the capabilities of knowledge service in product innovation design service platform, a method of acquiring knowledge resources supporting for product innovation from the Internet and providing knowledge active push is proposed. Through knowledge modeling for product innovation based on ontology, the integrated architecture of knowledge resources network is put forward. The technology for the acquisition of network knowledge resources based on focused crawler and web services is studied. Knowledge active push is provided for users by user behavior analysis and knowledge evaluation in order to improve users' enthusiasm for participation in platform. Finally, an application example is illustrated to prove the effectiveness of the method. PMID:25884031
Wagner, Michael M.; Levander, John D.; Brown, Shawn; Hogan, William R.; Millett, Nicholas; Hanna, Josh
2013-01-01
This paper describes the Apollo Web Services and Apollo-SV, its related ontology. The Apollo Web Services give an end-user application a single point of access to multiple epidemic simulators. An end user can specify an analytic problem—which we define as a configuration and a query of results—exactly once and submit it to multiple epidemic simulators. The end user represents the analytic problem using a standard syntax and vocabulary, not the native languages of the simulators. We have demonstrated the feasibility of this design by implementing a set of Apollo services that provide access to two epidemic simulators and two visualizer services. PMID:24551417
Wagner, Michael M; Levander, John D; Brown, Shawn; Hogan, William R; Millett, Nicholas; Hanna, Josh
2013-01-01
This paper describes the Apollo Web Services and Apollo-SV, its related ontology. The Apollo Web Services give an end-user application a single point of access to multiple epidemic simulators. An end user can specify an analytic problem-which we define as a configuration and a query of results-exactly once and submit it to multiple epidemic simulators. The end user represents the analytic problem using a standard syntax and vocabulary, not the native languages of the simulators. We have demonstrated the feasibility of this design by implementing a set of Apollo services that provide access to two epidemic simulators and two visualizer services.
CellFinder: a cell data repository
Stachelscheid, Harald; Seltmann, Stefanie; Lekschas, Fritz; Fontaine, Jean-Fred; Mah, Nancy; Neves, Mariana; Andrade-Navarro, Miguel A.; Leser, Ulf; Kurtz, Andreas
2014-01-01
CellFinder (http://www.cellfinder.org) is a comprehensive one-stop resource for molecular data characterizing mammalian cells in different tissues and in different development stages. It is built from carefully selected data sets stemming from other curated databases and the biomedical literature. To date, CellFinder describes 3394 cell types and 50 951 cell lines. The database currently contains 3055 microscopic and anatomical images, 205 whole-genome expression profiles of 194 cell/tissue types from RNA-seq and microarrays and 553 905 protein expressions for 535 cells/tissues. Text mining of a corpus of >2000 publications followed by manual curation confirmed expression information on ∼900 proteins and genes. CellFinder’s data model is capable to seamlessly represent entities from single cells to the organ level, to incorporate mappings between homologous entities in different species and to describe processes of cell development and differentiation. Its ontological backbone currently consists of 204 741 ontology terms incorporated from 10 different ontologies unified under the novel CELDA ontology. CellFinder’s web portal allows searching, browsing and comparing the stored data, interactive construction of developmental trees and navigating the partonomic hierarchy of cells and tissues through a unique body browser designed for life scientists and clinicians. PMID:24304896
KinView: A visual comparative sequence analysis tool for integrated kinome research
McSkimming, Daniel Ian; Dastgheib, Shima; Baffi, Timothy R.; Byrne, Dominic P.; Ferries, Samantha; Scott, Steven Thomas; Newton, Alexandra C.; Eyers, Claire E.; Kochut, Krzysztof J.; Eyers, Patrick A.
2017-01-01
Multiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format. Subsets of the alignment are easily selected through the SPARQL Protocol and RDF Query Language for downstream statistical analysis or visualization. We have also created the Kinome Viewer (KinView), an interactive integrative visualization that places eukaryotic protein kinase cancer variants in the context of natural sequence variation and experimentally determined post-translational modifications, which play central roles in the regulation of cellular signaling pathways. Using KinView, we identified differential phosphorylation patterns between tyrosine and serine/threonine kinases in the activation segment, a major kinase regulatory region that is often mutated in proliferative diseases. We discuss cancer variants that disrupt phosphorylation sites in the activation segment, and show how KinView can be used as a comparative tool to identify differences and similarities in natural variation, cancer variants and post-translational modifications between kinase groups, families and subfamilies. Based on KinView comparisons, we identify and experimentally characterize a regulatory tyrosine (Y177PLK4) in the PLK4 C-terminal activation segment region termed the P+1 loop. To further demonstrate the application of KinView in hypothesis generation and testing, we formulate and validate a hypothesis explaining a novel predicted loss-of-function variant (D523NPKCβ) in the regulatory spine of PKCβ, a recently identified tumor suppressor kinase. KinView provides a novel, extensible interface for performing comparative analyses between subsets of kinases and for integrating multiple types of residue specific annotations in user friendly formats. PMID:27731453
Bigdely-Shamlo, Nima; Cockfield, Jeremy; Makeig, Scott; Rognon, Thomas; La Valle, Chris; Miyakoshi, Makoto; Robbins, Kay A.
2016-01-01
Real-world brain imaging by EEG requires accurate annotation of complex subject-environment interactions in event-rich tasks and paradigms. This paper describes the evolution of the Hierarchical Event Descriptor (HED) system for systematically describing both laboratory and real-world events. HED version 2, first described here, provides the semantic capability of describing a variety of subject and environmental states. HED descriptions can include stimulus presentation events on screen or in virtual worlds, experimental or spontaneous events occurring in the real world environment, and events experienced via one or multiple sensory modalities. Furthermore, HED 2 can distinguish between the mere presence of an object and its actual (or putative) perception by a subject. Although the HED framework has implicit ontological and linked data representations, the user-interface for HED annotation is more intuitive than traditional ontological annotation. We believe that hiding the formal representations allows for a more user-friendly interface, making consistent, detailed tagging of experimental, and real-world events possible for research users. HED is extensible while retaining the advantages of having an enforced common core vocabulary. We have developed a collection of tools to support HED tag assignment and validation; these are available at hedtags.org. A plug-in for EEGLAB (sccn.ucsd.edu/eeglab), CTAGGER, is also available to speed the process of tagging existing studies. PMID:27799907
mlCAF: Multi-Level Cross-Domain Semantic Context Fusioning for Behavior Identification.
Razzaq, Muhammad Asif; Villalonga, Claudia; Lee, Sungyoung; Akhtar, Usman; Ali, Maqbool; Kim, Eun-Soo; Khattak, Asad Masood; Seung, Hyonwoo; Hur, Taeho; Bang, Jaehun; Kim, Dohyeong; Ali Khan, Wajahat
2017-10-24
The emerging research on automatic identification of user's contexts from the cross-domain environment in ubiquitous and pervasive computing systems has proved to be successful. Monitoring the diversified user's contexts and behaviors can help in controlling lifestyle associated to chronic diseases using context-aware applications. However, availability of cross-domain heterogeneous contexts provides a challenging opportunity for their fusion to obtain abstract information for further analysis. This work demonstrates extension of our previous work from a single domain (i.e., physical activity) to multiple domains (physical activity, nutrition and clinical) for context-awareness. We propose multi-level Context-aware Framework (mlCAF), which fuses the multi-level cross-domain contexts in order to arbitrate richer behavioral contexts. This work explicitly focuses on key challenges linked to multi-level context modeling, reasoning and fusioning based on the mlCAF open-source ontology. More specifically, it addresses the interpretation of contexts from three different domains, their fusioning conforming to richer contextual information. This paper contributes in terms of ontology evolution with additional domains, context definitions, rules and inclusion of semantic queries. For the framework evaluation, multi-level cross-domain contexts collected from 20 users were used to ascertain abstract contexts, which served as basis for behavior modeling and lifestyle identification. The experimental results indicate a context recognition average accuracy of around 92.65% for the collected cross-domain contexts.
Towards a Context-Aware Proactive Decision Support Framework
2013-11-15
initiative that has developed text analytic technology that crosses the semantic gap into the area of event recognition and representation. The...recognizing operational context, and techniques for recognizing context shift. Additional research areas include: • Adequately capturing users...Universal Interaction Context Ontology [12] might serve as a foundation • Instantiating formal models of decision making based on information seeking
Case and Model Driven Dynamic Template Linking
2005-06-01
store the trips in a PostgreSQL database (www.postgresql.org) and the values stored in this database could be re-used to provide values for similar trips...Preferences YES Yes but limited Print Form YES NO Close Form YES NO Just “X” Quit YES NO Just “X” Show User Action History YES NO 6.5 DAML Ontologies
Framework for End-User Programming of Cross-Smart Space Applications
Palviainen, Marko; Kuusijärvi, Jarkko; Ovaska, Eila
2012-01-01
Cross-smart space applications are specific types of software services that enable users to share information, monitor the physical and logical surroundings and control it in a way that is meaningful for the user's situation. For developing cross-smart space applications, this paper makes two main contributions: it introduces (i) a component design and scripting method for end-user programming of cross-smart space applications and (ii) a backend framework of components that interwork to support the brunt of the RDFScript translation, and the use and execution of ontology models. Before end-user programming activities, the software professionals must develop easy-to-apply Driver components for the APIs of existing software systems. Thereafter, end-users are able to create applications from the commands of the Driver components with the help of the provided toolset. The paper also introduces the reference implementation of the framework, tools for the Driver component development and end-user programming of cross-smart space applications and the first evaluation results on their application. PMID:23202169
SUPERFAMILY 1.75 including a domain-centric gene ontology method.
de Lima Morais, David A; Fang, Hai; Rackham, Owen J L; Wilson, Derek; Pethica, Ralph; Chothia, Cyrus; Gough, Julian
2011-01-01
The SUPERFAMILY resource provides protein domain assignments at the structural classification of protein (SCOP) superfamily level for over 1400 completely sequenced genomes, over 120 metagenomes and other gene collections such as UniProt. All models and assignments are available to browse and download at http://supfam.org. A new hidden Markov model library based on SCOP 1.75 has been created and a previously ignored class of SCOP, coiled coils, is now included. Our scoring component now uses HMMER3, which is in orders of magnitude faster and produces superior results. A cloud-based pipeline was implemented and is publicly available at Amazon web services elastic computer cloud. The SUPERFAMILY reference tree of life has been improved allowing the user to highlight a chosen superfamily, family or domain architecture on the tree of life. The most significant advance in SUPERFAMILY is that now it contains a domain-based gene ontology (GO) at the superfamily and family levels. A new methodology was developed to ensure a high quality GO annotation. The new methodology is general purpose and has been used to produce domain-based phenotypic ontologies in addition to GO.
PPDB - A tool for investigation of plants physiology based on gene ontology.
Sharma, Ajay Shiv; Gupta, Hari Om; Prasad, Rajendra
2014-09-02
Representing the way forward, from functional genomics and its ontology to functional understanding and physiological model, in a computationally tractable fashion is one of the ongoing challenges faced by computational biology. To tackle the standpoint, we herein feature the applications of contemporary database management to the development of PPDB, a searching and browsing tool for the Plants Physiology Database that is based upon the mining of a large amount of gene ontology data currently available. The working principles and search options associated with the PPDB are publicly available and freely accessible on-line ( http://www.iitr.ernet.in/ajayshiv/ ) through a user friendly environment generated by means of Drupal-6.24. By knowing that genes are expressed in temporally and spatially characteristic patterns and that their functionally distinct products often reside in specific cellular compartments and may be part of one or more multi-component complexes, this sort of work is intended to be relevant for investigating the functional relationships of gene products at a system level and, thus, helps us approach to the full physiology.
PPDB: A Tool for Investigation of Plants Physiology Based on Gene Ontology.
Sharma, Ajay Shiv; Gupta, Hari Om; Prasad, Rajendra
2015-09-01
Representing the way forward, from functional genomics and its ontology to functional understanding and physiological model, in a computationally tractable fashion is one of the ongoing challenges faced by computational biology. To tackle the standpoint, we herein feature the applications of contemporary database management to the development of PPDB, a searching and browsing tool for the Plants Physiology Database that is based upon the mining of a large amount of gene ontology data currently available. The working principles and search options associated with the PPDB are publicly available and freely accessible online ( http://www.iitr.ac.in/ajayshiv/ ) through a user-friendly environment generated by means of Drupal-6.24. By knowing that genes are expressed in temporally and spatially characteristic patterns and that their functionally distinct products often reside in specific cellular compartments and may be part of one or more multicomponent complexes, this sort of work is intended to be relevant for investigating the functional relationships of gene products at a system level and, thus, helps us approach to the full physiology.
ChEBI in 2016: Improved services and an expanding collection of metabolites
Hastings, Janna; Owen, Gareth; Dekker, Adriano; Ennis, Marcus; Kale, Namrata; Muthukrishnan, Venkatesh; Turner, Steve; Swainston, Neil; Mendes, Pedro; Steinbeck, Christoph
2016-01-01
ChEBI is a database and ontology containing information about chemical entities of biological interest. It currently includes over 46 000 entries, each of which is classified within the ontology and assigned multiple annotations including (where relevant) a chemical structure, database cross-references, synonyms and literature citations. All content is freely available and can be accessed online at http://www.ebi.ac.uk/chebi. In this update paper, we describe recent improvements and additions to the ChEBI offering. We have substantially extended our collection of endogenous metabolites for several organisms including human, mouse, Escherichia coli and yeast. Our front-end has also been reworked and updated, improving the user experience, removing our dependency on Java applets in favour of embedded JavaScript components and moving from a monthly release update to a ‘live’ website. Programmatic access has been improved by the introduction of a library, libChEBI, in Java, Python and Matlab. Furthermore, we have added two new tools, namely an analysis tool, BiNChE, and a query tool for the ontology, OntoQuery. PMID:26467479
Pierneef, Rian; Cronje, Louis; Bezuidt, Oliver; Reva, Oleg N.
2015-01-01
Abstract The Predicted Genomic Islands database (Pre_GI) is a comprehensive repository of prokaryotic genomic islands (islands, GIs) freely accessible at http://pregi.bi.up.ac.za/index.php . Pre_GI, Version 2015, catalogues 26 744 islands identified in 2407 bacterial/archaeal chromosomes and plasmids. It provides an easy-to-use interface which allows users the ability to query against the database with a variety of fields, parameters and associations. Pre_GI is constructed to be a web-resource for the analysis of ontological roads between islands and cartographic analysis of the global fluxes of mobile genetic elements through bacterial and archaeal taxonomic borders. Comparison of newly identified islands against Pre_GI presents an alternative avenue to identify their ontology, origin and relative time of acquisition. Pre_GI aims to aid research on horizontal transfer events and materials through providing data and tools for holistic investigation of migration of genes through ecological niches and taxonomic boundaries. Database URL: http://pregi.bi.up.ac.za/index.php , Version 2015 PMID:26200753
Endeavour update: a web resource for gene prioritization in multiple species
Tranchevent, Léon-Charles; Barriot, Roland; Yu, Shi; Van Vooren, Steven; Van Loo, Peter; Coessens, Bert; De Moor, Bart; Aerts, Stein; Moreau, Yves
2008-01-01
Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis. PMID:18508807
GeneTools--application for functional annotation and statistical hypothesis testing.
Beisvag, Vidar; Jünge, Frode K R; Bergum, Hallgeir; Jølsum, Lars; Lydersen, Stian; Günther, Clara-Cecilie; Ramampiaro, Heri; Langaas, Mette; Sandvik, Arne K; Laegreid, Astrid
2006-10-24
Modern biology has shifted from "one gene" approaches to methods for genomic-scale analysis like microarray technology, which allow simultaneous measurement of thousands of genes. This has created a need for tools facilitating interpretation of biological data in "batch" mode. However, such tools often leave the investigator with large volumes of apparently unorganized information. To meet this interpretation challenge, gene-set, or cluster testing has become a popular analytical tool. Many gene-set testing methods and software packages are now available, most of which use a variety of statistical tests to assess the genes in a set for biological information. However, the field is still evolving, and there is a great need for "integrated" solutions. GeneTools is a web-service providing access to a database that brings together information from a broad range of resources. The annotation data are updated weekly, guaranteeing that users get data most recently available. Data submitted by the user are stored in the database, where it can easily be updated, shared between users and exported in various formats. GeneTools provides three different tools: i) NMC Annotation Tool, which offers annotations from several databases like UniGene, Entrez Gene, SwissProt and GeneOntology, in both single- and batch search mode. ii) GO Annotator Tool, where users can add new gene ontology (GO) annotations to genes of interest. These user defined GO annotations can be used in further analysis or exported for public distribution. iii) eGOn, a tool for visualization and statistical hypothesis testing of GO category representation. As the first GO tool, eGOn supports hypothesis testing for three different situations (master-target situation, mutually exclusive target-target situation and intersecting target-target situation). An important additional function is an evidence-code filter that allows users, to select the GO annotations for the analysis. GeneTools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.no
Content-based Music Search and Recommendation System
NASA Astrophysics Data System (ADS)
Takegawa, Kazuki; Hijikata, Yoshinori; Nishida, Shogo
Recently, the turn volume of music data on the Internet has increased rapidly. This has increased the user's cost to find music data suiting their preference from such a large data set. We propose a content-based music search and recommendation system. This system has an interface for searching and finding music data and an interface for editing a user profile which is necessary for music recommendation. By exploiting the visualization of the feature space of music and the visualization of the user profile, the user can search music data and edit the user profile. Furthermore, by exploiting the infomation which can be acquired from each visualized object in a mutually complementary manner, we make it easier for the user to search music data and edit the user profile. Concretely, the system gives to the user an information obtained from the user profile when searching music data and an information obtained from the feature space of music when editing the user profile.
Food for thought ... A toxicology ontology roadmap.
Hardy, Barry; Apic, Gordana; Carthew, Philip; Clark, Dominic; Cook, David; Dix, Ian; Escher, Sylvia; Hastings, Janna; Heard, David J; Jeliazkova, Nina; Judson, Philip; Matis-Mitchell, Sherri; Mitic, Dragana; Myatt, Glenn; Shah, Imran; Spjuth, Ola; Tcheremenskaia, Olga; Toldo, Luca; Watson, David; White, Andrew; Yang, Chihae
2012-01-01
Foreign substances can have a dramatic and unpredictable adverse effect on human health. In the development of new therapeutic agents, it is essential that the potential adverse effects of all candidates be identified as early as possible. The field of predictive toxicology strives to profile the potential for adverse effects of novel chemical substances before they occur, both with traditional in vivo experimental approaches and increasingly through the development of in vitro and computational methods which can supplement and reduce the need for animal testing. To be maximally effective, the field needs access to the largest possible knowledge base of previous toxicology findings, and such results need to be made available in such a fashion so as to be interoperable, comparable, and compatible with standard toolkits. This necessitates the development of open, public, computable, and standardized toxicology vocabularies and ontologies so as to support the applications required by in silico, in vitro, and in vivo toxicology methods and related analysis and reporting activities. Such ontology development will support data management, model building, integrated analysis, validation and reporting, including regulatory reporting and alternative testing submission requirements as required by guidelines such as the REACH legislation, leading to new scientific advances in a mechanistically-based predictive toxicology. Numerous existing ontology and standards initiatives can contribute to the creation of a toxicology ontology supporting the needs of predictive toxicology and risk assessment. Additionally, new ontologies are needed to satisfy practical use cases and scenarios where gaps currently exist. Developing and integrating these resources will require a well-coordinated and sustained effort across numerous stakeholders engaged in a public-private partnership. In this communication, we set out a roadmap for the development of an integrated toxicology ontology, harnessing existing resources where applicable. We describe the stakeholders' requirements analysis from the academic and industry perspectives, timelines, and expected benefits of this initiative, with a view to engagement with the wider community.
Addressing the Challenges of Multi-Domain Data Integration with the SemantEco Framework
NASA Astrophysics Data System (ADS)
Patton, E. W.; Seyed, P.; McGuinness, D. L.
2013-12-01
Data integration across multiple domains will continue to be a challenge with the proliferation of big data in the sciences. Data origination issues and how data are manipulated are critical to enable scientists to understand and consume disparate datasets as research becomes more multidisciplinary. We present the SemantEco framework as an exemplar for designing an integrative portal for data discovery, exploration, and interpretation that uses best practice W3C Recommendations. We use the Resource Description Framework (RDF) with extensible ontologies described in the Web Ontology Language (OWL) to provide graph-based data representation. Furthermore, SemantEco ingests data via the software package csv2rdf4lod, which generates data provenance using the W3C provenance recommendation (PROV). Our presentation will discuss benefits and challenges of semantic integration, their effect on runtime performance, and how the SemantEco framework assisted in identifying performance issues and improved query performance across multiple domains by an order of magnitude. SemantEco benefits from a semantic approach that provides an 'open world', which allows data to incrementally change just as it does in the real world. SemantEco modules may load new ontologies and data using the W3C's SPARQL Protocol and RDF Query Language via HTTP. Modules may also provide user interface elements for applications and query capabilities to support new use cases. Modules can associate with domains, which are first-class objects in SemantEco. This enables SemantEco to perform integration and reasoning both within and across domains on module-provided data. The SemantEco framework has been used to construct a web portal for environmental and ecological data. The portal includes water and air quality data from the U.S. Geological Survey (USGS) and Environmental Protection Agency (EPA) and species observation counts for birds and fish from the Avian Knowledge Network and the Santa Barbara Long Term Ecological Research, respectively. We provide regulation ontologies using OWL2 datatype facets to detect out-of-range measurements for environmental standards set by the EPA, i.a. Users adjust queries using module-defined facets and a map presents the resulting measurement sites. Custom icons identify sites that violate regulations, making them easy to locate. Selecting a site gives the option of charting spatially proximate data from different domains over time. Our portal currently provides 1.6 billion triples of scientific data in RDF. We segment data by ZIP code and reasoning over 2157 measurements with our EPA regulation ontology that contains 131 regulations takes 2.5 seconds on a 2.4 GHz Intel Core 2 Quad with 8 GB of RAM. SemantEco's modular design and reasoning capabilities make it an exemplar for building multidisciplinary data integration tools that provide data access to scientists and the general population alike. Its provenance tracking provides accountability and its reasoning services can assist users in interpreting data. Future work includes support for geographical queries using the Open Geospatial Consortium's GeoSPARQL standard.
Druzinsky, Robert E.; Balhoff, James P.; Crompton, Alfred W.; Done, James; German, Rebecca Z.; Haendel, Melissa A.; Herrel, Anthony; Herring, Susan W.; Lapp, Hilmar; Mabee, Paula M.; Muller, Hans-Michael; Mungall, Christopher J.; Sternberg, Paul W.; Van Auken, Kimberly; Vinyard, Christopher J.; Williams, Susan H.; Wall, Christine E.
2016-01-01
Background In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required. Development and Testing of the Ontology Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Results and Significance Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities. PMID:26870952
NASA Astrophysics Data System (ADS)
Ma, X.; Zheng, J. G.; Goldstein, J.; Duggan, B.; Xu, J.; Du, C.; Akkiraju, A.; Aulenbach, S.; Tilmes, C.; Fox, P. A.
2013-12-01
The periodical National Climate Assessment (NCA) of the US Global Change Research Program (USGCRP) [1] produces reports about findings of global climate change and the impacts of climate change on the United States. Those findings are of great public and academic concerns and are used in policy and management decisions, which make the provenance information of findings in those reports especially important. The USGCRP is developing a Global Change Information System (GCIS), in which the NCA reports and associated provenance information are the primary records. We were modeling and developing Semantic Web applications for the GCIS. By applying a use case-driven iterative methodology [2], we developed an ontology [3] to represent the content structure of a report and the associated provenance information. We also mapped the classes and properties in our ontology into the W3C PROV-O ontology [4] to realize the formal presentation of provenance. We successfully implemented the ontology in several pilot systems for a recent National Climate Assessment report (i.e., the NCA3). They provide users the functionalities to browse and search provenance information with topics of interest. Provenance information of the NCA3 has been made structured and interoperable by applying the developed ontology. Besides the pilot systems we developed, other tools and services are also able to interact with the data in the context of the 'Web of data' and thus create added values. Our research shows that the use case-driven iterative method bridges the gap between Semantic Web researchers and earth and environmental scientists and is able to be deployed rapidly for developing Semantic Web applications. Our work also provides first-hand experience for re-using the W3C PROV-O ontology in the field of earth and environmental sciences, as the PROV-O ontology is recently ratified (on 04/30/2013) by the W3C as a recommendation and relevant applications are still rare. [1] http://www.globalchange.gov [2] Fox, P., McGuinness, D.L., 2008. TWC Semantic Web Methodology. Accessible at: http://tw.rpi.edu/web/doc/TWC_SemanticWebMethodology [3] https://scm.escience.rpi.edu/svn/public/projects/gcis/trunk/rdf/schema/GCISOntology.ttl [4] http://www.w3.org/TR/prov-o/
NASA Astrophysics Data System (ADS)
Pozzi, W.; Fekete, B.; Piasecki, M.; McGuinness, D.; Fox, P.; Lawford, R.; Vorosmarty, C.; Houser, P.; Imam, B.
2008-12-01
The inadequacies of water cycle observations for monitoring long-term changes in the global water system, as well as their feedback into the climate system, poses a major constraint on sustainable development of water resources and improvement of water management practices. Hence, The Group on Earth Observations (GEO) has established Task WA-08-01, "Integration of in situ and satellite data for water cycle monitoring," an integrative initiative combining different types of satellite and in situ observations related to key variables of the water cycle with model outputs for improved accuracy and global coverage. This presentation proposes development of the Rapid, Integrated Monitoring System for the Water Cycle (Global-RIMS)--already employed by the GEO Global Terrestrial Network for Hydrology (GTN-H)--as either one of the main components or linked with the Asian system to constitute the modeling system of GEOSS for water cycle monitoring. We further propose expanded, augmented capability to run multiple grids to embrace some of the heterogeneous methods and formats of the Earth Science, Hydrology, and Hydraulic Engineering communities. Different methodologies are employed by the Earth Science (land surface modeling), the Hydrological (GIS), and the Hydraulic Engineering Communities; with each community employing models that require different input data. Data will be routed as input variables to the models through web services, allowing satellite and in situ data to be integrated together within the modeling framework. Semantic data integration will provide the automation to enable this system to operate in near-real-time. Multiple data collections for ground water, precipitation, soil moisture satellite data, such as SMAP, and lake data will require multiple low level ontologies, and an upper level ontology will permit user-friendly water management knowledge to be synthesized. These ontologies will have to have overlapping terms mapped and linked together. so that they can cover an even wider net of data sources. The goal is to develop the means to link together the upper level and lower level ontologies and to have these registered within the GEOSS Registry. Actual operational ontologies that would link to models or link to data collections containing input variables required by models would have to be nested underneath this top level ontology, analogous to the mapping that has been carried out among ontologies within GEON.
Context-Based Tourism Information Filtering with a Semantic Rule Engine
Lamsfus, Carlos; Martin, David; Alzua-Sorzabal, Aurkene; López-de-Ipiña, Diego; Torres-Manzanera, Emilio
2012-01-01
This paper presents the CONCERT framework, a push/filter information consumption paradigm, based on a rule-based semantic contextual information system for tourism. CONCERT suggests a specific insight of the notion of context from a human mobility perspective. It focuses on the particular characteristics and requirements of travellers and addresses the drawbacks found in other approaches. Additionally, CONCERT suggests the use of digital broadcasting as push communication technology, whereby tourism information is disseminated to mobile devices. This information is then automatically filtered by a network of ontologies and offered to tourists on the screen. The results obtained in the experiments carried out show evidence that the information disseminated through digital broadcasting can be manipulated by the network of ontologies, providing contextualized information that produces user satisfaction. PMID:22778584
Context-based tourism information filtering with a semantic rule engine.
Lamsfus, Carlos; Martin, David; Alzua-Sorzabal, Aurkene; López-de-Ipiña, Diego; Torres-Manzanera, Emilio
2012-01-01
This paper presents the CONCERT framework, a push/filter information consumption paradigm, based on a rule-based semantic contextual information system for tourism. CONCERT suggests a specific insight of the notion of context from a human mobility perspective. It focuses on the particular characteristics and requirements of travellers and addresses the drawbacks found in other approaches. Additionally, CONCERT suggests the use of digital broadcasting as push communication technology, whereby tourism information is disseminated to mobile devices. This information is then automatically filtered by a network of ontologies and offered to tourists on the screen. The results obtained in the experiments carried out show evidence that the information disseminated through digital broadcasting can be manipulated by the network of ontologies, providing contextualized information that produces user satisfaction.
Combining Semantic and Lexical Methods for Mapping MedDRA to VCM Icons.
Lamy, Jean-Baptiste; Tsopra, Rosy
2018-01-01
VCM (Visualization of Concept in Medicine) is an iconic language that represents medical concepts, such as disorders, by icons. VCM has a formal semantics described by an ontology. The icons can be used in medical software for providing a visual summary or enriching texts. However, the use of VCM icons in user interfaces requires to map standard medical terminologies to VCM. Here, we present a method combining semantic and lexical approaches for mapping MedDRA to VCM. The method takes advantage of the hierarchical relations in MedDRA. It also analyzes the groups of lemmas in the term's labels, and relies on a manual mapping of these groups to the concepts in the VCM ontology. We evaluate the method on 50 terms. Finally, we discuss the method and suggest perspectives.
Druzinsky, Robert E; Balhoff, James P; Crompton, Alfred W; Done, James; German, Rebecca Z; Haendel, Melissa A; Herrel, Anthony; Herring, Susan W; Lapp, Hilmar; Mabee, Paula M; Muller, Hans-Michael; Mungall, Christopher J; Sternberg, Paul W; Van Auken, Kimberly; Vinyard, Christopher J; Williams, Susan H; Wall, Christine E
2016-01-01
In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required. Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.
Towards exergaming commons: composing the exergame ontology for publishing open game data.
Bamparopoulos, Giorgos; Konstantinidis, Evdokimos; Bratsas, Charalampos; Bamidis, Panagiotis D
2016-01-01
It has been shown that exergames have multiple benefits for physical, mental and cognitive health. Only recently, however, researchers have started considering them as health monitoring tools, through collection and analysis of game metrics data. In light of this and initiatives like the Quantified Self, there is an emerging need to open the data produced by health games and their associated metrics in order for them to be evaluated by the research community in an attempt to quantify their potential health, cognitive and physiological benefits. We have developed an ontology that describes exergames using the Web Ontology Language (OWL); it is available at http://purl.org/net/exergame/ns#. After an investigation of key components of exergames, relevant ontologies were incorporated, while necessary classes and properties were defined to model these components. A JavaScript framework was also developed in order to apply the ontology to online exergames. Finally, a SPARQL Endpoint is provided to enable open data access to potential clients through the web. Exergame components include details for players, game sessions, as well as, data produced during these game-playing sessions. The description of the game includes elements such as goals, game controllers and presentation hardware used; what is more, concepts from already existing ontologies are reused/repurposed. Game sessions include information related to the player, the date and venue where the game was played, as well as, the results/scores that were produced/achieved. These games are subsequently played by 14 users in multiple game sessions and the results derived from these sessions are published in a triplestore as open data. We model concepts related to exergames by providing a standardized structure for reference and comparison. This is the first work that publishes data from actual exergame sessions on the web, facilitating the integration and analysis of the data, while allowing open data access through the web in an effort to enable the concept of Open Trials for Active and Healthy Ageing.
Eagle-i: Making Invisible Resources, Visible
Haendel, M.; Wilson, M.; Torniai, C.; Segerdell, E.; Shaffer, C.; Frost, R.; Bourges, D.; Brownstein, J.; McInnerney, K.
2010-01-01
RP-134 The eagle-i Consortium – Dartmouth College, Harvard Medical School, Jackson State University, Morehouse School of Medicine, Montana State University, Oregon Health and Science University (OHSU), the University of Alaska, the University of Hawaii, and the University of Puerto Rico – aims to make invisible resources for scientific research visible by developing a searchable network of resource repositories at research institutions nationwide. Now in early development, it is hoped that the system will scale beyond the consortium at the end of the two-year pilot. Data Model & Ontology: The eagle-i ontology development team at the OHSU Library is generating the data model and ontologies necessary for resource indexing and querying. Our indexing system will enable cores and research labs to represent resources within a defined vocabulary, leading to more effective searches and better linkage between data types. This effort is being guided by active discussions within the ontology community (http://RRontology.tk) bringing together relevant preexisting ontologies in a logical framework. The goal of these discussions is to provide context for interoperability and domain-wide standards for resource types used throughout biomedical research. Research community feedback is welcomed. Architecture Development, led by a team at Harvard, includes four main components: tools for data collection, management and curation; an institutional resource repository; a federated network; and a central search application. Each participating institution will populate and manage their repository locally, using data collection and curation tools. To help improve search performance, data tools will support the semi-automatic annotation of resources. A central search application will use a federated protocol to broadcast queries to all repositories and display aggregated results. The search application will leverage the eagle-i ontologies to help guide users to valid queries via auto-suggestions and taxonomy browsing and improve search result quality via concept-based search and synonym expansion. Website: http://eagle-i.org. NIH/NCRR ARRA award #U24RR029825
Proposal for Re-Usable TODO Knowledge Management System RESTER
NASA Astrophysics Data System (ADS)
Saga, Ryosuke; Kageyama, Akinori; Tsuji, Hiroshi
This paper describes how to reuse a series of ad-hoc tasks such as special meeting arrangement and equipment procurement. Our RESTER (Reusable TODO Synthesizer) allows a group to reuse a series of tasks which are recorded in case database. Given a specific event, RESTER repairs the retrieved similar case by the ontology which describes the relationship of concept in the organization. A user has chance to check the modified case and to update it if he finds that there are incorrect repair because of deficient ontology. The user is also requested to judge if the retrieved case works or not. If he judges it is useful, the case becomes to be reused more frequently. Thus, RESTER works under the premise of human-computer collaboration. Based on the presented framework, this paper has identified several desirable attributes: (1) RESTER allows a group to externalize its experience on jobs, (2) Externalized experience are connected in case database, (3) A case is internalized by other group when it is retrieved and repaired for a new event, (4) New job generated from the previous similar job of one group is socialized by the other group.
NASA Astrophysics Data System (ADS)
Cacciotti, R.; Valach, J.; Kuneš, P.; Čerňanský, M.; Blaško, M.; Křemen, P.
2013-07-01
Deriving from the complex nature of cultural heritage conservation it is the need for enhancing a systematic but flexible organization of expert knowledge in the field. Such organization should address comprehensively the interrelations and complementariness among the different factors that come into play in the understanding of diagnostic and intervention problems. The purpose of MONDIS is to endorse this kind of organization. The approach consists in applying an ontological representation to the field of heritage conservation in order to establish an appropriate processing of data. The system allows replicating in a computer readable form the basic dependence among factors influencing the description, diagnosis and intervention of damages to immovable objects. More specifically MONDIS allows to input and search entries concerning object description, structural evolution, location characteristics and risk, component, material properties, surveys and measurements, damage typology, damage triggering events and possible interventions. The system supports searching features typical of standard databases, as it allows for the digitalization of a wide range of information including professional reports, books, articles and scientific papers. It also allows for computer aided retrieval of information tailored to user's requirements. The foreseen outputs will include a web user interface and a mobile application for visual inspection purposes.
OntoFire: an ontology-based geo-portal for wildfires
NASA Astrophysics Data System (ADS)
Kalabokidis, K.; Athanasis, N.; Vaitis, M.
2011-12-01
With the proliferation of the geospatial technologies on the Internet, the role of geo-portals (i.e. gateways to Spatial Data Infrastructures) in the area of wildfires management emerges. However, keyword-based techniques often frustrate users when looking for data of interest in geo-portal environments, while little attention has been paid to shift from the conventional keyword-based to navigation-based mechanisms. The presented OntoFire system is an ontology-based geo-portal about wildfires. Through the proposed navigation mechanisms, the relationships between the data can be discovered, which would otherwise not be possible when using conventional querying techniques alone. End users can use the browsing interface to find resources of interest by using the navigation mechanisms provided. Data providers can use the publishing interface to submit new metadata, modify metadata or removing metadata in/from the catalogue. The proposed approach can improve the discovery of valuable information that is necessary to set priorities for disaster mitigation and prevention strategies. OntoFire aspires to be a focal point of integration and management of a very large amount of information, contributing in this way to the dissemination of knowledge and to the preparedness of the operational stakeholders.
USDA-ARS?s Scientific Manuscript database
In the current study, we compared chicken gene transcriptional profiles following primary and secondary infections with Eimeria acervulina using a 9.6K avian intestinal intraepithelial lymphocyte cDNA microarray (AVIELA). Gene Ontology analysis showed that primary infection significantly modulated ...
NASA Astrophysics Data System (ADS)
Fox, P.; McGuinness, D.; Cinquini, L.; West, P.; Garcia, J.; Zednik, S.; Benedict, J.
2008-05-01
This presentation will demonstrate how users and other data providers can utilize the Virtual Solar-Terrestrial Observatory (VSTO) to find, access and use diverse data holdings from the disciplines of solar, solar-terrestrial and space physics. VSTO provides a web portal, web services and a native applications programming interface for various levels of users. Since these access methods are based on semantic web technologies and refer to the VSTO ontology, users also have the option of taking advantage of value added services when accessing and using the data. We present example of both conventional use of VSTO as well as the advanced semantics use. Finally, we present our future directions for VSTO and semantic data frameworks in general.
NASA Astrophysics Data System (ADS)
Fox, P.
2007-05-01
This presentation will demonstrate how users and other data providers can utilize the Virtual Solar-Terrestrial Observatory (VSTO) to find, access and use diverse data holdings from the disciplines of solar, solar-terrestrial and space physics. VSTO provides a web portal, web services and a native applications programming interface for various levels of users. Since these access methods are based on semantic web technologies and refer to the VSTO ontology, users also have the option of taking advantage of value added services when accessing and using the data. We present example of both conventional use of VSTO as well as the advanced semantics use. Finally, we present our future directions for VSTO and semantic data frameworks in general.
A Microarray Tool Provides Pathway and GO Term Analysis.
Koch, Martin; Royer, Hans-Dieter; Wiese, Michael
2011-12-01
Analysis of gene expression profiles is no longer exclusively a task for bioinformatic experts. However, gaining statistically significant results is challenging and requires both biological knowledge and computational know-how. Here we present a novel, user-friendly microarray reporting tool called maRt. The software provides access to bioinformatic resources, like gene ontology terms and biological pathways by use of the DAVID and the BioMart web-service. Results are summarized in structured HTML reports, each presenting a different layer of information. In these report, contents of diverse sources are integrated and interlinked. To speed up processing, maRt takes advantage of the multi-core technology of modern desktop computers by using parallel processing. Since the software is built upon a RCP infrastructure it might be an outset for developers aiming to integrate novel R based applications. Installer, documentation and various kinds of tutorials are available under LGPL license at the website of our institute http://www.pharma.uni-bonn.de/www/mart. This software is free for academic use. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Druzinsky, Robert E.; Balhoff, James P.; Crompton, Alfred W.
Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repositorymore » for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). Lastly, we compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.« less
Druzinsky, Robert E.; Balhoff, James P.; Crompton, Alfred W.; ...
2016-02-12
Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repositorymore » for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). Lastly, we compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.« less
Knowledge Representation and Management, It's Time to Integrate!
Dhombres, F; Charlet, J
2017-08-01
Objectives: To select, present, and summarize the best papers published in 2016 in the field of Knowledge Representation and Management (KRM). Methods: A comprehensive and standardized review of the medical informatics literature was performed based on a PubMed query. Results: Among the 1,421 retrieved papers, the review process resulted in the selection of four best papers focused on the integration of heterogeneous data via the development and the alignment of terminological resources. In the first article, the authors provide a curated and standardized version of the publicly available US FDA Adverse Event Reporting System. Such a resource will improve the quality of the underlying data, and enable standardized analyses using common vocabularies. The second article describes a project developed in order to facilitate heterogeneous data integration in the i2b2 framework. The originality is to allow users integrate the data described in different terminologies and to build a new repository, with a unique model able to support the representation of the various data. The third paper is dedicated to model the association between multiple phenotypic traits described within the Human Phenotype Ontology (HPO) and the corresponding genotype in the specific context of rare diseases (rare variants). Finally, the fourth paper presents solutions to annotation-ontology mapping in genome-scale data. Of particular interest in this work is the Experimental Factor Ontology (EFO) and its generic association model, the Ontology of Biomedical AssociatioN (OBAN). Conclusion: Ontologies have started to show their efficiency to integrate medical data for various tasks in medical informatics: electronic health records data management, clinical research, and knowledge-based systems development. Georg Thieme Verlag KG Stuttgart.
NeuroNames: an ontology for the BrainInfo portal to neuroscience on the web.
Bowden, Douglas M; Song, Evan; Kosheleva, Julia; Dubach, Mark F
2012-01-01
BrainInfo ( http://braininfo.org ) is a growing portal to neuroscientific information on the Web. It is indexed by NeuroNames, an ontology designed to compensate for ambiguities in neuroanatomical nomenclature. The 20-year old ontology continues to evolve toward the ideal of recognizing all names of neuroanatomical entities and accommodating all structural concepts about which neuroscientists communicate, including multiple concepts of entities for which neuroanatomists have yet to determine the best or 'true' conceptualization. To make the definitions of structural concepts unambiguous and terminologically consistent we created a 'default vocabulary' of unique structure names selected from existing terminology. We selected standard names by criteria designed to maximize practicality for use in verbal communication as well as computerized knowledge management. The ontology of NeuroNames accommodates synonyms and homonyms of the standard terms in many languages. It defines complex structures as models composed of primary structures, which are defined in unambiguous operational terms. NeuroNames currently relates more than 16,000 names in eight languages to some 2,500 neuroanatomical concepts. The ontology is maintained in a relational database with three core tables: Names, Concepts and Models. BrainInfo uses NeuroNames to index information by structure, to interpret users' queries and to clarify terminology on remote web pages. NeuroNames is a resource vocabulary of the NLM's Unified Medical Language System (UMLS, 2011) and the basis for the brain regions component of NIFSTD (NeuroLex, 2011). The current version has been downloaded to hundreds of laboratories for indexing data and linking to BrainInfo, which attracts some 400 visitors/day, downloading 2,000 pages/day.
NASA Astrophysics Data System (ADS)
Peckham, S. D.; Kelbert, A.; Rudan, S.; Stoica, M.
2016-12-01
Standardized metadata for models is the key to reliable and greatly simplified coupling in model coupling frameworks like CSDMS (Community Surface Dynamics Modeling System). This model metadata also helps model users to understand the important details that underpin computational models and to compare the capabilities of different models. These details include simplifying assumptions on the physics, governing equations and the numerical methods used to solve them, discretization of space (the grid) and time (the time-stepping scheme), state variables (input or output), model configuration parameters. This kind of metadata provides a "deep description" of a computational model that goes well beyond other types of metadata (e.g. author, purpose, scientific domain, programming language, digital rights, provenance, execution) and captures the science that underpins a model. While having this kind of standardized metadata for each model in a repository opens up a wide range of exciting possibilities, it is difficult to collect this information and a carefully conceived "data model" or schema is needed to store it. Automated harvesting and scraping methods can provide some useful information, but they often result in metadata that is inaccurate or incomplete, and this is not sufficient to enable the desired capabilities. In order to address this problem, we have developed a browser-based tool called the MCM Tool (Model Component Metadata) which runs on notebooks, tablets and smart phones. This tool was partially inspired by the TurboTax software, which greatly simplifies the necessary task of preparing tax documents. It allows a model developer or advanced user to provide a standardized, deep description of a computational geoscience model, including hydrologic models. Under the hood, the tool uses a new ontology for models built on the CSDMS Standard Names, expressed as a collection of RDF files (Resource Description Framework). This ontology is based on core concepts such as variables, objects, quantities, operations, processes and assumptions. The purpose of this talk is to present details of the new ontology and to then demonstrate the MCM Tool for several hydrologic models.
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)
NASA Astrophysics Data System (ADS)
Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.
2016-12-01
The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
The MMI Semantic Framework: Rosetta Stones for Earth Sciences
NASA Astrophysics Data System (ADS)
Rueda, C.; Bermudez, L. E.; Graybeal, J.; Alexander, P.
2009-12-01
Semantic interoperability—the exchange of meaning among computer systems—is needed to successfully share data in Ocean Science and across all Earth sciences. The best approach toward semantic interoperability requires a designed framework, and operationally tested tools and infrastructure within that framework. Currently available technologies make a scientific semantic framework feasible, but its development requires sustainable architectural vision and development processes. This presentation outlines the MMI Semantic Framework, including recent progress on it and its client applications. The MMI Semantic Framework consists of tools, infrastructure, and operational and community procedures and best practices, to meet short-term and long-term semantic interoperability goals. The design and prioritization of the semantic framework capabilities are based on real-world scenarios in Earth observation systems. We describe some key uses cases, as well as the associated requirements for building the overall infrastructure, which is realized through the MMI Ontology Registry and Repository. This system includes support for community creation and sharing of semantic content, ontology registration, version management, and seamless integration of user-friendly tools and application programming interfaces. The presentation describes the architectural components for semantic mediation, registry and repository for vocabularies, ontology, and term mappings. We show how the technologies and approaches in the framework can address community needs for managing and exchanging semantic information. We will demonstrate how different types of users and client applications exploit the tools and services for data aggregation, visualization, archiving, and integration. Specific examples from OOSTethys (http://www.oostethys.org) and the Ocean Observatories Initiative Cyberinfrastructure (http://www.oceanobservatories.org) will be cited. Finally, we show how semantic augmentation of web services standards could be performed using framework tools.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Valentine, D.; Richard, S. M.; Gupta, A.; Meier, O.; Peucker-Ehrenbrink, B.; Hudman, G.; Stocks, K. I.; Hsu, L.; Whitenack, T.; Grethe, J. S.; Ozyurt, I. B.
2017-12-01
EarthCube Data Discovery Hub (DDH) is an EarthCube Building Block project using technologies developed in CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) to enable geoscience users to explore a growing portfolio of EarthCube-created and other geoscience-related resources. Over 1 million metadata records are available for discovery through the project portal (cinergi.sdsc.edu). These records are retrieved from data facilities, including federal, state and academic sources, or contributed by geoscientists through workshops, surveys, or other channels. CINERGI metadata augmentation pipeline components 1) provide semantic enhancement based on a large ontology of geoscience terms, using text analytics to generate keywords with references to ontology classes, 2) add spatial extents based on place names found in the metadata record, and 3) add organization identifiers to the metadata. The records are indexed and can be searched via a web portal and standard search APIs. The added metadata content improves discoverability and interoperability of the registered resources. Specifically, the addition of ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by variables measured, equipment used, science domain, processes described, geospatial features studied, and other dataset characteristics that are generated by the pipeline. DDH also lets data curators access and edit the automatically generated metadata records using the CINERGI metadata editor, accept or reject the enhanced metadata content, and consider it in updating their metadata descriptions. We consider several complex data discovery workflows, in environmental seismology (quantifying sediment and water fluxes using seismic data), marine biology (determining available temperature, location, weather and bleaching characteristics of coral reefs related to measurements in a given coral reef survey), and river geochemistry (discovering observations relevant to geochemical measurements outside the tidal zone, given specific discharge conditions).
Intelligent resource discovery using ontology-based resource profiles
NASA Technical Reports Server (NTRS)
Hughes, J. Steven; Crichton, Dan; Kelly, Sean; Crichton, Jerry; Tran, Thuy
2004-01-01
Successful resource discovery across heterogeneous repositories is strongly dependent on the semantic and syntactic homogeneity of the associated resource descriptions. Ideally, resource descriptions are easily extracted from pre-existing standardized sources, expressed using standard syntactic and semantic structures, and managed and accessed within a distributed, flexible, and scaleable software framework.
NASA Astrophysics Data System (ADS)
Demir, I.; Sermet, M. Y.
2016-12-01
Nobody is immune from extreme events or natural hazards that can lead to large-scale consequences for the nation and public. One of the solutions to reduce the impacts of extreme events is to invest in improving resilience with the ability to better prepare, plan, recover, and adapt to disasters. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This abstracts presents our project on developing a resilience framework for flooding to improve societal preparedness with objectives; (a) develop a generalized ontology for extreme events with primary focus on flooding; (b) develop a knowledge engine with voice recognition, artificial intelligence, natural language processing, and inference engine. The knowledge engine will utilize the flood ontology and concepts to connect user input to relevant knowledge discovery outputs on flooding; (c) develop a data acquisition and processing framework from existing environmental observations, forecast models, and social networks. The system will utilize the framework, capabilities and user base of the Iowa Flood Information System (IFIS) to populate and test the system; (d) develop a communication framework to support user interaction and delivery of information to users. The interaction and delivery channels will include voice and text input via web-based system (e.g. IFIS), agent-based bots (e.g. Microsoft Skype, Facebook Messenger), smartphone and augmented reality applications (e.g. smart assistant), and automated web workflows (e.g. IFTTT, CloudWork) to open the knowledge discovery for flooding to thousands of community extensible web workflows.
Xie, Jiangan; Codd, Christopher; Mo, Kevin; He, Yongqun
2016-01-01
M. bovis strain Bacillus Calmette–Guérin (BCG) has been the only licensed live attenuated vaccine against tuberculosis (TB) for nearly one century and has also been approved as a therapeutic vaccine for bladder cancer treatment since 1990. During its long time usage, different adverse events (AEs) have been reported. However, the AEs associated with the BCG preventive TB vaccine and therapeutic cancer vaccine have not been systematically compared. In this study, we systematically collected various BCG AE data mined from the US VAERS database and PubMed literature reports, identified statistically significant BCG-associated AEs, and ontologically classified and compared these AEs related to these two types of BCG vaccine. From 397 VAERS BCG AE case reports, we identified 64 AEs statistically significantly associated with the BCG TB vaccine and 14 AEs with the BCG cancer vaccine. Our meta-analysis of 41 peer-reviewed journal reports identified 48 AEs associated with the BCG TB vaccine and 43 AEs associated with the BCG cancer vaccine. Among all identified AEs from VAERS and literature reports, 25 AEs belong to serious AEs. The Ontology of Adverse Events (OAE)-based ontological hierarchical analysis indicated that the AEs associated with the BCG TB vaccine were enriched in immune system (e.g., lymphadenopathy and lymphadenitis), skin (e.g., skin ulceration and cyanosis), and respiratory system (e.g., cough and pneumonia); in contrast, the AEs associated with the BCG cancer vaccine mainly occurred in the urinary system (e.g., dysuria, pollakiuria, and hematuria). With these distinct AE profiles detected, this study also discovered three AEs (i.e., chills, pneumonia, and C-reactive protein increased) shared by the BCG TB vaccine and bladder cancer vaccine. Furthermore, our deep investigation of 24 BCG-associated death cases from VAERS identified the important effects of age, vaccine co-administration, and immunosuppressive status on the final BCG-associated death outcome. PMID:27749923
Semantic biomedical resource discovery: a Natural Language Processing framework.
Sfakianaki, Pepi; Koumakis, Lefteris; Sfakianakis, Stelios; Iatraki, Galatia; Zacharioudakis, Giorgos; Graf, Norbert; Marias, Kostas; Tsiknakis, Manolis
2015-09-30
A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.
NASA Astrophysics Data System (ADS)
Piasecki, M.; Beran, B.
2007-12-01
Search engines have changed the way we see the Internet. The ability to find the information by just typing in keywords was a big contribution to the overall web experience. While the conventional search engine methodology worked well for textual documents, locating scientific data remains a problem since they are stored in databases not readily accessible by search engine bots. Considering different temporal, spatial and thematic coverage of different databases, especially for interdisciplinary research it is typically necessary to work with multiple data sources. These sources can be federal agencies which generally offer national coverage or regional sources which cover a smaller area with higher detail. However for a given geographic area of interest there often exists more than one database with relevant data. Thus being able to query multiple databases simultaneously is a desirable feature that would be tremendously useful for scientists. Development of such a search engine requires dealing with various heterogeneity issues. In scientific databases, systems often impose controlled vocabularies which ensure that they are generally homogeneous within themselves but are semantically heterogeneous when moving between different databases. This defines the boundaries of possible semantic related problems making it easier to solve than with the conventional search engines that deal with free text. We have developed a search engine that enables querying multiple data sources simultaneously and returns data in a standardized output despite the aforementioned heterogeneity issues between the underlying systems. This application relies mainly on metadata catalogs or indexing databases, ontologies and webservices with virtual globe and AJAX technologies for the graphical user interface. Users can trigger a search of dozens of different parameters over hundreds of thousands of stations from multiple agencies by providing a keyword, a spatial extent, i.e. a bounding box, and a temporal bracket. As part of this development we have also added an environment that allows users to do some of the semantic tagging, i.e. the linkage of a variable name (which can be anything they desire) to defined concepts in the ontology structure which in turn provides the backbone of the search engine.
Myneni, Sahiti; Amith, Muhammad; Geng, Yimin; Tao, Cui
2015-01-01
Adolescent and Young Adult (AYA) cancer survivors manage an array of health-related issues. Survivorship Care Plans (SCPs) have the potential to empower these young survivors by providing information regarding treatment summary, late-effects of cancer therapies, healthy lifestyle guidance, coping with work-life-health balance, and follow-up care. However, current mHealth infrastructure used to deliver SCPs has been limited in terms of flexibility, engagement, and reusability. The objective of this study is to develop an ontology-driven survivor engagement framework to facilitate rapid development of mobile apps that are targeted, extensible, and engaging. The major components include ontology models, patient engagement features, and behavioral intervention technologies. We apply the proposed framework to characterize individual building blocks ("survivor digilegos"), which form the basis for mHealth tools that address user needs across the cancer care continuum. Results indicate that the framework (a) allows identification of AYA survivorship components, (b) facilitates infusion of engagement elements, and (c) integrates behavior change constructs into the design architecture of survivorship applications. Implications for design of patient-engaging chronic disease management solutions are discussed.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.
Bolleman, Jerven T; Mungall, Christopher J; Strozzi, Francesco; Baran, Joachim; Dumontier, Michel; Bonnal, Raoul J P; Buels, Robert; Hoehndorf, Robert; Fujisawa, Takatomo; Katayama, Toshiaki; Cock, Peter J A
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
Heart health risk assessment system: a nonintrusive proposal using ontologies and expert rules.
Garcia-Valverde, Teresa; Muñoz, Andrés; Arcas, Francisco; Bueno-Crespo, Andrés; Caballero, Alberto
2014-01-01
According to the World Health Organization, the world's leading cause of death is heart disease, with nearly two million deaths per year. Although some factors are not possible to change, there are some keys that help to prevent heart diseases. One of the most important keys is to keep an active daily life, with moderate exercise. However, deciding what a moderate exercise is or when a slightly abnormal heart rate value is a risk depends on the person and the activity. In this paper we propose a context-aware system that is able to determine the activity the person is performing in an unobtrusive way. Then, we have defined ontology to represent the available knowledge about the person (biometric data, fitness status, medical information, etc.) and her current activity (level of intensity, heart rate recommended for that activity, etc.). With such knowledge, a set of expert rules based on this ontology are involved in a reasoning process to infer levels of alerts or suggestions for the users when the intensity of the activity is detected as dangerous for her health. We show how this approach can be accomplished by using only everyday devices such as a smartphone and a smartwatch.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco; ...
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
Li, Jia; Xia, Yunni; Luo, Xin
2014-01-01
OWL-S, one of the most important Semantic Web service ontologies proposed to date, provides a core ontological framework and guidelines for describing the properties and capabilities of their web services in an unambiguous, computer interpretable form. Predicting the reliability of composite service processes specified in OWL-S allows service users to decide whether the process meets the quantitative quality requirement. In this study, we consider the runtime quality of services to be fluctuating and introduce a dynamic framework to predict the runtime reliability of services specified in OWL-S, employing the Non-Markovian stochastic Petri net (NMSPN) and the time series model. The framework includes the following steps: obtaining the historical response times series of individual service components; fitting these series with a autoregressive-moving-average-model (ARMA for short) and predicting the future firing rates of service components; mapping the OWL-S process into a NMSPN model; employing the predicted firing rates as the model input of NMSPN and calculating the normal completion probability as the reliability estimate. In the case study, a comparison between the static model and our approach based on experimental data is presented and it is shown that our approach achieves higher prediction accuracy.
A use case study on late stent thrombosis for ontology-based temporal reasoning and analysis.
Clark, Kim; Sharma, Deepak; Qin, Rui; Chute, Christopher G; Tao, Cui
2014-01-01
In this paper, we show how we have applied the Clinical Narrative Temporal Relation Ontology (CNTRO) and its associated temporal reasoning system (the CNTRO Timeline Library) to trend temporal information within medical device adverse event report narratives. 238 narratives documenting occurrences of late stent thrombosis adverse events from the Food and Drug Administration's (FDA) Manufacturing and User Facility Device Experience (MAUDE) database were annotated and evaluated using the CNTRO Timeline Library to identify, order, and calculate the duration of temporal events. The CNTRO Timeline Library had a 95% accuracy in correctly ordering events within the 238 narratives. 41 narratives included an event in which the duration was documented, and the CNTRO Timeline Library had an 80% accuracy in correctly determining these durations. 77 narratives included documentation of a duration between events, and the CNTRO Timeline Library had a 76% accuracy in determining these durations. This paper also includes an example of how this temporal output from the CNTRO ontology can be used to verify recommendations for length of drug administration, and proposes that these same tools could be applied to other medical device adverse event narratives in order to identify currently unknown temporal trends.
Heart Health Risk Assessment System: A Nonintrusive Proposal Using Ontologies and Expert Rules
2014-01-01
According to the World Health Organization, the world's leading cause of death is heart disease, with nearly two million deaths per year. Although some factors are not possible to change, there are some keys that help to prevent heart diseases. One of the most important keys is to keep an active daily life, with moderate exercise. However, deciding what a moderate exercise is or when a slightly abnormal heart rate value is a risk depends on the person and the activity. In this paper we propose a context-aware system that is able to determine the activity the person is performing in an unobtrusive way. Then, we have defined ontology to represent the available knowledge about the person (biometric data, fitness status, medical information, etc.) and her current activity (level of intensity, heart rate recommended for that activity, etc.). With such knowledge, a set of expert rules based on this ontology are involved in a reasoning process to infer levels of alerts or suggestions for the users when the intensity of the activity is detected as dangerous for her health. We show how this approach can be accomplished by using only everyday devices such as a smartphone and a smartwatch. PMID:25045715
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
Multicriteria analysis of ontologically represented information
NASA Astrophysics Data System (ADS)
Wasielewska, K.; Ganzha, M.; Paprzycki, M.; Bǎdicǎ, C.; Ivanovic, M.; Lirkov, I.
2014-11-01
Our current work concerns the development of a decision support system for the software selection problem. The main idea is to utilize expert knowledge to help the user in selecting the best software / method / computational resource to solve a computational problem. Obviously, this involves multicriterial decision making and the key open question is: which method to choose. The context of the work is provided by the Agents in Grid (AiG) project, where the software selection (and thus multicriterial analysis) is to be realized when all information concerning the problem, the hardware and the software is ontologically represented. Initially, we have considered the Analytical Hierarchy Process (AHP), which is well suited for the hierarchical data structures (e.g., such that have been formulated in terms of ontologies). However, due to its well-known shortcomings, we have decided to extend our search for the multicriterial analysis method best suited for the problem in question. In this paper we report results of our search, which involved: (i) TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), (ii) PROMETHEE, and (iii) GRIP (Generalized Regression with Intensities of Preference). We also briefly argue why other methods have not been considered as valuable candidates.
Ontology based log content extraction engine for a posteriori security control.
Azkia, Hanieh; Cuppens-Boulahia, Nora; Cuppens, Frédéric; Coatrieux, Gouenou
2012-01-01
In a posteriori access control, users are accountable for actions they performed and must provide evidence, when required by some legal authorities for instance, to prove that these actions were legitimate. Generally, log files contain the needed data to achieve this goal. This logged data can be recorded in several formats; we consider here IHE-ATNA (Integrating the healthcare enterprise-Audit Trail and Node Authentication) as log format. The difficulty lies in extracting useful information regardless of the log format. A posteriori access control frameworks often include a log filtering engine that provides this extraction function. In this paper we define and enforce this function by building an IHE-ATNA based ontology model, which we query using SPARQL, and show how the a posteriori security controls are made effective and easier based on this function.
Knowledge-Centric Management of Business Rules in a Pharmacy
NASA Astrophysics Data System (ADS)
Puustjärvi, Juha; Puustjärvi, Leena
A business rule defines or constraints some aspect of the business. In healthcare sector many of the business rules are dictated by law or medical regulations, which are constantly changing. This is a challenge for the healthcare organizations. Although there is available several commercial business rule management systems the problem from pharmacies point of view is that these systems are overly geared towards the automation and manipulation of business rules, while the main need in pharmacies lies in easy retrieving of business rules within daily routines. Another problem is that business rule management systems are isolated in the sense that they have their own data stores that cannot be accessed by other information systems used in pharmacies. As a result, a pharmacist is burdened by accessing many systems inside a user task. In order to avoid this problem we have modeled business rules as well as their relationships to other relevant information by OWL (Web Ontology Language) such that the ontology is shared among the pharmacy's applications. In this way we can avoid the problems of isolated applications and replicated data. The ontology also encourages pharmacies business agility, i.e., the ability to react more rapidly to the changes required by the new business rules. The deployment of the ontology requires that stored business rules are annotated by appropriate metadata descriptions, which are presented by RDF/XML serialization format. However, neither the designer nor the pharmacists are burdened by RDF/XML format as there are sophisticated graphical editors that can be used.
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L.; Sanders, Brian; Grethe, Jeffrey S.; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W.; Martone, Maryann E.
2009-01-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov. PMID:18958629
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L; Sanders, Brian; Grethe, Jeffrey S; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W; Martone, Maryann E
2008-09-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov.
Ontology Based Quality Evaluation for Spatial Data
NASA Astrophysics Data System (ADS)
Yılmaz, C.; Cömert, Ç.
2015-08-01
Many institutions will be providing data to the National Spatial Data Infrastructure (NSDI). Current technical background of the NSDI is based on syntactic web services. It is expected that this will be replaced by semantic web services. The quality of the data provided is important in terms of the decision-making process and the accuracy of transactions. Therefore, the data quality needs to be tested. This topic has been neglected in Turkey. Data quality control for NSDI may be done by private or public "data accreditation" institutions. A methodology is required for data quality evaluation. There are studies for data quality including ISO standards, academic studies and software to evaluate spatial data quality. ISO 19157 standard defines the data quality elements. Proprietary software such as, 1Spatial's 1Validate and ESRI's Data Reviewer offers quality evaluation based on their own classification of rules. Commonly, rule based approaches are used for geospatial data quality check. In this study, we look for the technical components to devise and implement a rule based approach with ontologies using free and open source software in semantic web context. Semantic web uses ontologies to deliver well-defined web resources and make them accessible to end-users and processes. We have created an ontology conforming to the geospatial data and defined some sample rules to show how to test data with respect to data quality elements including; attribute, topo-semantic and geometrical consistency using free and open source software. To test data against rules, sample GeoSPARQL queries are created, associated with specifications.
Individualised training to address variability of radiologists' performance
NASA Astrophysics Data System (ADS)
Sun, Shanghua; Taylor, Paul; Wilkinson, Louise; Khoo, Lisanne
2008-03-01
Computer-based tools are increasingly used for training and the continuing professional development of radiologists. We propose an adaptive training system to support individualised learning in mammography, based on a set of real cases, which are annotated with educational content by experienced breast radiologists. The system has knowledge of the strengths and weakness of each radiologist's performance: each radiologist is assessed to compute a profile showing how they perform on different sets of cases, classified by type of abnormality, breast density, and perceptual difficulty. We also assess variability in cognitive aspects of image perception, classifying errors made by radiologists as errors of search, recognition or decision. This is a novel element in our approach. The profile is used to select cases to present to the radiologist. The intelligent and flexible presentation of these cases distinguishes our system from existing training tools. The training cases are organised and indexed by an ontology we have developed for breast radiologist training, which is consistent with the radiologists' profile. Hence, the training system is able to select appropriate cases to compose an individualised training path, addressing the variability of the radiologists' performance. A substantial part of the system, the ontology has been evaluated on a large number of cases, and the training system is under implementation for further evaluation.
Effective Filtering of Query Results on Updated User Behavioral Profiles in Web Mining
Sadesh, S.; Suganthe, R. C.
2015-01-01
Web with tremendous volume of information retrieves result for user related queries. With the rapid growth of web page recommendation, results retrieved based on data mining techniques did not offer higher performance filtering rate because relationships between user profile and queries were not analyzed in an extensive manner. At the same time, existing user profile based prediction in web data mining is not exhaustive in producing personalized result rate. To improve the query result rate on dynamics of user behavior over time, Hamilton Filtered Regime Switching User Query Probability (HFRS-UQP) framework is proposed. HFRS-UQP framework is split into two processes, where filtering and switching are carried out. The data mining based filtering in our research work uses the Hamilton Filtering framework to filter user result based on personalized information on automatic updated profiles through search engine. Maximized result is fetched, that is, filtered out with respect to user behavior profiles. The switching performs accurate filtering updated profiles using regime switching. The updating in profile change (i.e., switches) regime in HFRS-UQP framework identifies the second- and higher-order association of query result on the updated profiles. Experiment is conducted on factors such as personalized information search retrieval rate, filtering efficiency, and precision ratio. PMID:26221626
Adaptive User Profiles in Pervasive Advertising Environments
NASA Astrophysics Data System (ADS)
Alt, Florian; Balz, Moritz; Kristes, Stefanie; Shirazi, Alireza Sahami; Mennenöh, Julian; Schmidt, Albrecht; Schröder, Hendrik; Goedicke, Michael
Nowadays modern advertising environments try to provide more efficient ads by targeting costumers based on their interests. Various approaches exist today as to how information about the users' interests can be gathered. Users can deliberately and explicitly provide this information or user's shopping behaviors can be analyzed implicitly. We implemented an advertising platform to simulate an advertising environment and present adaptive profiles, which let users setup profiles based on a self-assessment, and enhance those profiles with information about their real shopping behavior as well as about their activity intensity. Additionally, we explain how pervasive technologies such as Bluetooth can be used to create a profile anonymously and unobtrusively.
Ontology and Taxonomy Design and Development for Personalised Web-Based Learning Systems
ERIC Educational Resources Information Center
Yalcinalp, Serpil; Gulbahar, Yasemin
2010-01-01
Recent developments and new directions in education have emphasised learners' needs, profile and pedagogical aspects by focusing on learner-centered approaches in educational settings. e-Learning, on the other hand, guarantees learners the opportunity of learning in their own way, and leads to new considerations in course design. e-Learning is…
Developing an Ontology for Ocean Biogeochemistry Data
NASA Astrophysics Data System (ADS)
Chandler, C. L.; Allison, M. D.; Groman, R. C.; West, P.; Zednik, S.; Maffei, A. R.
2010-12-01
Semantic Web technologies offer great promise for enabling new and better scientific research. However, significant challenges must be met before the promise of the Semantic Web can be realized for a discipline as diverse as oceanography. Evolving expectations for open access to research data combined with the complexity of global ecosystem science research themes present a significant challenge, and one that is best met through an informatics approach. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) is funded by the National Science Foundation Division of Ocean Sciences to work with ocean biogeochemistry researchers to improve access to data resulting from their respective programs. In an effort to improve data access, BCO-DMO staff members are collaborating with researchers from the Tetherless World Constellation (Rensselaer Polytechnic Institute) to develop an ontology that formally describes the concepts and relationships in the data managed by the BCO-DMO. The project required transforming a legacy system of human-readable, flat files of metadata to well-ordered controlled vocabularies to a fully developed ontology. To improve semantic interoperability, terms from the BCO-DMO controlled vocabularies are being mapped to controlled vocabulary terms adopted by other oceanographic data management organizations. While the entire process has proven to be difficult, time-consuming and labor-intensive, the work has been rewarding and is a necessary prerequisite for the eventual incorporation of Semantic Web tools. From the beginning of the project, development of the ontology has been guided by a use case based approach. The use cases were derived from data access related requests received from members of the research community served by the BCO-DMO. The resultant ontology satisfies the requirements of the use cases and reflects the information stored in the metadata database. The BCO-DMO metadata database currently contains information that powers several different user and machine-to-machine interfaces to the BCO-DMO data repositories. One goal of the ontology development project is to enable subsequent development of semantically-enabled components (e.g. faceted search) to enhance the power of those interfaces. Addition of semantic capabilities to the existing data interfaces will improve data access through enhanced data discovery. In addition to sharing the ontology, we will describe the challenges encountered thus far in the project, the technologies currently being used, and the strategies associated with the use case based informatics approach.
MELLO: Medical lifelog ontology for data terms from self-tracking and lifelog devices.
Kim, Hye Hyeon; Lee, Soo Youn; Baik, Su Youn; Kim, Ju Han
2015-12-01
The increasing use of health self-tracking devices is making the integration of heterogeneous data and shared decision-making more challenging. Computational analysis of lifelog data has been hampered by the lack of semantic and syntactic consistency among lifelog terms and related ontologies. Medical lifelog ontology (MELLO) was developed by identifying lifelog concepts and relationships between concepts, and it provides clear definitions by following ontology development methods. MELLO aims to support the classification and semantic mapping of lifelog data from diverse health self-tracking devices. MELLO was developed using the General Formal Ontology method with a manual iterative process comprising five steps: (1) defining the scope of lifelog data, (2) identifying lifelog concepts, (3) assigning relationships among MELLO concepts, (4) developing MELLO properties (e.g., synonyms, preferred terms, and definitions) for each MELLO concept, and (5) evaluating representative layers of the ontology content. An evaluation was performed by classifying 11 devices into 3 classes by subjects, and performing pairwise comparisons of lifelog terms among 5 devices in each class as measured using the Jaccard similarity index. MELLO represents a comprehensive knowledge base of 1998 lifelog concepts, with 4996 synonyms for 1211 (61%) concepts and 1395 definitions for 926 (46%) concepts. The MELLO Browser and MELLO Mapper provide convenient access and annotating non-standard proprietary terms with MELLO (http://mello.snubi.org/). MELLO covers 88.1% of lifelog terms from 11 health self-tracking devices and uses simple string matching to match semantically similar terms provided by various devices that are not yet integrated. The results from the comparisons of Jaccard similarities between simple string matching and MELLO matching revealed increases of 2.5, 2.2, and 5.7 folds for physical activity,body measure, and sleep classes, respectively. MELLO is the first ontology for representing health-related lifelog data with rich contents including definitions, synonyms, and semantic relationships. MELLO fills the semantic gap between heterogeneous lifelog terms that are generated by diverse health self-tracking devices. The unified representation of lifelog terms facilitated by MELLO can help describe an individual's lifestyle and environmental factors, which can be included with user-generated data for clinical research and thereby enhance data integration and sharing. Copyright © 2015. Published by Elsevier Ireland Ltd.
The Earth System CoG Collaboration Environment
NASA Astrophysics Data System (ADS)
DeLuca, C.; Murphy, S.; Cinquini, L.; Treshansky, A.; Wallis, J. C.; Rood, R. B.; Overeem, I.
2013-12-01
The Earth System CoG supports collaborative Earth science research and product development in virtual organizations that span multiple projects and communities. It provides access to data, metadata, and visualization services along with tools that support open project governance, and it can be used to host individual projects or to profile projects hosted elsewhere. All projects on CoG are described using a project ontology - an organized common vocabulary - that exposes information needed for collaboration and decision-making. Projects can be linked into a network, and the underlying ontology enables consolidated views of information across the network. This access to information promotes the creation of active and knowledgeable project governance, at both individual and aggregate project levels. CoG is being used to support software development projects, model intercomparison projects, training classes, and scientific programs. Its services and ontology are customizable by project. This presentation will provide an overview of CoG, review examples of current use, and discuss how CoG can be used as knowledge and coordination hub for networks of projects in the Earth Sciences.
Unsupervised user similarity mining in GSM sensor networks.
Shad, Shafqat Ali; Chen, Enhong
2013-01-01
Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining.
Development and Application of Ontologies in Support of Earth and Space Science Education
NASA Astrophysics Data System (ADS)
Fox, S. P.; Manduca, C. A.; Iverson, E.
2007-12-01
Through its work in supporting improved science education the Science Education Resource Center (SERC) has developed and applied a set of Earth and Space Science vocabularies. These controlled vocabularies play a central role in supporting user exploration of our educational materials. The set of over 50 vocabularies run the gamut from small vocabularies with a narrowly targeted use, to broader vocabularies that span multiple disciplines and are applied across multiple projects and collections. Typical specialized vocabularies cover disciplinary themes such as tectonic setting (with terms such as mid-ocean ridge, passive margin, and craton) as well as interdisciplinary work such as geology and human health (with terms such as radionuclides and airborne transport processes). To support project-specific customization of vocabularies while retaining the benefits of cross-project reuse our systems allow for dynamic mapping of terms among multiple vocabularies based on semantic equivalencies. The end result is a weaving of related vocabularies into an ontological network that is exposed as specific vocabularies that employ the natural language of the collections and communities that use them. Our process for vocabulary development is community driven and reflects our experiences in aligning terminology with disciplinary-specific expectations. These experiences include rectifying language differences across disciplines in building a Geoscience Quantitative Skills vocabulary through work with both the Mathematics and Geoscience communities, as well as the iterative development of a vocabulary spanning Earth and Space science through the aggregation of smaller vocabularies, each developed by scientists for use within their own discipline. The vocabularies are exposed as key navigational features in over 100 faceted search interfaces within the web sites of a dozen Earth and Space Science Education projects. Within these faceted search interfaces the terms in the vocabularies act as guideposts and browsing links for the users. Only terms relevant to the current collection, or search return, are exposed to the users giving them an immediate sense of the scope and focus of the collection. In using vocabularies to drive these sorts of discovery processes it is critical that vocabularies not only have clear semantics so they can be applied consistently, but also have appropriate evocative meaning for the users of the search interface. It is this immediate evocative meaning, rather than the precisely defined semantics that will end up driving user search behavior and in the end determine the efficacy of the vocabulary as an applied tool. We will outline our experiences in developing and applying these vocabularies within the context of geoscience education and explore how the broader themes that emerge can inform the development and use of ontologies throughout Earth and space science.
Water Quality Vocabulary Development and Deployment
NASA Astrophysics Data System (ADS)
Simons, B. A.; Yu, J.; Cox, S. J.
2013-12-01
Semantic descriptions of observed properties and associated units of measure are fundamental to understanding of environmental observations, including groundwater, surface water and marine water quality. Semantic descriptions can be captured in machine-readable ontologies and vocabularies, thus providing support for the annotation of observation values from the disparate data sources with appropriate and accurate metadata, which is critical for achieving semantic interoperability. However, current stand-alone water quality vocabularies provide limited support for cross-system comparisons or data fusion. To enhance semantic interoperability, the alignment of water-quality properties with definitions of chemical entities and units of measure in existing widely-used vocabularies is required. Modern ontologies and vocabularies are expressed, organized and deployed using Semantic Web technologies. We developed an ontology for observed properties (i.e. a model for expressing appropriate controlled vocabularies) which extends the NASA/TopQuadrant QUDT ontology for Unit and QuantityKind with two additional classes and two properties (see accompanying paper by Cox, Simons and Yu). We use our ontology to populate the Water Quality vocabulary with a set of individuals of each of the four key classes (and their subclasses), and add appropriate relationships between these individuals. This ontology is aligned with other relevant stand-alone Water Quality vocabularies and domain ontologies. Developing the Water Quality vocabulary involved two main steps. First, the Water Quality vocabulary was populated with individuals of the ObservedProperty class, which was determined from a census of existing datasets and services. Each ObservedProperty individual relates to other individuals of Unit and QuantityKind (taken from QUDT where possible), and to IdentifiedObject individuals. As a large fraction of observed water quality data are classified by the chemical substance involved, the IdentifiedObject individuals are linked to the ChEBI ontology for definitions of chemical substances.. Second, to allow compatibility with SKOS-based tools and to ensure the vocabulary does not violate the meta-modelling constraints of the OWL-DL profile, the relevant classes in QUDT are declared to be subclasses of SKOS Concept and a shadow SKOS view of ChEBI was generated (as ChEBI models all elements and substances as OWL classes). The provenance of each SKOS concept shadowing an OWL class is recorded using the PROV-O ontology. Some aspects of these processing steps can be automated through SPARQL queries, while other aspects must be done manually. For maintenance and provenance purposes, the complete vocabulary and ontologies are persisted in around 20 separate RDF files (in addition to the QUDT and ChEBI sources), each of which constitutes a separate RDF graph and reflects the various aspects of above steps. The vocabularies are published in multiple ways: - For download as files from the ontology URI - At a SPARQL endpoint - Through a URI-based SKOS API (SISSvoc) - Through search UIs built on top of the SPARQL endpoint or SISSvoc service
2014-06-01
from the ODM standard. Leveraging SPARX EA’s Java application programming interface (API), the team built a tool called OWL2EA that can ingest an OWL...server MySQL creates the physical schema that enables a user to store and retrieve data conforming to the vocabulary of the JC3IEDM. 6. GENERATING AN
2008-09-01
IWPC 21 Berners - Lee , Tim . (1999). Weaving the Web. New York: HarperCollins Publishers, Inc. 22... Berners - Lee , Tim . (1999). Weaving the Web. New York: HarperCollins Publishers, Inc. Berners - Lee , T., Hendler, J., & Lassila, O. (2001). The Semantic...environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. T. Berners - Lee , J. Hendler, and O
USI: a fast and accurate approach for conceptual document annotation.
Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent
2015-03-14
Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.
Smarter Earth Science Data System
NASA Technical Reports Server (NTRS)
Huang, Thomas
2013-01-01
The explosive growth in Earth observational data in the recent decade demands a better method of interoperability across heterogeneous systems. The Earth science data system community has mastered the art in storing large volume of observational data, but it is still unclear how this traditional method scale over time as we are entering the age of Big Data. Indexed search solutions such as Apache Solr (Smiley and Pugh, 2011) provides fast, scalable search via keyword or phases without any reasoning or inference. The modern search solutions such as Googles Knowledge Graph (Singhal, 2012) and Microsoft Bing, all utilize semantic reasoning to improve its accuracy in searches. The Earth science user community is demanding for an intelligent solution to help them finding the right data for their researches. The Ontological System for Context Artifacts and Resources (OSCAR) (Huang et al., 2012), was created in response to the DARPA Adaptive Vehicle Make (AVM) programs need for an intelligent context models management system to empower its terrain simulation subsystem. The core component of OSCAR is the Environmental Context Ontology (ECO) is built using the Semantic Web for Earth and Environmental Terminology (SWEET) (Raskin and Pan, 2005). This paper presents the current data archival methodology within a NASA Earth science data centers and discuss using semantic web to improve the way we capture and serve data to our users.
Data Model Management for Space Information Systems
NASA Technical Reports Server (NTRS)
Hughes, J. Steven; Crichton, Daniel J.; Ramirez, Paul; Mattmann, chris
2006-01-01
The Reference Architecture for Space Information Management (RASIM) suggests the separation of the data model from software components to promote the development of flexible information management systems. RASIM allows the data model to evolve independently from the software components and results in a robust implementation that remains viable as the domain changes. However, the development and management of data models within RASIM are difficult and time consuming tasks involving the choice of a notation, the capture of the model, its validation for consistency, and the export of the model for implementation. Current limitations to this approach include the lack of ability to capture comprehensive domain knowledge, the loss of significant modeling information during implementation, the lack of model visualization and documentation capabilities, and exports being limited to one or two schema types. The advent of the Semantic Web and its demand for sophisticated data models has addressed this situation by providing a new level of data model management in the form of ontology tools. In this paper we describe the use of a representative ontology tool to capture and manage a data model for a space information system. The resulting ontology is implementation independent. Novel on-line visualization and documentation capabilities are available automatically, and the ability to export to various schemas can be added through tool plug-ins. In addition, the ingestion of data instances into the ontology allows validation of the ontology and results in a domain knowledge base. Semantic browsers are easily configured for the knowledge base. For example the export of the knowledge base to RDF/XML and RDFS/XML and the use of open source metadata browsers provide ready-made user interfaces that support both text- and facet-based search. This paper will present the Planetary Data System (PDS) data model as a use case and describe the import of the data model into an ontology tool. We will also describe the current effort to provide interoperability with the European Space Agency (ESA)/Planetary Science Archive (PSA) which is critically dependent on a common data model.
SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data.
Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A
2015-01-01
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from http://www.molgenis.org/wiki/SORTA. © The Author(s) 2015. Published by Oxford University Press.
SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data
Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K.; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A.
2015-01-01
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA’s applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from http://www.molgenis.org/wiki/SORTA PMID:26385205
Visual analysis of large heterogeneous social networks by semantic and structural abstraction.
Shen, Zeqian; Ma, Kwan-Liu; Eliassi-Rad, Tina
2006-01-01
Social network analysis is an active area of study beyond sociology. It uncovers the invisible relationships between actors in a network and provides understanding of social processes and behaviors. It has become an important technique in a variety of application areas such as the Web, organizational studies, and homeland security. This paper presents a visual analytics tool, OntoVis, for understanding large, heterogeneous social networks, in which nodes and links could represent different concepts and relations, respectively. These concepts and relations are related through an ontology (also known as a schema). OntoVis is named such because it uses information in the ontology associated with a social network to semantically prune a large, heterogeneous network. In addition to semantic abstraction, OntoVis also allows users to do structural abstraction and importance filtering to make large networks manageable and to facilitate analytic reasoning. All these unique capabilities of OntoVis are illustrated with several case studies.
The OCareCloudS project: Toward organizing care through trusted cloud services.
De Backere, Femke; Ongenae, Femke; Vannieuwenborg, Frederic; Van Ooteghem, Jan; Duysburgh, Pieter; Jansen, Arne; Hoebeke, Jeroen; Wuyts, Kim; Rossey, Jen; Van den Abeele, Floris; Willems, Karen; Decancq, Jasmien; Annema, Jan Henk; Sulmon, Nicky; Van Landuyt, Dimitri; Verstichel, Stijn; Crombez, Pieter; Ackaert, Ann; De Grooff, Dirk; Jacobs, An; De Turck, Filip
2016-01-01
The increasing elderly population and the shift from acute to chronic illness makes it difficult to care for people in hospitals and rest homes. Moreover, elderly people, if given a choice, want to stay at home as long as possible. In this article, the methodologies to develop a cloud-based semantic system, offering valuable information and knowledge-based services, are presented. The information and services are related to the different personal living hemispheres of the patient, namely the daily care-related needs, the social needs and the daily life assistance. Ontologies are used to facilitate the integration, analysis, aggregation and efficient use of all the available data in the cloud. By using an interdisciplinary research approach, where user researchers, (ontology) engineers, researchers and domain stakeholders are at the forefront, a platform can be developed of great added value for the patients that want to grow old in their own home and for their caregivers.
SemanticOrganizer: A Customizable Semantic Repository for Distributed NASA Project Teams
NASA Technical Reports Server (NTRS)
Keller, Richard M.; Berrios, Daniel C.; Carvalho, Robert E.; Hall, David R.; Rich, Stephen J.; Sturken, Ian B.; Swanson, Keith J.; Wolfe, Shawn R.
2004-01-01
SemanticOrganizer is a collaborative knowledge management system designed to support distributed NASA projects, including diverse teams of scientists, engineers, and accident investigators. The system provides a customizable, semantically structured information repository that stores work products relevant to multiple projects of differing types. SemanticOrganizer is one of the earliest and largest semantic web applications deployed at NASA to date, and has been used in diverse contexts ranging from the investigation of Space Shuttle Columbia's accident to the search for life on other planets. Although the underlying repository employs a single unified ontology, access control and ontology customization mechanisms make the repository contents appear different for each project team. This paper describes SemanticOrganizer, its customization facilities, and a sampling of its applications. The paper also summarizes some key lessons learned from building and fielding a successful semantic web application across a wide-ranging set of domains with diverse users.
From Web Directories to Ontologies: Natural Language Processing Challenges
NASA Astrophysics Data System (ADS)
Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing
Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.
Design and development of linked data from the National Map
Usery, E. Lynn; Varanka, Dalia E.
2012-01-01
The development of linked data on the World-Wide Web provides the opportunity for the U.S. Geological Survey (USGS) to supply its extensive volumes of geospatial data, information, and knowledge in a machine interpretable form and reach users and applications that heretofore have been unavailable. To pilot a process to take advantage of this opportunity, the USGS is developing an ontology for The National Map and converting selected data from nine research test areas to a Semantic Web format to support machine processing and linked data access. In a case study, the USGS has developed initial methods for legacy vector and raster formatted geometry, attributes, and spatial relationships to be accessed in a linked data environment maintaining the capability to generate graphic or image output from semantic queries. The description of an initial USGS approach to developing ontology, linked data, and initial query capability from The National Map databases is presented.
Towards a Ubiquitous User Model for Profile Sharing and Reuse
de Lourdes Martinez-Villaseñor, Maria; Gonzalez-Mendoza, Miguel; Hernandez-Gress, Neil
2012-01-01
People interact with systems and applications through several devices and are willing to share information about preferences, interests and characteristics. Social networking profiles, data from advanced sensors attached to personal gadgets, and semantic web technologies such as FOAF and microformats are valuable sources of personal information that could provide a fair understanding of the user, but profile information is scattered over different user models. Some researchers in the ubiquitous user modeling community envision the need to share user model's information from heterogeneous sources. In this paper, we address the syntactic and semantic heterogeneity of user models in order to enable user modeling interoperability. We present a dynamic user profile structure based in Simple Knowledge Organization for the Web (SKOS) to provide knowledge representation for ubiquitous user model. We propose a two-tier matching strategy for concept schemas alignment to enable user modeling interoperability. Our proposal is proved in the application scenario of sharing and reusing data in order to deal with overweight and obesity. PMID:23201995
CP-ABE Based Privacy-Preserving User Profile Matching in Mobile Social Networks
Cui, Weirong; Du, Chenglie; Chen, Jinchao
2016-01-01
Privacy-preserving profile matching, a challenging task in mobile social networks, is getting more attention in recent years. In this paper, we propose a novel scheme that is based on ciphertext-policy attribute-based encryption to tackle this problem. In our scheme, a user can submit a preference-profile and search for users with matching-profile in decentralized mobile social networks. In this process, no participant’s profile and the submitted preference-profile is exposed. Meanwhile, a secure communication channel can be established between the pair of successfully matched users. In contrast to existing related schemes which are mainly based on the secure multi-party computation, our scheme can provide verifiability (both the initiator and any unmatched user cannot cheat each other to pretend to be matched), and requires few interactions among users. We provide thorough security analysis and performance evaluation on our scheme, and show its advantages in terms of security, efficiency and usability over state-of-the-art schemes. PMID:27337001
CP-ABE Based Privacy-Preserving User Profile Matching in Mobile Social Networks.
Cui, Weirong; Du, Chenglie; Chen, Jinchao
2016-01-01
Privacy-preserving profile matching, a challenging task in mobile social networks, is getting more attention in recent years. In this paper, we propose a novel scheme that is based on ciphertext-policy attribute-based encryption to tackle this problem. In our scheme, a user can submit a preference-profile and search for users with matching-profile in decentralized mobile social networks. In this process, no participant's profile and the submitted preference-profile is exposed. Meanwhile, a secure communication channel can be established between the pair of successfully matched users. In contrast to existing related schemes which are mainly based on the secure multi-party computation, our scheme can provide verifiability (both the initiator and any unmatched user cannot cheat each other to pretend to be matched), and requires few interactions among users. We provide thorough security analysis and performance evaluation on our scheme, and show its advantages in terms of security, efficiency and usability over state-of-the-art schemes.
Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang
2017-01-01
To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution.
Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang
2017-01-01
To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution. PMID:29854239
COMICS: Cartoon Visualization of Omics Data in Spatial Context Using Anatomical Ontologies
2017-01-01
COMICS is an interactive and open-access web platform for integration and visualization of molecular expression data in anatomograms of zebrafish, carp, and mouse model systems. Anatomical ontologies are used to map omics data across experiments and between an experiment and a particular visualization in a data-dependent manner. COMICS is built on top of several existing resources. Zebrafish and mouse anatomical ontologies with their controlled vocabulary (CV) and defined hierarchy are used with the ontoCAT R package to aggregate data for comparison and visualization. Libraries from the QGIS geographical information system are used with the R packages “maps” and “maptools” to visualize and interact with molecular expression data in anatomical drawings of the model systems. COMICS allows users to upload their own data from omics experiments, using any gene or protein nomenclature they wish, as long as CV terms are used to define anatomical regions or developmental stages. Common nomenclatures such as the ZFIN gene names and UniProt accessions are provided additional support. COMICS can be used to generate publication-quality visualizations of gene and protein expression across experiments. Unlike previous tools that have used anatomical ontologies to interpret imaging data in several animal models, including zebrafish, COMICS is designed to take spatially resolved data generated by dissection or fractionation and display this data in visually clear anatomical representations rather than large data tables. COMICS is optimized for ease-of-use, with a minimalistic web interface and automatic selection of the appropriate visual representation depending on the input data. PMID:29083911
COMICS: Cartoon Visualization of Omics Data in Spatial Context Using Anatomical Ontologies.
Travin, Dmitrii; Popov, Iaroslav; Guler, Arzu Tugce; Medvedev, Dmitry; van der Plas-Duivesteijn, Suzanne; Varela, Monica; Kolder, Iris C R M; Meijer, Annemarie H; Spaink, Herman P; Palmblad, Magnus
2018-01-05
COMICS is an interactive and open-access web platform for integration and visualization of molecular expression data in anatomograms of zebrafish, carp, and mouse model systems. Anatomical ontologies are used to map omics data across experiments and between an experiment and a particular visualization in a data-dependent manner. COMICS is built on top of several existing resources. Zebrafish and mouse anatomical ontologies with their controlled vocabulary (CV) and defined hierarchy are used with the ontoCAT R package to aggregate data for comparison and visualization. Libraries from the QGIS geographical information system are used with the R packages "maps" and "maptools" to visualize and interact with molecular expression data in anatomical drawings of the model systems. COMICS allows users to upload their own data from omics experiments, using any gene or protein nomenclature they wish, as long as CV terms are used to define anatomical regions or developmental stages. Common nomenclatures such as the ZFIN gene names and UniProt accessions are provided additional support. COMICS can be used to generate publication-quality visualizations of gene and protein expression across experiments. Unlike previous tools that have used anatomical ontologies to interpret imaging data in several animal models, including zebrafish, COMICS is designed to take spatially resolved data generated by dissection or fractionation and display this data in visually clear anatomical representations rather than large data tables. COMICS is optimized for ease-of-use, with a minimalistic web interface and automatic selection of the appropriate visual representation depending on the input data.
NASA Astrophysics Data System (ADS)
Lin, Y.; Chen, X.
2016-12-01
Land cover classification systems used in remote sensing image data have been developed to meet the needs for depicting land covers in scientific investigations and policy decisions. However, accuracy assessments of a spate of data sets demonstrate that compared with the real physiognomy, each of the thematic map of specific land cover classification system contains some unavoidable flaws and unintended deviation. This work proposes a web-based land cover classification system, an integrated prototype, based on an ontology model of various classification systems, each of which is assigned the same weight in the final determination of land cover type. Ontology, a formal explication of specific concepts and relations, is employed in this prototype to build up the connections among different systems to resolve the naming conflicts. The process is initialized by measuring semantic similarity between terminologies in the systems and the search key to produce certain set of satisfied classifications, and carries on through searching the predefined relations in concepts of all classification systems to generate classification maps with user-specified land cover type highlighted, based on probability calculated by votes from data sets with different classification system adopted. The present system is verified and validated by comparing the classification results with those most common systems. Due to full consideration and meaningful expression of each classification system using ontology and the convenience that the web brings with itself, this system, as a preliminary model, proposes a flexible and extensible architecture for classification system integration and data fusion, thereby providing a strong foundation for the future work.
CDAO-Store: Ontology-driven Data Integration for Phylogenetic Analysis
2011-01-01
Background The Comparative Data Analysis Ontology (CDAO) is an ontology developed, as part of the EvoInfo and EvoIO groups supported by the National Evolutionary Synthesis Center, to provide semantic descriptions of data and transformations commonly found in the domain of phylogenetic analysis. The core concepts of the ontology enable the description of phylogenetic trees and associated character data matrices. Results Using CDAO as the semantic back-end, we developed a triple-store, named CDAO-Store. CDAO-Store is a RDF-based store of phylogenetic data, including a complete import of TreeBASE. CDAO-Store provides a programmatic interface, in the form of web services, and a web-based front-end, to perform both user-defined as well as domain-specific queries; domain-specific queries include search for nearest common ancestors, minimum spanning clades, filter multiple trees in the store by size, author, taxa, tree identifier, algorithm or method. In addition, CDAO-Store provides a visualization front-end, called CDAO-Explorer, which can be used to view both character data matrices and trees extracted from the CDAO-Store. CDAO-Store provides import capabilities, enabling the addition of new data to the triple-store; files in PHYLIP, MEGA, nexml, and NEXUS formats can be imported and their CDAO representations added to the triple-store. Conclusions CDAO-Store is made up of a versatile and integrated set of tools to support phylogenetic analysis. To the best of our knowledge, CDAO-Store is the first semantically-aware repository of phylogenetic data with domain-specific querying capabilities. The portal to CDAO-Store is available at http://www.cs.nmsu.edu/~cdaostore. PMID:21496247
CDAO-store: ontology-driven data integration for phylogenetic analysis.
Chisham, Brandon; Wright, Ben; Le, Trung; Son, Tran Cao; Pontelli, Enrico
2011-04-15
The Comparative Data Analysis Ontology (CDAO) is an ontology developed, as part of the EvoInfo and EvoIO groups supported by the National Evolutionary Synthesis Center, to provide semantic descriptions of data and transformations commonly found in the domain of phylogenetic analysis. The core concepts of the ontology enable the description of phylogenetic trees and associated character data matrices. Using CDAO as the semantic back-end, we developed a triple-store, named CDAO-Store. CDAO-Store is a RDF-based store of phylogenetic data, including a complete import of TreeBASE. CDAO-Store provides a programmatic interface, in the form of web services, and a web-based front-end, to perform both user-defined as well as domain-specific queries; domain-specific queries include search for nearest common ancestors, minimum spanning clades, filter multiple trees in the store by size, author, taxa, tree identifier, algorithm or method. In addition, CDAO-Store provides a visualization front-end, called CDAO-Explorer, which can be used to view both character data matrices and trees extracted from the CDAO-Store. CDAO-Store provides import capabilities, enabling the addition of new data to the triple-store; files in PHYLIP, MEGA, nexml, and NEXUS formats can be imported and their CDAO representations added to the triple-store. CDAO-Store is made up of a versatile and integrated set of tools to support phylogenetic analysis. To the best of our knowledge, CDAO-Store is the first semantically-aware repository of phylogenetic data with domain-specific querying capabilities. The portal to CDAO-Store is available at http://www.cs.nmsu.edu/~cdaostore.
Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo
2015-01-01
Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis.
Unsupervised User Similarity Mining in GSM Sensor Networks
Shad, Shafqat Ali; Chen, Enhong
2013-01-01
Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining. PMID:23576905
Exploitation of Semantic Building Model in Indoor Navigation Systems
NASA Astrophysics Data System (ADS)
Anjomshoaa, A.; Shayeganfar, F.; Tjoa, A. Min
2009-04-01
There are many types of indoor and outdoor navigation tools and methodologies available. A majority of these solutions are based on Global Positioning Systems (GPS) and instant video and image processing. These approaches are ideal for open world environments where very few information about the target location is available, but for large scale building environments such as hospitals, governmental offices, etc the end-user will need more detailed information about the surrounding context which is especially important in case of people with special needs. This paper presents a smart indoor navigation solution that is based on Semantic Web technologies and Building Information Model (BIM). The proposed solution is also aligned with Google Android's concepts to enlighten the realization of results. Keywords: IAI IFCXML, Building Information Model, Indoor Navigation, Semantic Web, Google Android, People with Special Needs 1 Introduction Built environment is a central factor in our daily life and a big portion of human life is spent inside buildings. Traditionally the buildings are documented using building maps and plans by utilization of IT tools such as computer-aided design (CAD) applications. Documenting the maps in an electronic way is already pervasive but CAD drawings do not suffice the requirements regarding effective building models that can be shared with other building-related applications such as indoor navigation systems. The navigation in built environment is not a new issue, however with the advances in emerging technologies like GPS, mobile and networked environments, and Semantic Web new solutions have been suggested to enrich the traditional building maps and convert them to smart information resources that can be reused in other applications and improve the interpretability with building inhabitants and building visitors. Other important issues that should be addressed in building navigation scenarios are location tagging and end-user communication. The available solutions for location tagging are mostly based on proximity sensors and the information are bound to sensor references. In the proposed solution of this paper, the sensors simply play a role similar to annotations in Semantic Web world. Hence the sensors data in ontology sense bridges the gap between sensed information and building model. Combining these two and applying the proper inference rules, the building visitors will be able to reach their destinations with instant support of their communication devices such as hand helds, wearable computers, mobiles, etc. In a typical scenario of this kind, user's profile will be delivered to the smart building (via building ad-hoc services) and the appropriate route for user will be calculated and delivered to user's end-device. The calculated route is calculated by considering all constraints and requirements of the end user. So for example if the user is using a wheelchair, the calculated route should not contain stairs or narrow corridors that the wheelchair does not pass through. Then user starts to navigate through building by following the instructions of the end-device which are in turn generated from the calculated route. During the navigation process, the end-device should also interact with the smart building to sense the locations by reading the surrounding tags. So for example when a visually impaired person arrives at an unknown space, the tags will be sensed and the relevant information will be delivered to user in the proper way of communication. For example the building model can be used to generate a voice message for a blind person about a space and tell him/her that "the space has 3 doors, and the door on the left should be chosen which needs to be pushed to open". In this paper we will mainly focus on automatic generation of semantic building information models (Semantic BIM) and delivery of results to the end user. Combining the building information model with the environment and user constraints using Semantic Web technologies will make many scenarios conceivable. The generated IFC ontology that is base on the commonly accepted IFC (Industry Foundation Classes) standard can be used as the basis of information sharing between buildings, people, and applications. The proposed solution is aiming to facilitate the building navigation in an intuitive and extendable way that is easy to use by end-users and at the same time easy to maintain and manage by building administrators.
Towards A Self Adaptive System for Social Wellness.
Khattak, Asad Masood; Khan, Wajahat Ali; Pervez, Zeeshan; Iqbal, Farkhund; Lee, Sungyoung
2016-04-13
Advancements in science and technology have highlighted the importance of robust healthcare services, lifestyle services and personalized recommendations. For this purpose patient daily life activity recognition, profile information, and patient personal experience are required. In this research work we focus on the improvement in general health and life status of the elderly through the use of an innovative services to align dietary intake with daily life and health activity information. Dynamic provisioning of personalized healthcare and life-care services are based on the patient daily life activities recognized using smart phone. To achieve this, an ontology-based approach is proposed, where all the daily life activities and patient profile information are modeled in ontology. Then the semantic context is exploited with an inference mechanism that enables fine-grained situation analysis for personalized service recommendations. A generic system architecture is proposed that facilitates context information storage and exchange, profile information, and the newly recognized activities. The system exploits the patient's situation using semantic inference and provides recommendations for appropriate nutrition and activity related services. The proposed system is extensively evaluated for the claims and for its dynamic nature. The experimental results are very encouraging and have shown better accuracy than the existing system. The proposed system has also performed better in terms of the system support for a dynamic knowledge-base and the personalized recommendations.
OntoTrader: An Ontological Web Trading Agent Approach for Environmental Information Retrieval
Iribarne, Luis; Padilla, Nicolás; Ayala, Rosa; Asensio, José A.; Criado, Javier
2014-01-01
Modern Web-based Information Systems (WIS) are becoming increasingly necessary to provide support for users who are in different places with different types of information, by facilitating their access to the information, decision making, workgroups, and so forth. Design of these systems requires the use of standardized methods and techniques that enable a common vocabulary to be defined to represent the underlying knowledge. Thus, mediation elements such as traders enrich the interoperability of web components in open distributed systems. These traders must operate with other third-party traders and/or agents in the system, which must also use a common vocabulary for communication between them. This paper presents the OntoTrader architecture, an Ontological Web Trading agent based on the OMG ODP trading standard. It also presents the ontology needed by some system agents to communicate with the trading agent and the behavioral framework for the SOLERES OntoTrader agent, an Environmental Management Information System (EMIS). This framework implements a “Query-Searching/Recovering-Response” information retrieval model using a trading service, SPARQL notation, and the JADE platform. The paper also presents reflection, delegation and, federation mediation models and describes formalization, an experimental testing environment in three scenarios, and a tool which allows our proposal to be evaluated and validated. PMID:24977211
Hayman, G Thomas; Laulederkind, Stanley J F; Smith, Jennifer R; Wang, Shur-Jen; Petri, Victoria; Nigam, Rajni; Tutaj, Marek; De Pons, Jeff; Dwinell, Melinda R; Shimoyama, Mary
2016-01-01
The Rat Genome Database (RGD;http://rgd.mcw.edu/) provides critical datasets and software tools to a diverse community of rat and non-rat researchers worldwide. To meet the needs of the many users whose research is disease oriented, RGD has created a series of Disease Portals and has prioritized its curation efforts on the datasets important to understanding the mechanisms of various diseases. Gene-disease relationships for three species, rat, human and mouse, are annotated to capture biomarkers, genetic associations, molecular mechanisms and therapeutic targets. To generate gene-disease annotations more effectively and in greater detail, RGD initially adopted the MEDIC disease vocabulary from the Comparative Toxicogenomics Database and adapted it for use by expanding this framework with the addition of over 1000 terms to create the RGD Disease Ontology (RDO). The RDO provides the foundation for, at present, 10 comprehensive disease area-related dataset and analysis platforms at RGD, the Disease Portals. Two major disease areas are the focus of data acquisition and curation efforts each year, leading to the release of the related Disease Portals. Collaborative efforts to realize a more robust disease ontology are underway. Database URL:http://rgd.mcw.edu. © The Author(s) 2016. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Shi, Wei; Wang, Hongwei; He, Shaoyi
2013-12-01
Sentiment analysis of microblogging texts can facilitate both organisations' public opinion monitoring and governments' response strategies development. Nevertheless, most of the existing analysis methods are conducted on Twitter, lacking of sentiment analysis of Chinese microblogging (Weibo), and they generally rely on a large number of manually annotated training or machine learning to perform sentiment classification, yielding with difficulties in application. This paper addresses these problems and employs a sentiment ontology model to examine sentiment analysis of Chinese microblogging. We conduct a sentiment analysis of all public microblogging posts about '7.23 Wenzhou Train Collision' broadcasted by Sina microblogging users between 23 July and 1 August 2011. For every day in this time period, we first extract eight dimensions of sentiment (expect, joy, love, surprise, anxiety, sorrow, angry, and hate), and then build fuzzy sentiment ontology based on HowNet and semantic similarity for sentiment analysis; we also establish computing methods of influence and sentiment of microblogging texts; and we finally explore the change of public sentiment after '7.23 Wenzhou Train Collision'. The results show that the established sentiment analysis method has excellent application, and the change of different emotional values can reflect the success or failure of guiding the public opinion by the government.
Toward a Blended Ontology: Applying Knowledge Systems to ...
Bionanomedicine and environmental research share need common terms and ontologies. This study applied knowledge systems, data mining, and bibliometrics used in nano-scale ADME research from 1991 to 2011. The prominence of nano-ADME in environmental research began to exceed the publication rate in medical research in 2006. That trend appears to continue as a result of the growing products in commerce using nanotechnology, that is, 5-fold growth in number of countries with nanomaterials research centers. Funding for this research virtually did not exist prior to 2002, whereas today both medical and environmental research is funded globally. Key nanoparticle research began with pharmacology and therapeutic drug-delivery and contrasting agents, but the advances have found utility in the environmental research community. As evidence ultrafine aerosols and aquatic colloids research increased 6-fold, indicating a new emphasis on environmental nanotoxicology. User-directed expert elicitation from the engineering and chemical/ADME domains can be combined with appropriate Boolean logic and queries to define the corpus of nanoparticle interest. The study combined pharmacological expertise and informatics to identify the corpus by building logical conclusions and observations. Publication records informatics can lead to an enhanced understanding the connectivity between fields, as well as overcoming the differences in ontology between the fields. The National Exposure Resea
Reliability Prediction of Ontology-Based Service Compositions Using Petri Net and Time Series Models
Li, Jia; Xia, Yunni; Luo, Xin
2014-01-01
OWL-S, one of the most important Semantic Web service ontologies proposed to date, provides a core ontological framework and guidelines for describing the properties and capabilities of their web services in an unambiguous, computer interpretable form. Predicting the reliability of composite service processes specified in OWL-S allows service users to decide whether the process meets the quantitative quality requirement. In this study, we consider the runtime quality of services to be fluctuating and introduce a dynamic framework to predict the runtime reliability of services specified in OWL-S, employing the Non-Markovian stochastic Petri net (NMSPN) and the time series model. The framework includes the following steps: obtaining the historical response times series of individual service components; fitting these series with a autoregressive-moving-average-model (ARMA for short) and predicting the future firing rates of service components; mapping the OWL-S process into a NMSPN model; employing the predicted firing rates as the model input of NMSPN and calculating the normal completion probability as the reliability estimate. In the case study, a comparison between the static model and our approach based on experimental data is presented and it is shown that our approach achieves higher prediction accuracy. PMID:24688429
Semantic Location Extraction from Crowdsourced Data
NASA Astrophysics Data System (ADS)
Koswatte, S.; Mcdougall, K.; Liu, X.
2016-06-01
Crowdsourced Data (CSD) has recently received increased attention in many application areas including disaster management. Convenience of production and use, data currency and abundancy are some of the key reasons for attracting this high interest. Conversely, quality issues like incompleteness, credibility and relevancy prevent the direct use of such data in important applications like disaster management. Moreover, location information availability of CSD is problematic as it remains very low in many crowd sourced platforms such as Twitter. Also, this recorded location is mostly related to the mobile device or user location and often does not represent the event location. In CSD, event location is discussed descriptively in the comments in addition to the recorded location (which is generated by means of mobile device's GPS or mobile communication network). This study attempts to semantically extract the CSD location information with the help of an ontological Gazetteer and other available resources. 2011 Queensland flood tweets and Ushahidi Crowd Map data were semantically analysed to extract the location information with the support of Queensland Gazetteer which is converted to an ontological gazetteer and a global gazetteer. Some preliminary results show that the use of ontologies and semantics can improve the accuracy of place name identification of CSD and the process of location information extraction.
Determining the semantic similarities among Gene Ontology terms.
Taha, Kamal
2013-05-01
We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
OntoTrader: an ontological Web trading agent approach for environmental information retrieval.
Iribarne, Luis; Padilla, Nicolás; Ayala, Rosa; Asensio, José A; Criado, Javier
2014-01-01
Modern Web-based Information Systems (WIS) are becoming increasingly necessary to provide support for users who are in different places with different types of information, by facilitating their access to the information, decision making, workgroups, and so forth. Design of these systems requires the use of standardized methods and techniques that enable a common vocabulary to be defined to represent the underlying knowledge. Thus, mediation elements such as traders enrich the interoperability of web components in open distributed systems. These traders must operate with other third-party traders and/or agents in the system, which must also use a common vocabulary for communication between them. This paper presents the OntoTrader architecture, an Ontological Web Trading agent based on the OMG ODP trading standard. It also presents the ontology needed by some system agents to communicate with the trading agent and the behavioral framework for the SOLERES OntoTrader agent, an Environmental Management Information System (EMIS). This framework implements a "Query-Searching/Recovering-Response" information retrieval model using a trading service, SPARQL notation, and the JADE platform. The paper also presents reflection, delegation and, federation mediation models and describes formalization, an experimental testing environment in three scenarios, and a tool which allows our proposal to be evaluated and validated.
Shin, Sung Hee; Yun, Eun Kyoung
2011-06-01
This study was conducted to explore the profiles of online health information users in terms of certain psychological characteristics and to suggest guidelines for the provision of better user-oriented health information service. The cross-sectional study design was used with convenient sampling by Web-based questionnaire survey in Korea. To analyze health information user profiles on the Internet, a two-step cluster analysis was conducted. The results reveal that online health information users can be classified into four groups according to their level of subjective knowledge and health concern. The findings also suggest that four clusters that exhibit distinct profile patterns exist. The findings of this study would be useful for health portal developers who would like to understand users' characteristics and behaviors and to provide more user-oriented service in a satisfactory manner. It is suggested that to develop a full understanding of users' behaviors regarding Internet health information service, further research would be needed to explore users' various needs, their preferences, and relevant factors among users across a variety of health problem-addressing Web sites at different professional levels.
Hook, Sharon E; Skillman, Ann D; Gopalan, Banu; Small, Jack A; Schultz, Irvin R
2008-03-01
Among proposed uses for microarrays in environmental toxiciology is the identification of key contributors to toxicity within a mixture. However, it remains uncertain whether the transcriptomic profiles resulting from exposure to a mixture have patterns of altered gene expression that contain identifiable contributions from each toxicant component. We exposed isogenic rainbow trout Onchorynchus mykiss, to sublethal levels of ethynylestradiol, 2,2,4,4-tetrabromodiphenyl ether, and chromium VI or to a mixture of all three toxicants Fluorescently labeled complementary DNA (cDNA) were generated and hybridized against a commercially available Salmonid array spotted with 16,000 cDNAs. Data were analyzed using analysis of variance (p<0.05) with a Benjamani-Hochberg multiple test correction (Genespring [Agilent] software package) to identify up and downregulated genes. Gene clustering patterns that can be used as "expression signatures" were determined using hierarchical cluster analysis. The gene ontology terms associated with significantly altered genes were also used to identify functional groups that were associated with toxicant exposure. Cross-ontological analytics approach was used to assign functional annotations to genes with "unknown" function. Our analysis indicates that transcriptomic profiles resulting from the mixture exposure resemble those of the individual contaminant exposures, but are not a simple additive list. However, patterns of altered genes representative of each component of the mixture are clearly discernible, and the functional classes of genes altered represent the individual components of the mixture. These findings indicate that the use of microarrays to identify transcriptomic profiles may aid in the identification of key stressors within a chemical mixture, ultimately improving environmental assessment.
Systematically linking tranSMART, Galaxy and EGA for reusing human translational research data
Zhang, Chao; Bijlard, Jochem; Staiger, Christine; Scollen, Serena; van Enckevort, David; Hoogstrate, Youri; Senf, Alexander; Hiltemann, Saskia; Repo, Susanna; Pipping, Wibo; Bierkens, Mariska; Payralbe, Stefan; Stringer, Bas; Heringa, Jaap; Stubbs, Andrew; Bonino Da Silva Santos, Luiz Olavo; Belien, Jeroen; Weistra, Ward; Azevedo, Rita; van Bochove, Kees; Meijer, Gerrit; Boiten, Jan-Willem; Rambla, Jordi; Fijneman, Remond; Spalding, J. Dylan; Abeln, Sanne
2017-01-01
The availability of high-throughput molecular profiling techniques has provided more accurate and informative data for regular clinical studies. Nevertheless, complex computational workflows are required to interpret these data. Over the past years, the data volume has been growing explosively, requiring robust human data management to organise and integrate the data efficiently. For this reason, we set up an ELIXIR implementation study, together with the Translational research IT (TraIT) programme, to design a data ecosystem that is able to link raw and interpreted data. In this project, the data from the TraIT Cell Line Use Case (TraIT-CLUC) are used as a test case for this system. Within this ecosystem, we use the European Genome-phenome Archive (EGA) to store raw molecular profiling data; tranSMART to collect interpreted molecular profiling data and clinical data for corresponding samples; and Galaxy to store, run and manage the computational workflows. We can integrate these data by linking their repositories systematically. To showcase our design, we have structured the TraIT-CLUC data, which contain a variety of molecular profiling data types, for storage in both tranSMART and EGA. The metadata provided allows referencing between tranSMART and EGA, fulfilling the cycle of data submission and discovery; we have also designed a data flow from EGA to Galaxy, enabling reanalysis of the raw data in Galaxy. In this way, users can select patient cohorts in tranSMART, trace them back to the raw data and perform (re)analysis in Galaxy. Our conclusion is that the majority of metadata does not necessarily need to be stored (redundantly) in both databases, but that instead FAIR persistent identifiers should be available for well-defined data ontology levels: study, data access committee, physical sample, data sample and raw data file. This approach will pave the way for the stable linkage and reuse of data. PMID:29123641
Systematically linking tranSMART, Galaxy and EGA for reusing human translational research data.
Zhang, Chao; Bijlard, Jochem; Staiger, Christine; Scollen, Serena; van Enckevort, David; Hoogstrate, Youri; Senf, Alexander; Hiltemann, Saskia; Repo, Susanna; Pipping, Wibo; Bierkens, Mariska; Payralbe, Stefan; Stringer, Bas; Heringa, Jaap; Stubbs, Andrew; Bonino Da Silva Santos, Luiz Olavo; Belien, Jeroen; Weistra, Ward; Azevedo, Rita; van Bochove, Kees; Meijer, Gerrit; Boiten, Jan-Willem; Rambla, Jordi; Fijneman, Remond; Spalding, J Dylan; Abeln, Sanne
2017-01-01
The availability of high-throughput molecular profiling techniques has provided more accurate and informative data for regular clinical studies. Nevertheless, complex computational workflows are required to interpret these data. Over the past years, the data volume has been growing explosively, requiring robust human data management to organise and integrate the data efficiently. For this reason, we set up an ELIXIR implementation study, together with the Translational research IT (TraIT) programme, to design a data ecosystem that is able to link raw and interpreted data. In this project, the data from the TraIT Cell Line Use Case (TraIT-CLUC) are used as a test case for this system. Within this ecosystem, we use the European Genome-phenome Archive (EGA) to store raw molecular profiling data; tranSMART to collect interpreted molecular profiling data and clinical data for corresponding samples; and Galaxy to store, run and manage the computational workflows. We can integrate these data by linking their repositories systematically. To showcase our design, we have structured the TraIT-CLUC data, which contain a variety of molecular profiling data types, for storage in both tranSMART and EGA. The metadata provided allows referencing between tranSMART and EGA, fulfilling the cycle of data submission and discovery; we have also designed a data flow from EGA to Galaxy, enabling reanalysis of the raw data in Galaxy. In this way, users can select patient cohorts in tranSMART, trace them back to the raw data and perform (re)analysis in Galaxy. Our conclusion is that the majority of metadata does not necessarily need to be stored (redundantly) in both databases, but that instead FAIR persistent identifiers should be available for well-defined data ontology levels: study, data access committee, physical sample, data sample and raw data file. This approach will pave the way for the stable linkage and reuse of data.
Sarntivijai, Sirarat; Xiang, Zuoshuang; Shedden, Kerby A.; Markel, Howard; Omenn, Gilbert S.; Athey, Brian D.; He, Yongqun
2012-01-01
Vaccine adverse events (VAEs) are adverse bodily changes occurring after vaccination. Understanding the adverse event (AE) profiles is a crucial step to identify serious AEs. Two different types of seasonal influenza vaccines have been used on the market: trivalent (killed) inactivated influenza vaccine (TIV) and trivalent live attenuated influenza vaccine (LAIV). Different adverse event profiles induced by these two groups of seasonal influenza vaccines were studied based on the data drawn from the CDC Vaccine Adverse Event Report System (VAERS). Extracted from VAERS were 37,621 AE reports for four TIVs (Afluria, Fluarix, Fluvirin, and Fluzone) and 3,707 AE reports for the only LAIV (FluMist). The AE report data were analyzed by a novel combinatorial, ontology-based detection of AE method (CODAE). CODAE detects AEs using Proportional Reporting Ratio (PRR), Chi-square significance test, and base level filtration, and groups identified AEs by ontology-based hierarchical classification. In total, 48 TIV-enriched and 68 LAIV-enriched AEs were identified (PRR>2, Chi-square score >4, and the number of cases >0.2% of total reports). These AE terms were classified using the Ontology of Adverse Events (OAE), MedDRA, and SNOMED-CT. The OAE method provided better classification results than the two other methods. Thirteen out of 48 TIV-enriched AEs were related to neurological and muscular processing such as paralysis, movement disorders, and muscular weakness. In contrast, 15 out of 68 LAIV-enriched AEs were associated with inflammatory response and respiratory system disorders. There were evidences of two severe adverse events (Guillain-Barre Syndrome and paralysis) present in TIV. Although these severe adverse events were at low incidence rate, they were found to be more significantly enriched in TIV-vaccinated patients than LAIV-vaccinated patients. Therefore, our novel combinatorial bioinformatics analysis discovered that LAIV had lower chance of inducing these two severe adverse events than TIV. In addition, our meta-analysis found that all previously reported positive correlation between GBS and influenza vaccine immunization were based on trivalent influenza vaccines instead of monovalent influenza vaccines. PMID:23209624
Support of surgical process modeling by using adaptable software user interfaces
NASA Astrophysics Data System (ADS)
Neumuth, T.; Kaschek, B.; Czygan, M.; Goldstein, D.; Strauß, G.; Meixensberger, J.; Burgert, O.
2010-03-01
Surgical Process Modeling (SPM) is a powerful method for acquiring data about the evolution of surgical procedures. Surgical Process Models are used in a variety of use cases including evaluation studies, requirements analysis and procedure optimization, surgical education, and workflow management scheme design. This work proposes the use of adaptive, situation-aware user interfaces for observation support software for SPM. We developed a method to support the modeling of the observer by using an ontological knowledge base. This is used to drive the graphical user interface for the observer to restrict the search space of terminology depending on the current situation. In the evaluation study it is shown, that the workload of the observer was decreased significantly by using adaptive user interfaces. 54 SPM observation protocols were analyzed by using the NASA Task Load Index and it was shown that the use of the adaptive user interface disburdens the observer significantly in workload criteria effort, mental demand and temporal demand, helping him to concentrate on his essential task of modeling the Surgical Process.
Residential Consumption Scheduling Based on Dynamic User Profiling
NASA Astrophysics Data System (ADS)
Mangiatordi, Federica; Pallotti, Emiliano; Del Vecchio, Paolo; Capodiferro, Licia
Deployment of household appliances and of electric vehicles raises the electricity demand in the residential areas and the impact of the building's electrical power. The variations of electricity consumption across the day, may affect both the design of the electrical generation facilities and the electricity bill, mainly when a dynamic pricing is applied. This paper focuses on an energy management system able to control the day-ahead electricity demand in a residential area, taking into account both the variability of the energy production costs and the profiling of the users. The user's behavior is dynamically profiled on the basis of the tasks performed during the previous days and of the tasks foreseen for the current day. Depending on the size and on the flexibility in time of the user tasks, home inhabitants are grouped in, one over N, energy profiles, using a k-means algorithm. For a fixed energy generation cost, each energy profile is associated to a different hourly energy cost. The goal is to identify any bad user profile and to make it pay a highest bill. A bad profile example is when a user applies a lot of consumption tasks and low flexibility in task reallocation time. The proposed energy management system automatically schedules the tasks, solving a multi-objective optimization problem based on an MPSO strategy. The goals, when identifying bad users profiles, are to reduce the peak to average ratio in energy demand, and to minimize the energy costs, promoting virtuous behaviors.
linkedISA: semantic representation of ISA-Tab experimental metadata.
González-Beltrán, Alejandra; Maguire, Eamonn; Sansone, Susanna-Assunta; Rocca-Serra, Philippe
2014-01-01
Reporting and sharing experimental metadata- such as the experimental design, characteristics of the samples, and procedures applied, along with the analysis results, in a standardised manner ensures that datasets are comprehensible and, in principle, reproducible, comparable and reusable. Furthermore, sharing datasets in formats designed for consumption by humans and machines will also maximize their use. The Investigation/Study/Assay (ISA) open source metadata tracking framework facilitates standards-compliant collection, curation, visualization, storage and sharing of datasets, leveraging on other platforms to enable analysis and publication. The ISA software suite includes several components used in increasingly diverse set of life science and biomedical domains; it is underpinned by a general-purpose format, ISA-Tab, and conversions exist into formats required by public repositories. While ISA-Tab works well mainly as a human readable format, we have also implemented a linked data approach to semantically define the ISA-Tab syntax. We present a semantic web representation of the ISA-Tab syntax that complements ISA-Tab's syntactic interoperability with semantic interoperability. We introduce the linkedISA conversion tool from ISA-Tab to the Resource Description Framework (RDF), supporting mappings from the ISA syntax to multiple community-defined, open ontologies and capitalising on user-provided ontology annotations in the experimental metadata. We describe insights of the implementation and how annotations can be expanded driven by the metadata. We applied the conversion tool as part of Bio-GraphIIn, a web-based application supporting integration of the semantically-rich experimental descriptions. Designed in a user-friendly manner, the Bio-GraphIIn interface hides most of the complexities to the users, exposing a familiar tabular view of the experimental description to allow seamless interaction with the RDF representation, and visualising descriptors to drive the query over the semantic representation of the experimental design. In addition, we defined queries over the linkedISA RDF representation and demonstrated its use over the linkedISA conversion of datasets from Nature' Scientific Data online publication. Our linked data approach has allowed us to: 1) make the ISA-Tab semantics explicit and machine-processable, 2) exploit the existing ontology-based annotations in the ISA-Tab experimental descriptions, 3) augment the ISA-Tab syntax with new descriptive elements, 4) visualise and query elements related to the experimental design. Reasoning over ISA-Tab metadata and associated data will facilitate data integration and knowledge discovery.
EUMIS - an open portal framework for interoperable marine environmental services
NASA Astrophysics Data System (ADS)
Hamre, T.; Sandven, S.; Leadbetter, A.; Gouriou, V.; Dunne, D.; Grant, M.; Treguer, M.; Torget, Ø.
2012-04-01
NETMAR (Open service network for marine environmental data) is an FP7 project that aims to develop a pilot European Marine Information System (EUMIS) for searching, downloading and integrating satellite, in situ and model data from ocean and coastal areas. EUMIS will use a semantic framework coupled with ontologies for identifying and accessing distributed data, such as near-real time, forecast and historical data. Four pilots have been defined to clarify the needs for satellite, in situ and model based products and services in selected user communities. The pilots are: · Pilot 1: Arctic Sea Ice Monitoring and Forecasting · Pilot 2: Oil spill drift forecast and shoreline cleanup assessment services in France · Pilot 3: Ocean colour - Marine Ecosystem, Research and Monitoring · Pilot 4: International Coastal Atlas Network (ICAN) for coastal zone management NETMAR is developing a set of data delivery services for the targeted user communities by means of standard web-GIS and OPeNDAP protocols. Processing services and adaptive service chaining services will also be developed, to enable users to generate new products suited to their needs. Both data retrieved from online repositories as well as the products generated dynamically can be accessed and visualised in the EUMIS portal. For this purpose, a GIS Viewer, a Service Chaining Editor and a Ontology Browser/Discovery Client have been developed and integrated in EUMIS. The EUMIS portal is developed using a portal framework that is compliant with the JSR-168 (Java Portlet Specification 1.0) and JSR-286 (Java Portlet Specification, 2.0) standards. These standards defines the interface (contract) and lifecycle management for a portal system component, a portlet, which can be implemented in a number of programming languages, not only Java. The GIS Viewer is developed using a combination of Java, JavaScript and JSF (e.g. MapFaces). The Service chaining editor is implemented in JavaScript (using different libraries like jQuery and WireIt), and the Ontology Browser/Discovery Client by means of Adobe Flex. In addition to the portlets developed in the project, we have also used several of the pre-built portlets that come with the Liferay Community Edition portal framework, notably the wiki, forum and RSS feed portlets. The presentation will focus on the developed system components and show some examples of products and services from the defined pilots.
Tripathi, Kumar Parijat; Evangelista, Daniela; Zuccaro, Antonio; Guarracino, Mario Rosario
2015-01-01
RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool), QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery) tools. It offers a report on statistical analysis of functional and Gene Ontology (GO) annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA) by ab initio methods) helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is freely available at: http://www-labgtp.na.icar.cnr.it/Transcriptator.
IoT-Based User-Driven Service Modeling Environment for a Smart Space Management System
Choi, Hoan-Suk; Rhee, Woo-Seop
2014-01-01
The existing Internet environment has been extended to the Internet of Things (IoT) as an emerging new paradigm. The IoT connects various physical entities. These entities have communication capability and deploy the observed information to various service areas such as building management, energy-saving systems, surveillance services, and smart homes. These services are designed and developed by professional service providers. Moreover, users' needs have become more complicated and personalized with the spread of user-participation services such as social media and blogging. Therefore, some active users want to create their own services to satisfy their needs, but the existing IoT service-creation environment is difficult for the non-technical user because it requires a programming capability to create a service. To solve this problem, we propose the IoT-based user-driven service modeling environment to provide an easy way to create IoT services. Also, the proposed environment deploys the defined service to another user. Through the personalization and customization of the defined service, the value and dissemination of the service is increased. This environment also provides the ontology-based context-information processing that produces and describes the context information for the IoT-based user-driven service. PMID:25420153
IoT-based user-driven service modeling environment for a smart space management system.
Choi, Hoan-Suk; Rhee, Woo-Seop
2014-11-20
The existing Internet environment has been extended to the Internet of Things (IoT) as an emerging new paradigm. The IoT connects various physical entities. These entities have communication capability and deploy the observed information to various service areas such as building management, energy-saving systems, surveillance services, and smart homes. These services are designed and developed by professional service providers. Moreover, users' needs have become more complicated and personalized with the spread of user-participation services such as social media and blogging. Therefore, some active users want to create their own services to satisfy their needs, but the existing IoT service-creation environment is difficult for the non-technical user because it requires a programming capability to create a service. To solve this problem, we propose the IoT-based user-driven service modeling environment to provide an easy way to create IoT services. Also, the proposed environment deploys the defined service to another user. Through the personalization and customization of the defined service, the value and dissemination of the service is increased. This environment also provides the ontology-based context-information processing that produces and describes the context information for the IoT-based user-driven service.
NASA Astrophysics Data System (ADS)
Fazliev, A.
2009-04-01
The information and knowledge layers of information-computational system for water spectroscopy are described. Semantic metadata for all the tasks of domain information model that are the basis of the layers have been studied. The principle of semantic metadata determination and mechanisms of the usage during information systematization in molecular spectroscopy has been revealed. The software developed for the work with semantic metadata is described as well. Formation of domain model in the framework of Semantic Web is based on the use of explicit specification of its conceptualization or, in other words, its ontologies. Formation of conceptualization for molecular spectroscopy was described in Refs. 1, 2. In these works two chains of task are selected for zeroth approximation for knowledge domain description. These are direct tasks chain and inverse tasks chain. Solution schemes of these tasks defined approximation of data layer for knowledge domain conceptualization. Spectroscopy tasks solutions properties lead to a step-by-step extension of molecular spectroscopy conceptualization. Information layer of information system corresponds to this extension. An advantage of molecular spectroscopy model designed in a form of tasks chain is actualized in the fact that one can explicitly define data and metadata at each step of solution of these molecular spectroscopy chain tasks. Metadata structure (tasks solutions properties) in knowledge domain also has form of a chain in which input data and metadata of the previous task become metadata of the following tasks. The term metadata is used in its narrow sense: metadata are the properties of spectroscopy tasks solutions. Semantic metadata represented with the help of OWL 3 are formed automatically and they are individuals of classes (A-box). Unification of T-box and A-box is an ontology that can be processed with the help of inference engine. In this work we analyzed the formation of individuals of molecular spectroscopy applied ontologies as well as the software used for their creation by means of OWL DL language. The results of this work are presented in a form of an information layer and a knowledge layer in W@DIS information system 4. 1 FORMATION OF INDIVIDUALS OF WATER SPECTROSCOPY APPLIED ONTOLOGY Applied tasks ontology contains explicit description of input an output data of physical tasks solved in two chains of molecular spectroscopy tasks. Besides physical concepts, related to spectroscopy tasks solutions, an information source, which is a key concept of knowledge domain information model, is also used. Each solution of knowledge domain task is linked to the information source which contains a reference on published task solution, molecule and task solution properties. Each information source allows us to identify a certain knowledge domain task solution contained in the information system. Water spectroscopy applied ontology classes are formed on the basis of molecular spectroscopy concepts taxonomy. They are defined by constrains on properties of the selected conceptualization. Extension of applied ontology in W@DIS information system is actualized according to two scenarios. Individuals (ontology facts or axioms) formation is actualized during the task solution upload in the information system. Ontology user operation that implies molecular spectroscopy taxonomy and individuals is performed solely by the user. For this purpose Protege ontology editor was used. For the formation, processing and visualization of knowledge domain tasks individuals a software was designed and implemented. Method of individual formation determines the sequence of steps of created ontology individuals' generation. Tasks solutions properties (metadata) have qualitative and quantitative values. Qualitative metadata are regarded as metadata describing qualitative side of a task such as solution method or other information that can be explicitly specified by object properties of OWL DL language. Quantitative metadata are metadata that describe quantitative properties of task solution such as minimal and maximal data value or other information that can be explicitly obtained by programmed algorithmic operations. These metadata are related to DatatypeProperty properties of OWL specification language Quantitative metadata can be obtained automatically during data upload into information system. Since ObjectProperty values are objects, processing of qualitative metadata requires logical constraints. In case of the task solved in W@DIS ICS qualitative metadata can be formed automatically (for example in spectral functions calculation task). The used methods of translation of qualitative metadata into quantitative is characterized as roughened representation of knowledge in knowledge domain. The existence of two ways of data obtainment is a key moment in the formation of applied ontology of molecular spectroscopy task. experimental method (metadata for experimental data contain description of equipment, experiment conditions and so on) on the initial stage and inverse task solution on the following stages; calculation method (metadata for calculation data are closely related to the metadata used for the description of physical and mathematical models of molecular spectroscopy) 2 SOFTWARE FOR ONTOLOGY OPERATION Data collection in water spectroscopy information system is organized in a form of workflow that contains such operations as information source creation, entry of bibliographic data on publications, formation of uploaded data schema an so on. Metadata are generated in information source as well. Two methods are used for their formation: automatic metadata generation and manual metadata generation (performed by user). Software implementation of support of actions related to metadata formation is performed by META+ module. Functions of META+ module can be divided into two groups. The first groups contains the functions necessary to software developer while the second one the functions necessary to a user of the information system. META+ module functions necessary to the developer are: 1. creation of taxonomy (T-boxes) of applied ontology classes of knowledge domain tasks; 2. creation of instances of task classes; 3. creation of data schemes of tasks in a form of an XML-pattern and based on XML-syntax. XML-pattern is developed for instances generator and created according to certain rules imposed on software generator implementation. 4. implementation of metadata values calculation algorithms; 5. creation of a request interface and additional knowledge processing function for the solution of these task; 6. unification of the created functions and interfaces into one information system The following sequence is universal for the generation of task classes' individuals that form chains. Special interfaces for user operations management are designed for software developer in META+ module. There are means for qualitative metadata values updating during data reuploading to information source. The list of functions necessary to end user contains: - data sets visualization and editing, taking into account their metadata, e.g.: display of unique number of bands in transitions for a certain data source; - export of OWL/RDF models from information system to the environment in XML-syntax; - visualization of instances of classes of applied ontology tasks on molecular spectroscopy; - import of OWL/RDF models into the information system and their integration with domain vocabulary; - formation of additional knowledge of knowledge domain for the construction of ontological instances of task classes using GTML-formats and their processing; - formation of additional knowledge in knowledge domain for the construction of instances of task classes, using software algorithm for data sets processing; - function of semantic search implementation using an interface that formulates questions in a form of related triplets in order for getting an adequate answer. 3 STRUCTURE OF META+ MODULE META+ software module that provides the above functions contains the following components: - a knowledge base that stores semantic metadata and taxonomies of information system; - software libraries POWL and RAP 5 created by third-party developer and providing access to ontological storage; - function classes and libraries that form the core of the module and perform the tasks of formation, storage and visualization of classes instances; - configuration files and module patterns that allow one to adjust and organize operation of different functional blocks; META+ module also contains scripts and patterns implemented according to the rules of W@DIS information system development environment. - scripts for interaction with environment by means of the software core of information system. These scripts provide organizing web-oriented interactive communication; - patterns for the formation of functionality visualization realized by the scripts Software core of scientific information-computational system W@DIS is created with the help of MVC (Model - View - Controller) design pattern that allows us to separate logic of application from its representation. It realizes the interaction of three logical components, actualizing interactivity with the environment via Web and performing its preprocessing. Functions of «Controller» logical component are realized with the help of scripts designed according to the rules imposed by software core of the information system. Each script represents a definite object-oriented class with obligatory class method of script initiation called "start". Functions of actualization of domain application operation results representation (i.e. "View" component) are sets of HTML-patterns that allow one to visualize the results of domain applications operation with the help of additional constructions processed by software core of the system. Besides the interaction with the software core of the scientific information system this module also deals with configuration files of software core and its database. Such organization of work provides closer integration with software core and deeper and more adequate connection in operating system support. 4 CONCLUSION In this work the problems of semantic metadata creation in information system oriented on information representation in the area of molecular spectroscopy have been discussed. The described method of semantic metadata and functions formation as well as realization and structure of META+ module have been described. Architecture of META+ module is closely related to the existing software of "Molecular spectroscopy" scientific information system. Realization of the module is performed with the use of modern approaches to Web-oriented applications development. It uses the existing applied interfaces. The developed software allows us to: - perform automatic metadata annotation of calculated tasks solutions directly in the information system; - perform automatic annotation of metadata on the solution of tasks on task solution results uploading outside the information system forming an instance of the solved task on the basis of entry data; - use ontological instances of task solution for identification of data in information tasks of viewing, comparison and search solved by information system; - export applied tasks ontologies for the operation with them by external means; - solve the task of semantic search according to the pattern and using question-answer type interface. 5 ACKNOWLEDGEMENT The authors are grateful to RFBR for the financial support of development of distributed information system for molecular spectroscopy. REFERENCES A.D.Bykov, A.Z. Fazliev, N.N.Filippov, A.V. Kozodoev, A.I.Privezentsev, L.N.Sinitsa, M.V.Tonkov and M.Yu.Tretyakov, Distributed information system on atmospheric spectroscopy // Geophysical Research Abstracts, SRef-ID: 1607-7962/gra/EGU2007-A-01906, 2007, v. 9, p. 01906. A.I.Prevezentsev, A.Z. Fazliev Applied task ontology for molecular spectroscopy information resources systematization. The Proceedings of 9th Russian scientific conference "Electronic libraries: advanced methods and technologies, electronic collections" - RCDL'2007, Pereslavl Zalesskii, 2007, part.1, 2007, P.201-210. OWL Web Ontology Language Semantics and Abstract Syntax, W3C Recommendation 10 February 2004, http://www.w3.org/TR/2004/REC-owl-semantics-20040210/ W@DIS information system, http://wadis.saga.iao.ru RAP library, http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/.
An Ontology-Based, Mobile-Optimized System for Pharmacogenomic Decision Support at the Point-of-Care
Miñarro-Giménez, Jose Antonio; Blagec, Kathrin; Boyce, Richard D.; Adlassnig, Klaus-Peter; Samwald, Matthias
2014-01-01
Background The development of genotyping and genetic sequencing techniques and their evolution towards low costs and quick turnaround have encouraged a wide range of applications. One of the most promising applications is pharmacogenomics, where genetic profiles are used to predict the most suitable drugs and drug dosages for the individual patient. This approach aims to ensure appropriate medical treatment and avoid, or properly manage, undesired side effects. Results We developed the Medicine Safety Code (MSC) service, a novel pharmacogenomics decision support system, to provide physicians and patients with the ability to represent pharmacogenomic data in computable form and to provide pharmacogenomic guidance at the point-of-care. Pharmacogenomic data of individual patients are encoded as Quick Response (QR) codes and can be decoded and interpreted with common mobile devices without requiring a centralized repository for storing genetic patient data. In this paper, we present the first fully functional release of this system and describe its architecture, which utilizes Web Ontology Language 2 (OWL 2) ontologies to formalize pharmacogenomic knowledge and to provide clinical decision support functionalities. Conclusions The MSC system provides a novel approach for enabling the implementation of personalized medicine in clinical routine. PMID:24787444
Miñarro-Giménez, Jose Antonio; Blagec, Kathrin; Boyce, Richard D; Adlassnig, Klaus-Peter; Samwald, Matthias
2014-01-01
The development of genotyping and genetic sequencing techniques and their evolution towards low costs and quick turnaround have encouraged a wide range of applications. One of the most promising applications is pharmacogenomics, where genetic profiles are used to predict the most suitable drugs and drug dosages for the individual patient. This approach aims to ensure appropriate medical treatment and avoid, or properly manage, undesired side effects. We developed the Medicine Safety Code (MSC) service, a novel pharmacogenomics decision support system, to provide physicians and patients with the ability to represent pharmacogenomic data in computable form and to provide pharmacogenomic guidance at the point-of-care. Pharmacogenomic data of individual patients are encoded as Quick Response (QR) codes and can be decoded and interpreted with common mobile devices without requiring a centralized repository for storing genetic patient data. In this paper, we present the first fully functional release of this system and describe its architecture, which utilizes Web Ontology Language 2 (OWL 2) ontologies to formalize pharmacogenomic knowledge and to provide clinical decision support functionalities. The MSC system provides a novel approach for enabling the implementation of personalized medicine in clinical routine.
Effectively identifying user profiles in network and host metrics
NASA Astrophysics Data System (ADS)
Murphy, John P.; Berk, Vincent H.; Gregorio-de Souza, Ian
2010-04-01
This work presents a collection of methods that is used to effectively identify users of computers systems based on their particular usage of the software and the network. Not only are we able to identify individual computer users by their behavioral patterns, we are also able to detect significant deviations in their typical computer usage over time, or compared to a group of their peers. For instance, most people have a small, and relatively unique selection of regularly visited websites, certain email services, daily work hours, and typical preferred applications for mandated tasks. We argue that these habitual patterns are sufficiently specific to identify fully anonymized network users. We demonstrate that with only a modest data collection capability, profiles of individual computer users can be constructed so as to uniquely identify a profiled user from among their peers. As time progresses and habits or circumstances change, the methods presented update each profile so that changes in user behavior can be reliably detected over both abrupt and gradual time frames, without losing the ability to identify the profiled user. The primary benefit of our methodology allows one to efficiently detect deviant behaviors, such as subverted user accounts, or organizational policy violations. Thanks to the relative robustness, these techniques can be used in scenarios with very diverse data collection capabilities, and data privacy requirements. In addition to behavioral change detection, the generated profiles can also be compared against pre-defined examples of known adversarial patterns.
Exploring personalized searches using tag-based user profiles and resource profiles in folksonomy.
Cai, Yi; Li, Qing; Xie, Haoran; Min, Huaqin
2014-10-01
With the increase in resource-sharing websites such as YouTube and Flickr, many shared resources have arisen on the Web. Personalized searches have become more important and challenging since users demand higher retrieval quality. To achieve this goal, personalized searches need to take users' personalized profiles and information needs into consideration. Collaborative tagging (also known as folksonomy) systems allow users to annotate resources with their own tags, which provides a simple but powerful way for organizing, retrieving and sharing different types of social resources. In this article, we examine the limitations of previous tag-based personalized searches. To handle these limitations, we propose a new method to model user profiles and resource profiles in collaborative tagging systems. We use a normalized term frequency to indicate the preference degree of a user on a tag. A novel search method using such profiles of users and resources is proposed to facilitate the desired personalization in resource searches. In our framework, instead of the keyword matching or similarity measurement used in previous works, the relevance measurement between a resource and a user query (termed the query relevance) is treated as a fuzzy satisfaction problem of a user's query requirements. We implement a prototype system called the Folksonomy-based Multimedia Retrieval System (FMRS). Experiments using the FMRS data set and the MovieLens data set show that our proposed method outperforms baseline methods. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Criado, Javier; Padilla, Nicolás; Iribarne, Luis; Asensio, Jose-Andrés
Due to the globalization of the information and knowledge society on the Internet, modern Web-based Information Systems (WIS) must be flexible and prepared to be easily accessible and manageable in real-time. In recent times it has received a special interest the globalization of information through a common vocabulary (i.e., ontologies), and the standardized way in which information is retrieved on the Web (i.e., powerful search engines, and intelligent software agents). These same principles of globalization and standardization should also be valid for the user interfaces of the WIS, but they are built on traditional development paradigms. In this paper we present an approach to reduce the gap of globalization/standardization in the generation of WIS user interfaces by using a real-time "bottom-up" composition perspective with COTS-interface components (type interface widgets) and trading services.
TabSQL: a MySQL tool to facilitate mapping user data to public databases.
Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng
2010-06-23
With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
TabSQL: a MySQL tool to facilitate mapping user data to public databases
2010-01-01
Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251
A model-driven privacy compliance decision support for medical data sharing in Europe.
Boussi Rahmouni, H; Solomonides, T; Casassa Mont, M; Shiu, S; Rahmouni, M
2011-01-01
Clinical practitioners and medical researchers often have to share health data with other colleagues across Europe. Privacy compliance in this context is very important but challenging. Automated privacy guidelines are a practical way of increasing users' awareness of privacy obligations and help eliminating unintentional breaches of privacy. In this paper we present an ontology-plus-rules based approach to privacy decision support for the sharing of patient data across European platforms. We use ontologies to model the required domain and context information about data sharing and privacy requirements. In addition, we use a set of Semantic Web Rule Language rules to reason about legal privacy requirements that are applicable to a specific context of data disclosure. We make the complete set invocable through the use of a semantic web application acting as an interactive privacy guideline system can then invoke the full model in order to provide decision support. When asked, the system will generate privacy reports applicable to a specific case of data disclosure described by the user. Also reports showing guidelines per Member State may be obtained. The advantage of this approach lies in the expressiveness and extensibility of the modelling and inference languages adopted and the ability they confer to reason with complex requirements interpreted from high level regulations. However, the system cannot at this stage fully simulate the role of an ethics committee or review board.
Multiple Interests of Users in Collaborative Tagging Systems
NASA Astrophysics Data System (ADS)
Au Yeung, Ching-Man; Gibbins, Nicholas; Shadbolt, Nigel
Performance of recommender systems depends on whether the user profiles contain accurate information about the interests of the users, and this in turn relies on whether enough information about their interests can be collected. Collaborative tagging systems allow users to use their own words to describe their favourite resources, resulting in some user-generated categorisation schemes commonly known as folksonomies. Folksonomies thus contain rich information about the interests of the users, which can be used to support various recommender systems. Our analysis of the folksonomy in Delicious reveals that the interests of a single user can be very diverse. Traditional methods for representing interests of users are usually not able to reflect such diversity. We propose a method to construct user profiles of multiple interests from folksonomies based on a network clustering technique. Our evaluation shows that the proposed method is able to generate user profiles which reflect the diversity of user interests and can be used as a basis of providing more focused recommendation to the users.
Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator.
Tchechmedjiev, Andon; Abdaoui, Amine; Emonet, Vincent; Melzi, Soumia; Jonnagaddala, Jitendra; Jonquet, Clement
2018-06-01
Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014). The Annotator+ has been successfully integrated into the SIFR BioPortal platform-an implementation of NCBO BioPortal for French biomedical terminologies and ontologies-to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly available, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g. clinical) data in-house (https://github.com/sifrproject). andon.tchechmedjiev@lirmm.fr. Supplementary data are available at Bioinformatics online.
Geoscience Australia Publishes Sample Descriptions using W3C standards
NASA Astrophysics Data System (ADS)
Car, N. J.; Cox, S. J. D.; Bastrakova, I.; Wyborn, L. A.
2017-12-01
The recent revision of the W3C Semantic Sensor Network Ontology (SSN) has focused on three key concerns: Extending the scope of the ontology to include sampling and actuation as well as observation and sensing Modularizing the ontology into a simple core with few classes and properties and little formal axiomatization, supplemented by additional modules that formalize the semantics and extend the scope Alignments with several existing applications and upper ontologies These enhancements mean that SSN can now be used as the basis for publishing descriptions of geologic samples as Linked Data. Geoscience Australia maintains a database of about three million samples, collected over 50 years through projects from ocean core, terrestrial rock and hydrochemistry borehole projects, almost all of which are held in in the special-purpose GA samples repository. Access to descriptions of these samples as Linked Data has recently been enabled. The sample descriptions can be viewed in various machine-readable formalizations, including IGSN (XML & RDF), Dublin Core (XML & RDF) and SSN (RDF), as well as web landing-pages for people. Of particular importance is the support for encoding relationships between samples, and between samples and surveys, boreholes, and traverses which they are related to, as well as between samples processed for analytical purposes and their parents, siblings, and back to the original field samples. The SSN extension for Sample Relationships provides an extensible, semantically rich mechanism to capture any relationship necessary to explain the provenance of observation results obtained from samples. Sample citation is facilitated through the use of URI-based persistent identifiers which resolve to samples' landing pages. The sample system also allows PROV pingbacks to be received for samples when users of them record provenance for their actions.
Cimino, James J.; Ayres, Elaine J.; Remennik, Lyubov; Rath, Sachi; Freedman, Robert; Beri, Andrea; Chen, Yang; Huser, Vojtech
2013-01-01
The US National Institutes of Health (NIH) has developed the Biomedical Translational Research Information System (BTRIS) to support researchers’ access to translational and clinical data. BTRIS includes a data repository, a set of programs for loading data from NIH electronic health records and research data management systems, an ontology for coding the disparate data with a single terminology, and a set of user interface tools that provide access to identified data from individual research studies and data across all studies from which individually identifiable data have been removed. This paper reports on unique design elements of the system, progress to date and user experience after five years of development and operation. PMID:24262893
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria; Gil, Laurent; Hall, Peggy; Hastings, Emma; Junkins, Heather; McMahon, Aoife; Milano, Annalisa; Morales, Joannella; Pendlington, Zoe May; Welter, Danielle; Burdett, Tony; Hindorff, Lucia; Flicek, Paul; Cunningham, Fiona; Parkinson, Helen
2017-01-01
The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future. PMID:27899670
NASA Astrophysics Data System (ADS)
Gao, Wei; Zhu, Linli; Wang, Kaiyun
2015-12-01
Ontology, a model of knowledge representation and storage, has had extensive applications in pharmaceutics, social science, chemistry and biology. In the age of “big data”, the constructed concepts are often represented as higher-dimensional data by scholars, and thus the sparse learning techniques are introduced into ontology algorithms. In this paper, based on the alternating direction augmented Lagrangian method, we present an ontology optimization algorithm for ontological sparse vector learning, and a fast version of such ontology technologies. The optimal sparse vector is obtained by an iterative procedure, and the ontology function is then obtained from the sparse vector. Four simulation experiments show that our ontological sparse vector learning model has a higher precision ratio on plant ontology, humanoid robotics ontology, biology ontology and physics education ontology data for similarity measuring and ontology mapping applications.
Budak, Gungor; Srivastava, Rajneesh; Janga, Sarath Chandra
2017-06-01
RNA-binding proteins (RBPs) control the regulation of gene expression in eukaryotic genomes at post-transcriptional level by binding to their cognate RNAs. Although several variants of CLIP (crosslinking and immunoprecipitation) protocols are currently available to study the global protein-RNA interaction landscape at single-nucleotide resolution in a cell, currently there are very few tools that can facilitate understanding and dissecting the functional associations of RBPs from the resulting binding maps. Here, we present Seten, a web-based and command line tool, which can identify and compare processes, phenotypes, and diseases associated with RBPs from condition-specific CLIP-seq profiles. Seten uses BED files resulting from most peak calling algorithms, which include scores reflecting the extent of binding of an RBP on the target transcript, to provide both traditional functional enrichment as well as gene set enrichment results for a number of gene set collections including BioCarta, KEGG, Reactome, Gene Ontology (GO), Human Phenotype Ontology (HPO), and MalaCards Disease Ontology for several organisms including fruit fly, human, mouse, rat, worm, and yeast. It also provides an option to dynamically compare the associated gene sets across data sets as bubble charts, to facilitate comparative analysis. Benchmarking of Seten using eCLIP data for IGF2BP1, SRSF7, and PTBP1 against their corresponding CRISPR RNA-seq in K562 cells as well as randomized negative controls, demonstrated that its gene set enrichment method outperforms functional enrichment, with scores significantly contributing to the discovery of true annotations. Comparative performance analysis using these CRISPR control data sets revealed significantly higher precision and comparable recall to that observed using ChIP-Enrich. Seten's web interface currently provides precomputed results for about 200 CLIP-seq data sets and both command line as well as web interfaces can be used to analyze CLIP-seq data sets. We highlight several examples to show the utility of Seten for rapid profiling of various CLIP-seq data sets. Seten is available on http://www.iupui.edu/∼sysbio/seten/. © 2017 Budak et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Exploratory visualization of earth science data in a Semantic Web context
NASA Astrophysics Data System (ADS)
Ma, X.; Fox, P. A.
2012-12-01
Earth science data are increasingly unlocked from their local 'safes' and shared online with the global science community as well as the average citizen. The European Union (EU)-funded project OneGeology-Europe (1G-E, www.onegeology-europe.eu) is a typical project that promotes works in that direction. The 1G-E web portal provides easy access to distributed geological data resources across participating EU member states. Similar projects can also be found in other countries or regions, such as the geoscience information network USGIN (www.usgin.org) in United States, the groundwater information network GIN-RIES (www.gw-info.net) in Canada and the earth science infrastructure AuScope (www.auscope.org.au) in Australia. While data are increasingly made available online, we currently face a shortage of tools and services that support information and knowledge discovery with such data. One reason is that earth science data are recorded in professional language and terms, and people without background knowledge cannot understand their meanings well. The Semantic Web provides a new context to help computers as well as users to better understand meanings of data and conduct applications. In this study we aim to chain up Semantic Web technologies (e.g., vocabularies/ontologies and reasoning), data visualization (e.g., an animation underpinned by an ontology) and online earth science data (e.g., available as Web Map Service) to develop functions for information and knowledge discovery. We carried out a case study with data of the 1G-E project. We set up an ontology of geological time scale using the encoding languages of SKOS (Simple Knowledge Organization System) and OWL (Web Ontology Language) from W3C (World Wide Web Consortium, www.w3.org). Then we developed a Flash animation of geological time scale by using the ActionScript language. The animation is underpinned by the ontology and the interrelationships between concepts of geological time scale are visualized in the animation. We linked the animation and the ontology to the online geological data of 1G-E project and developed interactive applications. The animation was used to show legends of rock age layers in geological maps dynamically. In turn, these legends were used as control panels to filter out and generalize geospatial features of certain rock ages on map layers. We tested the functions with maps of various EU member states. As a part of the initial results, legends for rock age layers of EU individual national maps were generated respectively, and the functions for filtering and generalization were examined with the map of United Kingdom. Though new challenges are rising in the tests, like those caused by synonyms (e.g., 'Lower Cambrian' and 'Terreneuvian'), the initial results achieved the designed goals of information and knowledge discovery by using the ontology-underpinned animation. This study shows that (1) visualization lowers the barrier of ontologies, (2) integrating ontologies and visualization adds value to online earth science data services, and (3) exploratory visualization supports the procedure of data processing as well as the display of results.
Provenance Usage in the OceanLink Project
NASA Astrophysics Data System (ADS)
Narock, T.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Fils, D.; Finin, T.; Hitzler, P.; Janowicz, K.; Jones, M.; Krisnadhi, A.; Lehnert, K. A.; Mickle, A.; Raymond, L. M.; Schildhauer, M.; Shepherd, A.; Wiebe, P. H.
2014-12-01
A wide spectrum of maturing methods and tools, collectively characterized as the Semantic Web, is helping to vastly improve thedissemination of scientific research. The OceanLink project, an NSF EarthCube Building Block, is utilizing semantic technologies tointegrate geoscience data repositories, library holdings, conference abstracts, and funded research awards. Provenance is a vital componentin meeting both the scientific and engineering requirements of OceanLink. Provenance plays a key role in justification and understanding when presenting users with results aggregated from multiple sources. In the engineering sense, provenance enables the identification of new data and the ability to determine which data sources to query. Additionally, OceanLink will leverage human and machine computation for crowdsourcing, text mining, and co-reference resolution. The results of these computations, and their associated provenance, will be folded back into the constituent systems to continually enhance precision and utility. We will touch on the various roles provenance is playing in OceanLink as well as present our use of the PROV Ontology and associated Ontology Design Patterns.
Enabling complex queries to drug information sources through functional composition.
Peters, Lee; Mortensen, Jonathan; Nguyen, Thang; Bodenreider, Olivier
2013-01-01
Our objective was to enable an end-user to create complex queries to drug information sources through functional composition, by creating sequences of functions from application program interfaces (API) to drug terminologies. The development of a functional composition model seeks to link functions from two distinct APIs. An ontology was developed using Protégé to model the functions of the RxNorm and NDF-RT APIs by describing the semantics of their input and output. A set of rules were developed to define the interoperable conditions for functional composition. The operational definition of interoperability between function pairs is established by executing the rules on the ontology. We illustrate that the functional composition model supports common use cases, including checking interactions for RxNorm drugs and deploying allergy lists defined in reference to drug properties in NDF-RT. This model supports the RxMix application (http://mor.nlm.nih.gov/RxMix/), an application we developed for enabling complex queries to the RxNorm and NDF-RT APIs.
Developing an ontological explosion knowledge base for business continuity planning purposes.
Mohammadfam, Iraj; Kalatpour, Omid; Golmohammadi, Rostam; Khotanlou, Hasan
2013-01-01
Industrial accidents are among the most known challenges to business continuity. Many organisations have lost their reputation following devastating accidents. To manage the risks of such accidents, it is necessary to accumulate sufficient knowledge regarding their roots, causes and preventive techniques. The required knowledge might be obtained through various approaches, including databases. Unfortunately, many databases are hampered by (among other things) static data presentations, a lack of semantic features, and the inability to present accident knowledge as discrete domains. This paper proposes the use of Protégé software to develop a knowledge base for the domain of explosion accidents. Such a structure has a higher capability to improve information retrieval compared with common accident databases. To accomplish this goal, a knowledge management process model was followed. The ontological explosion knowledge base (EKB) was built for further applications, including process accident knowledge retrieval and risk management. The paper will show how the EKB has a semantic feature that enables users to overcome some of the search constraints of existing accident databases.
Pafilis, Evangelos; Frankild, Sune P; Schnetzer, Julia; Fanini, Lucia; Faulwetter, Sarah; Pavloudi, Christina; Vasileiadou, Katerina; Leary, Patrick; Hammock, Jennifer; Schulz, Katja; Parr, Cynthia Sims; Arvanitidis, Christos; Jensen, Lars Juhl
2015-06-01
The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users. The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at http://environments.hcmr.gr. © The Author 2015. Published by Oxford University Press.
CASAS: A tool for composing automatically and semantically astrophysical services
NASA Astrophysics Data System (ADS)
Louge, T.; Karray, M. H.; Archimède, B.; Knödlseder, J.
2017-07-01
Multiple astronomical datasets are available through internet and the astrophysical Distributed Computing Infrastructure (DCI) called Virtual Observatory (VO). Some scientific workflow technologies exist for retrieving and combining data from those sources. However selection of relevant services, automation of the workflows composition and the lack of user-friendly platforms remain a concern. This paper presents CASAS, a tool for semantic web services composition in astrophysics. This tool proposes automatic composition of astrophysical web services and brings a semantics-based, automatic composition of workflows. It widens the services choice and eases the use of heterogeneous services. Semantic web services composition relies on ontologies for elaborating the services composition; this work is based on Astrophysical Services ONtology (ASON). ASON had its structure mostly inherited from the VO services capacities. Nevertheless, our approach is not limited to the VO and brings VO plus non-VO services together without the need for premade recipes. CASAS is available for use through a simple web interface.
An ontology-based search engine for protein-protein interactions
2010-01-01
Background Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. Results We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Conclusion Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology. PMID:20122195
Antanaviciute, Agne; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Carr, Ian M
2015-12-01
Exome sequencing has become a de facto standard method for Mendelian disease gene discovery in recent years, yet identifying disease-causing mutations among thousands of candidate variants remains a non-trivial task. Here we describe a new variant prioritization tool, OVA (ontology variant analysis), in which user-provided phenotypic information is exploited to infer deeper biological context. OVA combines a knowledge-based approach with a variant-filtering framework. It reduces the number of candidate variants by considering genotype and predicted effect on protein sequence, and scores the remainder on biological relevance to the query phenotype.We take advantage of several ontologies in order to bridge knowledge across multiple biomedical domains and facilitate computational analysis of annotations pertaining to genes, diseases, phenotypes, tissues and pathways. In this way, OVA combines information regarding molecular and physical phenotypes and integrates both human and model organism data to effectively prioritize variants. By assessing performance on both known and novel disease mutations, we show that OVA performs biologically meaningful candidate variant prioritization and can be more accurate than another recently published candidate variant prioritization tool. OVA is freely accessible at http://dna2.leeds.ac.uk:8080/OVA/index.jsp. Supplementary data are available at Bioinformatics online. umaan@leeds.ac.uk. © The Author 2015. Published by Oxford University Press.
An ontology-based search engine for protein-protein interactions.
Park, Byungkyu; Han, Kyungsook
2010-01-18
Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.
GoGene: gene annotation in the fast lane.
Plake, Conrad; Royer, Loic; Winnenburg, Rainer; Hakenberg, Jörg; Schroeder, Michael
2009-07-01
High-throughput screens such as microarrays and RNAi screens produce huge amounts of data. They typically result in hundreds of genes, which are often further explored and clustered via enriched GeneOntology terms. The strength of such analyses is that they build on high-quality manual annotations provided with the GeneOntology. However, the weakness is that annotations are restricted to process, function and location and that they do not cover all known genes in model organisms. GoGene addresses this weakness by complementing high-quality manual annotation with high-throughput text mining extracting co-occurrences of genes and ontology terms from literature. GoGene contains over 4,000,000 associations between genes and gene-related terms for 10 model organisms extracted from more than 18,000,000 PubMed entries. It does not cover only process, function and location of genes, but also biomedical categories such as diseases, compounds, techniques and mutations. By bringing it all together, GoGene provides the most recent and most complete facts about genes and can rank them according to novelty and importance. GoGene accepts keywords, gene lists, gene sequences and protein sequences as input and supports search for genes in PubMed, EntrezGene and via BLAST. Since all associations of genes to terms are supported by evidence in the literature, the results are transparent and can be verified by the user. GoGene is available at http://gopubmed.org/gogene.
FROG - Fingerprinting Genomic Variation Ontology
Bhardwaj, Anshu
2015-01-01
Genetic variations play a crucial role in differential phenotypic outcomes. Given the complexity in establishing this correlation and the enormous data available today, it is imperative to design machine-readable, efficient methods to store, label, search and analyze this data. A semantic approach, FROG: “FingeRprinting Ontology of Genomic variations” is implemented to label variation data, based on its location, function and interactions. FROG has six levels to describe the variation annotation, namely, chromosome, DNA, RNA, protein, variations and interactions. Each level is a conceptual aggregation of logically connected attributes each of which comprises of various properties for the variant. For example, in chromosome level, one of the attributes is location of variation and which has two properties, allosomes or autosomes. Another attribute is variation kind which has four properties, namely, indel, deletion, insertion, substitution. Likewise, there are 48 attributes and 278 properties to capture the variation annotation across six levels. Each property is then assigned a bit score which in turn leads to generation of a binary fingerprint based on the combination of these properties (mostly taken from existing variation ontologies). FROG is a novel and unique method designed for the purpose of labeling the entire variation data generated till date for efficient storage, search and analysis. A web-based platform is designed as a test case for users to navigate sample datasets and generate fingerprints. The platform is available at http://ab-openlab.csir.res.in/frog. PMID:26244889
NASA Astrophysics Data System (ADS)
Strzelecki, M.; Iwaniak, A.; Łukowicz, J.; Kaczmarek, I.
2013-10-01
Nowadays, spatial information is not only used by professionals, but also by common citizens, who uses it for their daily activities. Open Data initiative states that data should be freely and unreservedly available for all users. It also applies to spatial data. As spatial data becomes widely available it is essential to publish it in form which guarantees the possibility of integrating it with other, heterogeneous data sources. Interoperability is the possibility to combine spatial data sets from different sources in a consistent way as well as providing access to it. Providing syntactic interoperability based on well-known data formats is relatively simple, unlike providing semantic interoperability, due to the multiple possible data interpretation. One of the issues connected with the problem of achieving interoperability is data harmonization. It is a process of providing access to spatial data in a representation that allows combining it with other harmonized data in a coherent way by using a common set of data product specification. Spatial data harmonization is performed by creating definition of reclassification and transformation rules (mapping schema) for source application schema. Creation of those rules is a very demanding task which requires wide domain knowledge and a detailed look into application schemas. The paper focuses on proposing methods for supporting data harmonization process, by automated or supervised creation of mapping schemas with the use of ontologies, ontology matching methods and Semantic Web technologies.