EIIS: An Educational Information Intelligent Search Engine Supported by Semantic Services
ERIC Educational Resources Information Center
Huang, Chang-Qin; Duan, Ru-Lin; Tang, Yong; Zhu, Zhi-Ting; Yan, Yong-Jian; Guo, Yu-Qing
2011-01-01
The semantic web brings a new opportunity for efficient information organization and search. To meet the special requirements of the educational field, this paper proposes an intelligent search engine enabled by educational semantic support service, where three kinds of searches are integrated into Educational Information Intelligent Search (EIIS)…
Semantic interpretation of search engine resultant
NASA Astrophysics Data System (ADS)
Nasution, M. K. M.
2018-01-01
In semantic, logical language can be interpreted in various forms, but the certainty of meaning is included in the uncertainty, which directly always influences the role of technology. One results of this uncertainty applies to search engines as user interfaces with information spaces such as the Web. Therefore, the behaviour of search engine results should be interpreted with certainty through semantic formulation as interpretation. Behaviour formulation shows there are various interpretations that can be done semantically either temporary, inclusion, or repeat.
Dao, Tien Tuan; Hoang, Tuan Nha; Ta, Xuan Hien; Tho, Marie Christine Ho Ba
2013-02-01
Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical processes, pathological knowledge and practical expertise. In this present work, an advanced knowledge-based personalized search engine was developed. Our search engine was based on a client-server multi-layer multi-agent architecture and the principle of semantic web services to acquire dynamically accurate and reliable HMSR information by a semantic processing and visualization approach. A security-enhanced mechanism was applied to protect the medical information. A multi-agent crawler was implemented to develop a content-based database of HMSR information. A new semantic-based PageRank score with related mathematical formulas were also defined and implemented. As the results, semantic web service descriptions were presented in OWL, WSDL and OWL-S formats. Operational scenarios with related web-based interfaces for personal computers and mobile devices were presented and analyzed. Functional comparison between our knowledge-based search engine, a conventional search engine and a semantic search engine showed the originality and the robustness of our knowledge-based personalized search engine. In fact, our knowledge-based personalized search engine allows different users such as orthopedic patient and experts or healthcare system managers or medical students to access remotely into useful, accurate, reliable and good-quality HMSR information for their learning and medical purposes. Copyright © 2012 Elsevier Inc. All rights reserved.
Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.
2006-12-01
The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.
Semantic Clustering of Search Engine Results
Soliman, Sara Saad; El-Sayed, Maged F.; Hassan, Yasser F.
2015-01-01
This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision. PMID:26933673
A unified architecture for biomedical search engines based on semantic web technologies.
Jalali, Vahid; Matash Borujerdi, Mohammad Reza
2011-04-01
There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.
Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A
2012-09-01
Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.
ERIC Educational Resources Information Center
Fast, Karl V.; Campbell, D. Grant
2001-01-01
Compares the implied ontological frameworks of the Open Archives Initiative Protocol for Metadata Harvesting and the World Wide Web Consortium's Semantic Web. Discusses current search engine technology, semantic markup, indexing principles of special libraries and online databases, and componentization and the distinction between data and…
NASA Astrophysics Data System (ADS)
Piasecki, M.; Beran, B.
2007-12-01
Search engines have changed the way we see the Internet. The ability to find the information by just typing in keywords was a big contribution to the overall web experience. While the conventional search engine methodology worked well for textual documents, locating scientific data remains a problem since they are stored in databases not readily accessible by search engine bots. Considering different temporal, spatial and thematic coverage of different databases, especially for interdisciplinary research it is typically necessary to work with multiple data sources. These sources can be federal agencies which generally offer national coverage or regional sources which cover a smaller area with higher detail. However for a given geographic area of interest there often exists more than one database with relevant data. Thus being able to query multiple databases simultaneously is a desirable feature that would be tremendously useful for scientists. Development of such a search engine requires dealing with various heterogeneity issues. In scientific databases, systems often impose controlled vocabularies which ensure that they are generally homogeneous within themselves but are semantically heterogeneous when moving between different databases. This defines the boundaries of possible semantic related problems making it easier to solve than with the conventional search engines that deal with free text. We have developed a search engine that enables querying multiple data sources simultaneously and returns data in a standardized output despite the aforementioned heterogeneity issues between the underlying systems. This application relies mainly on metadata catalogs or indexing databases, ontologies and webservices with virtual globe and AJAX technologies for the graphical user interface. Users can trigger a search of dozens of different parameters over hundreds of thousands of stations from multiple agencies by providing a keyword, a spatial extent, i.e. a bounding box, and a temporal bracket. As part of this development we have also added an environment that allows users to do some of the semantic tagging, i.e. the linkage of a variable name (which can be anything they desire) to defined concepts in the ontology structure which in turn provides the backbone of the search engine.
The Impact of Subject Indexes on Semantic Indeterminacy in Enterprise Document Retrieval
ERIC Educational Resources Information Center
Schymik, Gregory
2012-01-01
Ample evidence exists to support the conclusion that enterprise search is failing its users. This failure is costing corporate America billions of dollars every year. Most enterprise search engines are built using web search engines as their foundations. These search engines are optimized for web use and are inadequate when used inside the…
Development of Health Information Search Engine Based on Metadata and Ontology
Song, Tae-Min; Jin, Dal-Lae
2014-01-01
Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907
Development of health information search engine based on metadata and ontology.
Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae
2014-04-01
The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
Semantic similarity measure in biomedical domain leverage web search engine.
Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei
2010-01-01
Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.
BioSearch: a semantic search engine for Bio2RDF
Qiu, Honglei; Huang, Jiacheng
2017-01-01
Abstract Biomedical data are growing at an incredible pace and require substantial expertise to organize data in a manner that makes them easily findable, accessible, interoperable and reusable. Massive effort has been devoted to using Semantic Web standards and technologies to create a network of Linked Data for the life sciences, among others. However, while these data are accessible through programmatic means, effective user interfaces for non-experts to SPARQL endpoints are few and far between. Contributing to user frustrations is that data are not necessarily described using common vocabularies, thereby making it difficult to aggregate results, especially when distributed across multiple SPARQL endpoints. We propose BioSearch — a semantic search engine that uses ontologies to enhance federated query construction and organize search results. BioSearch also features a simplified query interface that allows users to optionally filter their keywords according to classes, properties and datasets. User evaluation demonstrated that BioSearch is more effective and usable than two state of the art search and browsing solutions. Database URL: http://ws.nju.edu.cn/biosearch/ PMID:29220451
Social Networking on the Semantic Web
ERIC Educational Resources Information Center
Finin, Tim; Ding, Li; Zhou, Lina; Joshi, Anupam
2005-01-01
Purpose: Aims to investigate the way that the semantic web is being used to represent and process social network information. Design/methodology/approach: The Swoogle semantic web search engine was used to construct several large data sets of Resource Description Framework (RDF) documents with social network information that were encoded using the…
A novel architecture for information retrieval system based on semantic web
NASA Astrophysics Data System (ADS)
Zhang, Hui
2011-12-01
Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.
Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui
2014-01-01
Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.
Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui
2014-01-01
Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM. PMID:24772189
GoWeb: a semantic search engine for the life science web.
Dietze, Heiko; Schroeder, Michael
2009-10-01
Current search engines are keyword-based. Semantic technologies promise a next generation of semantic search engines, which will be able to answer questions. Current approaches either apply natural language processing to unstructured text or they assume the existence of structured statements over which they can reason. Here, we introduce a third approach, GoWeb, which combines classical keyword-based Web search with text-mining and ontologies to navigate large results sets and facilitate question answering. We evaluate GoWeb on three benchmarks of questions on genes and functions, on symptoms and diseases, and on proteins and diseases. The first benchmark is based on the BioCreAtivE 1 Task 2 and links 457 gene names with 1352 functions. GoWeb finds 58% of the functional GeneOntology annotations. The second benchmark is based on 26 case reports and links symptoms with diseases. GoWeb achieves 77% success rate improving an existing approach by nearly 20%. The third benchmark is based on 28 questions in the TREC genomics challenge and links proteins to diseases. GoWeb achieves a success rate of 79%. GoWeb's combination of classical Web search with text-mining and ontologies is a first step towards answering questions in the biomedical domain. GoWeb is online at: http://www.gopubmed.org/goweb.
IntegromeDB: an integrated system and biological search engine.
Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia
2012-01-19
With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.
GOOSE: semantic search on internet connected sensors
NASA Astrophysics Data System (ADS)
Schutte, Klamer; Bomhof, Freek; Burghouts, Gertjan; van Diggelen, Jurriaan; Hiemstra, Peter; van't Hof, Jaap; Kraaij, Wessel; Pasman, Huib; Smith, Arthur; Versloot, Corne; de Wit, Joost
2013-05-01
More and more sensors are getting Internet connected. Examples are cameras on cell phones, CCTV cameras for traffic control as well as dedicated security and defense sensor systems. Due to the steadily increasing data volume, human exploitation of all this sensor data is impossible for effective mission execution. Smart access to all sensor data acts as enabler for questions such as "Is there a person behind this building" or "Alert me when a vehicle approaches". The GOOSE concept has the ambition to provide the capability to search semantically for any relevant information within "all" (including imaging) sensor streams in the entire Internet of sensors. This is similar to the capability provided by presently available Internet search engines which enable the retrieval of information on "all" web pages on the Internet. In line with current Internet search engines any indexing services shall be utilized cross-domain. The two main challenge for GOOSE is the Semantic Gap and Scalability. The GOOSE architecture consists of five elements: (1) an online extraction of primitives on each sensor stream; (2) an indexing and search mechanism for these primitives; (3) a ontology based semantic matching module; (4) a top-down hypothesis verification mechanism and (5) a controlling man-machine interface. This paper reports on the initial GOOSE demonstrator, which consists of the MES multimedia analysis platform and the CORTEX action recognition module. It also provides an outlook into future GOOSE development.
Web information retrieval based on ontology
NASA Astrophysics Data System (ADS)
Zhang, Jian
2013-03-01
The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.
Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai
2017-03-01
The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.
IntegromeDB: an integrated system and biological search engine
2012-01-01
Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback. PMID:22260095
EquiX-A Search and Query Language for XML.
ERIC Educational Resources Information Center
Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander
2002-01-01
Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)
Semantics-Based Intelligent Indexing and Retrieval of Digital Images - A Case Study
NASA Astrophysics Data System (ADS)
Osman, Taha; Thakker, Dhavalkumar; Schaefer, Gerald
The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they typically rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this chapter we present a semantically enabled image annotation and retrieval engine that is designed to satisfy the requirements of commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as presenting our initial thoughts on exploiting lexical databases for explicit semantic-based query expansion.
GeneView: a comprehensive semantic search engine for PubMed.
Thomas, Philippe; Starlinger, Johannes; Vowinkel, Alexander; Arzt, Sebastian; Leser, Ulf
2012-07-01
Research results are primarily published in scientific literature and curation efforts cannot keep up with the rapid growth of published literature. The plethora of knowledge remains hidden in large text repositories like MEDLINE. Consequently, life scientists have to spend a great amount of time searching for specific information. The enormous ambiguity among most names of biomedical objects such as genes, chemicals and diseases often produces too large and unspecific search results. We present GeneView, a semantic search engine for biomedical knowledge. GeneView is built upon a comprehensively annotated version of PubMed abstracts and openly available PubMed Central full texts. This semi-structured representation of biomedical texts enables a number of features extending classical search engines. For instance, users may search for entities using unique database identifiers or they may rank documents by the number of specific mentions they contain. Annotation is performed by a multitude of state-of-the-art text-mining tools for recognizing mentions from 10 entity classes and for identifying protein-protein interactions. GeneView currently contains annotations for >194 million entities from 10 classes for ∼21 million citations with 271,000 full text bodies. GeneView can be searched at http://bc3.informatik.hu-berlin.de/.
Developing a Domain Ontology: the Case of Water Cycle and Hydrology
NASA Astrophysics Data System (ADS)
Gupta, H.; Pozzi, W.; Piasecki, M.; Imam, B.; Houser, P.; Raskin, R.; Ramachandran, R.; Martinez Baquero, G.
2008-12-01
A semantic web ontology enables semantic data integration and semantic smart searching. Several organizations have attempted to implement smart registration and integration or searching using ontologies. These are the NOESIS (NSF project: LEAD) and HydroSeek (NSF project: CUAHS HIS) data discovery engines and the NSF project GEON. All three applications use ontologies to discover data from multiple sources and projects. The NASA WaterNet project was established to identify creative, innovative ways to bridge NASA research results to real world applications, linking decision support needs to available data, observations, and modeling capability. WaterNet (NASA project) utilized the smart query tool Noesis as a testbed to test whether different ontologies (and different catalog searches) could be combined to match resources with user needs. NOESIS contains the upper level SWEET ontology that accepts plug in domain ontologies to refine user search queries, reducing the burden of multiple keyword searches. Another smart search interface was that developed for CUAHSI, HydroSeek, that uses a multi-layered concept search ontology, tagging variables names from any number of data sources to specific leaf and higher level concepts on which the search is executed. This approach has proven to be quite successful in mitigating semantic heterogeneity as the user does not need to know the semantic specifics of each data source system but just uses a set of common keywords to discover the data for a specific temporal and geospatial domain. This presentation will show tests with Noesis and Hydroseek lead to the conclusion that the construction of a complex, and highly heterogeneous water cycle ontology requires multiple ontology modules. To illustrate the complexity and heterogeneity of a water cycle ontology, Hydroseek successfully utilizes WaterOneFlow to integrate data across multiple different data collections, such as USGS NWIS. However,different methodologies are employed by the Earth Science, the Hydrological, and Hydraulic Engineering Communities, and each community employs models that require different input data. If a sub-domain ontology is created for each of these,describing water balance calculations, then the resulting structure of the semantic network describing these various terms can be rather complex, heterogeneous, and overlapping, and will require "mapping" between equivalent terms in the ontologies, along with the development of an upper level conceptual or domain ontology to utilize and link to those already in existence.
Developing A Web-based User Interface for Semantic Information Retrieval
NASA Technical Reports Server (NTRS)
Berrios, Daniel C.; Keller, Richard M.
2003-01-01
While there are now a number of languages and frameworks that enable computer-based systems to search stored data semantically, the optimal design for effective user interfaces for such systems is still uncle ar. Such interfaces should mask unnecessary query detail from users, yet still allow them to build queries of arbitrary complexity without significant restrictions. We developed a user interface supporting s emantic query generation for Semanticorganizer, a tool used by scient ists and engineers at NASA to construct networks of knowledge and dat a. Through this interface users can select node types, node attribute s and node links to build ad-hoc semantic queries for searching the S emanticOrganizer network.
NASA Astrophysics Data System (ADS)
Murtazina, M. Sh; Avdeenko, T. V.
2018-05-01
The state of art and the progress in application of semantic technologies in the field of scientific and research activity have been analyzed. Even elementary empirical comparison has shown that the semantic search engines are superior in all respects to conventional search technologies. However, semantic information technologies are insufficiently used in the field of scientific and research activity in Russia. In present paper an approach to construction of ontological model of knowledge base is proposed. The ontological model is based on the upper-level ontology and the RDF mechanism for linking several domain ontologies. The ontological model is implemented in the Protégé environment.
Semantically optiMize the dAta seRvice operaTion (SMART) system for better data discovery and access
NASA Astrophysics Data System (ADS)
Yang, C.; Huang, T.; Armstrong, E. M.; Moroni, D. F.; Liu, K.; Gui, Z.
2013-12-01
Abstract: We present a Semantically optiMize the dAta seRvice operaTion (SMART) system for better data discovery and access across the NASA data systems, Global Earth Observation System of Systems (GEOSS) Clearinghouse and Data.gov to facilitate scientists to select Earth observation data that fit better their needs in four aspects: 1. Integrating and interfacing the SMART system to include the functionality of a) semantic reasoning based on Jena, an open source semantic reasoning engine, b) semantic similarity calculation, c) recommendation based on spatiotemporal, semantic, and user workflow patterns, and d) ranking results based on similarity between search terms and data ontology. 2. Collaborating with data user communities to a) capture science data ontology and record relevant ontology triple stores, b) analyze and mine user search and download patterns, c) integrate SMART into metadata-centric discovery system for community-wide usage and feedback, and d) customizing data discovery, search and access user interface to include the ranked results, recommendation components, and semantic based navigations. 3. Laying the groundwork to interface the SMART system with other data search and discovery systems as an open source data search and discovery solution. The SMART systems leverages NASA, GEO, FGDC data discovery, search and access for the Earth science community by enabling scientists to readily discover and access data appropriate to their endeavors, increasing the efficiency of data exploration and decreasing the time that scientists must spend on searching, downloading, and processing the datasets most applicable to their research. By incorporating the SMART system, it is a likely aim that the time being devoted to discovering the most applicable dataset will be substantially reduced, thereby reducing the number of user inquiries and likewise reducing the time and resources expended by a data center in addressing user inquiries. Keywords: EarthCube; ECHO, DAACs, GeoPlatform; Geospatial Cyberinfrastructure References: 1. Yang, P., Evans, J., Cole, M., Alameh, N., Marley, S., & Bambacus, M., (2007). The Emerging Concepts and Applications of the Spatial Web Portal. Photogrammetry Engineering &Remote Sensing,73(6):691-698. 2. Zhang, C, Zhao, T. and W. Li. (2010). The Framework of a Geospatial Semantic Web based Spatial Decision Support System for Digital Earth. International Journal of Digital Earth. 3(2):111-134. 3. Yang C., Raskin R., Goodchild M.F., Gahegan M., 2010, Geospatial Cyberinfrastructure: Past, Present and Future,Computers, Environment, and Urban Systems, 34(4):264-277. 4. Liu K., Yang C., Li W., Gui Z., Xu C., Xia J., 2013. Using ontology and similarity calculations to rank Earth science data searching results, International Journal of Geospatial Information Applications. (in press)
Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália
2016-07-01
Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations meaningful to that particular scope of research. Conversely, indirect concept associations, i.e. concepts related by other intermediary concepts, can be useful to integrate information from different studies and look into non-trivial relations. The BIOMedical Search Engine Framework supports the development of domain-specific search engines. The key strengths of the framework are modularity and extensibilityin terms of software design, the use of open-source consolidated Web technologies, and the ability to integrate any number of biomedical text mining tools and information resources. Currently, the Smart Drug Search keeps over 1,186,000 documents, containing more than 11,854,000 annotations for 77,200 different concepts. The Smart Drug Search is publicly accessible at http://sing.ei.uvigo.es/sds/. The BIOMedical Search Engine Framework is freely available for non-commercial use at https://github.com/agjacome/biomsef. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Improving Concept-Based Web Image Retrieval by Mixing Semantically Similar Greek Queries
ERIC Educational Resources Information Center
Lazarinis, Fotis
2008-01-01
Purpose: Image searching is a common activity for web users. Search engines offer image retrieval services based on textual queries. Previous studies have shown that web searching is more demanding when the search is not in English and does not use a Latin-based language. The aim of this paper is to explore the behaviour of the major search…
BioCarian: search engine for exploratory searches in heterogeneous biological databases.
Zaki, Nazar; Tennakoon, Chandana
2017-10-02
There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com . We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.
Semantic similarity measures in the biomedical domain by leveraging a web search engine.
Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching
2013-07-01
Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.
SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services
Gessler, Damian DG; Schiltz, Gary S; May, Greg D; Avraham, Shulamit; Town, Christopher D; Grant, David; Nelson, Rex T
2009-01-01
Background SSWAP (Simple Semantic Web Architecture and Protocol; pronounced "swap") is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP was developed as a hybrid semantic web services technology to overcome limitations found in both pure web service technologies and pure semantic web technologies. Results There are currently over 2400 resources published in SSWAP. Approximately two dozen are custom-written services for QTL (Quantitative Trait Loci) and mapping data for legumes and grasses (grains). The remaining are wrappers to Nucleic Acids Research Database and Web Server entries. As an architecture, SSWAP establishes how clients (users of data, services, and ontologies), providers (suppliers of data, services, and ontologies), and discovery servers (semantic search engines) interact to allow for the description, querying, discovery, invocation, and response of semantic web services. As a protocol, SSWAP provides the vocabulary and semantics to allow clients, providers, and discovery servers to engage in semantic web services. The protocol is based on the W3C-sanctioned first-order description logic language OWL DL. As an open source platform, a discovery server running at (as in to "swap info") uses the description logic reasoner Pellet to integrate semantic resources. The platform hosts an interactive guide to the protocol at , developer tools at , and a portal to third-party ontologies at (a "swap meet"). Conclusion SSWAP addresses the three basic requirements of a semantic web services architecture (i.e., a common syntax, shared semantic, and semantic discovery) while addressing three technology limitations common in distributed service systems: i.e., i) the fatal mutability of traditional interfaces, ii) the rigidity and fragility of static subsumption hierarchies, and iii) the confounding of content, structure, and presentation. SSWAP is novel by establishing the concept of a canonical yet mutable OWL DL graph that allows data and service providers to describe their resources, to allow discovery servers to offer semantically rich search engines, to allow clients to discover and invoke those resources, and to allow providers to respond with semantically tagged data. SSWAP allows for a mix-and-match of terms from both new and legacy third-party ontologies in these graphs. PMID:19775460
Hsu, Yi-Yu; Chen, Hung-Yu; Kao, Hung-Yu
2013-01-01
Background Determining the semantic relatedness of two biomedical terms is an important task for many text-mining applications in the biomedical field. Previous studies, such as those using ontology-based and corpus-based approaches, measured semantic relatedness by using information from the structure of biomedical literature, but these methods are limited by the small size of training resources. To increase the size of training datasets, the outputs of search engines have been used extensively to analyze the lexical patterns of biomedical terms. Methodology/Principal Findings In this work, we propose the Mutually Reinforcing Lexical Pattern Ranking (ReLPR) algorithm for learning and exploring the lexical patterns of synonym pairs in biomedical text. ReLPR employs lexical patterns and their pattern containers to assess the semantic relatedness of biomedical terms. By combining sentence structures and the linking activities between containers and lexical patterns, our algorithm can explore the correlation between two biomedical terms. Conclusions/Significance The average correlation coefficient of the ReLPR algorithm was 0.82 for various datasets. The results of the ReLPR algorithm were significantly superior to those of previous methods. PMID:24348899
Sauer, Ursula G; Wächter, Thomas; Hareng, Lars; Wareing, Britta; Langsch, Angelika; Zschunke, Matthias; Alvers, Michael R; Landsiedel, Robert
2014-06-01
The knowledge-based search engine Go3R, www.Go3R.org, has been developed to assist scientists from industry and regulatory authorities in collecting comprehensive toxicological information with a special focus on identifying available alternatives to animal testing. The semantic search paradigm of Go3R makes use of expert knowledge on 3Rs methods and regulatory toxicology, laid down in the ontology, a network of concepts, terms, and synonyms, to recognize the contents of documents. Search results are automatically sorted into a dynamic table of contents presented alongside the list of documents retrieved. This table of contents allows the user to quickly filter the set of documents by topics of interest. Documents containing hazard information are automatically assigned to a user interface following the endpoint-specific IUCLID5 categorization scheme required, e.g. for REACH registration dossiers. For this purpose, complex endpoint-specific search queries were compiled and integrated into the search engine (based upon a gold standard of 310 references that had been assigned manually to the different endpoint categories). Go3R sorts 87% of the references concordantly into the respective IUCLID5 categories. Currently, Go3R searches in the 22 million documents available in the PubMed and TOXNET databases. However, it can be customized to search in other databases including in-house databanks. Copyright © 2013 Elsevier Ltd. All rights reserved.
Visual Exploratory Search of Relationship Graphs on Smartphones
Ouyang, Jianquan; Zheng, Hao; Kong, Fanbin; Liu, Tianming
2013-01-01
This paper presents a novel framework for Visual Exploratory Search of Relationship Graphs on Smartphones (VESRGS) that is composed of three major components: inference and representation of semantic relationship graphs on the Web via meta-search, visual exploratory search of relationship graphs through both querying and browsing strategies, and human-computer interactions via the multi-touch interface and mobile Internet on smartphones. In comparison with traditional lookup search methodologies, the proposed VESRGS system is characterized with the following perceived advantages. 1) It infers rich semantic relationships between the querying keywords and other related concepts from large-scale meta-search results from Google, Yahoo! and Bing search engines, and represents semantic relationships via graphs; 2) the exploratory search approach empowers users to naturally and effectively explore, adventure and discover knowledge in a rich information world of interlinked relationship graphs in a personalized fashion; 3) it effectively takes the advantages of smartphones’ user-friendly interfaces and ubiquitous Internet connection and portability. Our extensive experimental results have demonstrated that the VESRGS framework can significantly improve the users’ capability of seeking the most relevant relationship information to their own specific needs. We envision that the VESRGS framework can be a starting point for future exploration of novel, effective search strategies in the mobile Internet era. PMID:24223936
NASA Astrophysics Data System (ADS)
Seyed, P.; Ashby, B.; Khan, I.; Patton, E. W.; McGuinness, D. L.
2013-12-01
Recent efforts to create and leverage standards for geospatial data specification and inference include the GeoSPARQL standard, Geospatial OWL ontologies (e.g., GAZ, Geonames), and RDF triple stores that support GeoSPARQL (e.g., AllegroGraph, Parliament) that use RDF instance data for geospatial features of interest. However, there remains a gap on how best to fuse software engineering best practices and GeoSPARQL within semantic web applications to enable flexible search driven by geospatial reasoning. In this abstract we introduce the SemantGeo module for the SemantEco framework that helps fill this gap, enabling scientists find data using geospatial semantics and reasoning. SemantGeo provides multiple types of geospatial reasoning for SemantEco modules. The server side implementation uses the Parliament SPARQL Endpoint accessed via a Tomcat servlet. SemantGeo uses the Google Maps API for user-specified polygon construction and JsTree for providing containment and categorical hierarchies for search. SemantGeo uses GeoSPARQL for spatial reasoning alone and in concert with RDFS/OWL reasoning capabilities to determine, e.g., what geofeatures are within, partially overlap with, or within a certain distance from, a given polygon. We also leverage qualitative relationships defined by the Gazetteer ontology that are composites of spatial relationships as well as administrative designations or geophysical phenomena. We provide multiple mechanisms for exploring data, such as polygon (map-based) and named-feature (hierarchy-based) selection, that enable flexible search constraints using boolean combination of selections. JsTree-based hierarchical search facets present named features and include a 'part of' hierarchy (e.g., measurement-site-01, Lake George, Adirondack Region, NY State) and type hierarchies (e.g., nodes in the hierarchy for WaterBody, Park, MeasurementSite), depending on the ';axis of choice' option selected. Using GeoSPARQL and aforementioned ontology, these hierarchies are constrained based on polygon selection, where the corresponding polygons of the contained features are visually rendered to assist exploration. Once measurement sites are plotted based on initial search, subsequent searches using JsTree selections can extend the previous based on nearby waterbodies in some semantic relationship of interest. For example, ';tributary of' captures water bodies that flow into the current one, and extending the original search to include tributaries of the observed water body is useful to environmental scientists for isolating the source of characteristic levels, including pollutants. Ultimately any SemantEco module can leverage SemantGeo's underlying APIs, leveraged in a deployment of SemantEco that combines EPA and USGS water quality data, and one customized for searching data available from the Darrin Freshwater Institute. Future work will address generating RDF geometry data from shape files, aligning RDF data sources to better leverage qualitative and spatial relationships, and validating newly generated RDF data adhering to the GeoSPARQL standard.
SemanticOrganizer: A Customizable Semantic Repository for Distributed NASA Project Teams
NASA Technical Reports Server (NTRS)
Keller, Richard M.; Berrios, Daniel C.; Carvalho, Robert E.; Hall, David R.; Rich, Stephen J.; Sturken, Ian B.; Swanson, Keith J.; Wolfe, Shawn R.
2004-01-01
SemanticOrganizer is a collaborative knowledge management system designed to support distributed NASA projects, including diverse teams of scientists, engineers, and accident investigators. The system provides a customizable, semantically structured information repository that stores work products relevant to multiple projects of differing types. SemanticOrganizer is one of the earliest and largest semantic web applications deployed at NASA to date, and has been used in diverse contexts ranging from the investigation of Space Shuttle Columbia's accident to the search for life on other planets. Although the underlying repository employs a single unified ontology, access control and ontology customization mechanisms make the repository contents appear different for each project team. This paper describes SemanticOrganizer, its customization facilities, and a sampling of its applications. The paper also summarizes some key lessons learned from building and fielding a successful semantic web application across a wide-ranging set of domains with diverse users.
Mayer, Miguel A; Karampiperis, Pythagoras; Kukurikos, Antonis; Karkaletsis, Vangelis; Stamatakis, Kostas; Villarroel, Dagmar; Leis, Angela
2011-06-01
The number of health-related websites is increasing day-by-day; however, their quality is variable and difficult to assess. Various "trust marks" and filtering portals have been created in order to assist consumers in retrieving quality medical information. Consumers are using search engines as the main tool to get health information; however, the major problem is that the meaning of the web content is not machine-readable in the sense that computers cannot understand words and sentences as humans can. In addition, trust marks are invisible to search engines, thus limiting their usefulness in practice. During the last five years there have been different attempts to use Semantic Web tools to label health-related web resources to help internet users identify trustworthy resources. This paper discusses how Semantic Web technologies can be applied in practice to generate machine-readable labels and display their content, as well as to empower end-users by providing them with the infrastructure for expressing and sharing their opinions on the quality of health-related web resources.
SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services.
Gessler, Damian D G; Schiltz, Gary S; May, Greg D; Avraham, Shulamit; Town, Christopher D; Grant, David; Nelson, Rex T
2009-09-23
SSWAP (Simple Semantic Web Architecture and Protocol; pronounced "swap") is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP was developed as a hybrid semantic web services technology to overcome limitations found in both pure web service technologies and pure semantic web technologies. There are currently over 2400 resources published in SSWAP. Approximately two dozen are custom-written services for QTL (Quantitative Trait Loci) and mapping data for legumes and grasses (grains). The remaining are wrappers to Nucleic Acids Research Database and Web Server entries. As an architecture, SSWAP establishes how clients (users of data, services, and ontologies), providers (suppliers of data, services, and ontologies), and discovery servers (semantic search engines) interact to allow for the description, querying, discovery, invocation, and response of semantic web services. As a protocol, SSWAP provides the vocabulary and semantics to allow clients, providers, and discovery servers to engage in semantic web services. The protocol is based on the W3C-sanctioned first-order description logic language OWL DL. As an open source platform, a discovery server running at http://sswap.info (as in to "swap info") uses the description logic reasoner Pellet to integrate semantic resources. The platform hosts an interactive guide to the protocol at http://sswap.info/protocol.jsp, developer tools at http://sswap.info/developer.jsp, and a portal to third-party ontologies at http://sswapmeet.sswap.info (a "swap meet"). SSWAP addresses the three basic requirements of a semantic web services architecture (i.e., a common syntax, shared semantic, and semantic discovery) while addressing three technology limitations common in distributed service systems: i.e., i) the fatal mutability of traditional interfaces, ii) the rigidity and fragility of static subsumption hierarchies, and iii) the confounding of content, structure, and presentation. SSWAP is novel by establishing the concept of a canonical yet mutable OWL DL graph that allows data and service providers to describe their resources, to allow discovery servers to offer semantically rich search engines, to allow clients to discover and invoke those resources, and to allow providers to respond with semantically tagged data. SSWAP allows for a mix-and-match of terms from both new and legacy third-party ontologies in these graphs.
A concept ideation framework for medical device design.
Hagedorn, Thomas J; Grosse, Ian R; Krishnamurty, Sundar
2015-06-01
Medical device design is a challenging process, often requiring collaboration between medical and engineering domain experts. This collaboration can be best institutionalized through systematic knowledge transfer between the two domains coupled with effective knowledge management throughout the design innovation process. Toward this goal, we present the development of a semantic framework for medical device design that unifies a large medical ontology with detailed engineering functional models along with the repository of design innovation information contained in the US Patent Database. As part of our development, existing medical, engineering, and patent document ontologies were modified and interlinked to create a comprehensive medical device innovation and design tool with appropriate properties and semantic relations to facilitate knowledge capture, enrich existing knowledge, and enable effective knowledge reuse for different scenarios. The result is a Concept Ideation Framework for Medical Device Design (CIFMeDD). Key features of the resulting framework include function-based searching and automated inter-domain reasoning to uniquely enable identification of functionally similar procedures, tools, and inventions from multiple domains based on simple semantic searches. The significance and usefulness of the resulting framework for aiding in conceptual design and innovation in the medical realm are explored via two case studies examining medical device design problems. Copyright © 2015 Elsevier Inc. All rights reserved.
Semantic retrieval and navigation in clinical document collections.
Kreuzthaler, Markus; Daumke, Philipp; Schulz, Stefan
2015-01-01
Patients with chronic diseases undergo numerous in- and outpatient treatment periods, and therefore many documents accumulate in their electronic records. We report on an on-going project focussing on the semantic enrichment of medical texts, in order to support recall-oriented navigation across a patient's complete documentation. A document pool of 1,696 de-identified discharge summaries was used for prototyping. A natural language processing toolset for document annotation (based on the text-mining framework UIMA) and indexing (Solr) was used to support a browser-based platform for document import, search and navigation. The integrated search engine combines free text and concept-based querying, supported by dynamically generated facets (diagnoses, procedures, medications, lab values, and body parts). The prototype demonstrates the feasibility of semantic document enrichment within document collections of a single patient. Originally conceived as an add-on for the clinical workplace, this technology could also be adapted to support personalised health record platforms, as well as cross-patient search for cohort building and other secondary use scenarios.
Augmenting Oracle Text with the UMLS for enhanced searching of free-text medical reports.
Ding, Jing; Erdal, Selnur; Dhaval, Rakesh; Kamal, Jyoti
2007-10-11
The intrinsic complexity of free-text medical reports imposes great challenges for information retrieval systems. We have developed a prototype search engine for retrieving clinical reports that leverages the powerful indexing and querying capabilities of Oracle Text, and the rich biomedical domain knowledge and semantic structures that are captured in the UMLS Metathesaurus.
Raising the IQ in full-text searching via intelligent querying
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kero, R.; Russell, L.; Swietlik, C.
1994-11-01
Current Information Retrieval (IR) technologies allow for efficient access to relevant information, provided that user selected query terms coincide with the specific linguistical choices made by the authors whose works constitute the text-base. Therefore, the challenge is to enhance the limited searching capability of state-of-the-practice IR. This can be done either with augmented clients that overcome current server searching deficiencies, or with added capabilities that can augment searching algorithms on the servers. The technology being investigated is that of deductive databases, with a set of new techniques called cooperative answering. This technology utilizes semantic networks to allow for navigation betweenmore » possible query search term alternatives. The augmented search terms are passed to an IR engine and the results can be compared. The project utilizes the OSTI Environment, Safety and Health Thesaurus to populate the domain specific semantic network and the text base of ES&H related documents from the Facility Profile Information Management System as the domain specific search space.« less
Analyzing Medical Image Search Behavior: Semantics and Prediction of Query Results.
De-Arteaga, Maria; Eggel, Ivan; Kahn, Charles E; Müller, Henning
2015-10-01
Log files of information retrieval systems that record user behavior have been used to improve the outcomes of retrieval systems, understand user behavior, and predict events. In this article, a log file of the ARRS GoldMiner search engine containing 222,005 consecutive queries is analyzed. Time stamps are available for each query, as well as masked IP addresses, which enables to identify queries from the same person. This article describes the ways in which physicians (or Internet searchers interested in medical images) search and proposes potential improvements by suggesting query modifications. For example, many queries contain only few terms and therefore are not specific; others contain spelling mistakes or non-medical terms that likely lead to poor or empty results. One of the goals of this report is to predict the number of results a query will have since such a model allows search engines to automatically propose query modifications in order to avoid result lists that are empty or too large. This prediction is made based on characteristics of the query terms themselves. Prediction of empty results has an accuracy above 88%, and thus can be used to automatically modify the query to avoid empty result sets for a user. The semantic analysis and data of reformulations done by users in the past can aid the development of better search systems, particularly to improve results for novice users. Therefore, this paper gives important ideas to better understand how people search and how to use this knowledge to improve the performance of specialized medical search engines.
Semantic-Based Information Retrieval of Biomedical Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiao, Yu; Potok, Thomas E; Hurson, Ali R.
In this paper, we propose to improve the effectiveness of biomedical information retrieval via a medical thesaurus. We analyzed the deficiencies of the existing medical thesauri and reconstructed a new thesaurus, called MEDTHES, which follows the ANSI/NISO Z39.19-2003 standard. MEDTHES also endows the users with fine-grained control of information retrieval by providing functions to calculate the semantic similarity between words. We demonstrate the usage of MEDTHES through an existing data search engine.
PIRIA: a general tool for indexing, search, and retrieval of multimedia content
NASA Astrophysics Data System (ADS)
Joint, Magali; Moellic, Pierre-Alain; Hede, P.; Adam, P.
2004-05-01
The Internet is a continuously expanding source of multimedia content and information. There are many products in development to search, retrieve, and understand multimedia content. But most of the current image search/retrieval engines, rely on a image database manually pre-indexed with keywords. Computers are still powerless to understand the semantic meaning of still or animated image content. Piria (Program for the Indexing and Research of Images by Affinity), the search engine we have developed brings this possibility closer to reality. Piria is a novel search engine that uses the query by example method. A user query is submitted to the system, which then returns a list of images ranked by similarity, obtained by a metric distance that operates on every indexed image signature. These indexed images are compared according to several different classifiers, not only Keywords, but also Form, Color and Texture, taking into account geometric transformations and variance like rotation, symmetry, mirroring, etc. Form - Edges extracted by an efficient segmentation algorithm. Color - Histogram, semantic color segmentation and spatial color relationship. Texture - Texture wavelets and local edge patterns. If required, Piria is also able to fuse results from multiple classifiers with a new classification of index categories: Single Indexer Single Call (SISC), Single Indexer Multiple Call (SIMC), Multiple Indexers Single Call (MISC) or Multiple Indexers Multiple Call (MIMC). Commercial and industrial applications will be explored and discussed as well as current and future development.
Discovery in a World of Mashups
NASA Astrophysics Data System (ADS)
King, T. A.; Ritschel, B.; Hourcle, J. A.; Moon, I. S.
2014-12-01
When the first digital information was stored electronically, discovery of what existed was through file names and the organization of the file system. With the advent of networks, digital information was shared on a wider scale, but discovery remained based on file and folder names. With a growing number of information sources, named based discovery quickly became ineffective. The keyword based search engine was one of the first types of a mashup in the world of Web 1.0. Embedded links from one document to another with prescribed relationships between files and the world of Web 2.0 was formed. Search engines like Google used the links to improve search results and a worldwide mashup was formed. While a vast improvement, the need for semantic (meaning rich) discovery was clear, especially for the discovery of scientific data. In response, every science discipline defined schemas to describe their type of data. Some core schemas where shared, but most schemas are custom tailored even though they share many common concepts. As with the networking of information sources, science increasingly relies on data from multiple disciplines. So there is a need to bring together multiple sources of semantically rich information. We explore how harvesting, conceptual mapping, facet based search engines, search term promotion, and style sheets can be combined to create the next generation of mashups in the emerging world of Web 3.0. We use NASA's Planetary Data System and NASA's Heliophysics Data Environment to illustrate how to create a multi-discipline mash-up.
Semantic Features for Classifying Referring Search Terms
DOE Office of Scientific and Technical Information (OSTI.GOV)
May, Chandler J.; Henry, Michael J.; McGrath, Liam R.
2012-05-11
When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from querymore » text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2005-11-16
... Reference System (TRS) [see http://www.epa.gov/trs ] in order to better support future semantic Web needs... creation of glossaries for Web pages and documents, a common vocabulary for search engines, and in the...
Unitary Operators on the Document Space.
ERIC Educational Resources Information Center
Hoenkamp, Eduard
2003-01-01
Discusses latent semantic indexing (LSI) that would allow search engines to reduce the dimension of the document space by mapping it into a space spanned by conceptual indices. Topics include vector space models; singular value decomposition (SVD); unitary operators; the Haar transform; and new algorithms. (Author/LRW)
The Semantic eScience Framework
NASA Astrophysics Data System (ADS)
McGuinness, Deborah; Fox, Peter; Hendler, James
2010-05-01
The goal of this effort is to design and implement a configurable and extensible semantic eScience framework (SESF). Configuration requires research into accommodating different levels of semantic expressivity and user requirements from use cases. Extensibility is being achieved in a modular approach to the semantic encodings (i.e. ontologies) performed in community settings, i.e. an ontology framework into which specific applications all the way up to communities can extend the semantics for their needs.We report on how we are accommodating the rapid advances in semantic technologies and tools and the sustainable software path for the future (certain) technical advances. In addition to a generalization of the current data science interface, we will present plans for an upper-level interface suitable for use by clearinghouses, and/or educational portals, digital libraries, and other disciplines.SESF builds upon previous work in the Virtual Solar-Terrestrial Observatory. The VSTO utilizes leading edge knowledge representation, query and reasoning techniques to support knowledge-enhanced search, data access, integration, and manipulation. It encodes term meanings and their inter-relationships in ontologies anduses these ontologies and associated inference engines to semantically enable the data services. The Semantically-Enabled Science Data Integration (SESDI) project implemented data integration capabilities among three sub-disciplines; solar radiation, volcanic outgassing and atmospheric structure using extensions to existingmodular ontolgies and used the VSTO data framework, while adding smart faceted search and semantic data registrationtools. The Semantic Provenance Capture in Data Ingest Systems (SPCDIS) has added explanation provenance capabilities to an observational data ingest pipeline for images of the Sun providing a set of tools to answer diverseend user questions such as ``Why does this image look bad?. http://tw.rpi.edu/portal/SESF
The Semantic eScience Framework
NASA Astrophysics Data System (ADS)
Fox, P. A.; McGuinness, D. L.
2009-12-01
The goal of this effort is to design and implement a configurable and extensible semantic eScience framework (SESF). Configuration requires research into accommodating different levels of semantic expressivity and user requirements from use cases. Extensibility is being achieved in a modular approach to the semantic encodings (i.e. ontologies) performed in community settings, i.e. an ontology framework into which specific applications all the way up to communities can extend the semantics for their needs.We report on how we are accommodating the rapid advances in semantic technologies and tools and the sustainable software path for the future (certain) technical advances. In addition to a generalization of the current data science interface, we will present plans for an upper-level interface suitable for use by clearinghouses, and/or educational portals, digital libraries, and other disciplines.SESF builds upon previous work in the Virtual Solar-Terrestrial Observatory. The VSTO utilizes leading edge knowledge representation, query and reasoning techniques to support knowledge-enhanced search, data access, integration, and manipulation. It encodes term meanings and their inter-relationships in ontologies anduses these ontologies and associated inference engines to semantically enable the data services. The Semantically-Enabled Science Data Integration (SESDI) project implemented data integration capabilities among three sub-disciplines; solar radiation, volcanic outgassing and atmospheric structure using extensions to existingmodular ontolgies and used the VSTO data framework, while adding smart faceted search and semantic data registrationtools. The Semantic Provenance Capture in Data Ingest Systems (SPCDIS) has added explanation provenance capabilities to an observational data ingest pipeline for images of the Sun providing a set of tools to answer diverseend user questions such as ``Why does this image look bad?.
Orthographic versus semantic matching in visual search for words within lists.
Léger, Laure; Rouet, Jean-François; Ros, Christine; Vibert, Nicolas
2012-03-01
An eye-tracking experiment was performed to assess the influence of orthographic and semantic distractor words on visual search for words within lists. The target word (e.g., "raven") was either shown to participants before the search (literal search) or defined by its semantic category (e.g., "bird", categorical search). In both cases, the type of words included in the list affected visual search times and eye movement patterns. In the literal condition, the presence of orthographic distractors sharing initial and final letters with the target word strongly increased search times. Indeed, the orthographic distractors attracted participants' gaze and were fixated for longer times than other words in the list. The presence of semantic distractors related to the target word also increased search times, which suggests that significant automatic semantic processing of nontarget words took place. In the categorical condition, semantic distractors were expected to have a greater impact on the search task. As expected, the presence in the list of semantic associates of the target word led to target selection errors. However, semantic distractors did not significantly increase search times any more, whereas orthographic distractors still did. Hence, the visual characteristics of nontarget words can be strong predictors of the efficiency of visual search even when the exact target word is unknown. The respective impacts of orthographic and semantic distractors depended more on the characteristics of lists than on the nature of the search task.
Roogle: an information retrieval engine for clinical data warehouse.
Cuggia, Marc; Garcelon, Nicolas; Campillo-Gimenez, Boris; Bernicot, Thomas; Laurent, Jean-François; Garin, Etienne; Happe, André; Duvauferrier, Régis
2011-01-01
High amount of relevant information is contained in reports stored in the electronic patient records and associated metadata. R-oogle is a project aiming at developing information retrieval engines adapted to these reports and designed for clinicians. The system consists in a data warehouse (full-text reports and structured data) imported from two different hospital information systems. Information retrieval is performed using metadata-based semantic and full-text search methods (as Google). Applications may be biomarkers identification in a translational approach, search of specific cases, and constitution of cohorts, professional practice evaluation, and quality control assessment.
LIVIVO - the Vertical Search Engine for Life Sciences.
Müller, Bernd; Poley, Christoph; Pössel, Jana; Hagelstein, Alexandra; Gübitz, Thomas
2017-01-01
The explosive growth of literature and data in the life sciences challenges researchers to keep track of current advancements in their disciplines. Novel approaches in the life science like the One Health paradigm require integrated methodologies in order to link and connect heterogeneous information from databases and literature resources. Current publications in the life sciences are increasingly characterized by the employment of trans-disciplinary methodologies comprising molecular and cell biology, genetics, genomic, epigenomic, transcriptional and proteomic high throughput technologies with data from humans, plants, and animals. The literature search engine LIVIVO empowers retrieval functionality by incorporating various literature resources from medicine, health, environment, agriculture and nutrition. LIVIVO is developed in-house by ZB MED - Information Centre for Life Sciences. It provides a user-friendly and usability-tested search interface with a corpus of 55 Million citations derived from 50 databases. Standardized application programming interfaces are available for data export and high throughput retrieval. The search functions allow for semantic retrieval with filtering options based on life science entities. The service oriented architecture of LIVIVO uses four different implementation layers to deliver search services. A Knowledge Environment is developed by ZB MED to deal with the heterogeneity of data as an integrative approach to model, store, and link semantic concepts within literature resources and databases. Future work will focus on the exploitation of life science ontologies and on the employment of NLP technologies in order to improve query expansion, filters in faceted search, and concept based relevancy rankings in LIVIVO.
Ontology modularization to improve semantic medical image annotation.
Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul
2011-02-01
Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.
Semantic Search of Web Services
ERIC Educational Resources Information Center
Hao, Ke
2013-01-01
This dissertation addresses semantic search of Web services using natural language processing. We first survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a vector space model based service…
Strategies for Information Retrieval and Virtual Teaming to Mitigate Risk on NASA's Missions
NASA Technical Reports Server (NTRS)
Topousis, Daria; Williams, Gregory; Murphy, Keri
2007-01-01
Following the loss of NASA's Space Shuttle Columbia in 2003, it was determined that problems in the agency's organization created an environment that led to the accident. One component of the proposed solution resulted in the formation of the NASA Engineering Network (NEN), a suite of information retrieval and knowledge sharing tools. This paper describes the implementation of this set of search, portal, content management, and semantic technologies, including a unique meta search capability for data from distributed engineering resources. NEN's communities of practice are formed along engineering disciplines where users leverage their knowledge and best practices to collaborate and take informal learning back to their personal jobs and embed it into the procedures of the agency. These results offer insight into using traditional engineering disciplines for virtual teaming and problem solving.
GeoSearch: A lightweight broking middleware for geospatial resources discovery
NASA Astrophysics Data System (ADS)
Gui, Z.; Yang, C.; Liu, K.; Xia, J.
2012-12-01
With petabytes of geodata, thousands of geospatial web services available over the Internet, it is critical to support geoscience research and applications by finding the best-fit geospatial resources from the massive and heterogeneous resources. Past decades' developments witnessed the operation of many service components to facilitate geospatial resource management and discovery. However, efficient and accurate geospatial resource discovery is still a big challenge due to the following reasons: 1)The entry barriers (also called "learning curves") hinder the usability of discovery services to end users. Different portals and catalogues always adopt various access protocols, metadata formats and GUI styles to organize, present and publish metadata. It is hard for end users to learn all these technical details and differences. 2)The cost for federating heterogeneous services is high. To provide sufficient resources and facilitate data discovery, many registries adopt periodic harvesting mechanism to retrieve metadata from other federated catalogues. These time-consuming processes lead to network and storage burdens, data redundancy, and also the overhead of maintaining data consistency. 3)The heterogeneous semantics issues in data discovery. Since the keyword matching is still the primary search method in many operational discovery services, the search accuracy (precision and recall) is hard to guarantee. Semantic technologies (such as semantic reasoning and similarity evaluation) offer a solution to solve these issues. However, integrating semantic technologies with existing service is challenging due to the expandability limitations on the service frameworks and metadata templates. 4)The capabilities to help users make final selection are inadequate. Most of the existing search portals lack intuitive and diverse information visualization methods and functions (sort, filter) to present, explore and analyze search results. Furthermore, the presentation of the value-added additional information (such as, service quality and user feedback), which conveys important decision supporting information, is missing. To address these issues, we prototyped a distributed search engine, GeoSearch, based on brokering middleware framework to search, integrate and visualize heterogeneous geospatial resources. Specifically, 1) A lightweight discover broker is developed to conduct distributed search. The broker retrieves metadata records for geospatial resources and additional information from dispersed services (portals and catalogues) and other systems on the fly. 2) A quality monitoring and evaluation broker (i.e., QoS Checker) is developed and integrated to provide quality information for geospatial web services. 3) The semantic assisted search and relevance evaluation functions are implemented by loosely interoperating with ESIP Testbed component. 4) Sophisticated information and data visualization functionalities and tools are assembled to improve user experience and assist resource selection.
Querying archetype-based EHRs by search ontology-based XPath engineering.
Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich
2018-05-11
Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.
BOSS: context-enhanced search for biomedical objects
2012-01-01
Background There exist many academic search solutions and most of them can be put on either ends of spectrum: general-purpose search and domain-specific "deep" search systems. The general-purpose search systems, such as PubMed, offer flexible query interface, but churn out a list of matching documents that users have to go through the results in order to find the answers to their queries. On the other hand, the "deep" search systems, such as PPI Finder and iHOP, return the precompiled results in a structured way. Their results, however, are often found only within some predefined contexts. In order to alleviate these problems, we introduce a new search engine, BOSS, Biomedical Object Search System. Methods Unlike the conventional search systems, BOSS indexes segments, rather than documents. A segment refers to a Maximal Coherent Semantic Unit (MCSU) such as phrase, clause or sentence that is semantically coherent in the given context (e.g., biomedical objects or their relations). For a user query, BOSS finds all matching segments, identifies the objects appearing in those segments, and aggregates the segments for each object. Finally, it returns the ranked list of the objects along with their matching segments. Results The working prototype of BOSS is available at http://boss.korea.ac.kr. The current version of BOSS has indexed abstracts of more than 20 million articles published during last 16 years from 1996 to 2011 across all science disciplines. Conclusion BOSS fills the gap between either ends of the spectrum by allowing users to pose context-free queries and by returning a structured set of results. Furthermore, BOSS exhibits the characteristic of good scalability, just as with conventional document search engines, because it is designed to use a standard document-indexing model with minimal modifications. Considering the features, BOSS notches up the technological level of traditional solutions for search on biomedical information. PMID:22595092
Software for Studying and Enhancing Educational Uses of Geospatial Semantics and Data
ERIC Educational Resources Information Center
Nodenot, Thierry; Sallaberry, Christian; Gaio, Mauro
2010-01-01
Geographically related queries form nearly one-fifth of all queries submitted to the Excite search engine and the most frequently occurring terms are names of places. This paper focuses on digital libraries and extends the basic services of existing library management systems to include new ones that are dedicated to geographic information…
What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.
Rodriguez-Vaamonde, Sergio; Torresani, Lorenzo; Fitzgibbon, Andrew W
2015-06-01
Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine. We present a Web-scalable system that exploits a pure text-based search engine to find an initial set of candidate documents for a given query. Then, the candidate set is reranked using visual information extracted from the images contained in the pages. The resulting system retains the computational efficiency of traditional text-based search engines with only a small additional storage cost needed to encode the visual information. We test our approach on one of the TREC Million Query Track benchmarks where we show that the exploitation of visual content yields improvement in accuracies for two distinct text-based search engines, including the system with the best reported performance on this benchmark. We further validate our approach by collecting document relevance judgements on our search results using Amazon Mechanical Turk. The results of this experiment confirm the improvement in accuracy produced by our image-based reranker over a pure text-based system.
Building a Semantic Framework for eScience
NASA Astrophysics Data System (ADS)
Movva, S.; Ramachandran, R.; Maskey, M.; Li, X.
2009-12-01
The e-Science vision focuses on the use of advanced computing technologies to support scientists. Recent research efforts in this area have focused primarily on “enabling” use of infrastructure resources for both data and computational access especially in Geosciences. One of the existing gaps in the existing e-Science efforts has been the failure to incorporate stable semantic technologies within the design process itself. In this presentation, we describe our effort in designing a framework for e-Science built using Service Oriented Architecture. Our framework provides users capabilities to create science workflows and mine distributed data. Our e-Science framework is being designed around a mass market tool to promote reusability across many projects. Semantics is an integral part of this framework and our design goal is to leverage the latest stable semantic technologies. The use of these stable semantic technologies will provide the users of our framework the useful features such as: allow search engines to find their content with RDFa tags; create RDF triple data store for their content; create RDF end points to share with others; and semantically mash their content with other online content available as RDF end point.
Semantic guidance of eye movements in real-world scenes
Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc
2011-01-01
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. PMID:21426914
Semantic guidance of eye movements in real-world scenes.
Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc
2011-05-25
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.
Semantics-Based Interoperability Framework for the Geosciences
NASA Astrophysics Data System (ADS)
Sinha, A.; Malik, Z.; Raskin, R.; Barnes, C.; Fox, P.; McGuinness, D.; Lin, K.
2008-12-01
Interoperability between heterogeneous data, tools and services is required to transform data to knowledge. To meet geoscience-oriented societal challenges such as forcing of climate change induced by volcanic eruptions, we suggest the need to develop semantic interoperability for data, services, and processes. Because such scientific endeavors require integration of multiple data bases associated with global enterprises, implicit semantic-based integration is impossible. Instead, explicit semantics are needed to facilitate interoperability and integration. Although different types of integration models are available (syntactic or semantic) we suggest that semantic interoperability is likely to be the most successful pathway. Clearly, the geoscience community would benefit from utilization of existing XML-based data models, such as GeoSciML, WaterML, etc to rapidly advance semantic interoperability and integration. We recognize that such integration will require a "meanings-based search, reasoning and information brokering", which will be facilitated through inter-ontology relationships (ontologies defined for each discipline). We suggest that Markup languages (MLs) and ontologies can be seen as "data integration facilitators", working at different abstraction levels. Therefore, we propose to use an ontology-based data registration and discovery approach to compliment mark-up languages through semantic data enrichment. Ontologies allow the use of formal and descriptive logic statements which permits expressive query capabilities for data integration through reasoning. We have developed domain ontologies (EPONT) to capture the concept behind data. EPONT ontologies are associated with existing ontologies such as SUMO, DOLCE and SWEET. Although significant efforts have gone into developing data (object) ontologies, we advance the idea of developing semantic frameworks for additional ontologies that deal with processes and services. This evolutionary step will facilitate the integrative capabilities of scientists as we examine the relationships between data and external factors such as processes that may influence our understanding of "why" certain events happen. We emphasize the need to go from analysis of data to concepts related to scientific principles of thermodynamics, kinetics, heat flow, mass transfer, etc. Towards meeting these objectives, we report on a pair of related service engines: DIA (Discovery, integration and analysis), and SEDRE (Semantically-Enabled Data Registration Engine) that utilize ontologies for semantic interoperability and integration.
A Semantic Approach for Knowledge Discovery to Help Mitigate Habitat Loss in the Gulf of Mexico
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Maskey, M.; Graves, S.; Hardin, D.
2008-12-01
Noesis is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities. Ontologies enable Noesis to help users refine their searches for information on the open web and in hidden web locations such as data catalogues with standardized, but discipline specific vocabularies. Through its ontologies Noesis provides a guided refinement of search queries which produces complete and accurate searches while reducing the user's burden to experiment with different search strings. All search results are organized by categories (e. g. all results from Google are grouped together) which may be selected or omitted according to the desire of the user. During the past two years ontologies were developed for sea grasses in the Gulf of Mexico and were used to support a habitat restoration demonstration project. Currently these ontologies are being augmented to address the special characteristics of mangroves. These new ontologies will extend the demonstration project to broader regions of the Gulf including protected mangrove locations in coastal Mexico. Noesis contributes to the decision making process by producing a comprehensive list of relevant resources based on the semantic information contained in the ontologies. Ontologies are organized in a tree like taxonomies, where the child nodes represent the Specializations and the parent nodes represent the Generalizations of a node or concept. Specializations can be used to provide more detailed search, while generalizations are used to make the search broader. Ontologies are also used to link two syntactically different terms to one semantic concept (synonyms). Appending a synonym to the query expands the search, thus providing better search coverage. Every concept has a set of properties that are neither in the same inheritance hierarchy (Specializations / Generalizations) nor equivalent (synonyms). These are called Related Concepts and they are captured in the ontology through property relationships. By using Related Concepts users can search for resources with respect to a particular property. Noesis automatically generates searches that include all of these capabilities, removing the burden from the user and producing broader and more accurate search results. This presentation will demonstrate the features of Noesis and describe its application to habitat studies in the Gulf of Mexico.
SPARK: Adapting Keyword Query to Semantic Search
NASA Astrophysics Data System (ADS)
Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong
Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.
Boulos, Maged N; Roudsari, Abdul V; Carson, Ewart R
2002-07-01
HealthCyberMap (http://healthcybermap.semanticweb.org/) aims at mapping Internet health information resources in novel ways for enhanced retrieval and navigation. This is achieved by collecting appropriate resource metadata in an unambiguous form that preserves semantics. We modelled a qualified Dublin Core (DC) metadata set ontology with extra elements for resource quality and geographical provenance in Prot g -2000. A metadata collection form helps acquiring resource instance data within Prot g . The DC subject field is populated with UMLS terms directly imported from UMLS Knowledge Source Server using UMLS tab, a Prot g -2000 plug-in. The project is saved in RDFS/RDF. The ontology and associated form serve as a free tool for building and maintaining an RDF medical resource metadata base. The UMLS tab enables browsing and searching for concepts that best describe a resource, and importing them to DC subject fields. The resultant metadata base can be used with a search and inference engine, and have textual and/or visual navigation interface(s) applied to it, to ultimately build a medical Semantic Web portal. Different ways of exploiting Prot g -2000 RDF output are discussed. By making the context and semantics of resources, not merely their raw text and formatting, amenable to computer 'understanding,' we can build a Semantic Web that is more useful to humans than the current Web. This requires proper use of metadata and ontologies. Clinical codes can reliably describe the subjects of medical resources, establish the semantic relationships (as defined by underlying coding scheme) between related resources, and automate their topical categorisation.
SNOMED CT module-driven clinical archetype management.
Allones, J L; Taboada, M; Martinez, D; Lozano, R; Sobrido, M J
2013-06-01
To explore semantic search to improve management and user navigation in clinical archetype repositories. In order to support semantic searches across archetypes, an automated method based on SNOMED CT modularization is implemented to transform clinical archetypes into SNOMED CT extracts. Concurrently, query terms are converted into SNOMED CT concepts using the search engine Lucene. Retrieval is then carried out by matching query concepts with the corresponding SNOMED CT segments. A test collection of the 16 clinical archetypes, including over 250 terms, and a subset of 55 clinical terms from two medical dictionaries, MediLexicon and MedlinePlus, were used to test our method. The keyword-based service supported by the OpenEHR repository offered us a benchmark to evaluate the enhancement of performance. In total, our approach reached 97.4% precision and 69.1% recall, providing a substantial improvement of recall (more than 70%) compared to the benchmark. Exploiting medical domain knowledge from ontologies such as SNOMED CT may overcome some limitations of the keyword-based systems and thus improve the search experience of repository users. An automated approach based on ontology segmentation is an efficient and feasible way for supporting modeling, management and user navigation in clinical archetype repositories. Copyright © 2013 Elsevier Inc. All rights reserved.
A predictive framework for evaluating models of semantic organization in free recall
Morton, Neal W; Polyn, Sean M.
2016-01-01
Research in free recall has demonstrated that semantic associations reliably influence the organization of search through episodic memory. However, the specific structure of these associations and the mechanisms by which they influence memory search remain unclear. We introduce a likelihood-based model-comparison technique, which embeds a model of semantic structure within the context maintenance and retrieval (CMR) model of human memory search. Within this framework, model variants are evaluated in terms of their ability to predict the specific sequence in which items are recalled. We compare three models of semantic structure, latent semantic analysis (LSA), global vectors (GloVe), and word association spaces (WAS), and find that models using WAS have the greatest predictive power. Furthermore, we find evidence that semantic and temporal organization is driven by distinct item and context cues, rather than a single context cue. This finding provides important constraint for theories of memory search. PMID:28331243
XSemantic: An Extension of LCA Based XML Semantic Search
NASA Astrophysics Data System (ADS)
Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai
One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.
The interplay of episodic and semantic memory in guiding repeated search in scenes.
Võ, Melissa L-H; Wolfe, Jeremy M
2013-02-01
It seems intuitive to think that previous exposure or interaction with an environment should make it easier to search through it and, no doubt, this is true in many real-world situations. However, in a recent study, we demonstrated that previous exposure to a scene does not necessarily speed search within that scene. For instance, when observers performed as many as 15 searches for different objects in the same, unchanging scene, the speed of search did not decrease much over the course of these multiple searches (Võ & Wolfe, 2012). Only when observers were asked to search for the same object again did search become considerably faster. We argued that our naturalistic scenes provided such strong "semantic" guidance-e.g., knowing that a faucet is usually located near a sink-that guidance by incidental episodic memory-having seen that faucet previously-was rendered less useful. Here, we directly manipulated the availability of semantic information provided by a scene. By monitoring observers' eye movements, we found a tight coupling of semantic and episodic memory guidance: Decreasing the availability of semantic information increases the use of episodic memory to guide search. These findings have broad implications regarding the use of memory during search in general and particularly during search in naturalistic scenes. Copyright © 2012 Elsevier B.V. All rights reserved.
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T.; Prokosch, U.
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T; Prokosch, U
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.
A Semantic Analysis Method for Scientific and Engineering Code
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.
1998-01-01
This paper develops a procedure to statically analyze aspects of the meaning or semantics of scientific and engineering code. The analysis involves adding semantic declarations to a user's code and parsing this semantic knowledge with the original code using multiple expert parsers. These semantic parsers are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. In practice, a user would submit code with semantic declarations of primitive variables to the analysis procedure, and its semantic parsers would automatically recognize and document some static, semantic concepts and locate some program semantic errors. A prototype implementation of this analysis procedure is demonstrated. Further, the relationship between the fundamental algebraic manipulations of equations and the parsing of expressions is explained. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)
NASA Astrophysics Data System (ADS)
Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.
2016-12-01
The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
A suffix arrays based approach to semantic search in P2P systems
NASA Astrophysics Data System (ADS)
Shi, Qingwei; Zhao, Zheng; Bao, Hu
2007-09-01
Building a semantic search system on top of peer-to-peer (P2P) networks is becoming an attractive and promising alternative scheme for the reason of scalability, Data freshness and search cost. In this paper, we present a Suffix Arrays based algorithm for Semantic Search (SASS) in P2P systems, which generates a distributed Semantic Overlay Network (SONs) construction for full-text search in P2P networks. For each node through the P2P network, SASS distributes document indices based on a set of suffix arrays, by which clusters are created depending on words or phrases shared between documents, therefore, the search cost for a given query is decreased by only scanning semantically related documents. In contrast to recently announced SONs scheme designed by using metadata or predefined-class, SASS is an unsupervised approach for decentralized generation of SONs. SASS is also an incremental, linear time algorithm, which efficiently handle the problem of nodes update in P2P networks. Our simulation results demonstrate that SASS yields high search efficiency in dynamic environments.
Semantic extraction and processing of medical records for patient-oriented visual index
NASA Astrophysics Data System (ADS)
Zheng, Weilin; Dong, Wenjie; Chen, Xiangjiao; Zhang, Jianguo
2012-02-01
To have comprehensive and completed understanding healthcare status of a patient, doctors need to search patient medical records from different healthcare information systems, such as PACS, RIS, HIS, USIS, as a reference of diagnosis and treatment decisions for the patient. However, it is time-consuming and tedious to do these procedures. In order to solve this kind of problems, we developed a patient-oriented visual index system (VIS) to use the visual technology to show health status and to retrieve the patients' examination information stored in each system with a 3D human model. In this presentation, we present a new approach about how to extract the semantic and characteristic information from the medical record systems such as RIS/USIS to create the 3D Visual Index. This approach includes following steps: (1) Building a medical characteristic semantic knowledge base; (2) Developing natural language processing (NLP) engine to perform semantic analysis and logical judgment on text-based medical records; (3) Applying the knowledge base and NLP engine on medical records to extract medical characteristics (e.g., the positive focus information), and then mapping extracted information to related organ/parts of 3D human model to create the visual index. We performed the testing procedures on 559 samples of radiological reports which include 853 focuses, and achieved 828 focuses' information. The successful rate of focus extraction is about 97.1%.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-06-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-03-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
NASA Astrophysics Data System (ADS)
Brambilla, Marco; Ceri, Stefano; Valle, Emanuele Della; Facca, Federico M.; Tziviskou, Christina
Although Semantic Web Services are expected to produce a revolution in the development of Web-based systems, very few enterprise-wide design experiences are available; one of the main reasons is the lack of sound Software Engineering methods and tools for the deployment of Semantic Web applications. In this chapter, we present an approach to software development for the Semantic Web based on classical Software Engineering methods (i.e., formal business process development, computer-aided and component-based software design, and automatic code generation) and on semantic methods and tools (i.e., ontology engineering, semantic service annotation and discovery).
Development of intelligent semantic search system for rubber research data in Thailand
NASA Astrophysics Data System (ADS)
Kaewboonma, Nattapong; Panawong, Jirapong; Pianhanuruk, Ekkawit; Buranarach, Marut
2017-10-01
The rubber production of Thailand increased not only by strong demand from the world market, but was also stimulated strongly through the replanting program of the Thai Government from 1961 onwards. With the continuous growth of rubber research data volume on the Web, the search for information has become a challenging task. Ontologies are used to improve the accuracy of information retrieval from the web by incorporating a degree of semantic analysis during the search. In this context, we propose an intelligent semantic search system for rubber research data in Thailand. The research methods included 1) analyzing domain knowledge, 2) ontologies development, and 3) intelligent semantic search system development to curate research data in trusted digital repositories may be shared among the wider Thailand rubber research community.
Semantic annotation of consumer health questions.
Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina
2018-02-06
Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most useful in estimating annotation confidence. To our knowledge, our corpus is the first focusing on annotation of uncurated consumer health questions. It is currently used to develop machine learning-based methods for question understanding. We make the corpus publicly available to stimulate further research on consumer health QA.
Hybrid Filtering in Semantic Query Processing
ERIC Educational Resources Information Center
Jeong, Hanjo
2011-01-01
This dissertation presents a hybrid filtering method and a case-based reasoning framework for enhancing the effectiveness of Web search. Web search may not reflect user needs, intent, context, and preferences, because today's keyword-based search is lacking semantic information to capture the user's context and intent in posing the search query.…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brost, Randolph C.; McLendon, William Clarence,
2013-01-01
Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report amore » preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.« less
A neotropical Miocene pollen database employing image-based search and semantic modeling.
Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W; Jaramillo, Carlos; Shyu, Chi-Ren
2014-08-01
Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery.
Rodríguez-García, Miguel Ángel; Rodríguez-González, Alejandro; Valencia-García, Rafael; Gómez-Berbís, Juan Miguel
2014-01-01
Precise, reliable and real-time financial information is critical for added-value financial services after the economic turmoil from which markets are still struggling to recover. Since the Web has become the most significant data source, intelligent crawlers based on Semantic Technologies have become trailblazers in the search of knowledge combining natural language processing and ontology engineering techniques. In this paper, we present the SONAR extension approach, which will leverage the potential of knowledge representation by extracting, managing, and turning scarce and disperse financial information into well-classified, structured, and widely used XBRL format-oriented knowledge, strongly supported by a proof-of-concept implementation and a thorough evaluation of the benefits of the approach. PMID:24587726
Rodríguez-García, Miguel Ángel; Rodríguez-González, Alejandro; Colomo-Palacios, Ricardo; Valencia-García, Rafael; Gómez-Berbís, Juan Miguel; García-Sánchez, Francisco
2014-01-01
Precise, reliable and real-time financial information is critical for added-value financial services after the economic turmoil from which markets are still struggling to recover. Since the Web has become the most significant data source, intelligent crawlers based on Semantic Technologies have become trailblazers in the search of knowledge combining natural language processing and ontology engineering techniques. In this paper, we present the SONAR extension approach, which will leverage the potential of knowledge representation by extracting, managing, and turning scarce and disperse financial information into well-classified, structured, and widely used XBRL format-oriented knowledge, strongly supported by a proof-of-concept implementation and a thorough evaluation of the benefits of the approach.
Generating Personalized Web Search Using Semantic Context
Xu, Zheng; Chen, Hai-Yan; Yu, Jie
2015-01-01
The “one size fits the all” criticism of search engines is that when queries are submitted, the same results are returned to different users. In order to solve this problem, personalized search is proposed, since it can provide different search results based upon the preferences of users. However, existing methods concentrate more on the long-term and independent user profile, and thus reduce the effectiveness of personalized search. In this paper, the method captures the user context to provide accurate preferences of users for effectively personalized search. First, the short-term query context is generated to identify related concepts of the query. Second, the user context is generated based on the click through data of users. Finally, a forgetting factor is introduced to merge the independent user context in a user session, which maintains the evolution of user preferences. Experimental results fully confirm that our approach can successfully represent user context according to individual user information needs. PMID:26000335
ELE: An Ontology-Based System Integrating Semantic Search and E-Learning Technologies
ERIC Educational Resources Information Center
Barbagallo, A.; Formica, A.
2017-01-01
ELSE (E-Learning for the Semantic ECM) is an ontology-based system which integrates semantic search methodologies and e-learning technologies. It has been developed within a project of the CME (Continuing Medical Education) program--ECM (Educazione Continua nella Medicina) for Italian participants. ELSE allows the creation of e-learning courses…
Cognitive search model and a new query paradigm
NASA Astrophysics Data System (ADS)
Xu, Zhonghui
2001-06-01
This paper proposes a cognitive model in which people begin to search pictures by using semantic content and find a right picture by judging whether its visual content is a proper visualization of the semantics desired. It is essential that human search is not just a process of matching computation on visual feature but rather a process of visualization of the semantic content known. For people to search electronic images in the way as they manually do in the model, we suggest that querying be a semantic-driven process like design. A query-by-design paradigm is prosed in the sense that what you design is what you find. Unlike query-by-example, query-by-design allows users to specify the semantic content through an iterative and incremental interaction process so that a retrieval can start with association and identification of the given semantic content and get refined while further visual cues are available. An experimental image retrieval system, Kuafu, has been under development using the query-by-design paradigm and an iconic language is adopted.
Biomedical question answering using semantic relations.
Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C
2015-01-16
The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.
A neotropical Miocene pollen database employing image-based search and semantic modeling1
Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W.; Jaramillo, Carlos; Shyu, Chi-Ren
2014-01-01
• Premise of the study: Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Methods: Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Results: Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Discussion: Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery. PMID:25202648
The interplay of episodic and semantic memory in guiding repeated search in scenes
Võ, Melissa L.-H.; Wolfe, Jeremy M.
2012-01-01
It seems intuitive to think that previous exposure or interaction with an environment should make it easier to search through it and, no doubt, this is true in many real-world situations. However, in a recent study, we demonstrated that previous exposure to a scene does not necessarily speed search within that scene. For instance, when observers performed as many as fifteen searches for different objects in the same, unchanging scene, the speed of search did not decrease much over the course of these multiple searches (Võ & Wolfe, 2012). Only when observers were asked to search for the same object again did search become considerably faster. We argued that our naturalistic scenes provided such strong “semantic” guidance — e.g., knowing that a faucet is usually located near a sink — that guidance by incidental episodic memory — having seen that faucet previously — was rendered less useful. Here, we directly manipulated the availability of semantic information provided by a scene. By monitoring observers’ eye movements, we found a tight coupling of semantic and episodic memory guidance: Decreasing the availability of semantic information increases the use of episodic memory to guide search. These findings have broad implications regarding the use of memory during search in general and particularly during search in naturalistic scenes. PMID:23177141
Sihong Chen; Jing Qin; Xing Ji; Baiying Lei; Tianfu Wang; Dong Ni; Jie-Zhi Cheng
2017-03-01
The gap between the computational and semantic features is the one of major factors that bottlenecks the computer-aided diagnosis (CAD) performance from clinical usage. To bridge this gap, we exploit three multi-task learning (MTL) schemes to leverage heterogeneous computational features derived from deep learning models of stacked denoising autoencoder (SDAE) and convolutional neural network (CNN), as well as hand-crafted Haar-like and HoG features, for the description of 9 semantic features for lung nodules in CT images. We regard that there may exist relations among the semantic features of "spiculation", "texture", "margin", etc., that can be explored with the MTL. The Lung Image Database Consortium (LIDC) data is adopted in this study for the rich annotation resources. The LIDC nodules were quantitatively scored w.r.t. 9 semantic features from 12 radiologists of several institutes in U.S.A. By treating each semantic feature as an individual task, the MTL schemes select and map the heterogeneous computational features toward the radiologists' ratings with cross validation evaluation schemes on the randomly selected 2400 nodules from the LIDC dataset. The experimental results suggest that the predicted semantic scores from the three MTL schemes are closer to the radiologists' ratings than the scores from single-task LASSO and elastic net regression methods. The proposed semantic attribute scoring scheme may provide richer quantitative assessments of nodules for better support of diagnostic decision and management. Meanwhile, the capability of the automatic association of medical image contents with the clinical semantic terms by our method may also assist the development of medical search engine.
Quantifying the semantics of search behavior before stock market moves.
Curme, Chester; Preis, Tobias; Stanley, H Eugene; Moat, Helen Susannah
2014-08-12
Technology is becoming deeply interwoven into the fabric of society. The Internet has become a central source of information for many people when making day-to-day decisions. Here, we present a method to mine the vast data Internet users create when searching for information online, to identify topics of interest before stock market moves. In an analysis of historic data from 2004 until 2012, we draw on records from the search engine Google and online encyclopedia Wikipedia as well as judgments from the service Amazon Mechanical Turk. We find evidence of links between Internet searches relating to politics or business and subsequent stock market moves. In particular, we find that an increase in search volume for these topics tends to precede stock market falls. We suggest that extensions of these analyses could offer insight into large-scale information flow before a range of real-world events.
Quantifying the semantics of search behavior before stock market moves
Curme, Chester; Preis, Tobias; Stanley, H. Eugene; Moat, Helen Susannah
2014-01-01
Technology is becoming deeply interwoven into the fabric of society. The Internet has become a central source of information for many people when making day-to-day decisions. Here, we present a method to mine the vast data Internet users create when searching for information online, to identify topics of interest before stock market moves. In an analysis of historic data from 2004 until 2012, we draw on records from the search engine Google and online encyclopedia Wikipedia as well as judgments from the service Amazon Mechanical Turk. We find evidence of links between Internet searches relating to politics or business and subsequent stock market moves. In particular, we find that an increase in search volume for these topics tends to precede stock market falls. We suggest that extensions of these analyses could offer insight into large-scale information flow before a range of real-world events. PMID:25071193
From Science to e-Science to Semantic e-Science: A Heliosphysics Case Study
NASA Technical Reports Server (NTRS)
Narock, Thomas; Fox, Peter
2011-01-01
The past few years have witnessed unparalleled efforts to make scientific data web accessible. The Semantic Web has proven invaluable in this effort; however, much of the literature is devoted to system design, ontology creation, and trials and tribulations of current technologies. In order to fully develop the nascent field of Semantic e-Science we must also evaluate systems in real-world settings. We describe a case study within the field of Heliophysics and provide a comparison of the evolutionary stages of data discovery, from manual to semantically enable. We describe the socio-technical implications of moving toward automated and intelligent data discovery. In doing so, we highlight how this process enhances what is currently being done manually in various scientific disciplines. Our case study illustrates that Semantic e-Science is more than just semantic search. The integration of search with web services, relational databases, and other cyberinfrastructure is a central tenet of our case study and one that we believe has applicability as a generalized research area within Semantic e-Science. This case study illustrates a specific example of the benefits, and limitations, of semantically replicating data discovery. We show examples of significant reductions in time and effort enable by Semantic e-Science; yet, we argue that a "complete" solution requires integrating semantic search with other research areas such as data provenance and web services.
Search strategies of Wikipedia readers.
Rodi, Giovanna Chiara; Loreto, Vittorio; Tria, Francesca
2017-01-01
The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users' click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the 'particular' to the 'universal' before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths.
Search strategies of Wikipedia readers
2017-01-01
The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users’ click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the ‘particular’ to the ‘universal’ before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths. PMID:28152030
Huang, Chung-Chi; Lu, Zhiyong
2016-01-01
Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as ‘CHEMICAL-1 compared to CHEMICAL-2.’ With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical–disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order covering diverse bio-entity relations. To assess the potential utility of our automated top-ranked patterns of a given relation in semantic search, we performed a pilot study on frequently sought semantic relations in PubMed and observed improved literature retrieval effectiveness based on post-hoc human relevance evaluation. Further investigation in larger tests and in real-world scenarios is warranted. PMID:27016698
Combined semantic and similarity search in medical image databases
NASA Astrophysics Data System (ADS)
Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin
2011-03-01
The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.
ERIC Educational Resources Information Center
Tse, Chi-Shing; Neely, James H.
2007-01-01
Letter-search (LS) within a prime often eliminates semantic priming. In 2 lexical decision experiments, the authors found that priming from LS primes occurred for low-frequency (LF) but not high-frequency (HF) targets whether the target's word frequency was manipulated between or within participants and whether the prime-target pairs were…
Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L
2010-11-01
Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©RSNA, 2010
An Experiment in Scientific Code Semantic Analysis
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.
1998-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, distributed expert parsers. These semantic parser are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. The parsers will automatically recognize and document some static, semantic concepts and locate some program semantic errors. Results are shown for a subroutine test case and a collection of combustion code routines. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.
An Information Infrastructure for Coastal Models and Data
NASA Astrophysics Data System (ADS)
Hardin, D.; Keiser, K.; Conover, H.; Graves, S.
2007-12-01
Advances in semantics and visualization have given rise to new capabilities for the location, manipulation, integration, management and display of data and information in and across domains. An example of these capabilities is illustrated by a coastal restoration project that utilizes satellite, in-situ data and hydrodynamic model output to address seagrass habitat restoration in the Northern Gulf of Mexico. In this project a standard stressor conceptual model was implemented as an ontology in addition to the typical CMAP diagram. The ontology captures the elements of the seagrass conceptual model as well as the relationships between them. Noesis, developed by the University of Alabama in Huntsville, is an application that provides a simple but powerful way to search and organize data and information represented by ontologies. Noesis uses domain ontologies to help scope search queries to ensure that search results are both accurate and complete. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. As a resource aggregator Noesis categorizes search results returned from multiple, concurrent search engines such as Google, Yahoo, and Ask.com. Search results are further directed by accessing domain specific catalogs that include outputs from hydrodynamic and other models. Embedded within the search results are links that invoke applications such as web map displays, animation tools and virtual globe applications such as Google Earth. In the seagrass prioritization project Noesis is used to locate information that is vital to understanding the impact of stressors on the habitat. This presentation will show how the intelligent search capabilities of Noesis are coupled with visualization tools and model output to investigate the restoration of seagrass habitat.
Semantically Enriching the Search System of a Music Digital Library
NASA Astrophysics Data System (ADS)
de Juan, Paloma; Iglesias, Carlos
Traditional search systems are usually based on keywords, a very simple and convenient mechanism to express a need for information. This is the most popular way of searching the Web, although it is not always an easy task to accurately summarize a natural language query in a few keywords. Working with keywords means losing the context, which is the only thing that can help us deal with ambiguity. This is the biggest problem of keyword-based systems. Semantic Web technologies seem a perfect solution to this problem, since they make it possible to represent the semantics of a given domain. In this chapter, we present three projects, Harmos, Semusici and Cantiga, whose aim is to provide access to a music digital library. We will describe two search systems, a traditional one and a semantic one, developed in the context of these projects and compare them in terms of usability and effectiveness.
Semantic search during divergent thinking.
Hass, Richard W
2017-09-01
Divergent thinking, as a method of examining creative cognition, has not been adequately analyzed in the context of modern cognitive theories. This article casts divergent thinking responding in the context of theories of memory search. First, it was argued that divergent thinking tasks are similar to semantic fluency tasks, but are more constrained, and less well structured. Next, response time distributions from 54 participants were analyzed for temporal and semantic clustering. Participants responded to two prompts from the alternative uses test: uses for a brick and uses for a bottle, for two minutes each. Participants' cumulative response curves were negatively accelerating, in line with theories of search of associative memory. However, results of analyses of semantic and temporal clustering suggested that clustering is less evident in alternative uses responding compared to semantic fluency tasks. This suggests either that divergent thinking responding does not involve an exhaustive search through a clustered memory trace, but rather that the process is more exploratory, yielding fewer overall responses that tend to drift away from close associates of the divergent thinking prompt. Copyright © 2017 Elsevier B.V. All rights reserved.
Semantic Memory in the Clinical Progression of Alzheimer Disease.
Tchakoute, Christophe T; Sainani, Kristin L; Henderson, Victor W
2017-09-01
Semantic memory measures may be useful in tracking and predicting progression of Alzheimer disease. We investigated relationships among semantic memory tasks and their 1-year predictive value in women with Alzheimer disease. We conducted secondary analyses of a randomized clinical trial of raloxifene in 42 women with late-onset mild-to-moderate Alzheimer disease. We assessed semantic memory with tests of oral confrontation naming, category fluency, semantic recognition and semantic naming, and semantic density in written narrative discourse. We measured global cognition (Alzheimer Disease Assessment Scale, cognitive subscale), dementia severity (Clinical Dementia Rating sum of boxes), and daily function (Activities of Daily Living Inventory) at baseline and 1 year. At baseline and 1 year, most semantic memory scores correlated highly or moderately with each other and with global cognition, dementia severity, and daily function. Semantic memory task performance at 1 year had worsened one-third to one-half standard deviation. Factor analysis of baseline test scores distinguished processes in semantic and lexical retrieval (semantic recognition, semantic naming, confrontation naming) from processes in lexical search (semantic density, category fluency). The semantic-lexical retrieval factor predicted global cognition at 1 year. Considered separately, baseline confrontation naming and category fluency predicted dementia severity, while semantic recognition and a composite of semantic recognition and semantic naming predicted global cognition. No individual semantic memory test predicted daily function. Semantic-lexical retrieval and lexical search may represent distinct aspects of semantic memory. Semantic memory processes are sensitive to cognitive decline and dementia severity in Alzheimer disease.
NASA Astrophysics Data System (ADS)
Johns, E. M.; Mayernik, M. S.; Boler, F. M.; Corson-Rikert, J.; Daniels, M. D.; Gross, M. B.; Khan, H.; Maull, K. E.; Rowan, L. R.; Stott, D.; Williams, S.; Krafft, D. B.
2015-12-01
Researchers seek information and data through a variety of avenues: published literature, search engines, repositories, colleagues, etc. In order to build a web application that leverages linked open data to enable multiple paths for information discovery, the EarthCollab project has surveyed two geoscience user communities to consider how researchers find and share scholarly output. EarthCollab, a cross-institutional, EarthCube funded project partnering UCAR, Cornell University, and UNAVCO, is employing the open-source semantic web software, VIVO, as the underlying technology to connect the people and resources of virtual research communities. This study will present an analysis of survey responses from members of the two case study communities: (1) the Bering Sea Project, an interdisciplinary field program whose data archive is hosted by NCAR's Earth Observing Laboratory (EOL), and (2) UNAVCO, a geodetic facility and consortium that supports diverse research projects informed by geodesy. The survey results illustrate the types of research products that respondents indicate should be discoverable within a digital platform and the current methods used to find publications, data, personnel, tools, and instrumentation. The responses showed that scientists rely heavily on general purpose search engines, such as Google, to find information, but that data center websites and the published literature were also critical sources for finding collaborators, data, and research tools.The survey participants also identify additional features of interest for an information platform such as search engine indexing, connection to institutional web pages, generation of bibliographies and CVs, and outward linking to social media. Through the survey, the user communities prioritized the type of information that is most important to display and describe their work within a research profile. The analysis of this survey will inform our further development of a platform that will facilitate different types of information discovery strategies, and help researchers to find and use the associated resources of a research project.
Jadhav, Ashutosh; Sheth, Amit; Pathak, Jyotishman
2014-01-01
Since the early 2000’s, Internet usage for health information searching has increased significantly. Studying search queries can help us to understand users “information need” and how do they formulate search queries (“expression of information need”). Although cardiovascular diseases (CVD) affect a large percentage of the population, few studies have investigated how and what users search for CVD. We address this knowledge gap in the community by analyzing a large corpus of 10 million CVD related search queries from MayoClinic.com. Using UMLS MetaMap and UMLS semantic types/concepts, we developed a rule-based approach to categorize the queries into 14 health categories. We analyzed structural properties, types (keyword-based/Wh-questions/Yes-No questions) and linguistic structure of the queries. Our results show that the most searched health categories are ‘Diseases/Conditions’, ‘Vital-Sings’, ‘Symptoms’ and ‘Living-with’. CVD queries are longer and are predominantly keyword-based. This study extends our knowledge about online health information searching and provides useful insights for Web search engines and health websites. PMID:25954380
Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis.
Segura Bedmar, Isabel; Martínez, Paloma; Carruana Martín, Adrián
2017-12-01
Biomedical semantic indexing is a very useful support tool for human curators in their efforts for indexing and cataloging the biomedical literature. The aim of this study was to describe a system to automatically assign Medical Subject Headings (MeSH) to biomedical articles from MEDLINE. Our approach relies on the assumption that similar documents should be classified by similar MeSH terms. Although previous work has already exploited the document similarity by using a k-nearest neighbors algorithm, we represent documents as document vectors by search engine indexing and then compute the similarity between documents using cosine similarity. Once the most similar documents for a given input document are retrieved, we rank their MeSH terms to choose the most suitable set for the input document. To do this, we define a scoring function that takes into account the frequency of the term into the set of retrieved documents and the similarity between the input document and each retrieved document. In addition, we implement guidelines proposed by human curators to annotate MEDLINE articles; in particular, the heuristic that says if 3 MeSH terms are proposed to classify an article and they share the same ancestor, they should be replaced by this ancestor. The representation of the MeSH thesaurus as a graph database allows us to employ graph search algorithms to quickly and easily capture hierarchical relationships such as the lowest common ancestor between terms. Our experiments show promising results with an F1 of 69% on the test dataset. To the best of our knowledge, this is the first work that combines search and graph database technologies for the task of biomedical semantic indexing. Due to its horizontal scalability, ElasticSearch becomes a real solution to index large collections of documents (such as the bibliographic database MEDLINE). Moreover, the use of graph search algorithms for accessing MeSH information could provide a support tool for cataloging MEDLINE abstracts in real time. ©Isabel Segura Bedmar, Paloma Martínez, Adrián Carruana Martín. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.12.2017.
F-OWL: An Inference Engine for Semantic Web
NASA Technical Reports Server (NTRS)
Zou, Youyong; Finin, Tim; Chen, Harry
2004-01-01
Understanding and using the data and knowledge encoded in semantic web documents requires an inference engine. F-OWL is an inference engine for the semantic web language OWL language based on F-logic, an approach to defining frame-based systems in logic. F-OWL is implemented using XSB and Flora-2 and takes full advantage of their features. We describe how F-OWL computes ontology entailment and compare it with other description logic based approaches. We also describe TAGA, a trading agent environment that we have used as a test bed for F-OWL and to explore how multiagent systems can use semantic web concepts and technology.
Huang, Chung-Chi; Lu, Zhiyong
2016-01-01
Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as 'CHEMICAL-1 compared to CHEMICAL-2' With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical-disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order covering diverse bio-entity relations. To assess the potential utility of our automated top-ranked patterns of a given relation in semantic search, we performed a pilot study on frequently sought semantic relations in PubMed and observed improved literature retrieval effectiveness based on post-hoc human relevance evaluation. Further investigation in larger tests and in real-world scenarios is warranted. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
Mining large heterogeneous data sets in drug discovery.
Wild, David J
2009-10-01
Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.
SciFlo: Semantically-Enabled Grid Workflow for Collaborative Science
NASA Astrophysics Data System (ADS)
Yunck, T.; Wilson, B. D.; Raskin, R.; Manipon, G.
2005-12-01
SciFlo is a system for Scientific Knowledge Creation on the Grid using a Semantically-Enabled Dataflow Execution Environment. SciFlo leverages Simple Object Access Protocol (SOAP) Web Services and the Grid Computing standards (WS-* standards and the Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable SOAP Services, native executables, local command-line scripts, and python codes into a distributed computing flow (a graph of operators). SciFlo's XML dataflow documents can be a mixture of concrete operators (fully bound operations) and abstract template operators (late binding via semantic lookup). All data objects and operators can be both simply typed (simple and complex types in XML schema) and semantically typed using controlled vocabularies (linked to OWL ontologies such as SWEET). By exploiting ontology-enhanced search and inference, one can discover (and automatically invoke) Web Services and operators that have been semantically labeled as performing the desired transformation, and adapt a particular invocation to the proper interface (number, types, and meaning of inputs and outputs). The SciFlo client & server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. A Visual Programming tool is also being developed, but it is not required. Once an analysis has been specified for a granule or day of data, it can be easily repeated with different control parameters and over months or years of data. SciFlo uses and preserves semantics, and also generates and infers new semantic annotations. Specifically, the SciFlo engine uses semantic metadata to understand (infer) what it is doing and potentially improve the data flow; preserves semantics by saving links to the semantics of (metadata describing) the input datasets, related datasets, and the data transformations (algorithms) used to generate downstream products; generates new metadata by allowing the user to add semantic annotations to the generated data products (or simply accept automatically generated provenance annotations); and infers new semantic metadata by understanding and applying logic to the semantics of the data and the transformations performed. Much ontology development still needs to be done but, nevertheless, SciFlo documents provide a substrate for using and preserving more semantics as ontologies develop. We will give a live demonstration of the growing SciFlo network using an example dataflow in which atmospheric temperature and water vapor profiles from three Earth Observing System (EOS) instruments are retrieved using SOAP (geo-location query & data access) services, co-registered, and visually & statistically compared on demand (see http://sciflo.jpl.nasa.gov for more information).
Foraging in Semantic Fields: How We Search Through Memory.
Hills, Thomas T; Todd, Peter M; Jones, Michael N
2015-07-01
When searching for concepts in memory--as in the verbal fluency task of naming all the animals one can think of--people appear to explore internal mental representations in much the same way that animals forage in physical space: searching locally within patches of information before transitioning globally between patches. However, the definition of the patches being searched in mental space is not well specified. Do we search by activating explicit predefined categories (e.g., pets) and recall items from within that category (categorical search), or do we activate and recall a connected sequence of individual items without using categorical information, with each item recalled leading to the retrieval of an associated item in a stream (associative search), or both? Using semantic representations in a search of associative memory framework and data from the animal fluency task, we tested competing hypotheses based on associative and categorical search models. Associative, but not categorical, patch transitions took longer to make than position-matched productions, suggesting that categorical transitions were not true transitions. There was also clear evidence of associative search even within categorical patch boundaries. Furthermore, most individuals' behavior was best explained by an associative search model without the addition of categorical information. Thus, our results support a search process that does not use categorical information, but for which patch boundaries shift with each recall and local search is well described by a random walk in semantic space, with switches to new regions of the semantic space when the current region is depleted. Copyright © 2015 Cognitive Science Society, Inc.
Semantic Modeling of Requirements: Leveraging Ontologies in Systems Engineering
ERIC Educational Resources Information Center
Mir, Masood Saleem
2012-01-01
The interdisciplinary nature of "Systems Engineering" (SE), having "stakeholders" from diverse domains with orthogonal facets, and need to consider all stages of "lifecycle" of system during conception, can benefit tremendously by employing "Knowledge Engineering" (KE) to achieve semantic agreement among all…
The effect of semantic context on prospective memory performance.
Thomas, Brandon J; McBride, Dawn M
2016-01-01
The current study provides evidence for spontaneous processing in prospective memory (PM) or memory for intentions. Discrepancy-plus-search is the spontaneous processing of PM cues via disruptions in processing fluency of ongoing task items. We tested whether this mechanism can be demonstrated in an ongoing rating task with a dominant semantic context. Ongoing task items were manipulated such that the PM cues were members of a semantic category (i.e., Body Parts) that was congruent or discrepant with the dominant semantic category in the ongoing task. Results showed that participants correctly responded to more PM cues when there was a category discrepancy between the PM cues and ongoing task items. Moreover, participants' identification of PM cues was accompanied by faster ongoing task reaction times when PM cues were discrepant with ongoing task items than when they were congruent. These results suggest that a discrepancy-plus-search process supports PM retrieval in certain contexts, and that some discrepancy-plus-search mechanisms may result from the violation of processing expectations within a semantic context.
Semantic memory retrieval circuit: role of pre-SMA, caudate, and thalamus.
Hart, John; Maguire, Mandy J; Motes, Michael; Mudar, Raksha Anand; Chiang, Hsueh-Sheng; Womack, Kyle B; Kraut, Michael A
2013-07-01
We propose that pre-supplementary motor area (pre-SMA)-thalamic interactions govern processes fundamental to semantic retrieval of an integrated object memory. At the onset of semantic retrieval, pre-SMA initiates electrical interactions between multiple cortical regions associated with semantic memory subsystems encodings as indexed by an increase in theta-band EEG power. This starts between 100-150 ms after stimulus presentation and is sustained throughout the task. We posit that this activity represents initiation of the object memory search, which continues in searching for an object memory. When the correct memory is retrieved, there is a high beta-band EEG power increase, which reflects communication between pre-SMA and thalamus, designates the end of the search process and resultant in object retrieval from multiple semantic memory subsystems. This high beta signal is also detected in cortical regions. This circuit is modulated by the caudate nuclei to facilitate correct and suppress incorrect target memories. Copyright © 2012 Elsevier Inc. All rights reserved.
The Role of Semantic Clustering in Optimal Memory Foraging.
Montez, Priscilla; Thompson, Graham; Kello, Christopher T
2015-11-01
Recent studies of semantic memory have investigated two theories of optimal search adopted from the animal foraging literature: Lévy flights and marginal value theorem. Each theory makes different simplifying assumptions and addresses different findings in search behaviors. In this study, an experiment is conducted to test whether clustering in semantic memory may play a role in evidence for both theories. Labeled magnets and a whiteboard were used to elicit spatial representations of semantic knowledge about animals. Category recall sequences from a separate experiment were used to trace search paths over the spatial representations of animal knowledge. Results showed that spatial distances between animal names arranged on the whiteboard were correlated with inter-response intervals (IRIs) during category recall, and distributions of both dependent measures approximated inverse power laws associated with Lévy flights. In addition, IRIs were relatively shorter when paths first entered animal clusters, and longer when they exited clusters, which is consistent with marginal value theorem. In conclusion, area-restricted searches over clustered semantic spaces may account for two different patterns of results interpreted as supporting two different theories of optimal memory foraging. Copyright © 2015 Cognitive Science Society, Inc.
Vonberg, Isabelle; Ehlen, Felicitas; Fromm, Ortwin; Klostermann, Fabian
2014-01-01
For word production, we may consciously pursue semantic or phonological search strategies, but it is uncertain whether we can retrieve the different aspects of lexical information independently from each other. We therefore studied the spread of semantic information into words produced under exclusively phonemic task demands. 42 subjects participated in a letter verbal fluency task, demanding the production of as many s-words as possible in two minutes. Based on curve fittings for the time courses of word production, output spurts (temporal clusters) considered to reflect rapid lexical retrieval based on automatic activation spread, were identified. Semantic and phonemic word relatedness within versus between these clusters was assessed by respective scores (0 meaning no relation, 4 maximum relation). Subjects produced 27.5 (±9.4) words belonging to 6.7 (±2.4) clusters. Both phonemically and semantically words were more related within clusters than between clusters (phon: 0.33±0.22 vs. 0.19±0.17, p<.01; sem: 0.65±0.29 vs. 0.37±0.29, p<.01). Whereas the extent of phonemic relatedness correlated with high task performance, the contrary was the case for the extent of semantic relatedness. The results indicate that semantic information spread occurs, even if the consciously pursued word search strategy is purely phonological. This, together with the negative correlation between semantic relatedness and verbal output suits the idea of a semantic default mode of lexical search, acting against rapid task performance in the given scenario of phonemic verbal fluency. The simultaneity of enhanced semantic and phonemic word relatedness within the same temporal cluster boundaries suggests an interaction between content and sound-related information whenever a new semantic field has been opened.
JournalMap: Geo-semantic searching for relevant knowledge
USDA-ARS?s Scientific Manuscript database
Ecologists struggling to understand rapidly changing environments and evolving ecosystem threats need quick access to relevant research and documentation of natural systems. The advent of semantic and aggregation searching (e.g., Google Scholar, Web of Science) has made it easier to find useful lite...
Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J; Dou, Dejing; Huang, Weili; Smith, Barry; Blake, Judith A; Eilbeck, Karen; Natale, Darren A; Lin, Yu; Wu, Bin; Silva, Nisansa de; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming; Ruttenberg, Alan
2016-01-01
As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.
Generic-distributed framework for cloud services marketplace based on unified ontology.
Hasan, Samer; Valli Kumari, V
2017-11-01
Cloud computing is a pattern for delivering ubiquitous and on demand computing resources based on pay-as-you-use financial model. Typically, cloud providers advertise cloud service descriptions in various formats on the Internet. On the other hand, cloud consumers use available search engines (Google and Yahoo) to explore cloud service descriptions and find the adequate service. Unfortunately, general purpose search engines are not designed to provide a small and complete set of results, which makes the process a big challenge. This paper presents a generic-distrusted framework for cloud services marketplace to automate cloud services discovery and selection process, and remove the barriers between service providers and consumers. Additionally, this work implements two instances of generic framework by adopting two different matching algorithms; namely dominant and recessive attributes algorithm borrowed from gene science and semantic similarity algorithm based on unified cloud service ontology. Finally, this paper presents unified cloud services ontology and models the real-life cloud services according to the proposed ontology. To the best of the authors' knowledge, this is the first attempt to build a cloud services marketplace where cloud providers and cloud consumers can trend cloud services as utilities. In comparison with existing work, semantic approach reduced the execution time by 20% and maintained the same values for all other parameters. On the other hand, dominant and recessive attributes approach reduced the execution time by 57% but showed lower value for recall.
SemanticFind: Locating What You Want in a Patient Record, Not Just What You Ask For
Prager, John M.; Liang, Jennifer J.; Devarakonda, Murthy V.
2017-01-01
We present a new model of patient record search, called SemanticFind, which goes beyond traditional textual and medical synonym matches by locating patient data that a clinician would want to see rather than just what they ask for. The new model is implemented by making extensive use of the UMLS semantic network, distributional semantics, and NLP, to match query terms along several dimensions in a patient record with the returned matches organized accordingly. The new approach finds all clinically related concepts without the user having to ask for them. An evaluation of the accuracy of SemanticFind shows that it found twice as many relevant matches compared to those found by literal (traditional) search alone, along with very high precision and recall. These results suggest potential uses for SemanticFind in clinical practice, retrospective chart reviews, and in automated extraction of quality metrics. PMID:28815139
The Role of Semantic Clustering in Optimal Memory Foraging
ERIC Educational Resources Information Center
Montez, Priscilla; Thompson, Graham; Kello, Christopher T.
2015-01-01
Recent studies of semantic memory have investigated two theories of optimal search adopted from the animal foraging literature: Lévy flights and marginal value theorem. Each theory makes different simplifying assumptions and addresses different findings in search behaviors. In this study, an experiment is conducted to test whether clustering in…
ERIC Educational Resources Information Center
Nesic, Sasa; Gasevic, Dragan; Jazayeri, Mehdi; Landoni, Monica
2011-01-01
Semantic web technologies have been applied to many aspects of learning content authoring including semantic annotation, semantic search, dynamic assembly, and personalization of learning content. At the same time, social networking services have started to play an important role in the authoring process by supporting authors' collaborative…
Electrophysiology reveals semantic priming at a short SOA irrespective of depth of prime processing.
Küper, Kristina; Heil, Martin
2009-04-03
The otherwise robust behavioral semantic priming effect is reduced to the point of being absent when a letter search has to be performed on the prime word. As a result the automaticity of semantic activation has been called into question. It is unclear, however, in how far automatic processes are even measurable in the letter search priming paradigm as the prime task necessitates a long prime-probe stimulus-onset asynchrony (SOA). In a modified procedure, a short SOA can be realized by delaying the prime task response until after participants have made a lexical decision on the probe. While the absence of lexical decision priming has already been demonstrated in this design it seems premature to draw any definite conclusions from this purely behavioral result since event related potential (ERP) measures have been shown to be a more sensitive index of semantic activation. Using the modified paradigm we thus recorded ERP in addition to lexical decision times. Stimuli were presented at two different SOAs (240 ms vs. 840 ms) and participants performed either a grammatical discrimination (Experiment 1) or a letter search (Experiment 2) on the prime. Irrespective of prime task, the modulation of the N400, the ERP correlate of semantic activation, provided clear-cut evidence of semantic processing at the short SOA. Implications for theories of semantic activation as well as the constraints of the delayed prime task procedure are discussed.
Semantic Priming Revisited: In Search of Structural Age Change in Semantic Knowledge.
ERIC Educational Resources Information Center
Mergler, Nancy L.; And Others
Contradictory previous research results showing that (1) language knowledge does not decrease with age; and (2) age differences exist in semantic strategies for memory recall provide the impetus for a study of semantic priming in young adults (mean age = 20) and older adults (mean age = 70) by providing target and prime words with six different…
NASA Astrophysics Data System (ADS)
Li, Y.; Jiang, Y.; Yang, C. P.; Armstrong, E. M.; Huang, T.; Moroni, D. F.; McGibbney, L. J.
2016-12-01
Big oceanographic data have been produced, archived and made available online, but finding the right data for scientific research and application development is still a significant challenge. A long-standing problem in data discovery is how to find the interrelationships between keywords and data, as well as the intrarelationships of the two individually. Most previous research attempted to solve this problem by building domain-specific ontology either manually or through automatic machine learning techniques. The former is costly, labor intensive and hard to keep up-to-date, while the latter is prone to noise and may be difficult for human to understand. Large-scale user behavior data modelling represents a largely untapped, unique, and valuable source for discovering semantic relationships among domain-specific vocabulary. In this article, we propose a search engine framework for mining and utilizing dataset relevancy from oceanographic dataset metadata, user behaviors, and existing ontology. The objective is to improve discovery accuracy of oceanographic data and reduce time for scientist to discover, download and reformat data for their projects. Experiments and a search example show that the proposed search engine helps both scientists and general users search with better ranking results, recommendation, and ontology navigation.
Heuristics for Relevancy Ranking of Earth Dataset Search Results
NASA Astrophysics Data System (ADS)
Lynnes, C.; Quinn, P.; Norton, J.
2016-12-01
As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
Heuristics for Relevancy Ranking of Earth Dataset Search Results
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Quinn, Patrick; Norton, James
2016-01-01
As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
Relevancy Ranking of Satellite Dataset Search Results
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Quinn, Patrick; Norton, James
2017-01-01
As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
A Semantic Web-based System for Managing Clinical Archetypes.
Fernandez-Breis, Jesualdo Tomas; Menarguez-Tortosa, Marcos; Martinez-Costa, Catalina; Fernandez-Breis, Eneko; Herrero-Sempere, Jose; Moner, David; Sanchez, Jesus; Valencia-Garcia, Rafael; Robles, Montserrat
2008-01-01
Archetypes facilitate the sharing of clinical knowledge and therefore are a basic tool for achieving interoperability between healthcare information systems. In this paper, a Semantic Web System for Managing Archetypes is presented. This system allows for the semantic annotation of archetypes, as well for performing semantic searches. The current system is capable of working with both ISO13606 and OpenEHR archetypes.
Jafarpour, Borna; Abidi, Samina Raza; Abidi, Syed Sibte Raza
2016-01-01
Computerizing paper-based CPG and then executing them can provide evidence-informed decision support to physicians at the point of care. Semantic web technologies especially web ontology language (OWL) ontologies have been profusely used to represent computerized CPG. Using semantic web reasoning capabilities to execute OWL-based computerized CPG unties them from a specific custom-built CPG execution engine and increases their shareability as any OWL reasoner and triple store can be utilized for CPG execution. However, existing semantic web reasoning-based CPG execution engines suffer from lack of ability to execute CPG with high levels of expressivity, high cognitive load of computerization of paper-based CPG and updating their computerized versions. In order to address these limitations, we have developed three CPG execution engines based on OWL 1 DL, OWL 2 DL and OWL 2 DL + semantic web rule language (SWRL). OWL 1 DL serves as the base execution engine capable of executing a wide range of CPG constructs, however for executing highly complex CPG the OWL 2 DL and OWL 2 DL + SWRL offer additional executional capabilities. We evaluated the technical performance and medical correctness of our execution engines using a range of CPG. Technical evaluations show the efficiency of our CPG execution engines in terms of CPU time and validity of the generated recommendation in comparison to existing CPG execution engines. Medical evaluations by domain experts show the validity of the CPG-mediated therapy plans in terms of relevance, safety, and ordering for a wide range of patient scenarios.
Ontology-Based Search of Genomic Metadata.
Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano
2016-01-01
The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.
Distracted by Relatives: Effects of Frontal Lobe Damage on Semantic Distraction
ERIC Educational Resources Information Center
Telling, Anna L.; Meyer, Antje S.; Humphreys, Glyn W.
2010-01-01
When young adults carry out visual search, distractors that are semantically related, rather than unrelated, to targets can disrupt target selection (see [Belke et al., 2008] and [Moores et al., 2003]). This effect is apparent on the first eye movements in search, suggesting that attention is sometimes captured by related distractors. Here we…
Spatial information semantic query based on SPARQL
NASA Astrophysics Data System (ADS)
Xiao, Zhifeng; Huang, Lei; Zhai, Xiaofang
2009-10-01
How can the efficiency of spatial information inquiries be enhanced in today's fast-growing information age? We are rich in geospatial data but poor in up-to-date geospatial information and knowledge that are ready to be accessed by public users. This paper adopts an approach for querying spatial semantic by building an Web Ontology language(OWL) format ontology and introducing SPARQL Protocol and RDF Query Language(SPARQL) to search spatial semantic relations. It is important to establish spatial semantics that support for effective spatial reasoning for performing semantic query. Compared to earlier keyword-based and information retrieval techniques that rely on syntax, we use semantic approaches in our spatial queries system. Semantic approaches need to be developed by ontology, so we use OWL to describe spatial information extracted by the large-scale map of Wuhan. Spatial information expressed by ontology with formal semantics is available to machines for processing and to people for understanding. The approach is illustrated by introducing a case study for using SPARQL to query geo-spatial ontology instances of Wuhan. The paper shows that making use of SPARQL to search OWL ontology instances can ensure the result's accuracy and applicability. The result also indicates constructing a geo-spatial semantic query system has positive efforts on forming spatial query and retrieval.
Knowledge-based understanding of aerial surveillance video
NASA Astrophysics Data System (ADS)
Cheng, Hui; Butler, Darren
2006-05-01
Aerial surveillance has long been used by the military to locate, monitor and track the enemy. Recently, its scope has expanded to include law enforcement activities, disaster management and commercial applications. With the ever-growing amount of aerial surveillance video acquired daily, there is an urgent need for extracting actionable intelligence in a timely manner. Furthermore, to support high-level video understanding, this analysis needs to go beyond current approaches and consider the relationships, motivations and intentions of the objects in the scene. In this paper we propose a system for interpreting aerial surveillance videos that automatically generates a succinct but meaningful description of the observed regions, objects and events. For a given video, the semantics of important regions and objects, and the relationships between them, are summarised into a semantic concept graph. From this, a textual description is derived that provides new search and indexing options for aerial video and enables the fusion of aerial video with other information modalities, such as human intelligence, reports and signal intelligence. Using a Mixture-of-Experts video segmentation algorithm an aerial video is first decomposed into regions and objects with predefined semantic meanings. The objects are then tracked and coerced into a semantic concept graph and the graph is summarized spatially, temporally and semantically using ontology guided sub-graph matching and re-writing. The system exploits domain specific knowledge and uses a reasoning engine to verify and correct the classes, identities and semantic relationships between the objects. This approach is advantageous because misclassifications lead to knowledge contradictions and hence they can be easily detected and intelligently corrected. In addition, the graph representation highlights events and anomalies that a low-level analysis would overlook.
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Lee, Sungyoung; Chung, Tae Choong
2016-01-01
Privacy-aware search of outsourced data ensures relevant data access in the untrusted domain of a public cloud service provider. Subscriber of a public cloud storage service can determine the presence or absence of a particular keyword by submitting search query in the form of a trapdoor. However, these trapdoor-based search queries are limited in functionality and cannot be used to identify secure outsourced data which contains semantically equivalent information. In addition, trapdoor-based methodologies are confined to pre-defined trapdoors and prevent subscribers from searching outsourced data with arbitrarily defined search criteria. To solve the problem of relevant data access, we have proposed an index-based privacy-aware search methodology that ensures semantic retrieval of data from an untrusted domain. This method ensures oblivious execution of a search query and leverages authorized subscribers to model conjunctive search queries without relying on predefined trapdoors. A security analysis of our proposed methodology shows that, in a conspired attack, unauthorized subscribers and untrusted cloud service providers cannot deduce any information that can lead to the potential loss of data privacy. A computational time analysis on commodity hardware demonstrates that our proposed methodology requires moderate computational resources to model a privacy-aware search query and for its oblivious evaluation on a cloud service provider.
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Lee, Sungyoung; Chung, Tae Choong
2016-01-01
Privacy-aware search of outsourced data ensures relevant data access in the untrusted domain of a public cloud service provider. Subscriber of a public cloud storage service can determine the presence or absence of a particular keyword by submitting search query in the form of a trapdoor. However, these trapdoor-based search queries are limited in functionality and cannot be used to identify secure outsourced data which contains semantically equivalent information. In addition, trapdoor-based methodologies are confined to pre-defined trapdoors and prevent subscribers from searching outsourced data with arbitrarily defined search criteria. To solve the problem of relevant data access, we have proposed an index-based privacy-aware search methodology that ensures semantic retrieval of data from an untrusted domain. This method ensures oblivious execution of a search query and leverages authorized subscribers to model conjunctive search queries without relying on predefined trapdoors. A security analysis of our proposed methodology shows that, in a conspired attack, unauthorized subscribers and untrusted cloud service providers cannot deduce any information that can lead to the potential loss of data privacy. A computational time analysis on commodity hardware demonstrates that our proposed methodology requires moderate computational resources to model a privacy-aware search query and for its oblivious evaluation on a cloud service provider. PMID:27571421
Moen, Hans; Ginter, Filip; Marsi, Erwin; Peltonen, Laura-Maria; Salakoski, Tapio; Salanterä, Sanna
2015-01-01
Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a--possibly unfinished--care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task.
2015-01-01
Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a - possibly unfinished - care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task. PMID:26099735
Reasoning with Vectors: A Continuous Model for Fast Robust Inference.
Widdows, Dominic; Cohen, Trevor
2015-10-01
This paper describes the use of continuous vector space models for reasoning with a formal knowledge base. The practical significance of these models is that they support fast, approximate but robust inference and hypothesis generation, which is complementary to the slow, exact, but sometimes brittle behavior of more traditional deduction engines such as theorem provers. The paper explains the way logical connectives can be used in semantic vector models, and summarizes the development of Predication-based Semantic Indexing, which involves the use of Vector Symbolic Architectures to represent the concepts and relationships from a knowledge base of subject-predicate-object triples. Experiments show that the use of continuous models for formal reasoning is not only possible, but already demonstrably effective for some recognized informatics tasks, and showing promise in other traditional problem areas. Examples described in this paper include: predicting new uses for existing drugs in biomedical informatics; removing unwanted meanings from search results in information retrieval and concept navigation; type-inference from attributes; comparing words based on their orthography; and representing tabular data, including modelling numerical values. The algorithms and techniques described in this paper are all publicly released and freely available in the Semantic Vectors open-source software package.
Reasoning with Vectors: A Continuous Model for Fast Robust Inference
Widdows, Dominic; Cohen, Trevor
2015-01-01
This paper describes the use of continuous vector space models for reasoning with a formal knowledge base. The practical significance of these models is that they support fast, approximate but robust inference and hypothesis generation, which is complementary to the slow, exact, but sometimes brittle behavior of more traditional deduction engines such as theorem provers. The paper explains the way logical connectives can be used in semantic vector models, and summarizes the development of Predication-based Semantic Indexing, which involves the use of Vector Symbolic Architectures to represent the concepts and relationships from a knowledge base of subject-predicate-object triples. Experiments show that the use of continuous models for formal reasoning is not only possible, but already demonstrably effective for some recognized informatics tasks, and showing promise in other traditional problem areas. Examples described in this paper include: predicting new uses for existing drugs in biomedical informatics; removing unwanted meanings from search results in information retrieval and concept navigation; type-inference from attributes; comparing words based on their orthography; and representing tabular data, including modelling numerical values. The algorithms and techniques described in this paper are all publicly released and freely available in the Semantic Vectors open-source software package.1 PMID:26582967
Determining Semantically Related Significant Genes.
Taha, Kamal
2014-01-01
GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.
NASA Technical Reports Server (NTRS)
Driscoll, James N.
1994-01-01
The high-speed data search system developed for KSC incorporates existing and emerging information retrieval technology to help a user intelligently and rapidly locate information found in large textual databases. This technology includes: natural language input; statistical ranking of retrieved information; an artificial intelligence concept called semantics, where 'surface level' knowledge found in text is used to improve the ranking of retrieved information; and relevance feedback, where user judgements about viewed information are used to automatically modify the search for further information. Semantics and relevance feedback are features of the system which are not available commercially. The system further demonstrates focus on paragraphs of information to decide relevance; and it can be used (without modification) to intelligently search all kinds of document collections, such as collections of legal documents medical documents, news stories, patents, and so forth. The purpose of this paper is to demonstrate the usefulness of statistical ranking, our semantic improvement, and relevance feedback.
Semantic e-Learning: Next Generation of e-Learning?
NASA Astrophysics Data System (ADS)
Konstantinos, Markellos; Penelope, Markellou; Giannis, Koutsonikos; Aglaia, Liopa-Tsakalidi
Semantic e-learning aspires to be the next generation of e-learning, since the understanding of learning materials and knowledge semantics allows their advanced representation, manipulation, sharing, exchange and reuse and ultimately promote efficient online experiences for users. In this context, the paper firstly explores some fundamental Semantic Web technologies and then discusses current and potential applications of these technologies in e-learning domain, namely, Semantic portals, Semantic search, personalization, recommendation systems, social software and Web 2.0 tools. Finally, it highlights future research directions and open issues of the field.
Multisensory brand search: How the meaning of sounds guides consumers' visual attention.
Knoeferle, Klemens M; Knoeferle, Pia; Velasco, Carlos; Spence, Charles
2016-06-01
Building on models of crossmodal attention, the present research proposes that brand search is inherently multisensory, in that the consumers' visual search for a specific brand can be facilitated by semantically related stimuli that are presented in another sensory modality. A series of 5 experiments demonstrates that the presentation of spatially nonpredictive auditory stimuli associated with products (e.g., usage sounds or product-related jingles) can crossmodally facilitate consumers' visual search for, and selection of, products. Eye-tracking data (Experiment 2) revealed that the crossmodal effect of auditory cues on visual search manifested itself not only in RTs, but also in the earliest stages of visual attentional processing, thus suggesting that the semantic information embedded within sounds can modulate the perceptual saliency of the target products' visual representations. Crossmodal facilitation was even observed for newly learnt associations between unfamiliar brands and sonic logos, implicating multisensory short-term learning in establishing audiovisual semantic associations. The facilitation effect was stronger when searching complex rather than simple visual displays, thus suggesting a modulatory role of perceptual load. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Seek and you shall remember: Scene semantics interact with visual search to build better memories
Draschkow, Dejan; Wolfe, Jeremy M.; Võ, Melissa L.-H.
2014-01-01
Memorizing critical objects and their locations is an essential part of everyday life. In the present study, incidental encoding of objects in naturalistic scenes during search was compared to explicit memorization of those scenes. To investigate if prior knowledge of scene structure influences these two types of encoding differently, we used meaningless arrays of objects as well as objects in real-world, semantically meaningful images. Surprisingly, when participants were asked to recall scenes, their memory performance was markedly better for searched objects than for objects they had explicitly tried to memorize, even though participants in the search condition were not explicitly asked to memorize objects. This finding held true even when objects were observed for an equal amount of time in both conditions. Critically, the recall benefit for searched over memorized objects in scenes was eliminated when objects were presented on uniform, non-scene backgrounds rather than in a full scene context. Thus, scene semantics not only help us search for objects in naturalistic scenes, but appear to produce a representation that supports our memory for those objects beyond intentional memorization. PMID:25015385
Energy Consumption Forecasting Using Semantic-Based Genetic Programming with Local Search Optimizer.
Castelli, Mauro; Trujillo, Leonardo; Vanneschi, Leonardo
2015-01-01
Energy consumption forecasting (ECF) is an important policy issue in today's economies. An accurate ECF has great benefits for electric utilities and both negative and positive errors lead to increased operating costs. The paper proposes a semantic based genetic programming framework to address the ECF problem. In particular, we propose a system that finds (quasi-)perfect solutions with high probability and that generates models able to produce near optimal predictions also on unseen data. The framework blends a recently developed version of genetic programming that integrates semantic genetic operators with a local search method. The main idea in combining semantic genetic programming and a local searcher is to couple the exploration ability of the former with the exploitation ability of the latter. Experimental results confirm the suitability of the proposed method in predicting the energy consumption. In particular, the system produces a lower error with respect to the existing state-of-the art techniques used on the same dataset. More importantly, this case study has shown that including a local searcher in the geometric semantic genetic programming system can speed up the search process and can result in fitter models that are able to produce an accurate forecasting also on unseen data.
Simultenious binary hash and features learning for image retrieval
NASA Astrophysics Data System (ADS)
Frantc, V. A.; Makov, S. V.; Voronin, V. V.; Marchuk, V. I.; Semenishchev, E. A.; Egiazarian, K. O.; Agaian, S.
2016-05-01
Content-based image retrieval systems have plenty of applications in modern world. The most important one is the image search by query image or by semantic description. Approaches to this problem are employed in personal photo-collection management systems, web-scale image search engines, medical systems, etc. Automatic analysis of large unlabeled image datasets is virtually impossible without satisfactory image-retrieval technique. It's the main reason why this kind of automatic image processing has attracted so much attention during recent years. Despite rather huge progress in the field, semantically meaningful image retrieval still remains a challenging task. The main issue here is the demand to provide reliable results in short amount of time. This paper addresses the problem by novel technique for simultaneous learning of global image features and binary hash codes. Our approach provide mapping of pixel-based image representation to hash-value space simultaneously trying to save as much of semantic image content as possible. We use deep learning methodology to generate image description with properties of similarity preservation and statistical independence. The main advantage of our approach in contrast to existing is ability to fine-tune retrieval procedure for very specific application which allow us to provide better results in comparison to general techniques. Presented in the paper framework for data- dependent image hashing is based on use two different kinds of neural networks: convolutional neural networks for image description and autoencoder for feature to hash space mapping. Experimental results confirmed that our approach has shown promising results in compare to other state-of-the-art methods.
Measurement of tag confidence in user generated contents retrieval
NASA Astrophysics Data System (ADS)
Lee, Sihyoung; Min, Hyun-Seok; Lee, Young Bok; Ro, Yong Man
2009-01-01
As online image sharing services are becoming popular, the importance of correctly annotated tags is being emphasized for precise search and retrieval. Tags created by user along with user-generated contents (UGC) are often ambiguous due to the fact that some tags are highly subjective and visually unrelated to the image. They cause unwanted results to users when image search engines rely on tags. In this paper, we propose a method of measuring tag confidence so that one can differentiate confidence tags from noisy tags. The proposed tag confidence is measured from visual semantics of the image. To verify the usefulness of the proposed method, experiments were performed with UGC database from social network sites. Experimental results showed that the image retrieval performance with confidence tags was increased.
NASA Astrophysics Data System (ADS)
Yakovlev, A. A.; Sorokin, V. S.; Mishustina, S. N.; Proidakova, N. V.; Postupaeva, S. G.
2017-01-01
The article describes a new method of search design of refrigerating systems, the basis of which is represented by a graph model of the physical operating principle based on thermodynamical description of physical processes. The mathematical model of the physical operating principle has been substantiated, and the basic abstract theorems relatively semantic load applied to nodes and edges of the graph have been represented. The necessity and the physical operating principle, sufficient for the given model and intended for the considered device class, were demonstrated by the example of a vapour-compression refrigerating plant. The example of obtaining a multitude of engineering solutions of a vapour-compression refrigerating plant has been considered.
The semantic planetary data system
NASA Technical Reports Server (NTRS)
Hughes, J. Steven; Crichton, Daniel; Kelly, Sean; Mattmann, Chris
2005-01-01
This paper will provide a brief overview of the PDS data model and the PDS catalog. It will then describe the implentation of the Semantic PDS including the development of the formal ontology, the generation of RDFS/XML and RDF/XML data sets, and the buiding of the semantic search application.
ERIC Educational Resources Information Center
McCarthy, Matthew T.
2017-01-01
Artificial intelligence (AI) that is based upon semantic search has become one of the dominant means for accessing information in recent years. This is particularly the case in mobile contexts, as search-based AI are embedded in each of the major mobile operating systems. The implications are such that information is becoming less a matter of…
NASA Astrophysics Data System (ADS)
Colomo-Palacios, Ricardo; Jiménez-López, Diego; García-Crespo, Ángel; Blanco-Iglesias, Borja
eLearning educative processes are a challenge for educative institutions and education professionals. In an environment in which learning resources are being produced, catalogued and stored using innovative ways, SOLE provides a platform in which exam questions can be produced supported by Web 2.0 tools, catalogued and labeled via semantic web and stored and distributed using eLearning standards. This paper presents, SOLE, a social network of exam questions sharing particularized for Software Engineering domain, based on semantics and built using semantic web and eLearning standards, such as IMS Question and Test Interoperability specification 2.1.
When fruits lose to animals: Disorganized search of semantic memory in Parkinson's disease.
Tagini, Sofia; Seyed-Allaei, Shima; Scarpina, Federica; Toraldo, Alessio; Mauro, Alessandro; Cherubini, Paolo; Reverberi, Carlo
2018-04-16
The semantic fluency task is widely used in both clinical and research settings to assess both the integrity of the semantic store and the effectiveness of the search through it. Our aim was to investigate whether nondemented Parkinson's disease (PD) patients show an impairment in the strategic exploration of the semantic store and whether the tested semantic category has an impact on multiple measures of performance. We compared 74 nondemented PD patients with 254 healthy subjects in a semantic fluency test using relatively small (fruits) and large (animals) semantic categories. Number of words produced, number of explored semantic subcategories, and degree of order in the produced sequences were computed as dependent variables. PD patients produced fewer words than healthy subjects did, regardless of the category. Number of subcategories was also lower in PD patients than in healthy subjects, without a significant difference between categories. Critically, PD patients' sequences were less semantically organized than were those of controls, but this effect appeared in only the smaller category (fruits), thus pointing to a lack of strategy in exploring the semantic store. Our results show that the semantic fluency deficit in PD patients has a strategic component, even though that may not be the only cause of the impaired performance. Furthermore, our evidence suggests that the semantic category used in the test influences performance, hence providing an explanation for the failure by previous studies, which often used large categories such as animals, to detect strategy deficits in PD. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Concept similarity and related categories in information retrieval using formal concept analysis
NASA Astrophysics Data System (ADS)
Eklund, P.; Ducrou, J.; Dau, F.
2012-11-01
The application of formal concept analysis to the problem of information retrieval has been shown useful but has lacked any real analysis of the idea of relevance ranking of search results. SearchSleuth is a program developed to experiment with the automated local analysis of Web search using formal concept analysis. SearchSleuth extends a standard search interface to include a conceptual neighbourhood centred on a formal concept derived from the initial query. This neighbourhood of the concept derived from the search terms is decorated with its upper and lower neighbours representing more general and special concepts, respectively. SearchSleuth is in many ways an archetype of search engines based on formal concept analysis with some novel features. In SearchSleuth, the notion of related categories - which are themselves formal concepts - is also introduced. This allows the retrieval focus to shift to a new formal concept called a sibling. This movement across the concept lattice needs to relate one formal concept to another in a principled way. This paper presents the issues concerning exploring, searching, and ordering the space of related categories. The focus is on understanding the use and meaning of proximity and semantic distance in the context of information retrieval using formal concept analysis.
Choi, Okkyung; Han, SangYong
2007-01-01
Ubiquitous Computing makes it possible to determine in real time the location and situations of service requesters in a web service environment as it enables access to computers at any time and in any place. Though research on various aspects of ubiquitous commerce is progressing at enterprises and research centers, both domestically and overseas, analysis of a customer's personal preferences based on semantic web and rule based services using semantics is not currently being conducted. This paper proposes a Ubiquitous Computing Services System that enables a rule based search as well as semantics based search to support the fact that the electronic space and the physical space can be combined into one and the real time search for web services and the construction of efficient web services thus become possible.
NASA Astrophysics Data System (ADS)
Manzella, Giuseppe M. R.; Bartolini, Andrea; Bustaffa, Franco; D'Angelo, Paolo; De Mattei, Maurizio; Frontini, Francesca; Maltese, Maurizio; Medone, Daniele; Monachini, Monica; Novellino, Antonio; Spada, Andrea
2016-04-01
The MAPS (Marine Planning and Service Platform) project is aiming at building a computer platform supporting a Marine Information and Knowledge System. One of the main objective of the project is to develop a repository that should gather, classify and structure marine scientific literature and data thus guaranteeing their accessibility to researchers and institutions by means of standard protocols. In oceanography the cost related to data collection is very high and the new paradigm is based on the concept to collect once and re-use many times (for re-analysis, marine environment assessment, studies on trends, etc). This concept requires the access to quality controlled data and to information that is provided in reports (grey literature) and/or in relevant scientific literature. Hence, creation of new technology is needed by integrating several disciplines such as data management, information systems, knowledge management. In one of the most important EC projects on data management, namely SeaDataNet (www.seadatanet.org), an initial example of knowledge management is provided through the Common Data Index, that is providing links to data and (eventually) to papers. There are efforts to develop search engines to find author's contributions to scientific literature or publications. This implies the use of persistent identifiers (such as DOI), as is done in ORCID. However very few efforts are dedicated to link publications to the data cited or used or that can be of importance for the published studies. This is the objective of MAPS. Full-text technologies are often unsuccessful since they assume the presence of specific keywords in the text; in order to fix this problem, the MAPS project suggests to use different semantic technologies for retrieving the text and data and thus getting much more complying results. The main parts of our design of the search engine are: • Syntactic parser - This module is responsible for the extraction of "rich words" from the text: the whole document gets parsed to extract the words which are more meaningful for the main argument of the document, and applies the extraction in the form of N-grams (mono-grams, bi-grams, tri-grams). • MAPS database - This module is a simple database which contains all the N-grams used by MAPS (physical parameters from SeaDataNet vocabularies) to define our marine "ontology". • Relation identifier - This module performs the most important task of identifying relationships between the N-gram extracted from the text by the parser and the provided oceanographic terminology. It checks N-grams supplied by the Syntactic parser and then matches them with the terms stored in the MAPS database. Found matches are returned back to the parser with flexed form appearing in the source text. • A "relaxed" extractor - This option can be activated when the search engine is launched. It was introduced to give the user a chance to create new N-grams combining existing mono-grams and bi-grams in the database with rich-words found within the source text. The innovation of a semantic engine lies in the fact that the process is not just about the retrieval of already known documents by means of a simple term query but rather the retrieval of a population of documents whose existence was unknown. The system answers by showing a screenshot of results ordered according to the following criteria: • Relevance - of the document with respect to the concept that is searched • Date - of publication of the paper • Source - data provider as defined in the SeaDataNet Common Data Index • Matrix - environmental matrices as defined in the oceanographic field • Geographic area - area specified in the text • Clustering - the process of organizing objects into groups whose members are similar The clustering returns as the output the related documents. For each document the MAPS visualization provides: • Title, author, source/provider of data, web address • Tagging of key terms or concepts • Summary of the document • Visualization of the whole document The possibility of inserting the number of citations for each document among the criteria of the advanced search is currently undergoing; in this case the engine should be able to connect to any of the existing bibliographic citation systems (such as Google Scholar, Scopus, etc.).
ERIC Educational Resources Information Center
Zhu, Lijuan
2011-01-01
Along with the greater productivity that CAD automation provides nowadays, the product data of engineering applications needs to be shared and managed efficiently to gain a competitive edge for the engineering product design. However, exchanging and sharing the heterogeneous product data is still challenging. This dissertation first presents a…
Using Linked Open Data and Semantic Integration to Search Across Geoscience Repositories
NASA Astrophysics Data System (ADS)
Mickle, A.; Raymond, L. M.; Shepherd, A.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Fils, D.; Hitzler, P.; Janowicz, K.; Jones, M.; Krisnadhi, A.; Lehnert, K. A.; Narock, T.; Schildhauer, M.; Wiebe, P. H.
2014-12-01
The MBLWHOI Library is a partner in the OceanLink project, an NSF EarthCube Building Block, applying semantic technologies to enable knowledge discovery, sharing and integration. OceanLink is testing ontology design patterns that link together: two data repositories, Rolling Deck to Repository (R2R), Biological and Chemical Oceanography Data Management Office (BCO-DMO); the MBLWHOI Library Institutional Repository (IR) Woods Hole Open Access Server (WHOAS); National Science Foundation (NSF) funded awards; and American Geophysical Union (AGU) conference presentations. The Library is collaborating with scientific users, data managers, DSpace engineers, experts in ontology design patterns, and user interface developers to make WHOAS, a DSpace repository, linked open data enabled. The goal is to allow searching across repositories without any of the information providers having to change how they manage their collections. The tools developed for DSpace will be made available to the community of users. There are 257 registered DSpace repositories in the United Stated and over 1700 worldwide. Outcomes include: Integration of DSpace with OpenRDF Sesame triple store to provide SPARQL endpoint for the storage and query of RDF representation of DSpace resources, Mapping of DSpace resources to OceanLink ontology, and DSpace "data" add on to provide resolvable linked open data representation of DSpace resources.
A search engine to access PubMed monolingual subsets: proof of concept and evaluation in French.
Griffon, Nicolas; Schuers, Matthieu; Soualmia, Lina Fatima; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse; Darmoni, Stéfan Jacques
2014-12-01
PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing. The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French). To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy. More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French. It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese.
A Search Engine to Access PubMed Monolingual Subsets: Proof of Concept and Evaluation in French
Schuers, Matthieu; Soualmia, Lina Fatima; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse; Darmoni, Stéfan Jacques
2014-01-01
Background PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing. Objective The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French). Methods To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy. Results More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French. Conclusions It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese. PMID:25448528
Zhou, Wei; Mo, Fei; Zhang, Yunhong; Ding, Jinhong
2017-01-01
Two experiments were conducted to investigate how linguistic information influences attention allocation in visual search and memory for words. In Experiment 1, participants searched for the synonym of a cue word among five words. The distractors included one antonym and three unrelated words. In Experiment 2, participants were asked to judge whether the five words presented on the screen comprise a valid sentence. The relationships among words were sentential, semantically related or unrelated. A memory recognition task followed. Results in both experiments showed that linguistically related words produced better memory performance. We also found that there were significant interactions between linguistic relation conditions and memorization on eye-movement measures, indicating that good memory for words relied on frequent and long fixations during search in the unrelated condition but to a much lesser extent in linguistically related conditions. We conclude that semantic and syntactic associations attenuate the link between overt attention allocation and subsequent memory performance, suggesting that linguistic relatedness can somewhat compensate for a relative lack of attention during word search.
On search guide phrase compilation for recommending home medical products.
Luo, Gang
2010-01-01
To help people find desired home medical products (HMPs), we developed an intelligent personal health record (iPHR) system that can automatically recommend HMPs based on users' health issues. Using nursing knowledge, we pre-compile a set of "search guide" phrases that provides semantic translation from words describing health issues to their underlying medical meanings. Then iPHR automatically generates queries from those phrases and uses them and a search engine to retrieve HMPs. To avoid missing relevant HMPs during retrieval, the compiled search guide phrases need to be comprehensive. Such compilation is a challenging task because nursing knowledge updates frequently and contains numerous details scattered in many sources. This paper presents a semi-automatic tool facilitating such compilation. Our idea is to formulate the phrase compilation task as a multi-label classification problem. For each newly obtained search guide phrase, we first use nursing knowledge and information retrieval techniques to identify a small set of potentially relevant classes with corresponding hints. Then a nurse makes the final decision on assigning this phrase to proper classes based on those hints. We demonstrate the effectiveness of our techniques by compiling search guide phrases from an occupational therapy textbook.
Multi-lingual search engine to access PubMed monolingual subsets: a feasibility study.
Darmoni, Stéfan J; Soualmia, Lina F; Griffon, Nicolas; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse
2013-01-01
PubMed contains many articles in languages other than English but it is difficult to find them using the English version of the Medical Subject Headings (MeSH) Thesaurus. The aim of this work is to propose a tool allowing access to a PubMed subset in one language, and to evaluate its performance. Translations of MeSH were enriched and gathered in the information system. PubMed subsets in main European languages were also added in our database, using a dedicated parser. The CISMeF generic semantic search engine was evaluated on the response time for simple queries. MeSH descriptors are currently available in 11 languages in the information system. All the 654,000 PubMed citations in French were integrated into CISMeF database. None of the response times exceed the threshold defined for usability (2 seconds). It is now possible to freely access biomedical literature in French using a tool in French; health professionals and lay people with a low English language may find it useful. It will be expended to several European languages: German, Spanish, Norwegian and Portuguese.
Cairelli, Michael J.; Miller, Christopher M.; Fiszman, Marcelo; Workman, T. Elizabeth; Rindflesch, Thomas C.
2013-01-01
Applying the principles of literature-based discovery (LBD), we elucidate the paradox that obesity is beneficial in critical care despite contributing to disease generally. Our approach enhances a previous extension to LBD, called “discovery browsing,” and is implemented using Semantic MEDLINE, which summarizes the results of a PubMed search into an interactive graph of semantic predications. The methodology allows a user to construct argumentation underpinning an answer to a biomedical question by engaging the user in an iterative process between system output and user knowledge. Components of the Semantic MEDLINE output graph identified as “interesting” by the user both contribute to subsequent searches and are constructed into a logical chain of relationships constituting an explanatory network in answer to the initial question. Based on this methodology we suggest that phthalates leached from plastic in critical care interventions activate PPAR gamma, which is anti-inflammatory and abundant in obese patients. PMID:24551329
Ontology construction and application in practice case study of health tourism in Thailand.
Chantrapornchai, Chantana; Choksuchat, Chidchanok
2016-01-01
Ontology is one of the key components in semantic webs. It contains the core knowledge for an effective search. However, building ontology requires the carefully-collected knowledge which is very domain-sensitive. In this work, we present the practice of ontology construction for a case study of health tourism in Thailand. The whole process follows the METHONTOLOGY approach, which consists of phases: information gathering, corpus study, ontology engineering, evaluation, publishing, and the application construction. Different sources of data such as structure web documents like HTML and other documents are acquired in the information gathering process. The tourism corpora from various tourism texts and standards are explored. The ontology is evaluated in two aspects: automatic reasoning using Pellet, and RacerPro, and the questionnaires, used to evaluate by experts of the domains: tourism domain experts and ontology experts. The ontology usability is demonstrated via the semantic web application and via example axioms. The developed ontology is actually the first health tourism ontology in Thailand with the published application.
Tool for Constructing Data Albums for Significant Weather Events
NASA Astrophysics Data System (ADS)
Kulkarni, A.; Ramachandran, R.; Conover, H.; McEniry, M.; Goodman, H.; Zavodsky, B. T.; Braun, S. A.; Wilson, B. D.
2012-12-01
Case study analysis and climatology studies are common approaches used in Atmospheric Science research. Research based on case studies involves a detailed description of specific weather events using data from different sources, to characterize physical processes in play for a given event. Climatology-based research tends to focus on the representativeness of a given event, by studying the characteristics and distribution of a large number of events. To gather relevant data and information for case studies and climatology analysis is tedious and time consuming; current Earth Science data systems are not suited to assemble multi-instrument, multi mission datasets around specific events. For example, in hurricane science, finding airborne or satellite data relevant to a given storm requires searching through web pages and data archives. Background information related to damages, deaths, and injuries requires extensive online searches for news reports and official storm summaries. We will present a knowledge synthesis engine to create curated "Data Albums" to support case study analysis and climatology studies. The technological challenges in building such a reusable and scalable knowledge synthesis engine are several. First, how to encode domain knowledge in a machine usable form? This knowledge must capture what information and data resources are relevant and the semantic relationships between the various fragments of information and data. Second, how to extract semantic information from various heterogeneous sources including unstructured texts using the encoded knowledge? Finally, how to design a structured database from the encoded knowledge to store all information and to support querying? The structured database must allow both knowledge overviews of an event as well as drill down capability needed for detailed analysis. An application ontology driven framework is being used to design the knowledge synthesis engine. The knowledge synthesis engine is being applied to build a portal for hurricane case studies at the Global Hydrology and Resource Center (GHRC), a NASA Data Center. This portal will auto-generate Data Albums for specific hurricane events, compiling information from distributed resources such as NASA field campaign collections, relevant data sets, storm reports, pictures, videos and other useful sources.
Social Semantics for an Effective Enterprise
NASA Technical Reports Server (NTRS)
Berndt, Sarah; Doane, Mike
2012-01-01
An evolution of the Semantic Web, the Social Semantic Web (s2w), facilitates knowledge sharing with "useful information based on human contributions, which gets better as more people participate." The s2w reaches beyond the search box to move us from a collection of hyperlinked facts, to meaningful, real time context. When focused through the lens of Enterprise Search, the Social Semantic Web facilitates the fluid transition of meaningful business information from the source to the user. It is the confluence of human thought and computer processing structured with the iterative application of taxonomies, folksonomies, ontologies, and metadata schemas. The importance and nuances of human interaction are often deemphasized when focusing on automatic generation of semantic markup, which results in dissatisfied users and unrealized return on investment. Users consistently qualify the value of information sets through the act of selection, making them the de facto stakeholders of the Social Semantic Web. Employers are the ultimate beneficiaries of s2w utilization with a better informed, more decisive workforce; one not achieved with an IT miracle technology, but by improved human-computer interactions. Johnson Space Center Taxonomist Sarah Berndt and Mike Doane, principal owner of Term Management, LLC discuss the planning, development, and maintenance stages for components of a semantic system while emphasizing the necessity of a Social Semantic Web for the Enterprise. Identification of risks and variables associated with layering the successful implementation of a semantic system are also modeled.
Semantic Metadata for Heterogeneous Spatial Planning Documents
NASA Astrophysics Data System (ADS)
Iwaniak, A.; Kaczmarek, I.; Łukowicz, J.; Strzelecki, M.; Coetzee, S.; Paluszyński, W.
2016-09-01
Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa). The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.
NASA Technical Reports Server (NTRS)
Ashish, Naveen
2005-01-01
We provide an overview of several ongoing NASA endeavors based on concepts, systems, and technology from the Semantic Web arena. Indeed NASA has been one of the early adopters of Semantic Web Technology and we describe ongoing and completed R&D efforts for several applications ranging from collaborative systems to airspace information management to enterprise search to scientific information gathering and discovery systems at NASA.
Cornelissen, Tim H W; Võ, Melissa L-H
2017-01-01
People have an amazing ability to identify objects and scenes with only a glimpse. How automatic is this scene and object identification? Are scene and object semantics-let alone their semantic congruity-processed to a degree that modulates ongoing gaze behavior even if they are irrelevant to the task at hand? Objects that do not fit the semantics of the scene (e.g., a toothbrush in an office) are typically fixated longer and more often than objects that are congruent with the scene context. In this study, we overlaid a letter T onto photographs of indoor scenes and instructed participants to search for it. Some of these background images contained scene-incongruent objects. Despite their lack of relevance to the search, we found that participants spent more time in total looking at semantically incongruent compared to congruent objects in the same position of the scene. Subsequent tests of explicit and implicit memory showed that participants did not remember many of the inconsistent objects and no more of the consistent objects. We argue that when we view natural environments, scene and object relationships are processed obligatorily, such that irrelevant semantic mismatches between scene and object identity can modulate ongoing eye-movement behavior.
Lorence, Daniel; Abraham, Joanna
2006-01-01
Medical and health-related searches pose a special case of risk when using the web as an information resource. Uninsured consumers, lacking access to a trained provider, will often rely on information from the internet for self-diagnosis and treatment. In areas where treatments are uncertain or controversial, most consumers lack the knowledge to make an informed decision. This exploratory technology assessment examines the use of Keyword Effectiveness Indexing (KEI) analysis as a potential tool for profiling information search and keyword retrieval patterns. Results demonstrate that the KEI methodology can be useful in identifying e-health search patterns, but is limited by semantic or text-based web environments.
A semantic medical multimedia retrieval approach using ontology information hiding.
Guo, Kehua; Zhang, Shigeng
2013-01-01
Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ostlund, Neil
This research showed the feasibility of applying the concepts of the Semantic Web to Computation Chemistry. We have created the first web portal (www.chemsem.com) that allows data created in the calculations of quantum chemistry, and other such chemistry calculations to be placed on the web in a way that makes the data accessible to scientists in a semantic form never before possible. The semantic web nature of the portal allows data to be searched, found, and used as an advance over the usual approach of a relational database. The semantic data on our portal has the nature of a Giantmore » Global Graph (GGG) that can be easily merged with related data and searched globally via a SPARQL Protocol and RDF Query Language (SPARQL) that makes global searches for data easier than with traditional methods. Our Semantic Web Portal requires that the data be understood by a computer and hence defined by an ontology (vocabulary). This ontology is used by the computer in understanding the data. We have created such an ontology for computational chemistry (purl.org/gc) that encapsulates a broad knowledge of the field of computational chemistry. We refer to this ontology as the Gainesville Core. While it is perhaps the first ontology for computational chemistry and is used by our portal, it is only a start of what must be a long multi-partner effort to define computational chemistry. In conjunction with the above efforts we have defined a new potential file standard (Common Standard for eXchange – CSX for computational chemistry data). This CSX file is the precursor of data in the Resource Description Framework (RDF) form that the semantic web requires. Our portal translates CSX files (as well as other computational chemistry data files) into RDF files that are part of the graph database that the semantic web employs. We propose a CSX file as a convenient way to encapsulate computational chemistry data.« less
Linked Data: what does it offer Earth Sciences?
NASA Astrophysics Data System (ADS)
Cox, Simon; Schade, Sven
2010-05-01
'Linked Data' is a current buzz-phrase promoting access to various forms of data on the internet. It starts from the two principles that have underpinned the architecture and scalability of the World Wide Web: 1. Universal Resource Identifiers - using the http protocol which is supported by the DNS system. 2. Hypertext - in which URIs of related resources are embedded within a document. Browsing is the key mode of interaction, with traversal of links between resources under control of the client. Linked Data also adds, or re-emphasizes: • Content negotiation - whereby the client uses http headers to tell the service what representation of a resource is acceptable, • Semantic Web principles - formal semantics for links, following the RDF data model and encoding, and • The 'mashup' effect - in which original and unexpected value may emerge from reuse of data, even if published in raw or unpolished form. Linked Data promotes typed links to all kinds of data, so is where the semantic web meets the 'deep web', i.e. resources which may be accessed using web protocols, but are in representations not indexed by search engines. Earth sciences are data rich, but with a strong legacy of specialized formats managed and processed by disconnected applications. However, most contemporary research problems require a cross-disciplinary approach, in which the heterogeneity resulting from that legacy is a significant challenge. In this context, Linked Data clearly has much to offer the earth sciences. But, there are some important questions to answer. What is a resource? Most earth science data is organized in arrays and databases. A subset useful for a particular study is usually identified by a parameterized query. The Linked Data paradigm emerged from the world of documents, and will often only resolve data-sets. It is impractical to create even nested navigation resources containing links to all potentially useful objects or subsets. From the viewpoint of human user interfaces, the browse metaphor, which has been such an important part of the success of the web, must be augmented with other interaction mechanisms, including query. What are the impacts on search and metadata? Hypertext provides links selected by the page provider. However, science should endeavor to be exhaustive in its use of data. Resource discovery through links must be supplemented by more systematic data discovery through search. Conversely, the crawlers that generate search indexes must be fed by resource providers (a) serving navigation pages with links to every dataset (b) adding enough 'metadata' (semantics) on each link to effectively populate the indexes. Linked Data makes this easier due to its integration with semantic web technologies, including structured vocabularies. What is the relation between structured data and Linked Data? Linked Data has focused on web-pages (primarily HTML) for human browsing, and RDF for semantics, assuming that other representations are opaque. However, this overlooks the wealth of XML data on the web, some of which is structured according to XML Schemas that provide semantics. Technical applications can use content-negotiation to get a structured representation, and exploit its semantics. Particularly relevant for earth sciences are data representations based on OGC Geography Markup Language (GML), such as GeoSciML, O&M and MOLES. GML was strongly influenced by RDF, and typed links are intrinsic: xlink:href plays the role that rdf:resource does in RDF representations. Services which expose GML-formatted resources (such as OGC Web Feature Service) are a prototype of Linked Data. Giving credit where it is due. Organizations investing in data collection may be reluctant to publish the raw data prior to completing an initial analysis. To encourage early data publication the system must provide suitable incentives, and citation analysis must recognize the increasing diversity of publication routes and forms. Linked Data makes it easier to include rich citation information when data is both published and used.
Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking.
Yu, Jun; Yang, Xiaokang; Gao, Fei; Tao, Dacheng
2017-12-01
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
Data-driven Ontology Development: A Case Study at NASA's Atmospheric Science Data Center
NASA Astrophysics Data System (ADS)
Hertz, J.; Huffer, E.; Kusterer, J.
2012-12-01
Well-founded ontologies are key to enabling transformative semantic technologies and accelerating scientific research. One example is semantically enabled search and discovery, making scientific data accessible and more understandable by accurately modeling a complex domain. The ontology creation process remains a challenge for many anxious to pursue semantic technologies. The key may be that the creation process -- whether formal, community-based, automated or semi-automated -- should encompass not only a foundational core and supplemental resources but also a focus on the purpose or mission the ontology is created to support. Are there tools or processes to de-mystify, assess or enhance the resulting ontology? We suggest that comparison and analysis of a domain-focused ontology can be made using text engineering tools for information extraction, tokenizers, named entity transducers and others. The results are analyzed to ensure the ontology reflects the core purpose of the domain's mission and that the ontology integrates and describes the supporting data in the language of the domain - how the science is analyzed and discussed among all users of the data. Commonalities and relationships among domain resources describing the Clouds and Earth's Radiant Energy (CERES) Bi-Directional Scan (BDS) datasets from NASA's Atmospheric Science Data Center are compared. The domain resources include: a formal ontology created for CERES; scientific works such as papers, conference proceedings and notes; information extracted from the datasets (i.e., header metadata); and BDS scientific documentation (Algorithm Theoretical Basis Documents, collection guides, data quality summaries and others). These resources are analyzed using the open source software General Architecture for Text Engineering, a mature framework for computational tasks involving human language.
Gist in time: scene semantics and structure enhance recall of searched objects
Wolfe, Jeremy M.; Võ, Melissa L.-H.
2016-01-01
Previous work has shown that recall of objects that are incidentally encountered as targets in visual search is better than recall of objects that have been intentionally memorized (Draschkow, Wolfe & Võ, 2014). However, this counter-intuitive result is not seen when these tasks are performed with non-scene stimuli. The goal of the current paper is to determine what features of search in a scene contribute to higher recall rates when compared to a memorization task. In each of four experiments, we compare the free recall rate for target objects following a search to the rate following a memorization task. Across the experiments, the stimuli include progressively more scene-related information. Experiment 1 provides the spatial relations between objects. Experiment 2 adds relative size and depth of objects. Experiments 3 and 4 include scene layout and semantic information. We find that search leads to better recall than explicit memorization in cases where scene layout and semantic information are present, as long as the participant has ample time (2500ms) to integrate this information with knowledge about the target object (Exp. 4). These results suggest that the integration of scene and target information not only leads to more efficient search, but can also contribute to stronger memory representations than intentional memorization. PMID:27270227
Gist in time: Scene semantics and structure enhance recall of searched objects.
Josephs, Emilie L; Draschkow, Dejan; Wolfe, Jeremy M; Võ, Melissa L-H
2016-09-01
Previous work has shown that recall of objects that are incidentally encountered as targets in visual search is better than recall of objects that have been intentionally memorized (Draschkow, Wolfe, & Võ, 2014). However, this counter-intuitive result is not seen when these tasks are performed with non-scene stimuli. The goal of the current paper is to determine what features of search in a scene contribute to higher recall rates when compared to a memorization task. In each of four experiments, we compare the free recall rate for target objects following a search to the rate following a memorization task. Across the experiments, the stimuli include progressively more scene-related information. Experiment 1 provides the spatial relations between objects. Experiment 2 adds relative size and depth of objects. Experiments 3 and 4 include scene layout and semantic information. We find that search leads to better recall than explicit memorization in cases where scene layout and semantic information are present, as long as the participant has ample time (2500ms) to integrate this information with knowledge about the target object (Exp. 4). These results suggest that the integration of scene and target information not only leads to more efficient search, but can also contribute to stronger memory representations than intentional memorization. Copyright © 2016 Elsevier B.V. All rights reserved.
Path Network Recovery Using Remote Sensing Data and Geospatial-Temporal Semantic Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
William C. McLendon III; Brost, Randy C.
Remote sensing systems produce large volumes of high-resolution images that are difficult to search. The GeoGraphy (pronounced Geo-Graph-y) framework [2, 20] encodes remote sensing imagery into a geospatial-temporal semantic graph representation to enable high level semantic searches to be performed. Typically scene objects such as buildings and trees tend to be shaped like blocks with few holes, but other shapes generated from path networks tend to have a large number of holes and can span a large geographic region due to their connectedness. For example, we have a dataset covering the city of Philadelphia in which there is a singlemore » road network node spanning a 6 mile x 8 mile region. Even a simple question such as "find two houses near the same street" might give unexpected results. More generally, nodes arising from networks of paths (roads, sidewalks, trails, etc.) require additional processing to make them useful for searches in GeoGraphy. We have assigned the term Path Network Recovery to this process. Path Network Recovery is a three-step process involving (1) partitioning the network node into segments, (2) repairing broken path segments interrupted by occlusions or sensor noise, and (3) adding path-aware search semantics into GeoQuestions. This report covers the path network recovery process, how it is used, and some example use cases of the current capabilities.« less
Strategic search from long-term memory: an examination of semantic and autobiographical recall.
Unsworth, Nash; Brewer, Gene A; Spillers, Gregory J
2014-01-01
Searching long-term memory is theoretically driven by both directed (search strategies) and random components. In the current study we conducted four experiments evaluating strategic search in semantic and autobiographical memory. Participants were required to generate either exemplars from the category of animals or the names of their friends for several minutes. Self-reported strategies suggested that participants typically relied on visualization strategies for both tasks and were less likely to rely on ordered strategies (e.g., alphabetic search). When participants were instructed to use particular strategies, the visualization strategy resulted in the highest levels of performance and the most efficient search, whereas ordered strategies resulted in the lowest levels of performance and fairly inefficient search. These results are consistent with the notion that retrieval from long-term memory is driven, in part, by search strategies employed by the individual, and that one particularly efficient strategy is to visualize various situational contexts that one has experienced in the past in order to constrain the search and generate the desired information.
Johns, Brendan T; Taler, Vanessa; Pisoni, David B; Farlow, Martin R; Hake, Ann Marie; Kareken, David A; Unverzagt, Frederick W; Jones, Michael N
2018-06-01
Mild cognitive impairment (MCI) is characterised by subjective and objective memory impairment in the absence of dementia. MCI is a strong predictor for the development of Alzheimer's disease, and may represent an early stage in the disease course in many cases. A standard task used in the diagnosis of MCI is verbal fluency, where participants produce as many items from a specific category (e.g., animals) as possible. Verbal fluency performance is typically analysed by counting the number of items produced. However, analysis of the semantic path of the items produced can provide valuable additional information. We introduce a cognitive model that uses multiple types of lexical information in conjunction with a standard memory search process. The model used a semantic representation derived from a standard semantic space model in conjunction with a memory searching mechanism derived from the Luce choice rule (Luce, 1977). The model was able to detect differences in the memory searching process of patients who were developing MCI, suggesting that the formal analysis of verbal fluency data is a promising avenue to examine the underlying changes occurring in the development of cognitive impairment. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Eye Movements and Visual Memory for Scenes
2005-01-01
Scene memory research has demonstrated that the memory representation of a semantically inconsistent object in a scene is more detailed and/or complete... memory during scene viewing, then changes to semantically inconsistent objects (which should be represented more com- pletely) should be detected more... semantic description. Due to the surprise nature of the visual memory test, any learning that occurred during the search portion of the experiment was
Problem Solving with General Semantics.
ERIC Educational Resources Information Center
Hewson, David
1996-01-01
Discusses how to use general semantics formulations to improve problem solving at home or at work--methods come from the areas of artificial intelligence/computer science, engineering, operations research, and psychology. (PA)
Message passing with parallel queue traversal
Underwood, Keith D [Albuquerque, NM; Brightwell, Ronald B [Albuquerque, NM; Hemmert, K Scott [Albuquerque, NM
2012-05-01
In message passing implementations, associative matching structures are used to permit list entries to be searched in parallel fashion, thereby avoiding the delay of linear list traversal. List management capabilities are provided to support list entry turnover semantics and priority ordering semantics.
Effects of paired-object affordance in search tasks across the adult lifespan.
Wulff, Melanie; Stainton, Alexandra; Rotshtein, Pia
2016-06-01
The study investigated the processes underlying the retrieval of action information about functional object pairs, focusing on the contribution of procedural and semantic knowledge. We further assessed whether the retrieval of action knowledge is affected by task demands and age. The contribution of procedural knowledge was examined by the way objects were selected, specifically whether active objects were selected before passive objects. The contribution of semantic knowledge was examined by manipulating the relation between targets and distracters. A touchscreen-based search task was used testing young, middle-aged, and elderly participants. Participants had to select by touching two targets among distracters using two search tasks. In an explicit action search task, participants had to select two objects which afforded a mutual action (e.g., functional pair: hammer-nail). Implicit affordance perception was tested using a visual color-matching search task; participants had to select two objects with the same colored frame. In both tasks, half of the colored targets also afforded an action. Overall, middle-aged participants performed better than young and elderly participants, specifically in the action task. Across participants in the action task, accuracy was increased when the distracters were semantically unrelated to the functional pair, while the opposite pattern was observed in the color task. This effect was enhanced with increased age. In the action task all participants utilized procedural knowledge, i.e., selected the active object before the passive object. This result supports the dual-route account from vision to action. Semantic knowledge contributed to both the action and the color task, but procedural knowledge associated with the direct route was primarily retrieved when the task was action-relevant. Across the adulthood lifespan, the data show inverted U-shaped effects of age on the retrieval of action knowledge. Age also linearly increased the involvement of the indirect (semantic) route and the integration of information of the direct and the indirect routes in selection processes. Copyright © 2016 Elsevier Inc. All rights reserved.
A Semantic Medical Multimedia Retrieval Approach Using Ontology Information Hiding
Guo, Kehua; Zhang, Shigeng
2013-01-01
Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches. PMID:24082915
Hybrid ontology for semantic information retrieval model using keyword matching indexing system.
Uthayan, K R; Mala, G S Anandha
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.
Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System
Uthayan, K. R.; Anandha Mala, G. S.
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology. PMID:25922851
Individual Differences in a Spatial-Semantic Virtual Environment.
ERIC Educational Resources Information Center
Chen, Chaomei
2000-01-01
Presents two empirical case studies concerning the role of individual differences in searching through a spatial-semantic virtual environment. Discusses information visualization in information systems; cognitive factors, including associative memory, spatial ability, and visual memory; user satisfaction; and cognitive abilities and search…
Memory Scanning, Introversion-Extraversion, and Levels of Processing.
ERIC Educational Resources Information Center
Eysenck, Michael W.; Eysenck, M. Christine
1979-01-01
Investigated was the hypothesis that high arousal increases processing of physical characteristics and reduces processing of semantic characteristics. While introverts and extroverts had equivalent scanning rates for physical features, introverts were significantly slower in searching for semantic features of category membership, indicating…
ESDORA: A Data Archive Infrastructure Using Digital Object Model and Open Source Frameworks
NASA Astrophysics Data System (ADS)
Shrestha, Biva; Pan, Jerry; Green, Jim; Palanisamy, Giriprakash; Wei, Yaxing; Lenhardt, W.; Cook, R. Bob; Wilson, B. E.; Leggott, M.
2011-12-01
There are an array of challenges associated with preserving, managing, and using contemporary scientific data. Large volume, multiple formats and data services, and the lack of a coherent mechanism for metadata/data management are some of the common issues across data centers. It is often difficult to preserve the data history and lineage information, along with other descriptive metadata, hindering the true science value for the archived data products. In this project, we use digital object abstraction architecture as the information/knowledge framework to address these challenges. We have used the following open-source frameworks: Fedora-Commons Repository, Drupal Content Management System, Islandora (Drupal Module) and Apache Solr Search Engine. The system is an active archive infrastructure for Earth Science data resources, which include ingestion, archiving, distribution, and discovery functionalities. We use an ingestion workflow to ingest the data and metadata, where many different aspects of data descriptions (including structured and non-structured metadata) are reviewed. The data and metadata are published after reviewing multiple times. They are staged during the reviewing phase. Each digital object is encoded in XML for long-term preservation of the content and relations among the digital items. The software architecture provides a flexible, modularized framework for adding pluggable user-oriented functionality. Solr is used to enable word search as well as faceted search. A home grown spatial search module is plugged in to allow user to make a spatial selection in a map view. A RDF semantic store within the Fedora-Commons Repository is used for storing information on data lineage, dissemination services, and text-based metadata. We use the semantic notion "isViewerFor" to register internally or externally referenced URLs, which are rendered within the same web browser when possible. With appropriate mapping of content into digital objects, many different data descriptions, including structured metadata, data history, auditing trails, are captured and coupled with the data content. The semantic store provides a foundation for possible further utilizations, including provide full-fledged Earth Science ontology for data interpretation or lineage tracking. Datasets from the NASA-sponsored Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) as well as from the Synthesis Thematic Data Center (MAST-DC) are used in a testing deployment with the system. The testing deployment allows us to validate the features and values described here for the integrated system, which will be presented here. Overall, we believe that the integrated system is valid, reusable data archive software that provides digital stewardship for Earth Sciences data content, now and in the future. References: [1] Devarakonda, Ranjeet, and Harold Shanafield. "Drupal: Collaborative framework for science research." Collaboration Technologies and Systems (CTS), 2011 International Conference on. IEEE, 2011. [2] Devarakonda, Ranjeet, et al. "Semantic search integration to climate data." Collaboration Technologies and Systems (CTS), 2014 International Conference on. IEEE, 2014.
Progress in The Semantic Analysis of Scientific Code
NASA Technical Reports Server (NTRS)
Stewart, Mark
2000-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.
Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature
Müller, Hans-Michael; Kenny, Eimear E
2004-01-01
We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other organism-specific corpora of text. Textpresso can be accessed at http://www.textpresso.org or via WormBase at http://www.wormbase.org. PMID:15383839
An architecture for diversity-aware search for medical web content.
Denecke, K
2012-01-01
The Web provides a huge source of information, also on medical and health-related issues. In particular the content of medical social media data can be diverse due to the background of an author, the source or the topic. Diversity in this context means that a document covers different aspects of a topic or a topic is described in different ways. In this paper, we introduce an approach that allows to consider the diverse aspects of a search query when providing retrieval results to a user. We introduce a system architecture for a diversity-aware search engine that allows retrieving medical information from the web. The diversity of retrieval results is assessed by calculating diversity measures that rely upon semantic information derived from a mapping to concepts of a medical terminology. Considering these measures, the result set is diversified by ranking more diverse texts higher. The methods and system architecture are implemented in a retrieval engine for medical web content. The diversity measures reflect the diversity of aspects considered in a text and its type of information content. They are used for result presentation, filtering and ranking. In a user evaluation we assess the user satisfaction with an ordering of retrieval results that considers the diversity measures. It is shown through the evaluation that diversity-aware retrieval considering diversity measures in ranking could increase the user satisfaction with retrieval results.
Towards a semantic PACS: Using Semantic Web technology to represent imaging data.
Van Soest, Johan; Lustberg, Tim; Grittner, Detlef; Marshall, M Scott; Persoon, Lucas; Nijsten, Bas; Feltens, Peter; Dekker, Andre
2014-01-01
The DICOM standard is ubiquitous within medicine. However, improved DICOM semantics would significantly enhance search operations. Furthermore, databases of current PACS systems are not flexible enough for the demands within image analysis research. In this paper, we investigated if we can use Semantic Web technology, to store and represent metadata of DICOM image files, as well as linking additional computational results to image metadata. Therefore, we developed a proof of concept containing two applications: one to store commonly used DICOM metadata in an RDF repository, and one to calculate imaging biomarkers based on DICOM images, and store the biomarker values in an RDF repository. This enabled us to search for all patients with a gross tumor volume calculated to be larger than 50 cc. We have shown that we can successfully store the DICOM metadata in an RDF repository and are refining our proof of concept with regards to volume naming, value representation, and the applications themselves.
Sanjuán, Ana; Hope, Thomas M.H.; Parker Jones, 'Ōiwi; Prejawa, Susan; Oberhuber, Marion; Guerin, Julie; Seghier, Mohamed L.; Green, David W.; Price, Cathy J.
2015-01-01
We used fMRI in 35 healthy participants to investigate how two neighbouring subregions in the lateral anterior temporal lobe (LATL) contribute to semantic matching and object naming. Four different levels of processing were considered: (A) recognition of the object concepts; (B) search for semantic associations related to object stimuli; (C) retrieval of semantic concepts of interest; and (D) retrieval of stimulus specific concepts as required for naming. During semantic association matching on picture stimuli or heard object names, we found that activation in both subregions was higher when the objects were semantically related (mug–kettle) than unrelated (car–teapot). This is consistent with both LATL subregions playing a role in (C), the successful retrieval of amodal semantic concepts. In addition, one subregion was more activated for object naming than matching semantically related objects, consistent with (D), the retrieval of a specific concept for naming. We discuss the implications of these novel findings for cognitive models of semantic processing and left anterior temporal lobe function. PMID:25496810
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.
De-Arteaga, Maria; Eggel, Ivan; Do, Bao; Rubin, Daniel; Kahn, Charles E; Müller, Henning
2015-08-01
Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often underrepresented, therefore RadLex was considered to be the best option for this task. The results show a surprising similarity between the usage behaviour in the two systems, but several subtle differences can also be noted. The average number of terms per query is 2.21 for GoldMiner and 2.07 for radTF, the used axes of RadLex (anatomy, pathology, findings, …) have almost the same distribution with clinical findings being the most frequent and the anatomical entity the second; also, combinations of RadLex axes are extremely similar between the two systems. Differences include a longer length of the sessions in radTF than in GoldMiner (3.4 and 1.9 queries per session on average). Several frequent search terms overlap but some strong differences exist in the details. In radTF the term "normal" is frequent, whereas in GoldMiner it is not. This makes intuitive sense, as in the literature normal cases are rarely described whereas in clinical work the comparison with normal cases is often a first step. The general similarity in many points is likely due to the fact that users of the two systems are influenced by their daily behaviour in using standard web search engines and follow this behaviour in their professional search. This means that many results and insights gained from standard web search can likely be transferred to more specialized search systems. Still, specialized log files can be used to find out more on reformulations and detailed strategies of users to find the right content. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yue, Peng; Gong, Jianya; Di, Liping
Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less
SPECTRa-T: machine-based data extraction and semantic searching of chemistry e-theses.
Downing, Jim; Harvey, Matt J; Morgan, Peter B; Murray-Rust, Peter; Rzepa, Henry S; Stewart, Diana C; Tonge, Alan P; Townsend, Joe A
2010-02-22
The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.
NASA Astrophysics Data System (ADS)
Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu
2015-04-01
With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.
Knowledge Provenance in Semantic Wikis
NASA Astrophysics Data System (ADS)
Ding, L.; Bao, J.; McGuinness, D. L.
2008-12-01
Collaborative online environments with a technical Wiki infrastructure are becoming more widespread. One of the strengths of a Wiki environment is that it is relatively easy for numerous users to contribute original content and modify existing content (potentially originally generated by others). As more users begin to depend on informational content that is evolving by Wiki communities, it becomes more important to track the provenance of the information. Semantic Wikis expand upon traditional Wiki environments by adding some computationally understandable encodings of some of the terms and relationships in Wikis. We have developed a semantic Wiki environment that expands a semantic Wiki with provenance markup. Provenance of original contributions as well as modifications is encoded using the provenance markup component of the Proof Markup Language. The Wiki environment provides the provenance markup automatically, thus users are not required to make specific encodings of author, contribution date, and modification trail. Further, our Wiki environment includes a search component that understands the provenance primitives and thus can be used to provide a provenance-aware search facility. We will describe the knowledge provenance infrastructure of our Semantic Wiki and show how it is being used as the foundation of our group web site as well as a number of project web sites.
Focused search of semantic cases: the effects of question form and case status.
Singer, M; Jakobson, L S
1989-05-01
The present study was designed to identify and examine some of the variables that influence the focused search of semantic cases in question answering. Singer, Parbery, and Jakobson (1988) have previously reported that people can focus on the case interrogated by a question and can largely disregard irrelevant cases. In the present study, people learned facts, such as the pilot painted the garage with the roller, the spraygun, and the brush. One day later, they answered questions that focused on a particular case. For example, the question did the pilot paint with a spraygun? focuses on the instrument case. Experiment 1 revealed that people can focus on a particular case in response both to complete questions and to comparable word probes, such as "pilot spraygun." Therefore, the given-new structure of questions is not essential to focused search. Experiment 2 revealed that people have a difficult time ignoring the agent case, even when it is irrelevant to the question. This corroborates proposals that agent and action information are closely interrelated in the representation of a fact. These results help to delineate the phenomenon of the focused search of semantic cases.
Framework and prototype for a secure XML-based electronic health records system.
Steele, Robert; Gardner, William; Chandra, Darius; Dillon, Tharam S
2007-01-01
Security of personal medical information has always been a challenge for the advancement of Electronic Health Records (EHRs) initiatives. eXtensible Markup Language (XML), is rapidly becoming the key standard for data representation and transportation. The widespread use of XML and the prospect of its use in the Electronic Health (e-health) domain highlights the need for flexible access control models for XML data and documents. This paper presents a declarative access control model for XML data repositories that utilises an expressive XML role control model. The operational semantics of this model are illustrated by Xplorer, a user interface generation engine which supports search-browse-navigate activities on XML repositories.
Perkins, David Nikolaus; Brost, Randolph; Ray, Lawrence P.
2017-08-08
Various technologies for facilitating analysis of large remote sensing and geolocation datasets to identify features of interest are described herein. A search query can be submitted to a computing system that executes searches over a geospatial temporal semantic (GTS) graph to identify features of interest. The GTS graph comprises nodes corresponding to objects described in the remote sensing and geolocation datasets, and edges that indicate geospatial or temporal relationships between pairs of nodes in the nodes. Trajectory information is encoded in the GTS graph by the inclusion of movable nodes to facilitate searches for features of interest in the datasets relative to moving objects such as vehicles.
Semantic-Web Technology: Applications at NASA
NASA Technical Reports Server (NTRS)
Ashish, Naveen
2004-01-01
We provide a description of work at the National Aeronautics and Space Administration (NASA) on building system based on semantic-web concepts and technologies. NASA has been one of the early adopters of semantic-web technologies for practical applications. Indeed there are several ongoing 0 endeavors on building semantics based systems for use in diverse NASA domains ranging from collaborative scientific activity to accident and mishap investigation to enterprise search to scientific information gathering and integration to aviation safety decision support We provide a brief overview of many applications and ongoing work with the goal of informing the external community of these NASA endeavors.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator
NASA Astrophysics Data System (ADS)
Seyed, P.; Chastain, K.; McGuinness, D. L.
2013-12-01
Use of Semantic Web technologies for data management in the Earth sciences (and beyond) has great potential but is still in its early stages, since the challenges of translating data into a more explicit or semantic form for immediate use within applications has not been fully addressed. In this abstract we help address this challenge by introducing the SemantEco Annotator, which enables anyone, regardless of expertise, to semantically annotate tabular Earth Science data and translate it into linked data format, while applying the logic inherent in community-standard vocabularies to guide the process. The Annotator was conceived under a desire to unify dataset content from a variety of sources under common vocabularies, for use in semantically-enabled web applications. Our current use case employs linked data generated by the Annotator for use in the SemantEco environment, which utilizes semantics to help users explore, search, and visualize water or air quality measurement and species occurrence data through a map-based interface. The generated data can also be used immediately to facilitate discovery and search capabilities within 'big data' environments. The Annotator provides a method for taking information about a dataset, that may only be known to its maintainers, and making it explicit, in a uniform and machine-readable fashion, such that a person or information system can more easily interpret the underlying structure and meaning. Its primary mechanism is to enable a user to formally describe how columns of a tabular dataset relate and/or describe entities. For example, if a user identifies columns for latitude and longitude coordinates, we can infer the data refers to a point that can be plotted on a map. Further, it can be made explicit that measurements of 'nitrate' and 'NO3-' are of the same entity through vocabulary assignments, thus more easily utilizing data sets that use different nomenclatures. The Annotator provides an extensive and searchable library of vocabularies to assist the user in locating terms to describe observed entities, their properties, and relationships. The Annotator leverages vocabulary definitions of these concepts to guide the user in describing data in a logically consistent manner. The vocabularies made available through the Annotator are open, as is the Annotator itself. We have taken a step towards making semantic annotation/translation of data more accessible. Our vision for the Annotator is as a tool that can be integrated into a semantic data 'workbench' environment, which would allow semantic annotation of a variety of data formats, using standard vocabularies. These vocabularies involved enable search for similar datasets, and integration with any semantically-enabled applications for analysis and visualization.
Query Results Clustering by Extending SPARQL with CLUSTER BY
NASA Astrophysics Data System (ADS)
Ławrynowicz, Agnieszka
The task of dynamic clustering of the search results proved to be useful in the Web context, where the user often does not know the granularity of the search results in advance. The goal of this paper is to provide a declarative way for invoking dynamic clustering of the results of queries submitted over Semantic Web data. To achieve this goal the paper proposes an approach that extends SPARQL by clustering abilities. The approach introduces a new statement, CLUSTER BY, into the SPARQL grammar and proposes semantics for such extension.
Souvignet, Julien; Declerck, Gunnar; Asfari, Hadyl; Jaulent, Marie-Christine; Bousquet, Cédric
2016-10-01
Efficient searching and coding in databases that use terminological resources requires that they support efficient data retrieval. The Medical Dictionary for Regulatory Activities (MedDRA) is a reference terminology for several countries and organizations to code adverse drug reactions (ADRs) for pharmacovigilance. Ontologies that are available in the medical domain provide several advantages such as reasoning to improve data retrieval. The field of pharmacovigilance does not yet benefit from a fully operational ontology to formally represent the MedDRA terms. Our objective was to build a semantic resource based on formal description logic to improve MedDRA term retrieval and aid the generation of on-demand custom groupings by appropriately and efficiently selecting terms: OntoADR. The method consists of the following steps: (1) mapping between MedDRA terms and SNOMED-CT, (2) generation of semantic definitions using semi-automatic methods, (3) storage of the resource and (4) manual curation by pharmacovigilance experts. We built a semantic resource for ADRs enabling a new type of semantics-based term search. OntoADR adds new search capabilities relative to previous approaches, overcoming the usual limitations of computation using lightweight description logic, such as the intractability of unions or negation queries, bringing it closer to user needs. Our automated approach for defining MedDRA terms enabled the association of at least one defining relationship with 67% of preferred terms. The curation work performed on our sample showed an error level of 14% for this automated approach. We tested OntoADR in practice, which allowed us to build custom groupings for several medical topics of interest. The methods we describe in this article could be adapted and extended to other terminologies which do not benefit from a formal semantic representation, thus enabling better data retrieval performance. Our custom groupings of MedDRA terms were used while performing signal detection, which suggests that the graphical user interface we are currently implementing to process OntoADR could be usefully integrated into specialized pharmacovigilance software that rely on MedDRA. Copyright © 2016 Elsevier Inc. All rights reserved.
Automated semantic indexing of figure captions to improve radiology image retrieval.
Kahn, Charles E; Rubin, Daniel L
2009-01-01
We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. Estimated precision was 0.897 (95% confidence interval, 0.857-0.937). Estimated recall was 0.930 (95% confidence interval, 0.838-1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval.
Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies
Köhler, Sebastian; Schulz, Marcel H.; Krawitz, Peter; Bauer, Sebastian; Dölken, Sandra; Ott, Claus E.; Mundlos, Christine; Horn, Denise; Mundlos, Stefan; Robinson, Peter N.
2009-01-01
The differential diagnostic process attempts to identify candidate diseases that best explain a set of clinical features. This process can be complicated by the fact that the features can have varying degrees of specificity, as well as by the presence of features unrelated to the disease itself. Depending on the experience of the physician and the availability of laboratory tests, clinical abnormalities may be described in greater or lesser detail. We have adapted semantic similarity metrics to measure phenotypic similarity between queries and hereditary diseases annotated with the use of the Human Phenotype Ontology (HPO) and have developed a statistical model to assign p values to the resulting similarity scores, which can be used to rank the candidate diseases. We show that our approach outperforms simpler term-matching approaches that do not take the semantic interrelationships between terms into account. The advantage of our approach was greater for queries containing phenotypic noise or imprecise clinical descriptions. The semantic network defined by the HPO can be used to refine the differential diagnosis by suggesting clinical features that, if present, best differentiate among the candidate diagnoses. Thus, semantic similarity searches in ontologies represent a useful way of harnessing the semantic structure of human phenotypic abnormalities to help with the differential diagnosis. We have implemented our methods in a freely available web application for the field of human Mendelian disorders. PMID:19800049
An ’Active Vision’ Computational Model of Visual Search for Human-Computer Interaction
2009-01-01
semantically related (e.g. cashew , peanut, almond) or randomly grouped (e.g. elm, eraser, potato). Groups were either labeled or not. In some...colored region were further semantically related (e.g. nuts with candy, and clothing with cosmetics). Layouts always contained 28 eight groups with
Dynamic Search and Working Memory in Social Recall
ERIC Educational Resources Information Center
Hills, Thomas T.; Pachur, Thorsten
2012-01-01
What are the mechanisms underlying search in social memory (e.g., remembering the people one knows)? Do the search mechanisms involve dynamic local-to-global transitions similar to semantic search, and are these transitions governed by the general control of attention, associated with working memory span? To find out, we asked participants to…
COEUS: “semantic web in a box” for biomedical applications
2012-01-01
Background As the “omics” revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter’s complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. Results COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a “semantic web in a box” approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. Conclusions The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/. PMID:23244467
COEUS: "semantic web in a box" for biomedical applications.
Lopes, Pedro; Oliveira, José Luís
2012-12-17
As the "omics" revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter's complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a "semantic web in a box" approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.
Navigation as a New Form of Search for Agricultural Learning Resources in Semantic Repositories
NASA Astrophysics Data System (ADS)
Cano, Ramiro; Abián, Alberto; Mena, Elena
Education is essential when it comes to raise public awareness on the environmental and economic benefits of organic agriculture and agroecology (OA & AE). Organic.Edunet, an EU funded project, aims at providing a freely-available portal where learning contents on OA & AE can be published and accessed through specialized technologies. This paper describes a novel mechanism for providing semantic capabilities (such as semantic navigational queries) to an arbitrary set of agricultural learning resources, in the context of the Organic.Edunet initiative.
FoodWiki: Ontology-Driven Mobile Safe Food Consumption System.
Çelik, Duygu
2015-01-01
An ontology-driven safe food consumption mobile system is considered. Over 3,000 compounds are being added to processed food, with numerous effects on the food: to add color, stabilize, texturize, preserve, sweeten, thicken, add flavor, soften, emulsify, and so forth. According to World Health Organization, governments have lately focused on legislation to reduce such ingredients or compounds in manufactured foods as they may have side effects causing health risks such as heart disease, cancer, diabetes, allergens, and obesity. By supervising what and how much to eat as well as what not to eat, we can maximize a patient's life quality through avoidance of unhealthy ingredients. Smart e-health systems with powerful knowledge bases can provide suggestions of appropriate foods to individuals. Next-generation smart knowledgebase systems will not only include traditional syntactic-based search, which limits the utility of the search results, but will also provide semantics for rich searching. In this paper, performance of concept matching of food ingredients is semantic-based, meaning that it runs its own semantic based rule set to infer meaningful results through the proposed Ontology-Driven Mobile Safe Food Consumption System (FoodWiki).
A Knowledge Discovery framework for Planetary Defense
NASA Astrophysics Data System (ADS)
Jiang, Y.; Yang, C. P.; Li, Y.; Yu, M.; Bambacus, M.; Seery, B.; Barbee, B.
2016-12-01
Planetary Defense, a project funded by NASA Goddard and the NSF, is a multi-faceted effort focused on the mitigation of Near Earth Object (NEO) threats to our planet. Currently, there exists a dispersion of information concerning NEO's amongst different organizations and scientists, leading to a lack of a coherent system of information to be used for efficient NEO mitigation. In this paper, a planetary defense knowledge discovery engine is proposed to better assist the development and integration of a NEO responding system. Specifically, we have implemented an organized information framework by two means: 1) the development of a semantic knowledge base, which provides a structure for relevant information. It has been developed by the implementation of web crawling and natural language processing techniques, which allows us to collect and store the most relevant structured information on a regular basis. 2) the development of a knowledge discovery engine, which allows for the efficient retrieval of information from our knowledge base. The knowledge discovery engine has been built on the top of Elasticsearch, an open source full-text search engine, as well as cutting-edge machine learning ranking and recommendation algorithms. This proposed framework is expected to advance the knowledge discovery and innovation in planetary science domain.
A. E. van Vogt: In Search of Meaning.
ERIC Educational Resources Information Center
Drake, H. L.
A general semantics perspective of science fiction writer A. E. van Vogt is presented in this paper. The first major section of the paper contains a biographical sketch of van Vogt and traces the influence of A. Korzybski's work on general semantics, "Science and Sanity," on his writing, while the second major section provides an…
Perspective: Semantic Data Management for the Home
2009-02-01
stored. For example, one view might be “all files with type=music and artist= Beatles stored on Liz’s iPod” and another “all files with owner=Liz...semantic naming structures and search tech- niques from a rich history of previous work. The Seman- tic Filesystem [12] proposed the use of attribute
ERIC Educational Resources Information Center
Huettig, Falk; McQueen, James M.
2007-01-01
Experiments 1 and 2 examined the time-course of retrieval of phonological, visual-shape and semantic knowledge as Dutch participants listened to sentences and looked at displays of four pictures. Given a sentence with "beker," "beaker," for example, the display contained phonological (a beaver, "bever"), shape (a…
NASA Astrophysics Data System (ADS)
Chmiel, P.; Ganzha, M.; Jaworska, T.; Paprzycki, M.
2017-10-01
Nowadays, as a part of systematic growth of volume, and variety, of information that can be found on the Internet, we observe also dramatic increase in sizes of available image collections. There are many ways to help users browsing / selecting images of interest. One of popular approaches are Content-Based Image Retrieval (CBIR) systems, which allow users to search for images that match their interests, expressed in the form of images (query by example). However, we believe that image search and retrieval could take advantage of semantic technologies. We have decided to test this hypothesis. Specifically, on the basis of knowledge captured in the CBIR, we have developed a domain ontology of residential real estate (detached houses, in particular). This allows us to semantically represent each image (and its constitutive architectural elements) represented within the CBIR. The proposed ontology was extended to capture not only the elements resulting from image segmentation, but also "spatial relations" between them. As a result, a new approach to querying the image database (semantic querying) has materialized, thus extending capabilities of the developed system.
A Geospatial Semantic Enrichment and Query Service for Geotagged Photographs
Ennis, Andrew; Nugent, Chris; Morrow, Philip; Chen, Liming; Ioannidis, George; Stan, Alexandru; Rachev, Preslav
2015-01-01
With the increasing abundance of technologies and smart devices, equipped with a multitude of sensors for sensing the environment around them, information creation and consumption has now become effortless. This, in particular, is the case for photographs with vast amounts being created and shared every day. For example, at the time of this writing, Instagram users upload 70 million photographs a day. Nevertheless, it still remains a challenge to discover the “right” information for the appropriate purpose. This paper describes an approach to create semantic geospatial metadata for photographs, which can facilitate photograph search and discovery. To achieve this we have developed and implemented a semantic geospatial data model by which a photograph can be enrich with geospatial metadata extracted from several geospatial data sources based on the raw low-level geo-metadata from a smartphone photograph. We present the details of our method and implementation for searching and querying the semantic geospatial metadata repository to enable a user or third party system to find the information they are looking for. PMID:26205265
A Statistical Ontology-Based Approach to Ranking for Multiword Search
ERIC Educational Resources Information Center
Kim, Jinwoo
2013-01-01
Keyword search is a prominent data retrieval method for the Web, largely because the simple and efficient nature of keyword processing allows a large amount of information to be searched with fast response. However, keyword search approaches do not formally capture the clear meaning of a keyword query and fail to address the semantic relationships…
Transformation of an uncertain video search pipeline to a sketch-based visual analytics loop.
Legg, Philip A; Chung, David H S; Parry, Matthew L; Bown, Rhodri; Jones, Mark W; Griffiths, Iwan W; Chen, Min
2013-12-01
Traditional sketch-based image or video search systems rely on machine learning concepts as their core technology. However, in many applications, machine learning alone is impractical since videos may not be semantically annotated sufficiently, there may be a lack of suitable training data, and the search requirements of the user may frequently change for different tasks. In this work, we develop a visual analytics systems that overcomes the shortcomings of the traditional approach. We make use of a sketch-based interface to enable users to specify search requirement in a flexible manner without depending on semantic annotation. We employ active machine learning to train different analytical models for different types of search requirements. We use visualization to facilitate knowledge discovery at the different stages of visual analytics. This includes visualizing the parameter space of the trained model, visualizing the search space to support interactive browsing, visualizing candidature search results to support rapid interaction for active learning while minimizing watching videos, and visualizing aggregated information of the search results. We demonstrate the system for searching spatiotemporal attributes from sports video to identify key instances of the team and player performance.
Optimal Foraging in Semantic Memory
ERIC Educational Resources Information Center
Hills, Thomas T.; Jones, Michael N.; Todd, Peter M.
2012-01-01
Do humans search in memory using dynamic local-to-global search strategies similar to those that animals use to forage between patches in space? If so, do their dynamic memory search policies correspond to optimal foraging strategies seen for spatial foraging? Results from a number of fields suggest these possibilities, including the shared…
Smarter Earth Science Data System
NASA Technical Reports Server (NTRS)
Huang, Thomas
2013-01-01
The explosive growth in Earth observational data in the recent decade demands a better method of interoperability across heterogeneous systems. The Earth science data system community has mastered the art in storing large volume of observational data, but it is still unclear how this traditional method scale over time as we are entering the age of Big Data. Indexed search solutions such as Apache Solr (Smiley and Pugh, 2011) provides fast, scalable search via keyword or phases without any reasoning or inference. The modern search solutions such as Googles Knowledge Graph (Singhal, 2012) and Microsoft Bing, all utilize semantic reasoning to improve its accuracy in searches. The Earth science user community is demanding for an intelligent solution to help them finding the right data for their researches. The Ontological System for Context Artifacts and Resources (OSCAR) (Huang et al., 2012), was created in response to the DARPA Adaptive Vehicle Make (AVM) programs need for an intelligent context models management system to empower its terrain simulation subsystem. The core component of OSCAR is the Environmental Context Ontology (ECO) is built using the Semantic Web for Earth and Environmental Terminology (SWEET) (Raskin and Pan, 2005). This paper presents the current data archival methodology within a NASA Earth science data centers and discuss using semantic web to improve the way we capture and serve data to our users.
NASA Astrophysics Data System (ADS)
Petrie, C.; Margaria, T.; Lausen, H.; Zaremba, M.
Explores trade-offs among existing approaches. Reveals strengths and weaknesses of proposed approaches, as well as which aspects of the problem are not yet covered. Introduces software engineering approach to evaluating semantic web services. Service-Oriented Computing is one of the most promising software engineering trends because of the potential to reduce the programming effort for future distributed industrial systems. However, only a small part of this potential rests on the standardization of tools offered by the web services stack. The larger part of this potential rests upon the development of sufficient semantics to automate service orchestration. Currently there are many different approaches to semantic web service descriptions and many frameworks built around them. A common understanding, evaluation scheme, and test bed to compare and classify these frameworks in terms of their capabilities and shortcomings, is necessary to make progress in developing the full potential of Service-Oriented Computing. The Semantic Web Services Challenge is an open source initiative that provides a public evaluation and certification of multiple frameworks on common industrially-relevant problem sets. This edited volume reports on the first results in developing common understanding of the various technologies intended to facilitate the automation of mediation, choreography and discovery for Web Services using semantic annotations. Semantic Web Services Challenge: Results from the First Year is designed for a professional audience composed of practitioners and researchers in industry. Professionals can use this book to evaluate SWS technology for their potential practical use. The book is also suitable for advanced-level students in computer science.
Designing learning management system interoperability in semantic web
NASA Astrophysics Data System (ADS)
Anistyasari, Y.; Sarno, R.; Rochmawati, N.
2018-01-01
The extensive adoption of learning management system (LMS) has set the focus on the interoperability requirement. Interoperability is the ability of different computer systems, applications or services to communicate, share and exchange data, information, and knowledge in a precise, effective and consistent way. Semantic web technology and the use of ontologies are able to provide the required computational semantics and interoperability for the automation of tasks in LMS. The purpose of this study is to design learning management system interoperability in the semantic web which currently has not been investigated deeply. Moodle is utilized to design the interoperability. Several database tables of Moodle are enhanced and some features are added. The semantic web interoperability is provided by exploited ontology in content materials. The ontology is further utilized as a searching tool to match user’s queries and available courses. It is concluded that LMS interoperability in Semantic Web is possible to be performed.
Investigating the capabilities of semantic enrichment of 3D CityEngine data
NASA Astrophysics Data System (ADS)
Solou, Dimitra; Dimopoulou, Efi
2016-08-01
In recent years the development of technology and the lifting of several technical limitations, has brought the third dimension to the fore. The complexity of urban environments and the strong need for land administration, intensify the need of using a three-dimensional cadastral system. Despite the progress in the field of geographic information systems and 3D modeling techniques, there is no fully digital 3D cadastre. The existing geographic information systems and the different methods of three-dimensional modeling allow for better management, visualization and dissemination of information. Nevertheless, these opportunities cannot be totally exploited because of deficiencies in standardization and interoperability in these systems. Within this context, CityGML was developed as an international standard of the Open Geospatial Consortium (OGC) for 3D city models' representation and exchange. CityGML defines geometry and topology for city modeling, also focusing on semantic aspects of 3D city information. The scope of CityGML is to reach common terminology, also addressing the imperative need for interoperability and data integration, taking into account the number of available geographic information systems and modeling techniques. The aim of this paper is to develop an application for managing semantic information of a model generated based on procedural modeling. The model was initially implemented in CityEngine ESRI's software, and then imported to ArcGIS environment. Final goal was the original model's semantic enrichment and then its conversion to CityGML format. Semantic information management and interoperability seemed to be feasible by the use of the 3DCities Project ESRI tools, since its database structure ensures adding semantic information to the CityEngine model and therefore automatically convert to CityGML for advanced analysis and visualization in different application areas.
Next-Generation Search Engines for Information Retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Hook, Leslie A; Palanisamy, Giri
In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore » finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a centralized index. The harvested files are indexed against SOLR search API consistently, so that it can render search capabilities such as simple, fielded, spatial and temporal searches across a span of projects ranging from land, atmosphere, and ocean ecology. Mercury also provides data sharing capabilities using Open Archive Initiatives Protocol for Metadata Handling (OAI-PMH). In this paper we will discuss about the best practices for archiving data and metadata, new searching techniques, efficient ways of data retrieval and information display.« less
A novel visualization model for web search results.
Nguyen, Tien N; Zhang, Jin
2006-01-01
This paper presents an interactive visualization system, named WebSearchViz, for visualizing the Web search results and acilitating users' navigation and exploration. The metaphor in our model is the solar system with its planets and asteroids revolving around the sun. Location, color, movement, and spatial distance of objects in the visual space are used to represent the semantic relationships between a query and relevant Web pages. Especially, the movement of objects and their speeds add a new dimension to the visual space, illustrating the degree of relevance among a query and Web search results in the context of users' subjects of interest. By interacting with the visual space, users are able to observe the semantic relevance between a query and a resulting Web page with respect to their subjects of interest, context information, or concern. Users' subjects of interest can be dynamically changed, redefined, added, or deleted from the visual space.
A novel adaptive Cuckoo search for optimal query plan generation.
Gomathi, Ramalingam; Sharmila, Dhandapani
2014-01-01
The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.
NASA Astrophysics Data System (ADS)
Baumann, Peter
2013-04-01
There is a traditional saying that metadata are understandable, semantic-rich, and searchable. Data, on the other hand, are big, with no accessible semantics, and just downloadable. Not only has this led to an imbalance of search support form a user perspective, but also underneath to a deep technology divide often using relational databases for metadata and bespoke archive solutions for data. Our vision is that this barrier will be overcome, and data and metadata become searchable likewise, leveraging the potential of semantic technologies in combination with scalability technologies. Ultimately, in this vision ad-hoc processing and filtering will not distinguish any longer, forming a uniformly accessible data universe. In the European EarthServer initiative, we work towards this vision by federating database-style raster query languages with metadata search and geo broker technology. We present our approach taken, how it can leverage OGC standards, the benefits envisaged, and first results.
Constructive Ontology Engineering
ERIC Educational Resources Information Center
Sousan, William L.
2010-01-01
The proliferation of the Semantic Web depends on ontologies for knowledge sharing, semantic annotation, data fusion, and descriptions of data for machine interpretation. However, ontologies are difficult to create and maintain. In addition, their structure and content may vary depending on the application and domain. Several methods described in…
Progress toward a Semantic eScience Framework; building on advanced cyberinfrastructure
NASA Astrophysics Data System (ADS)
McGuinness, D. L.; Fox, P. A.; West, P.; Rozell, E.; Zednik, S.; Chang, C.
2010-12-01
The configurable and extensible semantic eScience framework (SESF) has begun development and implementation of several semantic application components. Extensions and improvements to several ontologies have been made based on distinct interdisciplinary use cases ranging from solar physics, to biologicl and chemical oceanography. Importantly, these semantic representations mediate access to a diverse set of existing and emerging cyberinfrastructure. Among the advances are the population of triple stores with web accessible query services. A triple store is akin to a relational data store where the basic stored unit is a subject-predicate-object tuple. Access via a query is provided by the W3 Recommendation language specification SPARQL. Upon this middle tier of semantic cyberinfrastructure, we have developed several forms of semantic faceted search, including provenance-awareness. We report on the rapid advances in semantic technologies and tools and how we are sustaining the software path for the required technical advances as well as the ontology improvements and increased functionality of the semantic applications including how they are integrated into web-based portals (e.g. Drupal) and web services. Lastly, we indicate future work direction and opportunities for collaboration.
Annotations of Mexican bullfighting videos for semantic index
NASA Astrophysics Data System (ADS)
Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro
2015-09-01
The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Computing Information Value from RDF Graph Properties
DOE Office of Scientific and Technical Information (OSTI.GOV)
al-Saffar, Sinan; Heileman, Gregory
2010-11-08
Information value has been implicitly utilized and mostly non-subjectively computed in information retrieval (IR) systems. We explicitly define and compute the value of an information piece as a function of two parameters, the first is the potential semantic impact the target information can subjectively have on its recipient's world-knowledge, and the second parameter is trust in the information source. We model these two parameters as properties of RDF graphs. Two graphs are constructed, a target graph representing the semantics of the target body of information and a context graph representing the context of the consumer of that information. We computemore » information value subjectively as a function of both potential change to the context graph (impact) and the overlap between the two graphs (trust). Graph change is computed as a graph edit distance measuring the dissimilarity between the context graph before and after the learning of the target graph. A particular application of this subjective information valuation is in the construction of a personalized ranking component in Web search engines. Based on our method, we construct a Web re-ranking system that personalizes the information experience for the information-consumer.« less
Issues and solutions for storage, retrieval, and searching of MPEG-7 documents
NASA Astrophysics Data System (ADS)
Chang, Yuan-Chi; Lo, Ming-Ling; Smith, John R.
2000-10-01
The ongoing MPEG-7 standardization activity aims at creating a standard for describing multimedia content in order to facilitate the interpretation of the associated information content. Attempting to address a broad range of applications, MPEG-7 has defined a flexible framework consisting of Descriptors, Description Schemes, and Description Definition Language. Descriptors and Description Schemes describe features, structure and semantics of multimedia objects. They are written in the Description Definition Language (DDL). In the most recent revision, DDL applies XML (Extensible Markup Language) Schema with MPEG-7 extensions. DDL has constructs that support inclusion, inheritance, reference, enumeration, choice, sequence, and abstract type of Description Schemes and Descriptors. In order to enable multimedia systems to use MPEG-7, a number of important problems in storing, retrieving and searching MPEG-7 documents need to be solved. This paper reports on initial finding on issues and solutions of storing and accessing MPEG-7 documents. In particular, we discuss the benefits of using a virtual document management framework based on XML Access Server (XAS) in order to bridge the MPEG-7 multimedia applications and database systems. The need arises partly because MPEG-7 descriptions need customized storage schema, indexing and search engines. We also discuss issues arising in managing dependence and cross-description scheme search.
Automated Semantic Indexing of Figure Captions to Improve Radiology Image Retrieval
Kahn, Charles E.; Rubin, Daniel L.
2009-01-01
Objective We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. Design The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. Measurements Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. Results Estimated precision was 0.897 (95% confidence interval, 0.857–0.937). Estimated recall was 0.930 (95% confidence interval, 0.838–1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. Conclusion Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval. PMID:19261938
Semantic Context Detection Using Audio Event Fusion
NASA Astrophysics Data System (ADS)
Chu, Wei-Ta; Cheng, Wen-Huang; Wu, Ja-Ling
2006-12-01
Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.
A Story of a Crashed Plane in US-Mexican border
NASA Astrophysics Data System (ADS)
Bermudez, Luis; Hobona, Gobe; Vretanos, Peter; Peterson, Perry
2013-04-01
A plane has crashed on the US-Mexican border. The search and rescue command center planner needs to find information about the crash site, a mountain, nearby mountains for the establishment of a communications tower, as well as ranches for setting up a local incident center. Events like this one occur all over the world and exchanging information seamlessly is key to save lives and prevent further disasters. This abstract describes an interoperability testbed that applied this scenario using technologies based on Open Geospatial Consortium (OGC) standards. The OGC, which has about 500 members, serves as a global forum for the collaboration of developers and users of spatial data products and services, and to advance the development of international standards for geospatial interoperability. The OGC Interoperability Program conducts international interoperability testbeds, such as the OGC Web Services Phase 9 (OWS-9), that encourages rapid development, testing, validation, demonstration and adoption of open, consensus based standards and best practices. The Cross-Community Interoperability (CCI) thread in OWS-9 advanced the Web Feature Service for Gazetteers (WFS-G) by providing a Single Point of Entry Global Gazetteer (SPEGG), where a user can submit a single query and access global geographic names data across multiple Federal names databases. Currently users must make two queries with differing input parameters against two separate databases to obtain authoritative cross border geographic names data. The gazetteers in this scenario included: GNIS and GNS. GNIS or Geographic Names Information System is managed by USGS. It was first developed in 1964 and contains information about domestic and Antarctic names. GNS or GeoNET Names Server provides the Geographic Names Data Base (GNDB) and it is managed by National Geospatial Intelligence Agency (NGA). GNS has been in service since 1994, and serves names for areas outside the United States and its dependent areas, as well as names for undersea features. The following challenges were advanced: Cascaded WFS-G servers (allowing to query multiple WFSs with a "parent" WFS), implemented query names filters (e.g. fuzzy search, text search), implemented dealing with multilingualism and diacritics, implemented advanced spatial constraints (e.g. search by radial search and nearest neighbor) and semantically mediated feature types (e.g. mountain vs. hill). To enable semantic mediation, a series of semantic mappings were defined between the NGA GNS, USGS GNIS and the Alexandria Digital Library (ADL) Gazetteer. The mappings were encoded in the Web Ontology Language (OWL) to enable them to be used by semantic web technologies. The semantic mappings were then published for ingestion into a semantic mediator that used the mappings to associate location types from one gazetteer with location types in another. The semantic mediator was then able to transform requests on the fly, providing a single point of entry WFS-G to multiple gazetteers. The presentation will provide a live presentation of the work performed, highlight main developments, and discuss future development.
Wu, Honghan; Toti, Giulia; Morley, Katherine I; Ibrahim, Zina M; Folarin, Amos; Jackson, Richard; Kartoglu, Ismail; Agrawal, Asha; Stringer, Clive; Gale, Darren; Gorrell, Genevieve; Roberts, Angus; Broadbent, Matthew; Stewart, Robert; Dobson, Richard J B
2018-05-01
Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs. SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient. The semantic data are serviced via ontology-based search and analytics interfaces. SemEHR has been deployed at a number of UK hospitals, including the Clinical Record Interactive Search, an anonymized replica of the EHR of the UK South London and Maudsley National Health Service Foundation Trust, one of Europe's largest providers of mental health services. In 2 Clinical Record Interactive Search-based studies, SemEHR achieved 93% (hepatitis C) and 99% (HIV) F-measure results in identifying true positive patients. At King's College Hospital in London, as part of the CogStack program (github.com/cogstack), SemEHR is being used to recruit patients into the UK Department of Health 100 000 Genomes Project (genomicsengland.co.uk). The validation study suggests that the tool can validate previously recruited cases and is very fast at searching phenotypes; time for recruitment criteria checking was reduced from days to minutes. Validated on open intensive care EHR data, Medical Information Mart for Intensive Care III, the vital signs extracted by SemEHR can achieve around 97% accuracy. Results from the multiple case studies demonstrate SemEHR's efficiency: weeks or months of work can be done within hours or minutes in some cases. SemEHR provides a more comprehensive view of patients, bringing in more and unexpected insight compared to study-oriented bespoke IE systems. SemEHR is open source, available at https://github.com/CogStack/SemEHR.
LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.
Allot, Alexis; Peng, Yifan; Wei, Chih-Hsuan; Lee, Kyubum; Phan, Lon; Lu, Zhiyong
2018-05-14
The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.
FoodWiki: Ontology-Driven Mobile Safe Food Consumption System
Çelik, Duygu
2015-01-01
An ontology-driven safe food consumption mobile system is considered. Over 3,000 compounds are being added to processed food, with numerous effects on the food: to add color, stabilize, texturize, preserve, sweeten, thicken, add flavor, soften, emulsify, and so forth. According to World Health Organization, governments have lately focused on legislation to reduce such ingredients or compounds in manufactured foods as they may have side effects causing health risks such as heart disease, cancer, diabetes, allergens, and obesity. By supervising what and how much to eat as well as what not to eat, we can maximize a patient's life quality through avoidance of unhealthy ingredients. Smart e-health systems with powerful knowledge bases can provide suggestions of appropriate foods to individuals. Next-generation smart knowledgebase systems will not only include traditional syntactic-based search, which limits the utility of the search results, but will also provide semantics for rich searching. In this paper, performance of concept matching of food ingredients is semantic-based, meaning that it runs its own semantic based rule set to infer meaningful results through the proposed Ontology-Driven Mobile Safe Food Consumption System (FoodWiki). PMID:26221624
A Digital Knowledge Preservation Platform for Environmental Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida; Perez, David
2017-04-01
The Digital Knowledge Preservation Platform is the evolution of a pilot project for Open Data supporting the full research data life cycle. It is currently being evolved at IFCA (Instituto de Física de Cantabria) as a combination of different open tools that have been extended: DMPTool (https://dmptool.org/) with pilot semantics features (RDF export, parameters definition), INVENIO (http://invenio-software.org/ ) customized version to integrate the entire research data life cycle and Jupyter (http://jupyter.org/) as processing tool and reproducibility environment. This complete platform aims to provide an integrated environment for research data management following the FAIR+R principles: -Findable: The Web portal based on Invenio provides a search engine and all elements including metadata to make them easily findable. -Accessible: Both data and software are available online with internal PIDs and DOIs (provided by Datacite). -Interoperable: Datasets can be combined to perform new analysis. The OAI-PMH standard is also integrated. -Re-usable: different licenses types and embargo periods can be defined. -+Reproducible: directly integrated with cloud computing resources. The deployment of the entire system over a Cloud framework helps to build a dynamic and scalable solution, not only for managing open datasets but also as a useful tool for the final user, who is able to directly process and analyse the open data. In parallel, the direct use of semantics and metadata is being explored and integrated in the framework. Ontologies, being a knowledge representation, can contribute to define the elements and relationships of the research data life cycle, including DMP, datasets, software, etc. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop the ontology we are using a graphical tool called Protégé. Protégé is a graphical ontology-development tool which supports a rich knowledge model and it is open-source and freely available. However in order to process and manage the ontology from the web framework, we are using Semantic MediaWiki, which is able to process queries. Semantic MediaWiki is an extension of MediaWiki where we can do semantic search and export data in RDF and CSV format. This system is used as a testbed for the potential use of semantics in a more general environment. This Digital Knowledge Preservation Platform is very closed related to INDIGO-DataCloud project (https://www.indigo-datacloud.eu) since the same data life cycle approach is taking into account (Planning, Collect, Curate, Analyze, Publish, Preserve). INDIGO-DataCloud solutions will be able to support all the different elements in the system, as we showed in the last Research Data Alliance Plenary. This presentation will show the different elements on the system and how they work, as well as the roadmap of their continuous integration.
Issues in Semantic Memory: A Response to Glass and Holyoak. Technical Report No. 101.
ERIC Educational Resources Information Center
Shoben, Edward J.; And Others
Glass and Holyoak (1975) have raised two issues related to the distinction between set-theoretic and network theories of semantic memory, contending that: (a) their version of a network theory, the Marker Search model, is conceptually and empirically superior to the Feature Comparison model version of a set-theoretic theory; and (b) the contrast…
Effects of Spaced Retrieval Training on Semantic Memory in Alzheimer's Disease: A Systematic Review
ERIC Educational Resources Information Center
Oren, Shiri; Willerton, Charlene; Small, Jeff
2014-01-01
Purpose: This article reports on a systematic review and meta-analysis of the effects of spaced retrieval training (SRT) on semantic memory in people with Alzheimer's disease (AD) or related disorder. Method: An initial systematic database search identified 454 potential studies. After screening and de-duplication, 35 studies that used SRT…
Qualitative features of semantic fluency performance in mesial and lateral frontal patients.
Reverberi, Carlo; Laiacona, Marcella; Capitani, Erminio
2006-01-01
Semantic verbal fluency is widely used in clinical and experimental studies. This task is highly sensitive to the presence of brain pathology and is frequently impaired after frontal lesions. Besides the total number of words generated, a qualitative analysis of their sequence can add valuable information about the impaired cognitive components. Thirty-four frontal patients and a group of matched controls were examined. Besides the number of words and subcategories retrieved by each group, we analysed two distinct aspects of the word sequence: the search strategy through a semantically organised store and the ability to switch from one subcategory to another. We checked whether the pattern of impairment changed according to the lesion site within the frontal lobe. Overall, patients produced fewer words than controls. However, only lateral frontal patients presented a reduced semantic relatedness between contiguously produced words and a specifically increased proportion of switches to different subcategories. The performance of lateral frontal patients was in line with the hypothesis of a search strategy impairment and cannot be attributed to a switching deficit. The performance of mesial frontal patients could be ascribed to a general deficit of activation.
Transforming Systems Engineering through Model Centric Engineering
2017-08-08
12 Figure 5. Semantic Web Technologies related to Layers of Abstraction ................................. 23 Figure 6. NASA /JPL Instantiation...of OpenMBEE (circa 2014) ................................................. 24 Figure 7. NASA /JPL Foundational Ontology for Systems Engineering...Engineering (DE) Transformation initiative, and our relationship that we have fostered with National Aeronautics and Space Administration ( NASA ) Jet
A model-driven approach for representing clinical archetypes for Semantic Web environments.
Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás; Maldonado, José Alberto
2009-02-01
The life-long clinical information of any person supported by electronic means configures his Electronic Health Record (EHR). This information is usually distributed among several independent and heterogeneous systems that may be syntactically or semantically incompatible. There are currently different standards for representing and exchanging EHR information among different systems. In advanced EHR approaches, clinical information is represented by means of archetypes. Most of these approaches use the Archetype Definition Language (ADL) to specify archetypes. However, ADL has some drawbacks when attempting to perform semantic activities in Semantic Web environments. In this work, Semantic Web technologies are used to specify clinical archetypes for advanced EHR architectures. The advantages of using the Ontology Web Language (OWL) instead of ADL are described and discussed in this work. Moreover, a solution combining Semantic Web and Model-driven Engineering technologies is proposed to transform ADL into OWL for the CEN EN13606 EHR architecture.
Rewriting Logic Semantics of a Plan Execution Language
NASA Technical Reports Server (NTRS)
Dowek, Gilles; Munoz, Cesar A.; Rocha, Camilo
2009-01-01
The Plan Execution Interchange Language (PLEXIL) is a synchronous language developed by NASA to support autonomous spacecraft operations. In this paper, we propose a rewriting logic semantics of PLEXIL in Maude, a high-performance logical engine. The rewriting logic semantics is by itself a formal interpreter of the language and can be used as a semantic benchmark for the implementation of PLEXIL executives. The implementation in Maude has the additional benefit of making available to PLEXIL designers and developers all the formal analysis and verification tools provided by Maude. The formalization of the PLEXIL semantics in rewriting logic poses an interesting challenge due to the synchronous nature of the language and the prioritized rules defining its semantics. To overcome this difficulty, we propose a general procedure for simulating synchronous set relations in rewriting logic that is sound and, for deterministic relations, complete. We also report on the finding of two issues at the design level of the original PLEXIL semantics that were identified with the help of the executable specification in Maude.
Category and Word Search: Generalizing Search Principles to Complex Processing.
1982-03-01
complex processing (e.g., LaBerge & Samels, 1974; Shiffrin & Schneider, 1977). In the present paper we examine how well the major phenomena in simple visual...subjects are searching for novel characters ( LaBerge , 1973). The relatively large and rapid CH practice effects for word and category search are analogous...1974) demonstrated interference effects of irrelevant flanking letters. Shaffer and Laberge (1979) showed a similar effect with words and semantic
Semantic Document Model to Enhance Data and Knowledge Interoperability
NASA Astrophysics Data System (ADS)
Nešić, Saša
To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.
Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang
2017-02-20
Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed.
An Experiment in Scientific Program Understanding
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.; Owen, Karl (Technical Monitor)
2000-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. Results are shown for three intensively studied codes and seven blind test cases; all test cases are state of the art scientific codes. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.
iSMART: Ontology-based Semantic Query of CDA Documents
Liu, Shengping; Ni, Yuan; Mei, Jing; Li, Hanyu; Xie, Guotong; Hu, Gang; Liu, Haifeng; Hou, Xueqiao; Pan, Yue
2009-01-01
The Health Level 7 Clinical Document Architecture (CDA) is widely accepted as the format for electronic clinical document. With the rich ontological references in CDA documents, the ontology-based semantic query could be performed to retrieve CDA documents. In this paper, we present iSMART (interactive Semantic MedicAl Record reTrieval), a prototype system designed for ontology-based semantic query of CDA documents. The clinical information in CDA documents will be extracted into RDF triples by a declarative XML to RDF transformer. An ontology reasoner is developed to infer additional information by combining the background knowledge from SNOMED CT ontology. Then an RDF query engine is leveraged to enable the semantic queries. This system has been evaluated using the real clinical documents collected from a large hospital in southern China. PMID:20351883
SATORI: a system for ontology-guided visual exploration of biomedical data repositories.
Lekschas, Fritz; Gehlenborg, Nils
2018-04-01
The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. nils@hms.harvard.edu. Supplementary data are available at Bioinformatics online.
A path-oriented knowledge representation system: Defusing the combinatorial system
NASA Technical Reports Server (NTRS)
Karamouzis, Stamos T.; Barry, John S.; Smith, Steven L.; Feyock, Stefan
1995-01-01
LIMAP is a programming system oriented toward efficient information manipulation over fixed finite domains, and quantification over paths and predicates. A generalization of Warshall's Algorithm to precompute paths in a sparse matrix representation of semantic nets is employed to allow questions involving paths between components to be posed and answered easily. LIMAP's ability to cache all paths between two components in a matrix cell proved to be a computational obstacle, however, when the semantic net grew to realistic size. The present paper describes a means of mitigating this combinatorial explosion to an extent that makes the use of the LIMAP representation feasible for problems of significant size. The technique we describe radically reduces the size of the search space in which LIMAP must operate; semantic nets of more than 500 nodes have been attacked successfully. Furthermore, it appears that the procedure described is applicable not only to LIMAP, but to a number of other combinatorially explosive search space problems found in AI as well.
The Interplay of Episodic and Semantic Memory in Guiding Repeated Search in Scenes
ERIC Educational Resources Information Center
Vo, Melissa L.-H.; Wolfe, Jeremy M.
2013-01-01
It seems intuitive to think that previous exposure or interaction with an environment should make it easier to search through it and, no doubt, this is true in many real-world situations. However, in a recent study, we demonstrated that previous exposure to a scene does not necessarily speed search within that scene. For instance, when observers…
Sinaci, A Anil; Laleci Erturkmen, Gokce B
2013-10-01
In order to enable secondary use of Electronic Health Records (EHRs) by bridging the interoperability gap between clinical care and research domains, in this paper, a unified methodology and the supporting framework is introduced which brings together the power of metadata registries (MDR) and semantic web technologies. We introduce a federated semantic metadata registry framework by extending the ISO/IEC 11179 standard, and enable integration of data element registries through Linked Open Data (LOD) principles where each Common Data Element (CDE) can be uniquely referenced, queried and processed to enable the syntactic and semantic interoperability. Each CDE and their components are maintained as LOD resources enabling semantic links with other CDEs, terminology systems and with implementation dependent content models; hence facilitating semantic search, much effective reuse and semantic interoperability across different application domains. There are several important efforts addressing the semantic interoperability in healthcare domain such as IHE DEX profile proposal, CDISC SHARE and CDISC2RDF. Our architecture complements these by providing a framework to interlink existing data element registries and repositories for multiplying their potential for semantic interoperability to a greater extent. Open source implementation of the federated semantic MDR framework presented in this paper is the core of the semantic interoperability layer of the SALUS project which enables the execution of the post marketing safety analysis studies on top of existing EHR systems. Copyright © 2013 Elsevier Inc. All rights reserved.
Semantic Indexing of Medical Learning Objects: Medical Students' Usage of a Semantic Network
Gießler, Paul; Ohnesorge-Radtke, Ursula; Spreckelsen, Cord
2015-01-01
Background The Semantically Annotated Media (SAM) project aims to provide a flexible platform for searching, browsing, and indexing medical learning objects (MLOs) based on a semantic network derived from established classification systems. Primarily, SAM supports the Aachen emedia skills lab, but SAM is ready for indexing distributed content and the Simple Knowledge Organizing System standard provides a means for easily upgrading or even exchanging SAM’s semantic network. There is a lack of research addressing the usability of MLO indexes or search portals like SAM and the user behavior with such platforms. Objective The purpose of this study was to assess the usability of SAM by investigating characteristic user behavior of medical students accessing MLOs via SAM. Methods In this study, we chose a mixed-methods approach. Lean usability testing was combined with usability inspection by having the participants complete four typical usage scenarios before filling out a questionnaire. The questionnaire was based on the IsoMetrics usability inventory. Direct user interaction with SAM (mouse clicks and pages accessed) was logged. Results The study analyzed the typical usage patterns and habits of students using a semantic network for accessing MLOs. Four scenarios capturing characteristics of typical tasks to be solved by using SAM yielded high ratings of usability items and showed good results concerning the consistency of indexing by different users. Long-tail phenomena emerge as they are typical for a collaborative Web 2.0 platform. Suitable but nonetheless rarely used keywords were assigned to MLOs by some users. Conclusions It is possible to develop a Web-based tool with high usability and acceptance for indexing and retrieval of MLOs. SAM can be applied to indexing multicentered repositories of MLOs collaboratively. PMID:27731860
Semantic Indexing of Medical Learning Objects: Medical Students' Usage of a Semantic Network.
Tix, Nadine; Gießler, Paul; Ohnesorge-Radtke, Ursula; Spreckelsen, Cord
2015-11-11
The Semantically Annotated Media (SAM) project aims to provide a flexible platform for searching, browsing, and indexing medical learning objects (MLOs) based on a semantic network derived from established classification systems. Primarily, SAM supports the Aachen emedia skills lab, but SAM is ready for indexing distributed content and the Simple Knowledge Organizing System standard provides a means for easily upgrading or even exchanging SAM's semantic network. There is a lack of research addressing the usability of MLO indexes or search portals like SAM and the user behavior with such platforms. The purpose of this study was to assess the usability of SAM by investigating characteristic user behavior of medical students accessing MLOs via SAM. In this study, we chose a mixed-methods approach. Lean usability testing was combined with usability inspection by having the participants complete four typical usage scenarios before filling out a questionnaire. The questionnaire was based on the IsoMetrics usability inventory. Direct user interaction with SAM (mouse clicks and pages accessed) was logged. The study analyzed the typical usage patterns and habits of students using a semantic network for accessing MLOs. Four scenarios capturing characteristics of typical tasks to be solved by using SAM yielded high ratings of usability items and showed good results concerning the consistency of indexing by different users. Long-tail phenomena emerge as they are typical for a collaborative Web 2.0 platform. Suitable but nonetheless rarely used keywords were assigned to MLOs by some users. It is possible to develop a Web-based tool with high usability and acceptance for indexing and retrieval of MLOs. SAM can be applied to indexing multicentered repositories of MLOs collaboratively.
Fan, Jung-Wei; Friedman, Carol
2011-01-01
Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. PMID:21549857
OMOGENIA: A Semantically Driven Collaborative Environment
NASA Astrophysics Data System (ADS)
Liapis, Aggelos
Ontology creation can be thought of as a social procedure. Indeed the concepts involved in general need to be elicited from communities of domain experts and end-users by teams of knowledge engineers. Many problems in ontology creation appear to resemble certain problems in software design, particularly with respect to the setup of collaborative systems. For instance, the resolution of conceptual conflicts between formalized ontologies is a major engineering problem as ontologies move into widespread use on the semantic web. Such conflict resolution often requires human collaboration and cannot be achieved by automated methods with the exception of simple cases. In this chapter we discuss research in the field of computer-supported cooperative work (CSCW) that focuses on classification and which throws light on ontology building. Furthermore, we present a semantically driven collaborative environment called OMOGENIA as a natural way to display and examine the structure of an evolving ontology in a collaborative setting.
On2broker: Semantic-Based Access to Information Sources at the WWW.
ERIC Educational Resources Information Center
Fensel, Dieter; Angele, Jurgen; Decker, Stefan; Erdmann, Michael; Schnurr, Hans-Peter; Staab, Steffen; Studer, Rudi; Witt, Andreas
On2broker provides brokering services to improve access to heterogeneous, distributed, and semistructured information sources as they are presented in the World Wide Web. It relies on the use of ontologies to make explicit the semantics of Web pages. This paper discusses the general architecture and main components (i.e., query engine, information…
Concept Map Engineering: Methods and Tools Based on the Semantic Relation Approach
ERIC Educational Resources Information Center
Kim, Minkyu
2013-01-01
The purpose of this study is to develop a better understanding of technologies that use natural language as the basis for concept map construction. In particular, this study focuses on the semantic relation (SR) approach to drawing rich and authentic concept maps that reflect students' internal representations of a problem situation. The…
Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.
Cheng Ye, M S; Fabbri, Daniel
2018-05-21
Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks. Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart. The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question. Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings. Copyright © 2018. Published by Elsevier Inc.
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics
NASA Astrophysics Data System (ADS)
Huffer, E.; Gleason, J. L.
2015-12-01
The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
Semantic Approaches Applied to Scientific Ocean Drilling Data
NASA Astrophysics Data System (ADS)
Fils, D.; Jenkins, C. J.; Arko, R. A.
2012-12-01
The application of Linked Open Data methods to 40 years of data from scientific ocean drilling is providing users with several new methods for rich-content data search and discovery. Data from the Deep Sea Drilling Project (DSDP), Ocean Drilling Program (ODP) and Integrated Ocean Drilling Program (IODP) have been translated and placed in RDF triple stores to provide access via SPARQL, linked open data patterns, and by embedded structured data through schema.org / RDFa. Existing search services have been re-encoded in this environment which allows the new and established architectures to be contrasted. Vocabularies including computed semantic relations between concepts, allow separate but related data sets to be connected on their concepts and resources even when they are expressed somewhat differently. Scientific ocean drilling produces a wide range of data types and data sets: borehole logging file-based data, images, measurements, visual observations and the physical sample data. The steps involved in connecting these data to concepts using vocabularies will be presented, including the connection of data sets through Vocabulary of Interlinked Datasets (VoID) and open entity collections such as Freebase and dbPedia. Demonstrated examples will include: (i) using RDF Schema for inferencing and in federated searches across NGDC and IODP data, (ii) using structured data in the data.oceandrilling.org web site, (iii) association through semantic methods of age models and depth recorded data to facilitate age based searches for data recorded by depth only.
Cameron, Delroy; Sheth, Amit P; Jaykumar, Nishita; Thirunarayan, Krishnaprasad; Anand, Gaurish; Smith, Gary A
2014-12-01
While contemporary semantic search systems offer to improve classical keyword-based search, they are not always adequate for complex domain specific information needs. The domain of prescription drug abuse, for example, requires knowledge of both ontological concepts and "intelligible constructs" not typically modeled in ontologies. These intelligible constructs convey essential information that include notions of intensity, frequency, interval, dosage and sentiments, which could be important to the holistic needs of the information seeker. In this paper, we present a hybrid approach to domain specific information retrieval that integrates ontology-driven query interpretation with synonym-based query expansion and domain specific rules, to facilitate search in social media on prescription drug abuse. Our framework is based on a context-free grammar (CFG) that defines the query language of constructs interpretable by the search system. The grammar provides two levels of semantic interpretation: 1) a top-level CFG that facilitates retrieval of diverse textual patterns, which belong to broad templates and 2) a low-level CFG that enables interpretation of specific expressions belonging to such textual patterns. These low-level expressions occur as concepts from four different categories of data: 1) ontological concepts, 2) concepts in lexicons (such as emotions and sentiments), 3) concepts in lexicons with only partial ontology representation, called lexico-ontology concepts (such as side effects and routes of administration (ROA)), and 4) domain specific expressions (such as date, time, interval, frequency and dosage) derived solely through rules. Our approach is embodied in a novel Semantic Web platform called PREDOSE, which provides search support for complex domain specific information needs in prescription drug abuse epidemiology. When applied to a corpus of over 1 million drug abuse-related web forum posts, our search framework proved effective in retrieving relevant documents when compared with three existing search systems.
Cameron, Delroy; Sheth, Amit P.; Jaykumar, Nishita; Thirunarayan, Krishnaprasad; Anand, Gaurish; Smith, Gary A.
2015-01-01
While contemporary semantic search systems offer to improve classical keyword-based search, they are not always adequate for complex domain specific information needs. The domain of prescription drug abuse, for example, requires knowledge of both ontological concepts and “intelligible constructs” not typically modeled in ontologies. These intelligible constructs convey essential information that include notions of intensity, frequency, interval, dosage and sentiments, which could be important to the holistic needs of the information seeker. In this paper, we present a hybrid approach to domain specific information retrieval that integrates ontology-driven query interpretation with synonym-based query expansion and domain specific rules, to facilitate search in social media on prescription drug abuse. Our framework is based on a context-free grammar (CFG) that defines the query language of constructs interpretable by the search system. The grammar provides two levels of semantic interpretation: 1) a top-level CFG that facilitates retrieval of diverse textual patterns, which belong to broad templates and 2) a low-level CFG that enables interpretation of specific expressions belonging to such textual patterns. These low-level expressions occur as concepts from four different categories of data: 1) ontological concepts, 2) concepts in lexicons (such as emotions and sentiments), 3) concepts in lexicons with only partial ontology representation, called lexico-ontology concepts (such as side effects and routes of administration (ROA)), and 4) domain specific expressions (such as date, time, interval, frequency and dosage) derived solely through rules. Our approach is embodied in a novel Semantic Web platform called PREDOSE, which provides search support for complex domain specific information needs in prescription drug abuse epidemiology. When applied to a corpus of over 1 million drug abuse-related web forum posts, our search framework proved effective in retrieving relevant documents when compared with three existing search systems. PMID:25814917
Searching for extra-terrestrial civilizations
NASA Technical Reports Server (NTRS)
Gindilis, L. M.
1974-01-01
The probability of radio interchange with extraterrestrial civilizations is discussed. Difficulties constitute absorption, scattering, and dispersion of signals by the rarified interstellar medium as well as the deciphering of received signals and convergence of semantic concept. A cybernetic approach considers searching for signals that develop from astroengineering activities of extraterrestrial civilizations.
Text Mining the History of Medicine.
Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia
2016-01-01
Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform.
Text Mining the History of Medicine
Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia
2016-01-01
Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform. PMID:26734936
Mining and integration of pathway diagrams from imaging data.
Kozhenkov, Sergey; Baitaluk, Michael
2012-03-01
Pathway diagrams from PubMed and World Wide Web (WWW) contain valuable highly curated information difficult to reach without tools specifically designed and customized for the biological semantics and high-content density of the images. There is currently no search engine or tool that can analyze pathway images, extract their pathway components (molecules, genes, proteins, organelles, cells, organs, etc.) and indicate their relationships. Here, we describe a resource of pathway diagrams retrieved from article and web-page images through optical character recognition, in conjunction with data mining and data integration methods. The recognized pathways are integrated into the BiologicalNetworks research environment linking them to a wealth of data available in the BiologicalNetworks' knowledgebase, which integrates data from >100 public data sources and the biomedical literature. Multiple search and analytical tools are available that allow the recognized cellular pathways, molecular networks and cell/tissue/organ diagrams to be studied in the context of integrated knowledge, experimental data and the literature. BiologicalNetworks software and the pathway repository are freely available at www.biologicalnetworks.org. Supplementary data are available at Bioinformatics online.
Breast Histopathological Image Retrieval Based on Latent Dirichlet Allocation.
Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu
2017-07-01
In the field of pathology, whole slide image (WSI) has become the major carrier of visual and diagnostic information. Content-based image retrieval among WSIs can aid the diagnosis of an unknown pathological image by finding its similar regions in WSIs with diagnostic information. However, the huge size and complex content of WSI pose several challenges for retrieval. In this paper, we propose an unsupervised, accurate, and fast retrieval method for a breast histopathological image. Specifically, the method presents a local statistical feature of nuclei for morphology and distribution of nuclei, and employs the Gabor feature to describe the texture information. The latent Dirichlet allocation model is utilized for high-level semantic mining. Locality-sensitive hashing is used to speed up the search. Experiments on a WSI database with more than 8000 images from 15 types of breast histopathology demonstrate that our method achieves about 0.9 retrieval precision as well as promising efficiency. Based on the proposed framework, we are developing a search engine for an online digital slide browsing and retrieval platform, which can be applied in computer-aided diagnosis, pathology education, and WSI archiving and management.
Li, Yuanqing; Wang, Guangyi; Long, Jinyi; Yu, Zhuliang; Huang, Biao; Li, Xiaojian; Yu, Tianyou; Liang, Changhong; Li, Zheng; Sun, Pei
2011-01-01
One of the central questions in cognitive neuroscience is the precise neural representation, or brain pattern, associated with a semantic category. In this study, we explored the influence of audiovisual stimuli on the brain patterns of concepts or semantic categories through a functional magnetic resonance imaging (fMRI) experiment. We used a pattern search method to extract brain patterns corresponding to two semantic categories: "old people" and "young people." These brain patterns were elicited by semantically congruent audiovisual, semantically incongruent audiovisual, unimodal visual, and unimodal auditory stimuli belonging to the two semantic categories. We calculated the reproducibility index, which measures the similarity of the patterns within the same category. We also decoded the semantic categories from these brain patterns. The decoding accuracy reflects the discriminability of the brain patterns between two categories. The results showed that both the reproducibility index of brain patterns and the decoding accuracy were significantly higher for semantically congruent audiovisual stimuli than for unimodal visual and unimodal auditory stimuli, while the semantically incongruent stimuli did not elicit brain patterns with significantly higher reproducibility index or decoding accuracy. Thus, the semantically congruent audiovisual stimuli enhanced the within-class reproducibility of brain patterns and the between-class discriminability of brain patterns, and facilitate neural representations of semantic categories or concepts. Furthermore, we analyzed the brain activity in superior temporal sulcus and middle temporal gyrus (STS/MTG). The strength of the fMRI signal and the reproducibility index were enhanced by the semantically congruent audiovisual stimuli. Our results support the use of the reproducibility index as a potential tool to supplement the fMRI signal amplitude for evaluating multimodal integration.
Long, Jinyi; Yu, Zhuliang; Huang, Biao; Li, Xiaojian; Yu, Tianyou; Liang, Changhong; Li, Zheng; Sun, Pei
2011-01-01
One of the central questions in cognitive neuroscience is the precise neural representation, or brain pattern, associated with a semantic category. In this study, we explored the influence of audiovisual stimuli on the brain patterns of concepts or semantic categories through a functional magnetic resonance imaging (fMRI) experiment. We used a pattern search method to extract brain patterns corresponding to two semantic categories: “old people” and “young people.” These brain patterns were elicited by semantically congruent audiovisual, semantically incongruent audiovisual, unimodal visual, and unimodal auditory stimuli belonging to the two semantic categories. We calculated the reproducibility index, which measures the similarity of the patterns within the same category. We also decoded the semantic categories from these brain patterns. The decoding accuracy reflects the discriminability of the brain patterns between two categories. The results showed that both the reproducibility index of brain patterns and the decoding accuracy were significantly higher for semantically congruent audiovisual stimuli than for unimodal visual and unimodal auditory stimuli, while the semantically incongruent stimuli did not elicit brain patterns with significantly higher reproducibility index or decoding accuracy. Thus, the semantically congruent audiovisual stimuli enhanced the within-class reproducibility of brain patterns and the between-class discriminability of brain patterns, and facilitate neural representations of semantic categories or concepts. Furthermore, we analyzed the brain activity in superior temporal sulcus and middle temporal gyrus (STS/MTG). The strength of the fMRI signal and the reproducibility index were enhanced by the semantically congruent audiovisual stimuli. Our results support the use of the reproducibility index as a potential tool to supplement the fMRI signal amplitude for evaluating multimodal integration. PMID:21750692
LIS Professionals as Knowledge Engineers.
ERIC Educational Resources Information Center
Poulter, Alan; And Others
1994-01-01
Considers the role of library and information science professionals as knowledge engineers. Highlights include knowledge acquisition, including personal experience, interviews, protocol analysis, observation, multidimensional sorting, printed sources, and machine learning; knowledge representation, including production rules and semantic nets;…
Semantic annotation in biomedicine: the current landscape.
Jovanović, Jelena; Bagheri, Ebrahim
2017-09-22
The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.
Taxonomy, Ontology and Semantics at Johnson Space Center
NASA Technical Reports Server (NTRS)
Berndt, Sarah Ann
2011-01-01
At NASA Johnson Space Center (JSC), the Chief Knowledge Officer has been developing the JSC Taxonomy to capitalize on the accomplishments of yesterday while maintaining the flexibility needed for the evolving information environment of today. A clear vision and scope for the semantic system is integral to its success. The vision for the JSC Taxonomy is to connect information stovepipes to present a unified view for information and knowledge across the Center, across organizations, and across decades. Semantic search at JSC means seemless integration of disparate information sets into a single interface. Ever increasing use, interest, and organizational participation mark successful integration and provide the framework for future application.
Semantic Agent-Based Service Middleware and Simulation for Smart Cities
Liu, Ming; Xu, Yang; Hu, Haixiao; Mohammed, Abdul-Wahid
2016-01-01
With the development of Machine-to-Machine (M2M) technology, a variety of embedded and mobile devices is integrated to interact via the platform of the Internet of Things, especially in the domain of smart cities. One of the primary challenges is that selecting the appropriate services or service combination for upper layer applications is hard, which is due to the absence of a unified semantical service description pattern, as well as the service selection mechanism. In this paper, we define a semantic service representation model from four key properties: Capability (C), Deployment (D), Resource (R) and IOData (IO). Based on this model, an agent-based middleware is built to support semantic service enablement. In this middleware, we present an efficient semantic service discovery and matching approach for a service combination process, which calculates the semantic similarity between services, and a heuristic algorithm to search the service candidates for a specific service request. Based on this design, we propose a simulation of virtual urban fire fighting, and the experimental results manifest the feasibility and efficiency of our design. PMID:28009818
Semantic Agent-Based Service Middleware and Simulation for Smart Cities.
Liu, Ming; Xu, Yang; Hu, Haixiao; Mohammed, Abdul-Wahid
2016-12-21
With the development of Machine-to-Machine (M2M) technology, a variety of embedded and mobile devices is integrated to interact via the platform of the Internet of Things, especially in the domain of smart cities. One of the primary challenges is that selecting the appropriate services or service combination for upper layer applications is hard, which is due to the absence of a unified semantical service description pattern, as well as the service selection mechanism. In this paper, we define a semantic service representation model from four key properties: Capability (C), Deployment (D), Resource (R) and IOData (IO). Based on this model, an agent-based middleware is built to support semantic service enablement. In this middleware, we present an efficient semantic service discovery and matching approach for a service combination process, which calculates the semantic similarity between services, and a heuristic algorithm to search the service candidates for a specific service request. Based on this design, we propose a simulation of virtual urban fire fighting, and the experimental results manifest the feasibility and efficiency of our design.
Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang
2017-01-01
Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed. PMID:28230725
Transcranial Direct Current Stimulation Effects on Semantic Processing in Healthy Individuals.
Joyal, Marilyne; Fecteau, Shirley
2016-01-01
Semantic processing allows us to use conceptual knowledge about the world. It has been associated with a large distributed neural network that includes the frontal, temporal and parietal cortices. Recent studies using transcranial direct current stimulation (tDCS) also contributed at investigating semantic processing. The goal of this article was to review studies investigating semantic processing in healthy individuals with tDCS and discuss findings from these studies in line with neuroimaging results. Based on functional magnetic resonance imaging studies assessing semantic processing, we predicted that tDCS applied over the inferior frontal gyrus, middle temporal gyrus, and posterior parietal cortex will impact semantic processing. We conducted a search on Pubmed and selected 27 articles in which tDCS was used to modulate semantic processing in healthy subjects. We analysed each article according to these criteria: demographic information, experimental outcomes assessing semantic processing, study design, and effects of tDCS on semantic processes. From the 27 reviewed studies, 8 found main effects of stimulation. In addition to these 8 studies, 17 studies reported an interaction between stimulus types and stimulation conditions (e.g. incoherent functional, but not instrumental, actions were processed faster when anodal tDCS was applied over the posterior parietal cortex as compared to sham tDCS). Results suggest that regions in the frontal, temporal, and parietal cortices are involved in semantic processing. tDCS can modulate some aspects of semantic processing and provide information on the functional roles of brain regions involved in this cognitive process. Copyright © 2016 Elsevier Inc. All rights reserved.
Stracuzzi, David John; Brost, Randolph C.; Phillips, Cynthia A.; ...
2015-09-26
Geospatial semantic graphs provide a robust foundation for representing and analyzing remote sensor data. In particular, they support a variety of pattern search operations that capture the spatial and temporal relationships among the objects and events in the data. However, in the presence of large data corpora, even a carefully constructed search query may return a large number of unintended matches. This work considers the problem of calculating a quality score for each match to the query, given that the underlying data are uncertain. As a result, we present a preliminary evaluation of three methods for determining both match qualitymore » scores and associated uncertainty bounds, illustrated in the context of an example based on overhead imagery data.« less
A Hybrid P2P Overlay Network for Non-strictly Hierarchically Categorized Content
NASA Astrophysics Data System (ADS)
Wan, Yi; Asaka, Takuya; Takahashi, Tatsuro
In P2P content distribution systems, there are many cases in which the content can be classified into hierarchically organized categories. In this paper, we propose a hybrid overlay network design suitable for such content called Pastry/NSHCC (Pastry for Non-Strictly Hierarchically Categorized Content). The semantic information of classification hierarchies of the content can be utilized regardless of whether they are in a strict tree structure or not. By doing so, the search scope can be restrained to any granularity, and the number of query messages also decreases while maintaining keyword searching availability. Through simulation, we showed that the proposed method provides better performance and lower overhead than unstructured overlays exploiting the same semantic information.
Versioning System for Distributed Ontology Development
2016-02-02
Semantic Web community. For example, the distributed and isolated development requirement may apply to non‐cyber range communities of public ontology... semantic web .” However, we observe that the maintenance of an ontology and its reuse is not a high priority for the majority of the publicly available... Semantic ) Web . AAAI Spring Symposium: Symbiotic Relationships between Semantic Web and Knowledge Engineering. 2008. [LHK09] Matthias Loskyll
Pulvermüller, Friedemann
2013-10-01
"Embodied" proposals claim that the meaning of at least some words, concepts and constructions is grounded in knowledge about actions and objects. An alternative "disembodied" position locates semantics in a symbolic system functionally detached from sensorimotor modules. This latter view is not tenable theoretically and has been empirically falsified by neuroscience research. A minimally-embodied approach now claims that action-perception systems may "color", but not represent, meaning; however, such minimal embodiment (misembodiment?) still fails to explain why action and perception systems exert causal effects on the processing of symbols from specific semantic classes. Action perception theory (APT) offers neurobiological mechanisms for "embodied" referential, affective and action semantics along with "disembodied" mechanisms of semantic abstraction, generalization and symbol combination, which draw upon multimodal brain systems. In this sense, APT suggests integrative-neuromechanistic explanations of why both sensorimotor and multimodal areas of the human brain differentially contribute to specific facets of meaning and concepts. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Attention during natural vision warps semantic representation across the human brain.
Çukur, Tolga; Nishimoto, Shinji; Huth, Alexander G; Gallant, Jack L
2013-06-01
Little is known about how attention changes the cortical representation of sensory information in humans. On the basis of neurophysiological evidence, we hypothesized that attention causes tuning changes to expand the representation of attended stimuli at the cost of unattended stimuli. To investigate this issue, we used functional magnetic resonance imaging to measure how semantic representation changed during visual search for different object categories in natural movies. We found that many voxels across occipito-temporal and fronto-parietal cortex shifted their tuning toward the attended category. These tuning shifts expanded the representation of the attended category and of semantically related, but unattended, categories, and compressed the representation of categories that were semantically dissimilar to the target. Attentional warping of semantic representation occurred even when the attended category was not present in the movie; thus, the effect was not a target-detection artifact. These results suggest that attention dynamically alters visual representation to optimize processing of behaviorally relevant objects during natural vision.
Attention During Natural Vision Warps Semantic Representation Across the Human Brain
Çukur, Tolga; Nishimoto, Shinji; Huth, Alexander G.; Gallant, Jack L.
2013-01-01
Little is known about how attention changes the cortical representation of sensory information in humans. Based on neurophysiological evidence, we hypothesized that attention causes tuning changes to expand the representation of attended stimuli at the cost of unattended stimuli. To investigate this issue we used functional MRI (fMRI) to measure how semantic representation changes when searching for different object categories in natural movies. We find that many voxels across occipito-temporal and fronto-parietal cortex shift their tuning toward the attended category. These tuning shifts expand the representation of the attended category and of semantically-related but unattended categories, and compress the representation of categories semantically-dissimilar to the target. Attentional warping of semantic representation occurs even when the attended category is not present in the movie, thus the effect is not a target-detection artifact. These results suggest that attention dynamically alters visual representation to optimize processing of behaviorally relevant objects during natural vision. PMID:23603707
Holderbaum, Candice Steffen; Mansur, Letícia Lessa; de Salles, Jerusa Fumagalli
2016-01-01
ABSTRACT Investigations on the semantic priming effect (SPE) in patients after left hemisphere (LH) lesions have shown disparities that may be explained by the variability in performance found among patients. The aim of the present study was to verify the existence of subgroups of patients after LH stroke by searching for dissociations between performance on the lexical decision task based on the semantic priming paradigm and performance on direct memory, semantic association and language tasks. All 17 patients with LH lesions after stroke (ten non-fluent aphasics and seven non aphasics) were analyzed individually. Results indicated the presence of three groups of patients according to SPE: one exhibiting SPE at both stimulus onset asynchronies (SOAs), one with SPE only at long SOA, and another, larger group with no SPE. PMID:29213439
A Method for Search Engine Selection using Thesaurus for Selective Meta-Search Engine
NASA Astrophysics Data System (ADS)
Goto, Shoji; Ozono, Tadachika; Shintani, Toramatsu
In this paper, we propose a new method for selecting search engines on WWW for selective meta-search engine. In selective meta-search engine, a method is needed that would enable selecting appropriate search engines for users' queries. Most existing methods use statistical data such as document frequency. These methods may select inappropriate search engines if a query contains polysemous words. In this paper, we describe an search engine selection method based on thesaurus. In our method, a thesaurus is constructed from documents in a search engine and is used as a source description of the search engine. The form of a particular thesaurus depends on the documents used for its construction. Our method enables search engine selection by considering relationship between terms and overcomes the problems caused by polysemous words. Further, our method does not have a centralized broker maintaining data, such as document frequency for all search engines. As a result, it is easy to add a new search engine, and meta-search engines become more scalable with our method compared to other existing methods.
NASA Astrophysics Data System (ADS)
Nieland, Simon; Kleinschmit, Birgit; Förster, Michael
2015-05-01
Ontology-based applications hold promise in improving spatial data interoperability. In this work we use remote sensing-based biodiversity information and apply semantic formalisation and ontological inference to show improvements in data interoperability/comparability. The proposed methodology includes an observation-based, "bottom-up" engineering approach for remote sensing applications and gives a practical example of semantic mediation of geospatial products. We apply the methodology to three different nomenclatures used for remote sensing-based classification of two heathland nature conservation areas in Belgium and Germany. We analysed sensor nomenclatures with respect to their semantic formalisation and their bio-geographical differences. The results indicate that a hierarchical and transparent nomenclature is far more important for transferability than the sensor or study area. The inclusion of additional information, not necessarily belonging to a vegetation class description, is a key factor for the future success of using semantics for interoperability in remote sensing.
John Sinclair (1933-2007): The Search for Units of Meaning--Sinclair on Empirical Semantics
ERIC Educational Resources Information Center
Stubbs, Michael
2009-01-01
John McHardy Sinclair has made major contributions to applied linguistics in three related areas: language in education, discourse analysis, and corpus-assisted lexicography. This article discusses the far-reaching implications for language description of this third area. The corpus-assisted search methodology provides empirical evidence for an…
2007-03-01
features Federated Search (providing services to find and aggregate information across GIG enterprise data sources); Enterprise Catalog (providing...Content Discovery Federated Search Portlet Users Guide v0.4.3 M16 25-Apr-05 NCES Mediation Core Enterprise Services SDK v0.5.0 M17 25-Apr-05 NCES
Web Feet Guide to Search Engines: Finding It on the Net.
ERIC Educational Resources Information Center
Web Feet, 2001
2001-01-01
This guide to search engines for the World Wide Web discusses selecting the right search engine; interpreting search results; major search engines; online tutorials and guides; search engines for kids; specialized search tools for various subjects; and other specialized engines and gateways. (LRW)
Semantically Enabling Knowledge Representation of Metamorphic Petrology Data
NASA Astrophysics Data System (ADS)
West, P.; Fox, P. A.; Spear, F. S.; Adali, S.; Nguyen, C.; Hallett, B. W.; Horkley, L. K.
2012-12-01
More and more metamorphic petrology data is being collected around the world, and is now being organized together into different virtual data portals by means of virtual organizations. For example, there is the virtual data portal Petrological Database (PetDB, http://www.petdb.org) of the Ocean Floor that is organizing scientific information about geochemical data of ocean floor igneous and metamorphic rocks; and also The Metamorphic Petrology Database (MetPetDB, http://metpetdb.rpi.edu) that is being created by a global community of metamorphic petrologists in collaboration with software engineers and data managers at Rensselaer Polytechnic Institute. The current focus is to provide the ability for scientists and researchers to register their data and search the databases for information regarding sample collections. What we present here is the next step in evolution of the MetPetDB portal, utilizing semantically enabled features such as discovery, data casting, faceted search, knowledge representation, and linked data as well as organizing information about the community and collaboration within the virtual community itself. We take the information that is currently represented in a relational database and make it available through web services, SPARQL endpoints, semantic and triple-stores where inferencing is enabled. We will be leveraging research that has taken place in virtual observatories, such as the Virtual Solar Terrestrial Observatory (VSTO) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO); vocabulary work done in various communities such as Observations and Measurements (ISO 19156), FOAF (Friend of a Friend), Bibo (Bibliography Ontology), and domain specific ontologies; enabling provenance traces of samples and subsamples using the different provenance ontologies; and providing the much needed linking of data from the various research organizations into a common, collaborative virtual observatory. In addition to better representing and presenting the actual data, we also look to organize and represent the knowledge information and expertise behind the data. Domain experts hold a lot of knowledge in their minds, in their presentations and publications, and elsewhere. Not only is this a technical issue, this is also a social issue in that we need to be able to encourage the domain experts to share their knowledge in a way that can be searched and queried over. With this additional focus in MetPetDB the site can be used more efficiently by other domain experts, but can also be utilized by non-specialists as well in order to educate people of the importance of the work being done as well as enable future domain experts.
NASA Taxonomies for Searching Problem Reports and FMEAs
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Throop, David R.
2006-01-01
Many types of hazard and risk analyses are used during the life cycle of complex systems, including Failure Modes and Effects Analysis (FMEA), Hazard Analysis, Fault Tree and Event Tree Analysis, Probabilistic Risk Assessment, Reliability Analysis and analysis of Problem Reporting and Corrective Action (PRACA) databases. The success of these methods depends on the availability of input data and the analysts knowledge. Standard nomenclature can increase the reusability of hazard, risk and problem data. When nomenclature in the source texts is not standard, taxonomies with mapping words (sets of rough synonyms) can be combined with semantic search to identify items and tag them with metadata based on a rich standard nomenclature. Semantic search uses word meanings in the context of parsed phrases to find matches. The NASA taxonomies provide the word meanings. Spacecraft taxonomies and ontologies (generalization hierarchies with attributes and relationships, based on terms meanings) are being developed for types of subsystems, functions, entities, hazards and failures. The ontologies are broad and general, covering hardware, software and human systems. Semantic search of Space Station texts was used to validate and extend the taxonomies. The taxonomies have also been used to extract system connectivity (interaction) models and functions from requirements text. Now the Reconciler semantic search tool and the taxonomies are being applied to improve search in the Space Shuttle PRACA database, to discover recurring patterns of failure. Usual methods of string search and keyword search fall short because the entries are terse and have numerous shortcuts (irregular abbreviations, nonstandard acronyms, cryptic codes) and modifier words cannot be used in sentence context to refine the search. The limited and fixed FMEA categories associated with the entries do not make the fine distinctions needed in the search. The approach assigns PRACA report titles to problem classes in the taxonomy. Each ontology class includes mapping words - near-synonyms naming different manifestations of that problem class. The mapping words for Problems, Entities and Functions are converted to a canonical form plus any of a small set of modifier words (e.g. non-uniformity NOT + UNIFORM.) The report titles are parsed as sentences if possible, or treated as a flat sequence of word tokens if parsing fails. When canonical forms in the title match mapping words, the PRACA entry is associated with the corresponding Problem, Entity or Function in the ontology. The user can search for types of failures associated with types of equipment, clustering by type of problem (e.g., all bearings found with problems of being uneven: rough, irregular, gritty ). The results could also be used for tagging PRACA report entries with rich metadata. This approach could also be applied to searching and tagging failure modes, failure effects and mitigations in FMEAs. In the pilot work, parsing 52K+ truncated titles (the test cases that were available), has resulted in identification of both a type of equipment and type of problem in about 75% of the cases. The results are displayed in a manner analogous to Google search results. The effort has also led to the enrichment of the taxonomy, adding some new categories and many new mapping words. Further work would make enhancements that have been identified for improving the clustering and further reducing the false alarm rate. (In searching for recurring problems, good clustering is more important than reducing false alarms). Searching complete PRACA reports should lead to immediate improvement.
DOORS to the semantic web and grid with a PORTAL for biomedical computing.
Taswell, Carl
2008-03-01
The semantic web remains in the early stages of development. It has not yet achieved the goals envisioned by its founders as a pervasive web of distributed knowledge and intelligence. Success will be attained when a dynamic synergism can be created between people and a sufficient number of infrastructure systems and tools for the semantic web in analogy with those for the original web. The domain name system (DNS), web browsers, and the benefits of publishing web pages motivated many people to register domain names and publish web sites on the original web. An analogous resource label system, semantic search applications, and the benefits of collaborative semantic networks will motivate people to register resource labels and publish resource descriptions on the semantic web. The Domain Ontology Oriented Resource System (DOORS) and Problem Oriented Registry of Tags and Labels (PORTAL) are proposed as infrastructure systems for resource metadata within a paradigm that can serve as a bridge between the original web and the semantic web. The Internet Registry Information Service (IRIS) registers [corrected] domain names while DNS publishes domain addresses with mapping of names to addresses for the original web. Analogously, PORTAL registers resource labels and tags while DOORS publishes resource locations and descriptions with mapping of labels to locations for the semantic web. BioPORT is proposed as a prototype PORTAL registry specific for the problem domain of biomedical computing.
Semantic representations in the temporal pole predict false memories
Chadwick, Martin J.; Anjum, Raeesa S.; Kumaran, Dharshan; Schacter, Daniel L.; Spiers, Hugo J.; Hassabis, Demis
2016-01-01
Recent advances in neuroscience have given us unprecedented insight into the neural mechanisms of false memory, showing that artificial memories can be inserted into the memory cells of the hippocampus in a way that is indistinguishable from true memories. However, this alone is not enough to explain how false memories can arise naturally in the course of our daily lives. Cognitive psychology has demonstrated that many instances of false memory, both in the laboratory and the real world, can be attributed to semantic interference. Whereas previous studies have found that a diverse set of regions show some involvement in semantic false memory, none have revealed the nature of the semantic representations underpinning the phenomenon. Here we use fMRI with representational similarity analysis to search for a neural code consistent with semantic false memory. We find clear evidence that false memories emerge from a similarity-based neural code in the temporal pole, a region that has been called the “semantic hub” of the brain. We further show that each individual has a partially unique semantic code within the temporal pole, and this unique code can predict idiosyncratic patterns of memory errors. Finally, we show that the same neural code can also predict variation in true-memory performance, consistent with an adaptive perspective on false memory. Taken together, our findings reveal the underlying structure of neural representations of semantic knowledge, and how this semantic structure can both enhance and distort our memories. PMID:27551087
Towards Semantic e-Science for Traditional Chinese Medicine
Chen, Huajun; Mao, Yuxin; Zheng, Xiaoqing; Cui, Meng; Feng, Yi; Deng, Shuiguang; Yin, Aining; Zhou, Chunying; Tang, Jinming; Jiang, Xiaohong; Wu, Zhaohui
2007-01-01
Background Recent advances in Web and information technologies with the increasing decentralization of organizational structures have resulted in massive amounts of information resources and domain-specific services in Traditional Chinese Medicine. The massive volume and diversity of information and services available have made it difficult to achieve seamless and interoperable e-Science for knowledge-intensive disciplines like TCM. Therefore, information integration and service coordination are two major challenges in e-Science for TCM. We still lack sophisticated approaches to integrate scientific data and services for TCM e-Science. Results We present a comprehensive approach to build dynamic and extendable e-Science applications for knowledge-intensive disciplines like TCM based on semantic and knowledge-based techniques. The semantic e-Science infrastructure for TCM supports large-scale database integration and service coordination in a virtual organization. We use domain ontologies to integrate TCM database resources and services in a semantic cyberspace and deliver a semantically superior experience including browsing, searching, querying and knowledge discovering to users. We have developed a collection of semantic-based toolkits to facilitate TCM scientists and researchers in information sharing and collaborative research. Conclusion Semantic and knowledge-based techniques are suitable to knowledge-intensive disciplines like TCM. It's possible to build on-demand e-Science system for TCM based on existing semantic and knowledge-based techniques. The presented approach in the paper integrates heterogeneous distributed TCM databases and services, and provides scientists with semantically superior experience to support collaborative research in TCM discipline. PMID:17493289
Semantic representations in the temporal pole predict false memories.
Chadwick, Martin J; Anjum, Raeesa S; Kumaran, Dharshan; Schacter, Daniel L; Spiers, Hugo J; Hassabis, Demis
2016-09-06
Recent advances in neuroscience have given us unprecedented insight into the neural mechanisms of false memory, showing that artificial memories can be inserted into the memory cells of the hippocampus in a way that is indistinguishable from true memories. However, this alone is not enough to explain how false memories can arise naturally in the course of our daily lives. Cognitive psychology has demonstrated that many instances of false memory, both in the laboratory and the real world, can be attributed to semantic interference. Whereas previous studies have found that a diverse set of regions show some involvement in semantic false memory, none have revealed the nature of the semantic representations underpinning the phenomenon. Here we use fMRI with representational similarity analysis to search for a neural code consistent with semantic false memory. We find clear evidence that false memories emerge from a similarity-based neural code in the temporal pole, a region that has been called the "semantic hub" of the brain. We further show that each individual has a partially unique semantic code within the temporal pole, and this unique code can predict idiosyncratic patterns of memory errors. Finally, we show that the same neural code can also predict variation in true-memory performance, consistent with an adaptive perspective on false memory. Taken together, our findings reveal the underlying structure of neural representations of semantic knowledge, and how this semantic structure can both enhance and distort our memories.
Guidance of visual attention by semantic information in real-world scenes
Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc
2014-01-01
Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Shahar, Yuval; Young, Ohad; Shalom, Erez; Mayaffit, Alon; Moskovitch, Robert; Hessing, Alon; Galperin, Maya
2004-01-01
We propose to present a poster (and potentially also a demonstration of the implemented system) summarizing the current state of our work on a hybrid, multiple-format representation of clinical guidelines that facilitates conversion of guidelines from free text to a formal representation. We describe a distributed Web-based architecture (DeGeL) and a set of tools using the hybrid representation. The tools enable performing tasks such as guideline specification, semantic markup, search, retrieval, visualization, eligibility determination, runtime application and retrospective quality assessment. The representation includes four parallel formats: Free text (one or more original sources); semistructured text (labeled by the target guideline-ontology semantic labels); semiformal text (which includes some control specification); and a formal, machine-executable representation. The specification, indexing, search, retrieval, and browsing tools are essentially independent of the ontology chosen for guideline representation, but editing the semi-formal and formal formats requires ontology-specific tools, which we have developed in the case of the Asbru guideline-specification language. The four formats support increasingly sophisticated computational tasks. The hybrid guidelines are stored in a Web-based library. All tools, such as for runtime guideline application or retrospective quality assessment, are designed to operate on all representations. We demonstrate the hybrid framework by providing examples from the semantic markup and search tools.
Is semantic verbal fluency impairment explained by executive function deficits in schizophrenia?
Berberian, Arthur A; Moraes, Giovanna V; Gadelha, Ary; Brietzke, Elisa; Fonseca, Ana O; Scarpato, Bruno S; Vicente, Marcella O; Seabra, Alessandra G; Bressan, Rodrigo A; Lacerda, Acioly L
2016-04-19
To investigate if verbal fluency impairment in schizophrenia reflects executive function deficits or results from degraded semantic store or inefficient search and retrieval strategies. Two groups were compared: 141 individuals with schizophrenia and 119 healthy age and education-matched controls. Both groups performed semantic and phonetic verbal fluency tasks. Performance was evaluated using three scores, based on 1) number of words generated; 2) number of clustered/related words; and 3) switching score. A fourth performance score based on the number of clusters was also measured. SZ individuals produced fewer words than controls. After controlling for the total number of words produced, a difference was observed between the groups in the number of cluster-related words generated in the semantic task. In both groups, the number of words generated in the semantic task was higher than that generated in the phonemic task, although a significant group vs. fluency type interaction showed that subjects with schizophrenia had disproportionate semantic fluency impairment. Working memory was positively associated with increased production of words within clusters and inversely correlated with switching. Semantic fluency impairment may be attributed to an inability (resulting from reduced cognitive control) to distinguish target signal from competing noise and to maintain cues for production of memory probes.
Building the interspace: Digital library infrastructure for a University Engineering Community
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schatz, B.
A large-scale digital library is being constructed and evaluated at the University of Illinois, with the goal of bringing professional search and display to Internet information services. A testbed planned to grow to 10K documents and 100K users is being constructed in the Grainger Engineering Library Information Center, as a joint effort of the University Library and the National Center for Supercomputing Applications (NCSA), with evaluation and research by the Graduate School of Library and Information Science and the Department of Computer Science. The electronic collection will be articles from engineering and science journals and magazines, obtained directly from publishersmore » in SGML format and displayed containing all text, figures, tables, and equations. The publisher partners include IEEE Computer Society, AIAA (Aerospace Engineering), American Physical Society, and Wiley & Sons. The software will be based upon NCSA Mosaic as a network engine connected to commercial SGML displayers and full-text searchers. The users will include faculty/students across the midwestern universities in the Big Ten, with evaluations via interviews, surveys, and transaction logs. Concurrently, research into scaling the testbed is being conducted. This includes efforts in computer science, information science, library science, and information systems. These efforts will evaluate different semantic retrieval technologies, including automatic thesaurus and subject classification graphs. New architectures will be designed and implemented for a next generation digital library infrastructure, the Interspace, which supports interaction with information spread across information spaces within the Net.« less
Semantic framework for mapping object-oriented model to semantic web languages
Ježek, Petr; Mouček, Roman
2015-01-01
The article deals with and discusses two main approaches in building semantic structures for electrophysiological metadata. It is the use of conventional data structures, repositories, and programming languages on one hand and the use of formal representations of ontologies, known from knowledge representation, such as description logics or semantic web languages on the other hand. Although knowledge engineering offers languages supporting richer semantic means of expression and technological advanced approaches, conventional data structures and repositories are still popular among developers, administrators and users because of their simplicity, overall intelligibility, and lower demands on technical equipment. The choice of conventional data resources and repositories, however, raises the question of how and where to add semantics that cannot be naturally expressed using them. As one of the possible solutions, this semantics can be added into the structures of the programming language that accesses and processes the underlying data. To support this idea we introduced a software prototype that enables its users to add semantically richer expressions into a Java object-oriented code. This approach does not burden users with additional demands on programming environment since reflective Java annotations were used as an entry for these expressions. Moreover, additional semantics need not to be written by the programmer directly to the code, but it can be collected from non-programmers using a graphic user interface. The mapping that allows the transformation of the semantically enriched Java code into the Semantic Web language OWL was proposed and implemented in a library named the Semantic Framework. This approach was validated by the integration of the Semantic Framework in the EEG/ERP Portal and by the subsequent registration of the EEG/ERP Portal in the Neuroscience Information Framework. PMID:25762923
Semantic framework for mapping object-oriented model to semantic web languages.
Ježek, Petr; Mouček, Roman
2015-01-01
The article deals with and discusses two main approaches in building semantic structures for electrophysiological metadata. It is the use of conventional data structures, repositories, and programming languages on one hand and the use of formal representations of ontologies, known from knowledge representation, such as description logics or semantic web languages on the other hand. Although knowledge engineering offers languages supporting richer semantic means of expression and technological advanced approaches, conventional data structures and repositories are still popular among developers, administrators and users because of their simplicity, overall intelligibility, and lower demands on technical equipment. The choice of conventional data resources and repositories, however, raises the question of how and where to add semantics that cannot be naturally expressed using them. As one of the possible solutions, this semantics can be added into the structures of the programming language that accesses and processes the underlying data. To support this idea we introduced a software prototype that enables its users to add semantically richer expressions into a Java object-oriented code. This approach does not burden users with additional demands on programming environment since reflective Java annotations were used as an entry for these expressions. Moreover, additional semantics need not to be written by the programmer directly to the code, but it can be collected from non-programmers using a graphic user interface. The mapping that allows the transformation of the semantically enriched Java code into the Semantic Web language OWL was proposed and implemented in a library named the Semantic Framework. This approach was validated by the integration of the Semantic Framework in the EEG/ERP Portal and by the subsequent registration of the EEG/ERP Portal in the Neuroscience Information Framework.
ERIC Educational Resources Information Center
Bolton, B.; Adderley, K. J.
1978-01-01
After viewing videotaped case studies indicating the relevance of electrical laboratory work to professional engineers, student attitudes showed a positive improvement toward laboratory work. Semantic differential tests, questionnaires, and interviews were used. (Author/MH)
Query Auto-Completion Based on Word2vec Semantic Similarity
NASA Astrophysics Data System (ADS)
Shao, Taihua; Chen, Honghui; Chen, Wanyu
2018-04-01
Query auto-completion (QAC) is the first step of information retrieval, which helps users formulate the entire query after inputting only a few prefixes. Regarding the models of QAC, the traditional method ignores the contribution from the semantic relevance between queries. However, similar queries always express extremely similar search intention. In this paper, we propose a hybrid model FS-QAC based on query semantic similarity as well as the query frequency. We choose word2vec method to measure the semantic similarity between intended queries and pre-submitted queries. By combining both features, our experiments show that FS-QAC model improves the performance when predicting the user’s query intention and helping formulate the right query. Our experimental results show that the optimal hybrid model contributes to a 7.54% improvement in terms of MRR against a state-of-the-art baseline using the public AOL query logs.
A Generic Evaluation Model for Semantic Web Services
NASA Astrophysics Data System (ADS)
Shafiq, Omair
Semantic Web Services research has gained momentum over the last few Years and by now several realizations exist. They are being used in a number of industrial use-cases. Soon software developers will be expected to use this infrastructure to build their B2B applications requiring dynamic integration. However, there is still a lack of guidelines for the evaluation of tools developed to realize Semantic Web Services and applications built on top of them. In normal software engineering practice such guidelines can already be found for traditional component-based systems. Also some efforts are being made to build performance models for servicebased systems. Drawing on these related efforts in component-oriented and servicebased systems, we identified the need for a generic evaluation model for Semantic Web Services applicable to any realization. The generic evaluation model will help users and customers to orient their systems and solutions towards using Semantic Web Services. In this chapter, we have presented the requirements for the generic evaluation model for Semantic Web Services and further discussed the initial steps that we took to sketch such a model. Finally, we discuss related activities for evaluating semantic technologies.
Categorizing words through semantic memory navigation
NASA Astrophysics Data System (ADS)
Borge-Holthoefer, J.; Arenas, A.
2010-03-01
Semantic memory is the cognitive system devoted to storage and retrieval of conceptual knowledge. Empirical data indicate that semantic memory is organized in a network structure. Everyday experience shows that word search and retrieval processes provide fluent and coherent speech, i.e. are efficient. This implies either that semantic memory encodes, besides thousands of words, different kind of links for different relationships (introducing greater complexity and storage costs), or that the structure evolves facilitating the differentiation between long-lasting semantic relations from incidental, phenomenological ones. Assuming the latter possibility, we explore a mechanism to disentangle the underlying semantic backbone which comprises conceptual structure (extraction of categorical relations between pairs of words), from the rest of information present in the structure. To this end, we first present and characterize an empirical data set modeled as a network, then we simulate a stochastic cognitive navigation on this topology. We schematize this latter process as uncorrelated random walks from node to node, which converge to a feature vectors network. By doing so we both introduce a novel mechanism for information retrieval, and point at the problem of category formation in close connection to linguistic and non-linguistic experience.
SCALEUS: Semantic Web Services Integration for Biomedical Applications.
Sernadela, Pedro; González-Castro, Lorena; Oliveira, José Luís
2017-04-01
In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/ .
Tag-Based Social Image Search: Toward Relevant and Diverse Results
NASA Astrophysics Data System (ADS)
Yang, Kuiyuan; Wang, Meng; Hua, Xian-Sheng; Zhang, Hong-Jiang
Recent years have witnessed a great success of social media websites. Tag-based image search is an important approach to access the image content of interest on these websites. However, the existing ranking methods for tag-based image search frequently return results that are irrelevant or lack of diversity. This chapter presents a diverse relevance ranking scheme which simultaneously takes relevance and diversity into account by exploring the content of images and their associated tags. First, it estimates the relevance scores of images with respect to the query term based on both visual information of images and semantic information of associated tags. Then semantic similarities of social images are estimated based on their tags. Based on the relevance scores and the similarities, the ranking list is generated by a greedy ordering algorithm which optimizes Average Diverse Precision (ADP), a novel measure that is extended from the conventional Average Precision (AP). Comprehensive experiments and user studies demonstrate the effectiveness of the approach.
Cui, Licong; Xu, Rong; Luo, Zhihui; Wentz, Susan; Scarberry, Kyle; Zhang, Guo-Qiang
2014-08-03
Finding quality consumer health information online can effectively bring important public health benefits to the general population. It can empower people with timely and current knowledge for managing their health and promoting wellbeing. Despite a popular belief that search engines such as Google can solve all information access problems, recent studies show that using search engines and simple search terms is not sufficient. Our objective is to provide an approach to organizing consumer health information for navigational exploration, complementing keyword-based direct search. Multi-topic assignment to health information, such as online questions, is a fundamental step for navigational exploration. We introduce a new multi-topic assignment method combining semantic annotation using UMLS concepts (CUIs) and Formal Concept Analysis (FCA). Each question was tagged with CUIs identified by MetaMap. The CUIs were filtered with term-frequency and a new term-strength index to construct a CUI-question context. The CUI-question context and a topic-subject context were used for multi-topic assignment, resulting in a topic-question context. The topic-question context was then directly used for constructing a prototype navigational exploration interface. Experimental evaluation was performed on the task of automatic multi-topic assignment of 99 predefined topics for about 60,000 consumer health questions from NetWellness. Using example-based metrics, suitable for multi-topic assignment problems, our method achieved a precision of 0.849, recall of 0.774, and F₁ measure of 0.782, using a reference standard of 278 questions with manually assigned topics. Compared to NetWellness' original topic assignment, a 36.5% increase in recall is achieved with virtually no sacrifice in precision. Enhancing the recall of multi-topic assignment without sacrificing precision is a prerequisite for achieving the benefits of navigational exploration. Our new multi-topic assignment method, combining term-strength, FCA, and information retrieval techniques, significantly improved recall and performed well according to example-based metrics.
2014-01-01
Background Finding quality consumer health information online can effectively bring important public health benefits to the general population. It can empower people with timely and current knowledge for managing their health and promoting wellbeing. Despite a popular belief that search engines such as Google can solve all information access problems, recent studies show that using search engines and simple search terms is not sufficient. Our objective is to provide an approach to organizing consumer health information for navigational exploration, complementing keyword-based direct search. Multi-topic assignment to health information, such as online questions, is a fundamental step for navigational exploration. Methods We introduce a new multi-topic assignment method combining semantic annotation using UMLS concepts (CUIs) and Formal Concept Analysis (FCA). Each question was tagged with CUIs identified by MetaMap. The CUIs were filtered with term-frequency and a new term-strength index to construct a CUI-question context. The CUI-question context and a topic-subject context were used for multi-topic assignment, resulting in a topic-question context. The topic-question context was then directly used for constructing a prototype navigational exploration interface. Results Experimental evaluation was performed on the task of automatic multi-topic assignment of 99 predefined topics for about 60,000 consumer health questions from NetWellness. Using example-based metrics, suitable for multi-topic assignment problems, our method achieved a precision of 0.849, recall of 0.774, and F1 measure of 0.782, using a reference standard of 278 questions with manually assigned topics. Compared to NetWellness’ original topic assignment, a 36.5% increase in recall is achieved with virtually no sacrifice in precision. Conclusion Enhancing the recall of multi-topic assignment without sacrificing precision is a prerequisite for achieving the benefits of navigational exploration. Our new multi-topic assignment method, combining term-strength, FCA, and information retrieval techniques, significantly improved recall and performed well according to example-based metrics. PMID:25086916
ERIC Educational Resources Information Center
Rutherford, Barbara J.; Mathesius, Jeffrey R.
2012-01-01
Difference between the brain's hemispheres in efficiency of intentional search of the mental lexicon with phonological, orthographic, and semantic strategies was investigated. Letter strings for lexical decision were presented at fixation, with a lateralized distractor to the LVF or RVF. Word results revealed that both hemispheres were capable of…
Google Scholar is not enough to be used alone for systematic reviews.
Giustini, Dean; Boulos, Maged N Kamel
2013-01-01
Google Scholar (GS) has been noted for its ability to search broadly for important references in the literature. Gehanno et al. recently examined GS in their study: 'Is Google scholar enough to be used alone for systematic reviews?' In this paper, we revisit this important question, and some of Gehanno et al.'s other findings in evaluating the academic search engine. The authors searched for a recent systematic review (SR) of comparable size to run search tests similar to those in Gehanno et al. We selected Chou et al. (2013) contacting the authors for a list of publications they found in their SR on social media in health. We queried GS for each of those 506 titles (in quotes "), one by one. When GS failed to retrieve a paper, or produced too many results, we used the allintitle: command to find papers with the same title. Google Scholar produced records for ~95% of the papers cited by Chou et al. (n=476/506). A few of the 30 papers that were not in GS were later retrieved via PubMed and even regular Google Search. But due to its different structure, we could not run searches in GS that were originally performed by Chou et al. in PubMed, Web of Science, Scopus and PsycINFO®. Identifying 506 papers in GS was an inefficient process, especially for papers using similar search terms. Has Google Scholar improved enough to be used alone in searching for systematic reviews? No. GS' constantly-changing content, algorithms and database structure make it a poor choice for systematic reviews. Looking for papers when you know their titles is a far different issue from discovering them initially. Further research is needed to determine when and how (and for what purposes) GS can be used alone. Google should provide details about GS' database coverage and improve its interface (e.g., with semantic search filters, stored searching, etc.). Perhaps then it will be an appropriate choice for systematic reviews.
Druzinsky, Robert E; Balhoff, James P; Crompton, Alfred W; Done, James; German, Rebecca Z; Haendel, Melissa A; Herrel, Anthony; Herring, Susan W; Lapp, Hilmar; Mabee, Paula M; Muller, Hans-Michael; Mungall, Christopher J; Sternberg, Paul W; Van Auken, Kimberly; Vinyard, Christopher J; Williams, Susan H; Wall, Christine E
2016-01-01
In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required. Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.
Combined use of semantics and metadata to manage Research Data Life Cycle in Environmental Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida
2017-04-01
The use of metadata to contextualize datasets is quite extended in Earth System Sciences. There are some initiatives and available tools to help data managers to choose the best metadata standard that fit their use cases, like the DCC Metadata Directory (http://www.dcc.ac.uk/resources/metadata-standards). In our use case, we have been gathering physical, chemical and biological data from a water reservoir since 2010. A well metadata definition is crucial not only to contextualize our own data but also to integrate datasets from other sources like satellites or meteorological agencies. That is why we have chosen EML (Ecological Metadata Language), which integrates many different elements to define a dataset, including the project context, instrumentation and parameters definition, and the software used to process, provide quality controls and include the publication details. Those metadata elements can contribute to help both human and machines to understand and process the dataset. However, the use of metadata is not enough to fully support the data life cycle, from the Data Management Plan definition to the Publication and Re-use. To do so, we need to define not only metadata and attributes but also the relationships between them, so semantics are needed. Ontologies, being a knowledge representation, can contribute to define the elements of a research data life cycle, including DMP, datasets, software, etc. They also can define how the different elements are related between them and how they interact. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop an ontology we are using a graphical tool Protégé, which is a graphical ontology-development tool that supports a rich knowledge model and it is open-source and freely available. To process and manage the ontology, we are using Semantic MediaWiki, which is able to process queries. Semantic MediaWiki is an extension of MediaWiki where we can do semantic search and export data in RDF. Our final goal is integrating our data repository portal and semantic processing engine in order to have a complete system to manage the data life cycle stages and their relationships, including machine-actionable DMP solution, datasets and software management, computing resources for processing and analysis and publication features (DOI mint). This way we will be able to reproduce the full data life cycle chain warranting the FAIR+R principles.
Semantic Integration for Marine Science Interoperability Using Web Technologies
NASA Astrophysics Data System (ADS)
Rueda, C.; Bermudez, L.; Graybeal, J.; Isenor, A. W.
2008-12-01
The Marine Metadata Interoperability Project, MMI (http://marinemetadata.org) promotes the exchange, integration, and use of marine data through enhanced data publishing, discovery, documentation, and accessibility. A key effort is the definition of an Architectural Framework and Operational Concept for Semantic Interoperability (http://marinemetadata.org/sfc), which is complemented with the development of tools that realize critical use cases in semantic interoperability. In this presentation, we describe a set of such Semantic Web tools that allow performing important interoperability tasks, ranging from the creation of controlled vocabularies and the mapping of terms across multiple ontologies, to the online registration, storage, and search services needed to work with the ontologies (http://mmisw.org). This set of services uses Web standards and technologies, including Resource Description Framework (RDF), Web Ontology language (OWL), Web services, and toolkits for Rich Internet Application development. We will describe the following components: MMI Ontology Registry: The MMI Ontology Registry and Repository provides registry and storage services for ontologies. Entries in the registry are associated with projects defined by the registered users. Also, sophisticated search functions, for example according to metadata items and vocabulary terms, are provided. Client applications can submit search requests using the WC3 SPARQL Query Language for RDF. Voc2RDF: This component converts an ASCII comma-delimited set of terms and definitions into an RDF file. Voc2RDF facilitates the creation of controlled vocabularies by using a simple form-based user interface. Created vocabularies and their descriptive metadata can be submitted to the MMI Ontology Registry for versioning and community access. VINE: The Vocabulary Integration Environment component allows the user to map vocabulary terms across multiple ontologies. Various relationships can be established, for example exactMatch, narrowerThan, and subClassOf. VINE can compute inferred mappings based on the given associations. Attributes about each mapping, like comments and a confidence level, can also be included. VINE also supports registering and storing resulting mapping files in the Ontology Registry. The presentation will describe the application of semantic technologies in general, and our planned applications in particular, to solve data management problems in the marine and environmental sciences.
Ilic, D; Bessell, T L; Silagy, C A; Green, S
2003-03-01
The Internet provides consumers with access to online health information; however, identifying relevant and valid information can be problematic. Our objectives were firstly to investigate the efficiency of search-engines, and then to assess the quality of online information pertaining to androgen deficiency in the ageing male (ADAM). Keyword searches were performed on nine search-engines (four general and five medical) to identify website information regarding ADAM. Search-engine efficiency was compared by percentage of relevant websites obtained via each search-engine. The quality of information published on each website was assessed using the DISCERN rating tool. Of 4927 websites searched, 47 (1.44%) and 10 (0.60%) relevant websites were identified by general and medical search-engines respectively. The overall quality of online information on ADAM was poor. The quality of websites retrieved using medical search-engines did not differ significantly from those retrieved by general search-engines. Despite the poor quality of online information relating to ADAM, it is evident that medical search-engines are no better than general search-engines in sourcing consumer information relevant to ADAM.
Semantic Interpretation of An Artificial Neural Network
1995-12-01
ARTIFICIAL NEURAL NETWORK .7,’ THESIS Stanley Dale Kinderknecht Captain, USAF 770 DEAT7ET77,’H IR O C 7... ARTIFICIAL NEURAL NETWORK THESIS Stanley Dale Kinderknecht Captain, USAF AFIT/GCS/ENG/95D-07 Approved for public release; distribution unlimited The views...Government. AFIT/GCS/ENG/95D-07 SEMANTIC INTERPRETATION OF AN ARTIFICIAL NEURAL NETWORK THESIS Presented to the Faculty of the School of Engineering of
2017-03-01
Reports an error in "Multisensory brand search: How the meaning of sounds guides consumers' visual attention" by Klemens M. Knoeferle, Pia Knoeferle, Carlos Velasco and Charles Spence ( Journal of Experimental Psychology: Applied , 2016[Jun], Vol 22[2], 196-210). In the article, under Experiment 2, Design and Stimuli, the set number of target products and visual distractors reported in the second paragraph should be 20 and 13, respectively: "On each trial, the 16 products shown in the display were randomly selected from a set of 20 products belonging to different categories. Out of the set of 20 products, seven were potential targets, whereas the other 13 were used as visual distractors only throughout the experiment (since they were not linked to specific usage or consumption sounds)." Consequently, Appendix A in the supplemental materials has been updated. (The following abstract of the original article appeared in record 2016-28876-002.) Building on models of crossmodal attention, the present research proposes that brand search is inherently multisensory, in that the consumers' visual search for a specific brand can be facilitated by semantically related stimuli that are presented in another sensory modality. A series of 5 experiments demonstrates that the presentation of spatially nonpredictive auditory stimuli associated with products (e.g., usage sounds or product-related jingles) can crossmodally facilitate consumers' visual search for, and selection of, products. Eye-tracking data (Experiment 2) revealed that the crossmodal effect of auditory cues on visual search manifested itself not only in RTs, but also in the earliest stages of visual attentional processing, thus suggesting that the semantic information embedded within sounds can modulate the perceptual saliency of the target products' visual representations. Crossmodal facilitation was even observed for newly learnt associations between unfamiliar brands and sonic logos, implicating multisensory short-term learning in establishing audiovisual semantic associations. The facilitation effect was stronger when searching complex rather than simple visual displays, thus suggesting a modulatory role of perceptual load. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Graham, Matthew; Gray, N.; Burke, D.
2010-01-01
Many activities in the era of data-intensive astronomy are predicated upon some transference of domain knowledge and expertise from human to machine. The semantic infrastructure required to support this is no longer a pipe dream of computer science but a set of practical engineering challenges, more concerned with deployment and performance details than AI abstractions. The application of such ideas promises to help in such areas as contextual data access, exploiting distributed annotation and heterogeneous sources, and intelligent data dissemination and discovery. In this talk, we will review the status and use of semantic technologies in astronomy, particularly to address current problems in astroinformatics, with such projects as SKUA and AstroCollation.
Social and Personal Factors in Semantic Infusion Projects
NASA Astrophysics Data System (ADS)
West, P.; Fox, P. A.; McGuinness, D. L.
2009-12-01
As part of our semantic data framework activities across multiple, diverse disciplines we required the involvement of domain scientists, computer scientists, software engineers, data managers, and often, social scientists. This involvement from a cross-section of disciplines turns out to be a social exercise as much as it is a technical and methodical activity. Each member of the team is used to different modes of working, expectations, vocabularies, levels of participation, and incentive and reward systems. We will examine how both roles and personal responsibilities play in the development of semantic infusion projects, and how an iterative development cycle can contribute to the successful completion of such a project.
Adjacency and Proximity Searching in the Science Citation Index and Google
2005-01-01
major database search engines , including commercial S&T database search engines (e.g., Science Citation Index (SCI), Engineering Compendex (EC...PubMed, OVID), Federal agency award database search engines (e.g., NSF, NIH, DOE, EPA, as accessed in Federal R&D Project Summaries), Web search Engines (e.g...searching. Some database search engines allow strict constrained co- occurrence searching as a user option (e.g., OVID, EC), while others do not (e.g., SCI
Visual search for verbal material in patients with obsessive-compulsive disorder.
Botta, Fabiano; Vibert, Nicolas; Harika-Germaneau, Ghina; Frasca, Mickaël; Rigalleau, François; Fakra, Eric; Ros, Christine; Rouet, Jean-François; Ferreri, Florian; Jaafari, Nematollah
2018-06-01
This study aimed at investigating attentional mechanisms in obsessive-compulsive disorder (OCD) by analysing how visual search processes are modulated by normal and obsession-related distracting information in OCD patients and whether these modulations differ from those observed in healthy people. OCD patients were asked to search for a target word within distractor words that could be orthographically similar to the target, semantically related to the target, semantically related to the most typical obsessions/compulsions observed in OCD patients, or unrelated to the target. Patients' performance and eye movements were compared with those of individually matched healthy controls. In controls, the distractors that were visually similar to the target mostly captured attention. Conversely, patients' attention was captured equally by all kinds of distractor words, whatever their similarity with the target, except obsession-related distractors that attracted patients' attention less than the other distractors. OCD had a major impact on the mostly subliminal mechanisms that guide attention within the search display, but had much less impact on the distractor rejection processes that take place when a distractor is fixated. Hence, visual search in OCD is characterized by abnormal subliminal, but not supraliminal, processing of obsession-related information and by an impaired ability to inhibit task-irrelevant inputs. Copyright © 2018 Elsevier B.V. All rights reserved.
Content-based image retrieval with ontological ranking
NASA Astrophysics Data System (ADS)
Tsai, Shen-Fu; Tsai, Min-Hsuan; Huang, Thomas S.
2010-02-01
Images are a much more powerful medium of expression than text, as the adage says: "One picture is worth a thousand words." It is because compared with text consisting of an array of words, an image has more degrees of freedom and therefore a more complicated structure. However, the less limited structure of images presents researchers in the computer vision community a tough task of teaching machines to understand and organize images, especially when a limit number of learning examples and background knowledge are given. The advance of internet and web technology in the past decade has changed the way human gain knowledge. People, hence, can exchange knowledge with others by discussing and contributing information on the web. As a result, the web pages in the internet have become a living and growing source of information. One is therefore tempted to wonder whether machines can learn from the web knowledge base as well. Indeed, it is possible to make computer learn from the internet and provide human with more meaningful knowledge. In this work, we explore this novel possibility on image understanding applied to semantic image search. We exploit web resources to obtain links from images to keywords and a semantic ontology constituting human's general knowledge. The former maps visual content to related text in contrast to the traditional way of associating images with surrounding text; the latter provides relations between concepts for machines to understand to what extent and in what sense an image is close to the image search query. With the aid of these two tools, the resulting image search system is thus content-based and moreover, organized. The returned images are ranked and organized such that semantically similar images are grouped together and given a rank based on the semantic closeness to the input query. The novelty of the system is twofold: first, images are retrieved not only based on text cues but their actual contents as well; second, the grouping is different from pure visual similarity clustering. More specifically, the inferred concepts of each image in the group are examined in the context of a huge concept ontology to determine their true relations with what people have in mind when doing image search.
Informatics in radiology: radiology gamuts ontology: differential diagnosis for the Semantic Web.
Budovec, Joseph J; Lam, Cesar A; Kahn, Charles E
2014-01-01
The Semantic Web is an effort to add semantics, or "meaning," to empower automated searching and processing of Web-based information. The overarching goal of the Semantic Web is to enable users to more easily find, share, and combine information. Critical to this vision are knowledge models called ontologies, which define a set of concepts and formalize the relations between them. Ontologies have been developed to manage and exploit the large and rapidly growing volume of information in biomedical domains. In diagnostic radiology, lists of differential diagnoses of imaging observations, called gamuts, provide an important source of knowledge. The Radiology Gamuts Ontology (RGO) is a formal knowledge model of differential diagnoses in radiology that includes 1674 differential diagnoses, 19,017 terms, and 52,976 links between terms. Its knowledge is used to provide an interactive, freely available online reference of radiology gamuts ( www.gamuts.net ). A Web service allows its content to be discovered and consumed by other information systems. The RGO integrates radiologic knowledge with other biomedical ontologies as part of the Semantic Web. © RSNA, 2014.
Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel
2013-04-15
In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.
2013-01-01
Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic. PMID:23586394
Large-scale Cross-modality Search via Collective Matrix Factorization Hashing.
Ding, Guiguang; Guo, Yuchen; Zhou, Jile; Gao, Yue
2016-09-08
By transforming data into binary representation, i.e., Hashing, we can perform high-speed search with low storage cost, and thus Hashing has collected increasing research interest in the recent years. Recently, how to generate Hashcode for multimodal data (e.g., images with textual tags, documents with photos, etc) for large-scale cross-modality search (e.g., searching semantically related images in database for a document query) is an important research issue because of the fast growth of multimodal data in the Web. To address this issue, a novel framework for multimodal Hashing is proposed, termed as Collective Matrix Factorization Hashing (CMFH). The key idea of CMFH is to learn unified Hashcodes for different modalities of one multimodal instance in the shared latent semantic space in which different modalities can be effectively connected. Therefore, accurate cross-modality search is supported. Based on the general framework, we extend it in the unsupervised scenario where it tries to preserve the Euclidean structure, and in the supervised scenario where it fully exploits the label information of data. The corresponding theoretical analysis and the optimization algorithms are given. We conducted comprehensive experiments on three benchmark datasets for cross-modality search. The experimental results demonstrate that CMFH can significantly outperform several state-of-the-art cross-modality Hashing methods, which validates the effectiveness of the proposed CMFH.
Start Your Engines: Surfing with Search Engines for Kids.
ERIC Educational Resources Information Center
Byerly, Greg; Brodie, Carolyn S.
1999-01-01
Suggests that to be an effective educator and user of the Web it is essential to know the basics about search engines. Presents tips for using search engines. Describes several search engines for children and young adults, as well as some general filtered search engines for children. (AEF)
Semantics and technologies in modern design of interior stairs
NASA Astrophysics Data System (ADS)
Kukhta, M.; Sokolov, A.; Pelevin, E.
2015-10-01
Use of metal in the design of interior stairs presents new features for shaping, and can be implemented using different technologies. The article discusses the features of design and production technologies of forged metal spiral staircase considering the image semantics based on the historical and cultural heritage. To achieve the objective was applied structural- semantic method (to identify the organization of structure and semantic features of the artistic image), engineering methods (to justify the construction of the object), anthropometry method and ergonomics (to provide usability), methods of comparative analysis (to reveale the features of the way the ladder in different periods of culture). According to the research results are as follows. Was revealed the semantics influence on the design of interior staircase that is based on the World Tree image. Also was suggested rational calculation of steps to ensure the required strength. And finally was presented technology, providing the realization of the artistic image. In the practical part of the work is presented version of forged staircase.
ER2OWL: Generating OWL Ontology from ER Diagram
NASA Astrophysics Data System (ADS)
Fahad, Muhammad
Ontology is the fundamental part of Semantic Web. The goal of W3C is to bring the web into (its full potential) a semantic web with reusing previous systems and artifacts. Most legacy systems have been documented in structural analysis and structured design (SASD), especially in simple or Extended ER Diagram (ERD). Such systems need up-gradation to become the part of semantic web. In this paper, we present ERD to OWL-DL ontology transformation rules at concrete level. These rules facilitate an easy and understandable transformation from ERD to OWL. The set of rules for transformation is tested on a structured analysis and design example. The framework provides OWL ontology for semantic web fundamental. This framework helps software engineers in upgrading the structured analysis and design artifact ERD, to components of semantic web. Moreover our transformation tool, ER2OWL, reduces the cost and time for building OWL ontologies with the reuse of existing entity relationship models.
NASA Astrophysics Data System (ADS)
Rosen, Charles; Siegel, Edward Carl-Ludwig; Feynman, Richard; Wunderman, Irwin; Smith, Adolph; Marinov, Vesco; Goldman, Jacob; Brine, Sergey; Poge, Larry; Schmidt, Erich; Young, Frederic; Goates-Bulmer, William-Steven; Lewis-Tsurakov-Altshuler, Thomas-Valerie-Genot; Ibm/Exxon Collaboration; Google/Uw Collaboration; Microsoft/Amazon Collaboration; Oracle/Sun Collaboration; Ostp/Dod/Dia/Nsa/W.-F./Boa/Ubs/Ub Collaboration
2013-03-01
Belew[Finding Out About, Cambridge(2000)] and separately full-decade pre-Page/Brin/Google FIRST Siegel-Rosen(Machine-Intelligence/Atherton)-Feynman-Smith-Marinov(Guzik Enterprises/Exxon-Enterprises/A.-I./Santa Clara)-Wunderman(H.-P.) [IBM Conf. on Computers and Mathematics, Stanford(1986); APS Mtgs.(1980s): Palo Alto/Santa Clara/San Francisco/...(1980s) MRS Spring-Mtgs.(1980s): Palo Alto/San Jose/San Francisco/...(1980-1992) FIRST quantum-computing via Bose-Einstein quantum-statistics(BEQS) Bose-Einstein CONDENSATION (BEC) in artificial-intelligence(A-I) artificial neural-networks(A-N-N) and biological neural-networks(B-N-N) and Siegel[J. Noncrystalline-Solids 40, 453(1980); Symp. on Fractals..., MRS Fall-Mtg., Boston(1989)-5-papers; Symp. on Scaling..., (1990); Symp. on Transport in Geometric-Constraint (1990)
Response time effects of alerting tone and semantic context for synthesized voice cockpit warnings
NASA Technical Reports Server (NTRS)
Simpson, C. A.; Williams, D. H.
1980-01-01
Some handbooks and human factors design guides have recommended that a voice warning should be preceded by a tone to attract attention to the warning. As far as can be determined from a search of the literature, no experimental evidence supporting this exists. A fixed-base simulator flown by airline pilots was used to test the hypothesis that the total 'system-time' to respond to a synthesized voice cockpit warning would be longer when the message was preceded by a tone because the voice itself was expected to perform both the alerting and the information transfer functions. The simulation included realistic ATC radio voice communications, synthesized engine noise, cockpit conversation, and realistic flight routes. The effect of a tone before a voice warning was to lengthen response time; that is, responses were slower with an alerting tone. Lengthening the voice warning with another work, however, did not increase response time.
An Infrastructure for Indexing and Organizing Best Practices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Liming; Staples, Mark; Gorton, Ian
Industry best practices are widely held but not necessarily empirically verified software engineering beliefs. Best practices can be documented in distributed web-based public repositories as pattern catalogues or practice libraries. There is a need to systematically index and organize these practices to enable their better practical use and scientific evaluation. In this paper, we propose a semi-automatic approach to index and organise best practices. A central repository acts as an information overlay on top of other pre-existing resources to facilitate organization, navigation, annotation and meta-analysis while maintaining synchronization with those resources. An initial population of the central repository is automatedmore » using Yahoo! contextual search services. The collected data is organized using semantic web technologies so that the data can be more easily shared and used for innovative analyses. A prototype has demonstrated the capability of the approach.« less
A user-centred evaluation framework for the Sealife semantic web browsers
Oliver, Helen; Diallo, Gayo; de Quincey, Ed; Alexopoulou, Dimitra; Habermann, Bianca; Kostkova, Patty; Schroeder, Michael; Jupp, Simon; Khelif, Khaled; Stevens, Robert; Jawaheer, Gawesh; Madle, Gemma
2009-01-01
Background Semantically-enriched browsing has enhanced the browsing experience by providing contextualised dynamically generated Web content, and quicker access to searched-for information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. Methods This paper describes a user-centred evaluation framework that was developed to evaluate the Sealife SWBs that elicited feedback on users' perceptions on ease of use and information findability. Three sources of data: i) web server logs; ii) user questionnaires; and iii) semi-structured interviews were analysed and comparisons made between each browser and a control system. Results It was found that the evaluation framework used successfully elicited users' perceptions of the three distinct SWBs. The results indicate that the browser with the most mature and polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Conclusion Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues. PMID:19796398
A user-centred evaluation framework for the Sealife semantic web browsers.
Oliver, Helen; Diallo, Gayo; de Quincey, Ed; Alexopoulou, Dimitra; Habermann, Bianca; Kostkova, Patty; Schroeder, Michael; Jupp, Simon; Khelif, Khaled; Stevens, Robert; Jawaheer, Gawesh; Madle, Gemma
2009-10-01
Semantically-enriched browsing has enhanced the browsing experience by providing contextualized dynamically generated Web content, and quicker access to searched-for information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. This paper describes a user-centred evaluation framework that was developed to evaluate the Sealife SWBs that elicited feedback on users' perceptions on ease of use and information findability. Three sources of data: i) web server logs; ii) user questionnaires; and iii) semi-structured interviews were analysed and comparisons made between each browser and a control system. It was found that the evaluation framework used successfully elicited users' perceptions of the three distinct SWBs. The results indicate that the browser with the most mature and polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues.
Dynamic memory searches: Selective output interference for the memory of facts.
Aue, William R; Criss, Amy H; Prince, Melissa A
2015-12-01
The benefits of testing on later memory performance are well documented; however, the manner in which testing harms memory performance is less well understood. This research is concerned with the finding that accuracy decreases over the course of testing, a phenomena termed "output interference" (OI). OI has primarily been investigated with episodic memory, but there is limited research investigating OI in measures of semantic memory (i.e., knowledge). In the current study, participants were twice tested for their knowledge of factual questions; they received corrective feedback during the first test. No OI was observed during the first test, when participants presumably searched semantic memory to answer the general-knowledge questions. During the second test, OI was observed. Conditional analyses of Test 2 performance revealed that OI was largely isolated to questions answered incorrectly during Test 1. These were questions for which participants needed to rely on recent experience (i.e., the feedback in episodic memory) to respond correctly. One possible explanation is that episodic memory is more susceptible to the sort of interference generated during testing (e.g., gradual changes in context, encoding/updating of items) relative to semantic memory. Alternative explanations are considered.
2013-01-01
Background Clinical Intelligence, as a research and engineering discipline, is dedicated to the development of tools for data analysis for the purposes of clinical research, surveillance, and effective health care management. Self-service ad hoc querying of clinical data is one desirable type of functionality. Since most of the data are currently stored in relational or similar form, ad hoc querying is problematic as it requires specialised technical skills and the knowledge of particular data schemas. Results A possible solution is semantic querying where the user formulates queries in terms of domain ontologies that are much easier to navigate and comprehend than data schemas. In this article, we are exploring the possibility of using SADI Semantic Web services for semantic querying of clinical data. We have developed a prototype of a semantic querying infrastructure for the surveillance of, and research on, hospital-acquired infections. Conclusions Our results suggest that SADI can support ad-hoc, self-service, semantic queries of relational data in a Clinical Intelligence context. The use of SADI compares favourably with approaches based on declarative semantic mappings from data schemas to ontologies, such as query rewriting and RDFizing by materialisation, because it can easily cope with situations when (i) some computation is required to turn relational data into RDF or OWL, e.g., to implement temporal reasoning, or (ii) integration with external data sources is necessary. PMID:23497556
SemMat: Federated Semantic Services Platform for Open materials Science and Engineering
2017-01-01
identified the following two important tasks to remedy the data heterogeneity challenge to promote data integration: (1) creating the semantic...sourced from the structural and bio -materials domains. For structural materials data, we reviewed and used MIL-HDBK-5J [11] and MIL-HDBK-17. Furthermore...documents about composite materials provided by our domain expert. Based on the suggestions given by domain experts in bio -materials, the following
Setting semantics: conceptual set can determine the physical properties that capture attention.
Goodhew, Stephanie C; Kendall, William; Ferber, Susanne; Pratt, Jay
2014-08-01
The ability of a stimulus to capture visuospatial attention depends on the interplay between its bottom-up saliency and its relationship to an observer's top-down control set, such that stimuli capture attention if they match the predefined properties that distinguish a searched-for target from distractors (Folk, Remington, & Johnston, Journal of Experimental Psychology: Human Perception & Performance, 18, 1030-1044 1992). Despite decades of research on this phenomenon, however, the vast majority has focused exclusively on matches based on low-level physical properties. Yet if contingent capture is indeed a "top-down" influence on attention, then semantic content should be accessible and able to determine which physical features capture attention. Here we tested this prediction by examining whether a semantically defined target could create a control set for particular features. To do this, we had participants search to identify a target that was differentiated from distractors by its meaning (e.g., the word "red" among color words all written in black). Before the target array, a cue was presented, and it was varied whether the cue appeared in the physical color implied by the target word. Across three experiments, we found that cues that embodied the meaning of the word produced greater cuing than cues that did not. This suggests that top-down control sets activate content that is semantically associated with the target-defining property, and this content in turn has the ability to exogenously orient attention.
Meeting medical terminology needs--the Ontology-Enhanced Medical Concept Mapper.
Leroy, G; Chen, H
2001-12-01
This paper describes the development and testing of the Medical Concept Mapper, a tool designed to facilitate access to online medical information sources by providing users with appropriate medical search terms for their personal queries. Our system is valuable for patients whose knowledge of medical vocabularies is inadequate to find the desired information, and for medical experts who search for information outside their field of expertise. The Medical Concept Mapper maps synonyms and semantically related concepts to a user's query. The system is unique because it integrates our natural language processing tool, i.e., the Arizona (AZ) Noun Phraser, with human-created ontologies, the Unified Medical Language System (UMLS) and WordNet, and our computer generated Concept Space, into one system. Our unique contribution results from combining the UMLS Semantic Net with Concept Space in our deep semantic parsing (DSP) algorithm. This algorithm establishes a medical query context based on the UMLS Semantic Net, which allows Concept Space terms to be filtered so as to isolate related terms relevant to the query. We performed two user studies in which Medical Concept Mapper terms were compared against human experts' terms. We conclude that the AZ Noun Phraser is well suited to extract medical phrases from user queries, that WordNet is not well suited to provide strictly medical synonyms, that the UMLS Metathesaurus is well suited to provide medical synonyms, and that Concept Space is well suited to provide related medical terms, especially when these terms are limited by our DSP algorithm.
Centrality-based Selection of Semantic Resources for Geosciences
NASA Astrophysics Data System (ADS)
Cerba, Otakar; Jedlicka, Karel
2017-04-01
Semantical questions intervene almost in all disciplines dealing with geographic data and information, because relevant semantics is crucial for any way of communication and interaction among humans as well as among machines. But the existence of such a large number of different semantic resources (such as various thesauri, controlled vocabularies, knowledge bases or ontologies) makes the process of semantics implementation much more difficult and complicates the use of the advantages of semantics. This is because in many cases users are not able to find the most suitable resource for their purposes. The research presented in this paper introduces a methodology consisting of an analysis of identical relations in Linked Data space, which covers a majority of semantic resources, to find a suitable resource of semantic information. Identical links interconnect representations of an object or a concept in various semantic resources. Therefore this type of relations is considered to be crucial from the view of Linked Data, because these links provide new additional information, including various views on one concept based on different cultural or regional aspects (so-called social role of Linked Data). For these reasons it is possible to declare that one reasonable criterion for feasible semantic resources for almost all domains, including geosciences, is their position in a network of interconnected semantic resources and level of linking to other knowledge bases and similar products. The presented methodology is based on searching of mutual connections between various instances of one concept using "follow your nose" approach. The extracted data on interconnections between semantic resources are arranged to directed graphs and processed by various metrics patterned on centrality computing (degree, closeness or betweenness centrality). Semantic resources recommended by the research could be used for providing semantically described keywords for metadata records or as names of items in data models. Such an approach enables much more efficient data harmonization, integration, sharing and exploitation. * * * * This publication was supported by the project LO1506 of the Czech Ministry of Education, Youth and Sports. This publication was supported by project Data-Driven Bioeconomy (DataBio) from the ICT-15-2016-2017, Big Data PPP call.
Custom Search Engines: Tools & Tips
ERIC Educational Resources Information Center
Notess, Greg R.
2008-01-01
Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…
Adding a visualization feature to web search engines: it's time.
Wong, Pak Chung
2008-01-01
It's widely recognized that all Web search engines today are almost identical in presentation layout and behavior. In fact, the same presentation approach has been applied to depicting search engine results pages (SERPs) since the first Web search engine launched in 1993. In this Visualization Viewpoints article, I propose to add a visualization feature to Web search engines and suggest that the new addition can improve search engines' performance and capabilities, which in turn lead to better Web search technology.
ERIC Educational Resources Information Center
Garman, Nancy
1999-01-01
Describes common options and features to consider in evaluating which meta search engine will best meet a searcher's needs. Discusses number and names of engines searched; other sources and specialty engines; search queries; other search options; and results options. (AEF)
NASA Astrophysics Data System (ADS)
Titov, A.; Gordov, E.; Okladnikov, I.
2009-04-01
In this report the results of the work devoted to the development of working model of the software system for storage, semantically-enabled search and retrieval along with processing and visualization of environmental datasets containing results of meteorological and air pollution observations and mathematical climate modeling are presented. Specially designed metadata standard for machine-readable description of datasets related to meteorology, climate and atmospheric pollution transport domains is introduced as one of the key system components. To provide semantic interoperability the Resource Description Framework (RDF, http://www.w3.org/RDF/) technology means have been chosen for metadata description model realization in the form of RDF Schema. The final version of the RDF Schema is implemented on the base of widely used standards, such as Dublin Core Metadata Element Set (http://dublincore.org/), Directory Interchange Format (DIF, http://gcmd.gsfc.nasa.gov/User/difguide/difman.html), ISO 19139, etc. At present the system is available as a Web server (http://climate.risks.scert.ru/metadatabase/) based on the web-portal ATMOS engine [1] and is implementing dataset management functionality including SeRQL-based semantic search as well as statistical analysis and visualization of selected data archives [2,3]. The core of the system is Apache web server in conjunction with Tomcat Java Servlet Container (http://jakarta.apache.org/tomcat/) and Sesame Server (http://www.openrdf.org/) used as a database for RDF and RDF Schema. At present statistical analysis of meteorological and climatic data with subsequent visualization of results is implemented for such datasets as NCEP/NCAR Reanalysis, Reanalysis NCEP/DOE AMIP II, JMA/CRIEPI JRA-25, ECMWF ERA-40 and local measurements obtained from meteorological stations on the territory of Russia. This functionality is aimed primarily at finding of main characteristics of regional climate dynamics. The proposed system represents a step in the process of development of a distributed collaborative information-computational environment to support multidisciplinary investigations of Earth regional environment [4]. Partial support of this work by SB RAS Integration Project 34, SB RAS Basic Program Project 4.5.2.2, APN Project CBA2007-08NSY and FP6 Enviro-RISKS project (INCO-CT-2004-013427) is acknowledged. References 1. E.P. Gordov, V.N. Lykosov, and A.Z. Fazliev. Web portal on environmental sciences "ATMOS" // Advances in Geosciences. 2006. Vol. 8. p. 33 - 38. 2. Gordov E.P., Okladnikov I.G., Titov A.G. Development of elements of web based information-computational system supporting regional environment processes investigations // Journal of Computational Technologies, Vol. 12, Special Issue #3, 2007, pp. 20 - 28. 3. Okladnikov I.G., Titov A.G. Melnikova V.N., Shulgina T.M. Web-system for processing and visualization of meteorological and climatic data // Journal of Computational Technologies, Vol. 13, Special Issue #3, 2008, pp. 64 - 69. 4. Gordov E.P., Lykosov V.N. Development of information-computational infrastructure for integrated study of Siberia environment // Journal of Computational Technologies, Vol. 12, Special Issue #2, 2007, pp. 19 - 30.
Organizing Diverse, Distributed Project Information
NASA Technical Reports Server (NTRS)
Keller, Richard M.
2003-01-01
SemanticOrganizer is a software application designed to organize and integrate information generated within a distributed organization or as part of a project that involves multiple, geographically dispersed collaborators. SemanticOrganizer incorporates the capabilities of database storage, document sharing, hypermedia navigation, and semantic-interlinking into a system that can be customized to satisfy the specific information-management needs of different user communities. The program provides a centralized repository of information that is both secure and accessible to project collaborators via the World Wide Web. SemanticOrganizer's repository can be used to collect diverse information (including forms, documents, notes, data, spreadsheets, images, and sounds) from computers at collaborators work sites. The program organizes the information using a unique network-structured conceptual framework, wherein each node represents a data record that contains not only the original information but also metadata (in effect, standardized data that characterize the information). Links among nodes express semantic relationships among the data records. The program features a Web interface through which users enter, interlink, and/or search for information in the repository. By use of this repository, the collaborators have immediate access to the most recent project information, as well as to archived information. A key advantage to SemanticOrganizer is its ability to interlink information together in a natural fashion using customized terminology and concepts that are familiar to a user community.
Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro
2011-07-01
Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.
Research Prototype: Automated Analysis of Scientific and Engineering Semantics
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.; Follen, Greg (Technical Monitor)
2001-01-01
Physical and mathematical formulae and concepts are fundamental elements of scientific and engineering software. These classical equations and methods are time tested, universally accepted, and relatively unambiguous. The existence of this classical ontology suggests an ideal problem for automated comprehension. This problem is further motivated by the pervasive use of scientific code and high code development costs. To investigate code comprehension in this classical knowledge domain, a research prototype has been developed. The prototype incorporates scientific domain knowledge to recognize code properties (including units, physical, and mathematical quantity). Also, the procedure implements programming language semantics to propagate these properties through the code. This prototype's ability to elucidate code and detect errors will be demonstrated with state of the art scientific codes.
Discovery of novel biomarkers and phenotypes by semantic technologies
2013-01-01
Background Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. Results This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. Conclusions The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions. PMID:23402646
Helping Students Choose Tools To Search the Web.
ERIC Educational Resources Information Center
Cohen, Laura B.; Jacobson, Trudi E.
2000-01-01
Describes areas where faculty members can aid students in making intelligent use of the Web in their research. Differentiates between subject directories and search engines. Describes an engine's three components: spider, index, and search engine. Outlines two misconceptions: that Yahoo! is a search engine and that search engines contain all the…
Grooker, KartOO, Addict-o-Matic and More: Really Different Search Engines
ERIC Educational Resources Information Center
Descy, Don E.
2009-01-01
There are hundreds of unique search engines in the United States and thousands of unique search engines around the world. If people get into search engines designed just to search particular web sites, the number is in the hundreds of thousands. This article looks at: (1) clustering search engines, such as KartOO (www.kartoo.com) and Grokker…
ODISEES: A New Paradigm in Data Access
NASA Astrophysics Data System (ADS)
Huffer, E.; Little, M. M.; Kusterer, J.
2013-12-01
As part of its ongoing efforts to improve access to data, the Atmospheric Science Data Center has developed a high-precision Earth Science domain ontology (the 'ES Ontology') implemented in a graph database ('the Semantic Metadata Repository') that is used to store detailed, semantically-enhanced, parameter-level metadata for ASDC data products. The ES Ontology provides the semantic infrastructure needed to drive the ASDC's Ontology-Driven Interactive Search Environment for Earth Science ('ODISEES'), a data discovery and access tool, and will support additional data services such as analytics and visualization. The ES ontology is designed on the premise that naming conventions alone are not adequate to provide the information needed by prospective data consumers to assess the suitability of a given dataset for their research requirements; nor are current metadata conventions adequate to support seamless machine-to-machine interactions between file servers and end-user applications. Data consumers need information not only about what two data elements have in common, but also about how they are different. End-user applications need consistent, detailed metadata to support real-time data interoperability. The ES ontology is a highly precise, bottom-up, queriable model of the Earth Science domain that focuses on critical details about the measurable phenomena, instrument techniques, data processing methods, and data file structures. Earth Science parameters are described in detail in the ES Ontology and mapped to the corresponding variables that occur in ASDC datasets. Variables are in turn mapped to well-annotated representations of the datasets that they occur in, the instrument(s) used to create them, the instrument platforms, the processing methods, etc., creating a linked-data structure that allows both human and machine users to access a wealth of information critical to understanding and manipulating the data. The mappings are recorded in the Semantic Metadata Repository as RDF-triples. An off-the-shelf Ontology Development Environment and a custom Metadata Conversion Tool comprise a human-machine/machine-machine hybrid tool that partially automates the creation of metadata as RDF-triples by interfacing with existing metadata repositories and providing a user interface that solicits input from a human user, when needed. RDF-triples are pushed to the Ontology Development Environment, where a reasoning engine executes a series of inference rules whose antecedent conditions can be satisfied by the initial set of RDF-triples, thereby generating the additional detailed metadata that is missing in existing repositories. A SPARQL Endpoint, a web-based query service and a Graphical User Interface allow prospective data consumers - even those with no familiarity with NASA data products - to search the metadata repository to find and order data products that meet their exact specifications. A web-based API will provide an interface for machine-to-machine transactions.
MetaSEEk: a content-based metasearch engine for images
NASA Astrophysics Data System (ADS)
Beigi, Mandis; Benitez, Ana B.; Chang, Shih-Fu
1997-12-01
Search engines are the most powerful resources for finding information on the rapidly expanding World Wide Web (WWW). Finding the desired search engines and learning how to use them, however, can be very time consuming. The integration of such search tools enables the users to access information across the world in a transparent and efficient manner. These systems are called meta-search engines. The recent emergence of visual information retrieval (VIR) search engines on the web is leading to the same efficiency problem. This paper describes and evaluates MetaSEEk, a content-based meta-search engine used for finding images on the Web based on their visual information. MetaSEEk is designed to intelligently select and interface with multiple on-line image search engines by ranking their performance for different classes of user queries. User feedback is also integrated in the ranking refinement. We compare MetaSEEk with a base line version of meta-search engine, which does not use the past performance of the different search engines in recommending target search engines for future queries.
[Advanced online search techniques and dedicated search engines for physicians].
Nahum, Yoav
2008-02-01
In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.
An approach for the semantic interoperability of ISO EN 13606 and OpenEHR archetypes.
Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás
2010-10-01
The communication between health information systems of hospitals and primary care organizations is currently an important challenge to improve the quality of clinical practice and patient safety. However, clinical information is usually distributed among several independent systems that may be syntactically or semantically incompatible. This fact prevents healthcare professionals from accessing clinical information of patients in an understandable and normalized way. In this work, we address the semantic interoperability of two EHR standards: OpenEHR and ISO EN 13606. Both standards follow the dual model approach which distinguishes information and knowledge, this being represented through archetypes. The solution presented here is capable of transforming OpenEHR archetypes into ISO EN 13606 and vice versa by combining Semantic Web and Model-driven Engineering technologies. The resulting software implementation has been tested using publicly available collections of archetypes for both standards.
Graph-Based Semantic Web Service Composition for Healthcare Data Integration.
Arch-Int, Ngamnij; Arch-Int, Somjit; Sonsilphong, Suphachoke; Wanchai, Paweena
2017-01-01
Within the numerous and heterogeneous web services offered through different sources, automatic web services composition is the most convenient method for building complex business processes that permit invocation of multiple existing atomic services. The current solutions in functional web services composition lack autonomous queries of semantic matches within the parameters of web services, which are necessary in the composition of large-scale related services. In this paper, we propose a graph-based Semantic Web Services composition system consisting of two subsystems: management time and run time. The management-time subsystem is responsible for dependency graph preparation in which a dependency graph of related services is generated automatically according to the proposed semantic matchmaking rules. The run-time subsystem is responsible for discovering the potential web services and nonredundant web services composition of a user's query using a graph-based searching algorithm. The proposed approach was applied to healthcare data integration in different health organizations and was evaluated according to two aspects: execution time measurement and correctness measurement.
Graph-Based Semantic Web Service Composition for Healthcare Data Integration
2017-01-01
Within the numerous and heterogeneous web services offered through different sources, automatic web services composition is the most convenient method for building complex business processes that permit invocation of multiple existing atomic services. The current solutions in functional web services composition lack autonomous queries of semantic matches within the parameters of web services, which are necessary in the composition of large-scale related services. In this paper, we propose a graph-based Semantic Web Services composition system consisting of two subsystems: management time and run time. The management-time subsystem is responsible for dependency graph preparation in which a dependency graph of related services is generated automatically according to the proposed semantic matchmaking rules. The run-time subsystem is responsible for discovering the potential web services and nonredundant web services composition of a user's query using a graph-based searching algorithm. The proposed approach was applied to healthcare data integration in different health organizations and was evaluated according to two aspects: execution time measurement and correctness measurement. PMID:29065602
Atmosphere-based image classification through luminance and hue
NASA Astrophysics Data System (ADS)
Xu, Feng; Zhang, Yujin
2005-07-01
In this paper a novel image classification system is proposed. Atmosphere serves an important role in generating the scene"s topic or in conveying the message behind the scene"s story, which belongs to abstract attribute level in semantic levels. At first, five atmosphere semantic categories are defined according to rules of photo and film grammar, followed by global luminance and hue features. Then the hierarchical SVM classifiers are applied. In each classification stage, corresponding features are extracted and the trained linear SVM is implemented, resulting in two classes. After three stages of classification, five atmosphere categories are obtained. At last, the text annotation of the atmosphere semantics and the corresponding features by Extensible Markup Language (XML) in MPEG-7 is defined, which can be integrated into more multimedia applications (such as searching, indexing and accessing of multimedia content). The experiment is performed on Corel images and film frames. The classification results prove the effectiveness of the definition of atmosphere semantic classes and the corresponding features.
Semantic Interaction for Sensemaking: Inferring Analytical Reasoning for Model Steering.
Endert, A; Fiaux, P; North, C
2012-12-01
Visual analytic tools aim to support the cognitively demanding task of sensemaking. Their success often depends on the ability to leverage capabilities of mathematical models, visualization, and human intuition through flexible, usable, and expressive interactions. Spatially clustering data is one effective metaphor for users to explore similarity and relationships between information, adjusting the weighting of dimensions or characteristics of the dataset to observe the change in the spatial layout. Semantic interaction is an approach to user interaction in such spatializations that couples these parametric modifications of the clustering model with users' analytic operations on the data (e.g., direct document movement in the spatialization, highlighting text, search, etc.). In this paper, we present results of a user study exploring the ability of semantic interaction in a visual analytic prototype, ForceSPIRE, to support sensemaking. We found that semantic interaction captures the analytical reasoning of the user through keyword weighting, and aids the user in co-creating a spatialization based on the user's reasoning and intuition.
Search Engines: Gateway to a New ``Panopticon''?
NASA Astrophysics Data System (ADS)
Kosta, Eleni; Kalloniatis, Christos; Mitrou, Lilian; Kavakli, Evangelia
Nowadays, Internet users are depending on various search engines in order to be able to find requested information on the Web. Although most users feel that they are and remain anonymous when they place their search queries, reality proves otherwise. The increasing importance of search engines for the location of the desired information on the Internet usually leads to considerable inroads into the privacy of users. The scope of this paper is to study the main privacy issues with regard to search engines, such as the anonymisation of search logs and their retention period, and to examine the applicability of the European data protection legislation to non-EU search engine providers. Ixquick, a privacy-friendly meta search engine will be presented as an alternative to privacy intrusive existing practices of search engines.
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources
Jonquet, Clement; LePendu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F.; Musen, Mark A.; Shah, Nigam H.
2011-01-01
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index—a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.” PMID:21918645
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources.
Jonquet, Clement; Lependu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F; Musen, Mark A; Shah, Nigam H
2011-09-01
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics "under the hood."
Organizational Knowledge Transfer Using Ontologies and a Rule-Based System
NASA Astrophysics Data System (ADS)
Okabe, Masao; Yoshioka, Akiko; Kobayashi, Keido; Yamaguchi, Takahira
In recent automated and integrated manufacturing, so-called intelligence skill is becoming more and more important and its efficient transfer to next-generation engineers is one of the urgent issues. In this paper, we propose a new approach without costly OJT (on-the-job training), that is, combinational usage of a domain ontology, a rule ontology and a rule-based system. Intelligence skill can be decomposed into pieces of simple engineering rules. A rule ontology consists of these engineering rules as primitives and the semantic relations among them. A domain ontology consists of technical terms in the engineering rules and the semantic relations among them. A rule ontology helps novices get the total picture of the intelligence skill and a domain ontology helps them understand the exact meanings of the engineering rules. A rule-based system helps domain experts externalize their tacit intelligence skill to ontologies and also helps novices internalize them. As a case study, we applied our proposal to some actual job at a remote control and maintenance office of hydroelectric power stations in Tokyo Electric Power Co., Inc. We also did an evaluation experiment for this case study and the result supports our proposal.
NASA Astrophysics Data System (ADS)
Stekolschik, Alexander, Prof.
2017-10-01
The bill of materials (BOM), which involves all parts and assemblies of the product, is the core of any mechanical or electronic product. The flexible and integrated management of engineering (Engineering Bill of Materials [eBOM]) and manufacturing (Manufacturing Bill of Materials [mBOM]) structures is the key to the creation of modern products in mechanical engineering companies. This paper presents a method framework for the creation and control of e- and, especially, mBOM. The requirements, resulting from the process of differentiation between companies that produce serialized or engineered-to-order products, are considered in the analysis phase. The main part of the paper describes different approaches to fully or partly automated creation of mBOM. The first approach is the definition of part selection rules in the generic mBOM templates. The mBOM can be derived from the eBOM for partly standardized products by using this method. Another approach is the simultaneous use of semantic rules, options, and parameters in both structures. The implementation of the method framework (selection of use cases) in a standard product lifecycle management (PLM) system is part of the research.
Semantic technologies improving the recall and precision of the Mercury metadata search engine
NASA Astrophysics Data System (ADS)
Pouchard, L. C.; Cook, R. B.; Green, J.; Palanisamy, G.; Noy, N.
2011-12-01
The Mercury federated metadata system [1] was developed at the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC), a NASA-sponsored effort holding datasets about biogeochemical dynamics, ecological data, and environmental processes. Mercury currently indexes over 100,000 records from several data providers conforming to community standards, e.g. EML, FGDC, FGDC Biological Profile, ISO 19115 and DIF. With the breadth of sciences represented in Mercury, the potential exists to address some key interdisciplinary scientific challenges related to climate change, its environmental and ecological impacts, and mitigation of these impacts. However, this wealth of metadata also hinders pinpointing datasets relevant to a particular inquiry. We implemented a semantic solution after concluding that traditional search approaches cannot improve the accuracy of the search results in this domain because: a) unlike everyday queries, scientific queries seek to return specific datasets with numerous parameters that may or may not be exposed to search (Deep Web queries); b) the relevance of a dataset cannot be judged by its popularity, as each scientific inquiry tends to be unique; and c)each domain science has its own terminology, more or less curated, consensual, and standardized depending on the domain. The same terms may refer to different concepts across domains (homonyms), but different terms mean the same thing (synonyms). Interdisciplinary research is arduous because an expert in a domain must become fluent in the language of another, just to find relevant datasets. Thus, we decided to use scientific ontologies because they can provide a context for a free-text search, in a way that string-based keywords never will. With added context, relevant datasets are more easily discoverable. To enable search and programmatic access to ontology entities in Mercury, we are using an instance of the BioPortal ontology repository. Mercury accesses ontology entities using the BioPortal REST API by passing a search parameter to BioPortal that may return domain context, parameter attribute, or entity annotations depending on the entity's associated ontological relationships. As Mercury's facetted search is popular with users, the results are displayed as facets. Unlike a facetted search however, the ontology-based solution implements both restrictions (improving precision) and expansions (improving recall) on the results of the initial search. For instance, "carbon" acquires a scientific context and additional key terms or phrases for discovering domain-specific datasets. A limitation of our solution is that the user must perform an additional step. Another limitation is that the quality of the newly discovered metadata is contingent upon the quality of the ontologies we use. Our solution leverages Mercury's federated capabilities to collect records from heterogeneous domains, and BioPortal's storage, curation and access capabilities for ontology entities. With minimal additional development, our approach builds on two mature systems for finding relevant datasets for interdisciplinary inquiries. We thus indicate a path forward for linking environmental, ecological and biological sciences. References: [1] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94.
Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W
2009-03-01
LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.
The Theory of Planned Behaviour Applied to Search Engines as a Learning Tool
ERIC Educational Resources Information Center
Liaw, Shu-Sheng
2004-01-01
Search engines have been developed for helping learners to seek online information. Based on theory of planned behaviour approach, this research intends to investigate the behaviour of using search engines as a learning tool. After factor analysis, the results suggest that perceived satisfaction of search engine, search engines as an information…
Rice, Grace E; Caswell, Helen; Moore, Perry; Lambon Ralph, Matthew A; Hoffman, Paul
2018-06-06
One critical feature of any well-engineered system is its resilience to perturbation and minor damage. The purpose of the current study was to investigate how resilience is achieved in higher cognitive systems, which we explored through the domain of semantic cognition. Convergent evidence implicates the bilateral anterior temporal lobes (ATLs) as a conceptual knowledge hub. While bilateral damage to this region produces profound semantic impairment, unilateral atrophy/resection or transient perturbation has a limited effect. Two neural mechanisms might underpin this resilience to unilateral ATL damage: 1) the undamaged ATL upregulates its activation in order to compensate; and/or 2) prefrontal regions involved in control of semantic retrieval upregulate to compensate for the impoverished semantic representations that follow from ATL damage. To test these possibilities, 34 postsurgical temporal lobe epilepsy patients and 20 age-matched controls were scanned whilst completing semantic tasks. Pictorial tasks, which produced bilateral frontal and temporal activation, showed few activation differences between patients and control participants. Written word tasks, however, produced a left-lateralized activation pattern and greater differences between the groups. Patients with right ATL resection increased activation in left inferior frontal gyrus (IFG). Patients with left ATL resection upregulated both the right ATL and right IFG. Consistent with recent computational models, these results indicate that 1) written word semantic processing in patients with ATL resection is supported by upregulation of semantic knowledge and control regions, principally in the undamaged hemisphere, and 2) pictorial semantic processing is less affected, presumably because it draws on a more bilateral network.
Guidance of attention by information held in working memory.
Calleja, Marissa Ortiz; Rich, Anina N
2013-05-01
Information held in working memory (WM) can guide attention during visual search. The authors of recent studies have interpreted the effect of holding verbal labels in WM as guidance of visual attention by semantic information. In a series of experiments, we tested how attention is influenced by visual features versus category-level information about complex objects held in WM. Participants either memorized an object's image or its category. While holding this information in memory, they searched for a target in a four-object search display. On exact-match trials, the memorized item reappeared as a distractor in the search display. On category-match trials, another exemplar of the memorized item appeared as a distractor. On neutral trials, none of the distractors were related to the memorized object. We found attentional guidance in visual search on both exact-match and category-match trials in Experiment 1, in which the exemplars were visually similar. When we controlled for visual similarity among the exemplars by using four possible exemplars (Exp. 2) or by using two exemplars rated as being visually dissimilar (Exp. 3), we found attentional guidance only on exact-match trials when participants memorized the object's image. The same pattern of results held when the target was invariant (Exps. 2-3) and when the target was defined semantically and varied in visual features (Exp. 4). The findings of these experiments suggest that attentional guidance by WM requires active visual information.
Choi, Okkyung; Jung, Hanyoung; Moon, Seungbin
2014-01-01
With smartphone distribution becoming common and robotic applications on the rise, social tagging services for various applications including robotic domains have advanced significantly. Though social tagging plays an important role when users are finding the exact information through web search, reliability and semantic relation between web contents and tags are not considered. Spams are making ill use of this aspect and put irrelevant tags deliberately on contents and induce users to advertise contents when they click items of search results. Therefore, this study proposes a detection method for tag-ranking manipulation to solve the problem of the existing methods which cannot guarantee the reliability of tagging. Similarity is measured for ranking the grade of registered tag on the contents, and weighted values of each tag are measured by means of synonym relevance, frequency, and semantic distances between tags. Lastly, experimental evaluation results are provided and its efficiency and accuracy are verified through them.
Neuhaus, Philipp; Doods, Justin; Dugas, Martin
2015-01-01
Automatic coding of medical terms is an important, but highly complicated and laborious task. To compare and evaluate different strategies a framework with a standardized web-interface was created. Two UMLS mapping strategies are compared to demonstrate the interface. The framework is a Java Spring application running on a Tomcat application server. It accepts different parameters and returns results in JSON format. To demonstrate the framework, a list of medical data items was mapped by two different methods: similarity search in a large table of terminology codes versus search in a manually curated repository. These mappings were reviewed by a specialist. The evaluation shows that the framework is flexible (due to standardized interfaces like HTTP and JSON), performant and reliable. Accuracy of automatically assigned codes is limited (up to 40%). Combining different semantic mappers into a standardized Web-API is feasible. This framework can be easily enhanced due to its modular design.
Spiders and Worms and Crawlers, Oh My: Searching on the World Wide Web.
ERIC Educational Resources Information Center
Eagan, Ann; Bender, Laura
Searching on the world wide web can be confusing. A myriad of search engines exist, often with little or no documentation, and many of these search engines work differently from the standard search engines people are accustomed to using. Intended for librarians, this paper defines search engines, directories, spiders, and robots, and covers basics…
Ziaimatin, Hasti; Groza, Tudor; Tudorache, Tania; Hunter, Jane
2016-12-01
Collaboration platforms provide a dynamic environment where the content is subject to ongoing evolution through expert contributions. The knowledge embedded in such platforms is not static as it evolves through incremental refinements - or micro-contributions. Such refinements provide vast resources of tacit knowledge and experience. In our previous work, we proposed and evaluated a Semantic and Time-dependent Expertise Profiling (STEP) approach for capturing expertise from micro-contributions. In this paper we extend our investigation to structured micro-contributions that emerge from an ontology engineering environment, such as the one built for developing the International Classification of Diseases (ICD) revision 11. We take advantage of the semantically related nature of these structured micro-contributions to showcase two major aspects: (i) a novel semantic similarity metric, in addition to an approach for creating bottom-up baseline expertise profiles using expertise centroids; and (ii) the application of STEP in this new environment combined with the use of the same semantic similarity measure to both compare STEP against baseline profiles, as well as to investigate the coverage of these baseline profiles by STEP.
Semantically transparent fingerprinting for right protection of digital cinema
NASA Astrophysics Data System (ADS)
Wu, Xiaolin
2003-06-01
Digital cinema, a new frontier and crown jewel of digital multimedia, has the potential of revolutionizing the science, engineering and business of movie production and distribution. The advantages of digital cinema technology over traditional analog technology are numerous and profound. But without effective and enforceable copyright protection measures, digital cinema can be more susceptible to widespread piracy, which can dampen or even prevent the commercial deployment of digital cinema. In this paper we propose a novel approach of fingerprinting each individual distribution copy of a digital movie for the purpose of tracing pirated copies back to their source. The proposed fingerprinting technique presents a fundamental departure from the traditional digital watermarking/fingerprinting techniques. Its novelty and uniqueness lie in a so-called semantic or subjective transparency property. The fingerprints are created by editing those visual and audio attributes that can be modified with semantic and subjective transparency to the audience. Semantically-transparent fingerprinting or watermarking is the most robust kind among all existing watermarking techniques, because it is content-based not sample-based, and semantically-recoverable not statistically-recoverable.
Dynamics of a macroscopic model characterizing mutualism of search engines and web sites
NASA Astrophysics Data System (ADS)
Wang, Yuanshi; Wu, Hong
2006-05-01
We present a model to describe the mutualism relationship between search engines and web sites. In the model, search engines and web sites benefit from each other while the search engines are derived products of the web sites and cannot survive independently. Our goal is to show strategies for the search engines to survive in the internet market. From mathematical analysis of the model, we show that mutualism does not always result in survival. We show various conditions under which the search engines would tend to extinction, persist or grow explosively. Then by the conditions, we deduce a series of strategies for the search engines to survive in the internet market. We present conditions under which the initial number of consumers of the search engines has little contribution to their persistence, which is in agreement with the results in previous works. Furthermore, we show novel conditions under which the initial value plays an important role in the persistence of the search engines and deduce new strategies. We also give suggestions for the web sites to cooperate with the search engines in order to form a win-win situation.
The use and limits of scientific names in biological informatics.
Remsen, David
2016-01-01
Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription. These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.
Utilization of a radiology-centric search engine.
Sharpe, Richard E; Sharpe, Megan; Siegel, Eliot; Siddiqui, Khan
2010-04-01
Internet-based search engines have become a significant component of medical practice. Physicians increasingly rely on information available from search engines as a means to improve patient care, provide better education, and enhance research. Specialized search engines have emerged to more efficiently meet the needs of physicians. Details about the ways in which radiologists utilize search engines have not been documented. The authors categorized every 25th search query in a radiology-centric vertical search engine by radiologic subspecialty, imaging modality, geographic location of access, time of day, use of abbreviations, misspellings, and search language. Musculoskeletal and neurologic imagings were the most frequently searched subspecialties. The least frequently searched were breast imaging, pediatric imaging, and nuclear medicine. Magnetic resonance imaging and computed tomography were the most frequently searched modalities. A majority of searches were initiated in North America, but all continents were represented. Searches occurred 24 h/day in converted local times, with a majority occurring during the normal business day. Misspellings and abbreviations were common. Almost all searches were performed in English. Search engine utilization trends are likely to mirror trends in diagnostic imaging in the region from which searches originate. Internet searching appears to function as a real-time clinical decision-making tool, a research tool, and an educational resource. A more thorough understanding of search utilization patterns can be obtained by analyzing phrases as actually entered as well as the geographic location and time of origination. This knowledge may contribute to the development of more efficient and personalized search engines.
An Exploratory Survey of Student Perspectives Regarding Search Engines
ERIC Educational Resources Information Center
Alshare, Khaled; Miller, Don; Wenger, James
2005-01-01
This study explored college students' perceptions regarding their use of search engines. The main objective was to determine how frequently students used various search engines, whether advanced search features were used, and how many search engines were used. Various factors that might influence student responses were examined. Results showed…
The Use of Web Search Engines in Information Science Research.
ERIC Educational Resources Information Center
Bar-Ilan, Judit
2004-01-01
Reviews the literature on the use of Web search engines in information science research, including: ways users interact with Web search engines; social aspects of searching; structure and dynamic nature of the Web; link analysis; other bibliometric applications; characterizing information on the Web; search engine evaluation and improvement; and…
Using Internet Search Engines to Obtain Medical Information: A Comparative Study
Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun
2012-01-01
Background The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. Objective To compare major Internet search engines in their usability of obtaining medical and health information. Methods We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Results Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search results highly overlapped between the search engines, and the overlap between any two search engines was about half or more. On the other hand, each search engine emphasized various types of content differently. In terms of user satisfaction analysis, volunteer users scored Bing the highest for its usefulness, followed by Yahoo!, Google, and Ask.com. Conclusions Google, Yahoo!, Bing, and Ask.com are by and large effective search engines for helping lay users get health and medical information. Nevertheless, the current ranking methods have some pitfalls and there is room for improvement to help users get more accurate and useful information. We suggest that search engine users explore multiple search engines to search different types of health information and medical knowledge for their own needs and get a professional consultation if necessary. PMID:22672889
Using Internet search engines to obtain medical information: a comparative study.
Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun; Xu, Dong
2012-05-16
The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. To compare major Internet search engines in their usability of obtaining medical and health information. We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search results highly overlapped between the search engines, and the overlap between any two search engines was about half or more. On the other hand, each search engine emphasized various types of content differently. In terms of user satisfaction analysis, volunteer users scored Bing the highest for its usefulness, followed by Yahoo!, Google, and Ask.com. Google, Yahoo!, Bing, and Ask.com are by and large effective search engines for helping lay users get health and medical information. Nevertheless, the current ranking methods have some pitfalls and there is room for improvement to help users get more accurate and useful information. We suggest that search engine users explore multiple search engines to search different types of health information and medical knowledge for their own needs and get a professional consultation if necessary.
NASA Indexing Benchmarks: Evaluating Text Search Engines
NASA Technical Reports Server (NTRS)
Esler, Sandra L.; Nelson, Michael L.
1997-01-01
The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner. This study develops criteria for analytically comparing the index and search engines and presents results for a number of freely available search engines. A product of this research is a toolkit capable of automatically indexing, searching, and extracting performance statistics from each of the focused search engines. This toolkit is highly configurable and has the ability to run these benchmark tests against other engines as well. Results demonstrate that the tested search engines can be grouped into two levels. Level one engines are efficient on small to medium sized data collections, but show weaknesses when used for collections 100MB or larger. Level two search engines are recommended for data collections up to and beyond 100MB.
Borlawsky, Tara B.; Dhaval, Rakesh; Hastings, Shannon L.; Payne, Philip R. O.
2009-01-01
In October 2006, the National Institutes of Health launched a new national consortium, funded through Clinical and Translational Science Awards (CTSA), with the primary objective of improving the conduct and efficiency of the inherently multi-disciplinary field of translational research. To help meet this goal, the Ohio State University Center for Clinical and Translational Science has launched a knowledge management initiative that is focused on facilitating widespread semantic interoperability among administrative, basic science, clinical and research computing systems, both internally and among the translational research community at-large, through the integration of domain-specific standard terminologies and ontologies with local annotations. This manuscript describes an agile framework that builds upon prevailing knowledge engineering and semantic interoperability methods, and will be implemented as part this initiative. PMID:21347164
Borlawsky, Tara B; Dhaval, Rakesh; Hastings, Shannon L; Payne, Philip R O
2009-03-01
In October 2006, the National Institutes of Health launched a new national consortium, funded through Clinical and Translational Science Awards (CTSA), with the primary objective of improving the conduct and efficiency of the inherently multi-disciplinary field of translational research. To help meet this goal, the Ohio State University Center for Clinical and Translational Science has launched a knowledge management initiative that is focused on facilitating widespread semantic interoperability among administrative, basic science, clinical and research computing systems, both internally and among the translational research community at-large, through the integration of domain-specific standard terminologies and ontologies with local annotations. This manuscript describes an agile framework that builds upon prevailing knowledge engineering and semantic interoperability methods, and will be implemented as part this initiative.
Conceptual mapping of user's queries to medical subject headings.
Zieman, Y. L.; Bleich, H. L.
1997-01-01
This paper describes a way to map users' queries to relevant Medical Subject Headings (MeSH terms) used by the National Library of Medicine to index the biomedical literature. The method, called SENSE (SEarch with New SEmantics), transforms words and phrases in the users' queries into primary conceptual components and compares these components with those of the MeSH vocabulary. Similar to the way in which most numbers can be split into numerical factors and expressed as their product--for example, 42 can be expressed as 2*21, 6*7, 3*14, 2*3*7,--so most medical concepts can be split into "semantic factors" and expressed as their juxtaposition. Note that if we split 42 into its primary factors, the breakdown is unique: 2*3*7. Similarly, when we split medical concepts into their "primary semantic factors" the breakdown is also unique. For example, the MeSH term 'renovascular hypertension' can be split morphologically into reno, vascular, hyper, and tension--morphemes that can then be translated into their primary semantic factors--kidney, blood vessel, high, and pressure. By "factoring" each MeSH term in this way, and by similarly factoring the user's query, we can match query to MeSH term by searching for combinations of common factors. Unlike UMLS and other methods that match at the level of words or phrases, SENSE matches at the level of concepts; in this way, a wide variety of words and phrases that have the same meaning produce the same match. Now used in PaperChase, the method is surprisingly powerful in matching users' queries to Medical Subject Headings. PMID:9357680
How activation, entanglement, and searching a semantic network contribute to event memory.
Nelson, Douglas L; Kitto, Kirsty; Galea, David; McEvoy, Cathy L; Bruza, Peter D
2013-08-01
Free-association norms indicate that words are organized into semantic/associative neighborhoods within a larger network of words and links that bind the net together. We present evidence indicating that memory for a recent word event can depend on implicitly and simultaneously activating related words in its neighborhood. Processing a word during encoding primes its network representation as a function of the density of the links in its neighborhood. Such priming increases recall and recognition and can have long-lasting effects when the word is processed in working memory. Evidence for this phenomenon is reviewed in extralist-cuing, primed free-association, intralist-cuing, and single-item recognition tasks. The findings also show that when a related word is presented in order to cue the recall of a studied word, the cue activates the target in an array of related words that distract and reduce the probability of the target's selection. The activation of the semantic network produces priming benefits during encoding, and search costs during retrieval. In extralist cuing, recall is a negative function of cue-to-distractor strength, and a positive function of neighborhood density, cue-to-target strength, and target-to-cue strength. We show how these four measures derived from the network can be combined and used to predict memory performance. These measures play different roles in different tasks, indicating that the contribution of the semantic network varies with the context provided by the task. Finally, we evaluate spreading-activation and quantum-like entanglement explanations for the priming effects produced by neighborhood density.
Decision theory for computing variable and value ordering decisions for scheduling problems
NASA Technical Reports Server (NTRS)
Linden, Theodore A.
1993-01-01
Heuristics that guide search are critical when solving large planning and scheduling problems, but most variable and value ordering heuristics are sensitive to only one feature of the search state. One wants to combine evidence from all features of the search state into a subjective probability that a value choice is best, but there has been no solid semantics for merging evidence when it is conceived in these terms. Instead, variable and value ordering decisions should be viewed as problems in decision theory. This led to two key insights: (1) The fundamental concept that allows heuristic evidence to be merged is the net incremental utility that will be achieved by assigning a value to a variable. Probability distributions about net incremental utility can merge evidence from the utility function, binary constraints, resource constraints, and other problem features. The subjective probability that a value is the best choice is then derived from probability distributions about net incremental utility. (2) The methods used for rumor control in Bayesian Networks are the primary way to prevent cycling in the computation of probable net incremental utility. These insights lead to semantically justifiable ways to compute heuristic variable and value ordering decisions that merge evidence from all available features of the search state.
ERIC Educational Resources Information Center
Rushton, Erin E.; Kelehan, Martha Daisy; Strong, Marcy A.
2008-01-01
Search engine use is one of the most popular online activities. According to a recent OCLC report, nearly all students start their electronic research using a search engine instead of the library Web site. Instead of viewing search engines as competition, however, librarians at Binghamton University Libraries decided to employ search engine…
A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos
Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian
2016-01-01
Objective Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today’s keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users’ information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. Materials and Methods The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively. Results The authors produced a prototype implementation of the proposed system, which is publicly accessible at https://patentq.njit.edu/oer. To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Conclusion Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable information, as well as intuitively and conveniently preview essential content of a single or a collection of videos. PMID:26335986
Teen smoking cessation help via the Internet: a survey of search engines.
Edwards, Christine C; Elliott, Sean P; Conway, Terry L; Woodruff, Susan I
2003-07-01
The objective of this study was to assess Web sites related to teen smoking cessation on the Internet. Seven Internet search engines were searched using the keywords teen quit smoking. The top 20 hits from each search engine were reviewed and categorized. The keywords teen quit smoking produced between 35 and 400,000 hits depending on the search engine. Of 140 potential hits, 62% were active, unique sites; 85% were listed by only one search engine; and 40% focused on cessation. Findings suggest that legitimate on-line smoking cessation help for teens is constrained by search engine choice and the amount of time teens spend looking through potential sites. Resource listings should be updated regularly. Smoking cessation Web sites need to be picked up on multiple search engine searches. Further evaluation of smoking cessation Web sites need to be conducted to identify the most effective help for teens.
Discovering discovery patterns with Predication-based Semantic Indexing.
Cohen, Trevor; Widdows, Dominic; Schvaneveldt, Roger W; Davies, Peter; Rindflesch, Thomas C
2012-12-01
In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as "discovery patterns", such as "drug x INHIBITS substance y, substance y CAUSES disease z" that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues. Copyright © 2012 Elsevier Inc. All rights reserved.
Discovering discovery patterns with predication-based Semantic Indexing
Cohen, Trevor; Widdows, Dominic; Schvaneveldt, Roger W.; Davies, Peter; Rindflesch, Thomas C.
2012-01-01
In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as “discovery patterns”, such as “drug x INHIBITS substance y, substance y CAUSES disease z” that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues. PMID:22841748
[Development of domain specific search engines].
Takai, T; Tokunaga, M; Maeda, K; Kaminuma, T
2000-01-01
As cyber space exploding in a pace that nobody has ever imagined, it becomes very important to search cyber space efficiently and effectively. One solution to this problem is search engines. Already a lot of commercial search engines have been put on the market. However these search engines respond with such cumbersome results that domain specific experts can not tolerate. Using a dedicate hardware and a commercial software called OpenText, we have tried to develop several domain specific search engines. These engines are for our institute's Web contents, drugs, chemical safety, endocrine disruptors, and emergent response for chemical hazard. These engines have been on our Web site for testing.
Jones, Andrew R.; Siepen, Jennifer A.; Hubbard, Simon J.; Paton, Norman W.
2010-01-01
Tandem mass spectrometry, run in combination with liquid chromatography (LC-MS/MS), can generate large numbers of peptide and protein identifications, for which a variety of database search engines are available. Distinguishing correct identifications from false positives is far from trivial because all data sets are noisy, and tend to be too large for manual inspection, therefore probabilistic methods must be employed to balance the trade-off between sensitivity and specificity. Decoy databases are becoming widely used to place statistical confidence in results sets, allowing the false discovery rate (FDR) to be estimated. It has previously been demonstrated that different MS search engines produce different peptide identification sets, and as such, employing more than one search engine could result in an increased number of peptides being identified. However, such efforts are hindered by the lack of a single scoring framework employed by all search engines. We have developed a search engine independent scoring framework based on FDR which allows peptide identifications from different search engines to be combined, called the FDRScore. We observe that peptide identifications made by three search engines are infrequently false positives, and identifications made by only a single search engine, even with a strong score from the source search engine, are significantly more likely to be false positives. We have developed a second score based on the FDR within peptide identifications grouped according to the set of search engines that have made the identification, called the combined FDRScore. We demonstrate by searching large publicly available data sets that the combined FDRScore can differentiate between between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine. PMID:19253293
Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro
2011-01-01
Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org. PMID:21632604
Psychophysical studies of the performance of an image database retrieval system
NASA Astrophysics Data System (ADS)
Papathomas, Thomas V.; Conway, Tiffany E.; Cox, Ingemar J.; Ghosn, Joumana; Miller, Matt L.; Minka, Thomas P.; Yianilos, Peter N.
1998-07-01
We describe psychophysical experiments conducted to study PicHunter, a content-based image retrieval (CBIR) system. Experiment 1 studies the importance of using (a) semantic information, (2) memory of earlier input and (3) relative, rather than absolute, judgements of image similarity. The target testing paradigm is used in which a user must search for an image identical to a target. We find that the best performance comes from a version of PicHunter that uses only semantic cues, with memory and relative similarity judgements. Second best is use of both pictorial and semantic cues, with memory and relative similarity judgements. Most reports of CBIR systems provide only qualitative measures of performance based on how similar retrieved images are to a target. Experiment 2 puts PicHunter into this context with a more rigorous test. We first establish a baseline for our database by measuring the time required to find an image that is similar to a target when the images are presented in random order. Although PicHunter's performance is measurably better than this, the test is weak because even random presentation of images yields reasonably short search times. This casts doubt on the strength of results given in other reports where no baseline is established.
Brockmole, James R; Le-Hoa Võ, Melissa
2010-10-01
When encountering familiar scenes, observers can use item-specific memory to facilitate the guidance of attention to objects appearing in known locations or configurations. Here, we investigated how memory for relational contingencies that emerge across different scenes can be exploited to guide attention. Participants searched for letter targets embedded in pictures of bedrooms. In a between-subjects manipulation, targets were either always on a bed pillow or randomly positioned. When targets were systematically located within scenes, search for targets became more efficient. Importantly, this learning transferred to bedrooms without pillows, ruling out learning that is based on perceptual contingencies. Learning also transferred to living room scenes, but it did not transfer to kitchen scenes, even though both scene types contained pillows. These results suggest that statistical regularities abstracted across a range of stimuli are governed by semantic expectations regarding the presence of target-predicting local landmarks. Moreover, explicit awareness of these contingencies led to a central tendency bias in recall memory for precise target positions that is similar to the spatial category effects observed in landmark memory. These results broaden the scope of conditions under which contextual cuing operates and demonstrate how semantic memory plays a causal and independent role in the learning of associations between objects in real-world scenes.
Processable English: The Theory Behind the PENG System
2009-06-01
implicit - is often buried amongst masses of irrelevant data. Heralding from unstructured sources such as natural language documents, email, audio ...estimation and prediction, data-mining, social network analysis, and semantic search and visualisation . This report describes the theoretical
ERIC Educational Resources Information Center
El Guemmat, Kamal; Ouahabi, Sara
2018-01-01
The objective of this article is to analyze the searching and indexing techniques of educational search engines' implementation while treating future challenges. Educational search engines could greatly help in the effectiveness of e-learning if used correctly. However, these engines have several gaps which influence the performance of e-learning…
Drexel at TREC 2014 Federated Web Search Track
2014-11-01
of its input RS results. 1. INTRODUCTION Federated Web Search is the task of searching multiple search engines simultaneously and combining their...or distributed properly[5]. The goal of RS is then, for a given query, to select only the most promising search engines from all those available. Most...result pages of 149 search engines . 4000 queries are used in building the sample set. As a part of the Vertical Selection task, search engines are
ERIC Educational Resources Information Center
Michael, Kurt Y.; Alsup, Philip R.
2016-01-01
Research focusing on science, technology, engineering, and math (STEM) education among conservative Protestant Christian school students is scarce. Crenshaw's intersectionality theory is examined as it pertains to religion as a group identifier. The STEM Semantic Survey was completed by 157 middle school students attending six different private…
A Search Relevance Algorithm for Weather Effects Products
2006-12-29
accessed) are often search engines [4] [5]. This suggests that people are navigating the internet by searching and not through the traditional...geographic location. Unlike traditional search engines a Federated Search Engine does not scour all the data available and return matches. Instead...gold standard in search engines . However, its ranking system is based, largely, on a measure of interconnectedness. A page that is referenced more
Semantic Edge Based Disparity Estimation Using Adaptive Dynamic Programming for Binocular Sensors
Zhu, Dongchen; Li, Jiamao; Wang, Xianshun; Peng, Jingquan; Shi, Wenjun; Zhang, Xiaolin
2018-01-01
Disparity calculation is crucial for binocular sensor ranging. The disparity estimation based on edges is an important branch in the research of sparse stereo matching and plays an important role in visual navigation. In this paper, we propose a robust sparse stereo matching method based on the semantic edges. Some simple matching costs are used first, and then a novel adaptive dynamic programming algorithm is proposed to obtain optimal solutions. This algorithm makes use of the disparity or semantic consistency constraint between the stereo images to adaptively search parameters, which can improve the robustness of our method. The proposed method is compared quantitatively and qualitatively with the traditional dynamic programming method, some dense stereo matching methods, and the advanced edge-based method respectively. Experiments show that our method can provide superior performance on the above comparison. PMID:29614028
Semantic Edge Based Disparity Estimation Using Adaptive Dynamic Programming for Binocular Sensors.
Zhu, Dongchen; Li, Jiamao; Wang, Xianshun; Peng, Jingquan; Shi, Wenjun; Zhang, Xiaolin
2018-04-03
Disparity calculation is crucial for binocular sensor ranging. The disparity estimation based on edges is an important branch in the research of sparse stereo matching and plays an important role in visual navigation. In this paper, we propose a robust sparse stereo matching method based on the semantic edges. Some simple matching costs are used first, and then a novel adaptive dynamic programming algorithm is proposed to obtain optimal solutions. This algorithm makes use of the disparity or semantic consistency constraint between the stereo images to adaptively search parameters, which can improve the robustness of our method. The proposed method is compared quantitatively and qualitatively with the traditional dynamic programming method, some dense stereo matching methods, and the advanced edge-based method respectively. Experiments show that our method can provide superior performance on the above comparison.
SAS- Semantic Annotation Service for Geoscience resources on the web
NASA Astrophysics Data System (ADS)
Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.
2015-12-01
There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.
E-Government Goes Semantic Web: How Administrations Can Transform Their Information Processes
NASA Astrophysics Data System (ADS)
Klischewski, Ralf; Ukena, Stefan
E-government applications and services are built mainly on access to, retrieval of, integration of, and delivery of relevant information to citizens, businesses, and administrative users. In order to perform such information processing automatically through the Semantic Web,1 machine-readable2 enhancements of web resources are needed, based on the understanding of the content and context of the information in focus. While these enhancements are far from trivial to produce, administrations in their role of information and service providers so far find little guidance on how to migrate their web resources and enable a new quality of information processing; even research is still seeking best practices. Therefore, the underlying research question of this chapter is: what are the appropriate approaches which guide administrations in transforming their information processes toward the Semantic Web? In search for answers, this chapter analyzes the challenges and possible solutions from the perspective of administrations: (a) the reconstruction of the information processing in the e-government in terms of how semantic technologies must be employed to support information provision and consumption through the Semantic Web; (b) the required contribution to the transformation is compared to the capabilities and expectations of administrations; and (c) available experience with the steps of transformation are reviewed and discussed as to what extent they can be expected to successfully drive the e-government to the Semantic Web. This research builds on studying the case of Schleswig-Holstein, Germany, where semantic technologies have been used within the frame of the Access-eGov3 project in order to semantically enhance electronic service interfaces with the aim of providing a new way of accessing and combining e-government services.
Druzinsky, Robert E.; Balhoff, James P.; Crompton, Alfred W.; Done, James; German, Rebecca Z.; Haendel, Melissa A.; Herrel, Anthony; Herring, Susan W.; Lapp, Hilmar; Mabee, Paula M.; Muller, Hans-Michael; Mungall, Christopher J.; Sternberg, Paul W.; Van Auken, Kimberly; Vinyard, Christopher J.; Williams, Susan H.; Wall, Christine E.
2016-01-01
Background In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required. Development and Testing of the Ontology Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Results and Significance Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities. PMID:26870952
Current Searching Methodology and Retrieval Issues: An Assessment
2008-03-01
searching that are used by search engines are discussed. They are: full text searching, i.e., the searching of unstructured data, and metadata searching...also found among search engines ; however, it is the popularity of full text searching that has changed the road map to information access. The...other hand, information seekers’ willingness, or lack of, to learn the multiple search engines ’ capabilities may diminish their search results
ERIC Educational Resources Information Center
Kurtz, Michael J.; Eichorn, Guenther; Accomazzi, Alberto; Grant, Carolyn S.; Demleitner, Markus; Murray, Stephen S.; Jones, Michael L. W.; Gay, Geri K.; Rieger, Robert H.; Millman, David; Bruggemann-Klein, Anne; Klein, Rolf; Landgraf, Britta; Wang, James Ze; Li, Jia; Chan, Desmond; Wiederhold, Gio; Pitti, Daniel V.
1999-01-01
Includes six articles that discuss a digital library for astronomy; comparing evaluations of digital collection efforts; cross-organizational access management of Web-based resources; searching scientific bibliographic databases based on content-based relations between documents; semantics-sensitive retrieval for digital picture libraries; and…
User centered and ontology based information retrieval system for life sciences.
Sy, Mohameth-François; Ranwez, Sylvie; Montmain, Jacky; Regnault, Armelle; Crampes, Michel; Ranwez, Vincent
2012-01-25
Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.
User centered and ontology based information retrieval system for life sciences
2012-01-01
Background Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. Results This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. Conclusions The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help. PMID:22373375
Foraging patterns in online searches.
Wang, Xiangwen; Pleimling, Michel
2017-03-01
Nowadays online searches are undeniably the most common form of information gathering, as witnessed by billions of clicks generated each day on search engines. In this work we describe online searches as foraging processes that take place on the semi-infinite line. Using a variety of quantities like probability distributions and complementary cumulative distribution functions of step length and waiting time as well as mean square displacements and entropies, we analyze three different click-through logs that contain the detailed information of millions of queries submitted to search engines. Notable differences between the different logs reveal an increased efficiency of the search engines. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches (i.e., on one page of links provided by the search engines), whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power law distributed. Our investigation of click logs of search engines therefore highlights the presence of intermittent search processes (where phases of local explorations are separated by power law distributed relocation jumps) in online searches. It follows that good search engines enable the users to find the information they are looking for through a local exploration of a single page with search results, whereas for poor search engine users are often forced to do a broader exploration of different pages.
Foraging patterns in online searches
NASA Astrophysics Data System (ADS)
Wang, Xiangwen; Pleimling, Michel
2017-03-01
Nowadays online searches are undeniably the most common form of information gathering, as witnessed by billions of clicks generated each day on search engines. In this work we describe online searches as foraging processes that take place on the semi-infinite line. Using a variety of quantities like probability distributions and complementary cumulative distribution functions of step length and waiting time as well as mean square displacements and entropies, we analyze three different click-through logs that contain the detailed information of millions of queries submitted to search engines. Notable differences between the different logs reveal an increased efficiency of the search engines. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches (i.e., on one page of links provided by the search engines), whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power law distributed. Our investigation of click logs of search engines therefore highlights the presence of intermittent search processes (where phases of local explorations are separated by power law distributed relocation jumps) in online searches. It follows that good search engines enable the users to find the information they are looking for through a local exploration of a single page with search results, whereas for poor search engine users are often forced to do a broader exploration of different pages.
NASA Astrophysics Data System (ADS)
Albeke, S. E.; Perkins, D. G.; Ewers, S. L.; Ewers, B. E.; Holbrook, W. S.; Miller, S. N.
2015-12-01
The sharing of data and results is paramount for advancing scientific research. The Wyoming Center for Environmental Hydrology and Geophysics (WyCEHG) is a multidisciplinary group that is driving scientific breakthroughs to help manage water resources in the Western United States. WyCEHG is mandated by the National Science Foundation (NSF) to share their data. However, the infrastructure from which to share such diverse, complex and massive amounts of data did not exist within the University of Wyoming. We developed an innovative framework to meet the data organization, sharing, and discovery requirements of WyCEHG by integrating both open and closed source software, embedded metadata tags, semantic web technologies, and a web-mapping application. The infrastructure uses a Relational Database Management System as the foundation, providing a versatile platform to store, organize, and query myriad datasets, taking advantage of both structured and unstructured formats. Detailed metadata are fundamental to the utility of datasets. We tag data with Uniform Resource Identifiers (URI's) to specify concepts with formal descriptions (i.e. semantic ontologies), thus allowing users the ability to search metadata based on the intended context rather than conventional keyword searches. Additionally, WyCEHG data are geographically referenced. Using the ArcGIS API for Javascript, we developed a web mapping application leveraging database-linked spatial data services, providing a means to visualize and spatially query available data in an intuitive map environment. Using server-side scripting (PHP), the mapping application, in conjunction with semantic search modules, dynamically communicates with the database and file system, providing access to available datasets. Our approach provides a flexible, comprehensive infrastructure from which to store and serve WyCEHG's highly diverse research-based data. This framework has not only allowed WyCEHG to meet its data stewardship requirements, but can provide a template for others to follow.
Semantically-Rigorous Systems Engineering Modeling Using Sysml and OWL
NASA Technical Reports Server (NTRS)
Jenkins, J. Steven; Rouquette, Nicolas F.
2012-01-01
The Systems Modeling Language (SysML) has found wide acceptance as a standard graphical notation for the domain of systems engineering. SysML subsets and extends the Unified Modeling Language (UML) to define conventions for expressing structural, behavioral, and analytical elements, and relationships among them. SysML-enabled modeling tools are available from multiple providers, and have been used for diverse projects in military aerospace, scientific exploration, and civil engineering. The Web Ontology Language (OWL) has found wide acceptance as a standard notation for knowledge representation. OWL-enabled modeling tools are available from multiple providers, as well as auxiliary assets such as reasoners and application programming interface libraries, etc. OWL has been applied to diverse projects in a wide array of fields. While the emphasis in SysML is on notation, SysML inherits (from UML) a semantic foundation that provides for limited reasoning and analysis. UML's partial formalization (FUML), however, does not cover the full semantics of SysML, which is a substantial impediment to developing high confidence in the soundness of any conclusions drawn therefrom. OWL, by contrast, was developed from the beginning on formal logical principles, and consequently provides strong support for verification of consistency and satisfiability, extraction of entailments, conjunctive query answering, etc. This emphasis on formal logic is counterbalanced by the absence of any graphical notation conventions in the OWL standards. Consequently, OWL has had only limited adoption in systems engineering. The complementary strengths and weaknesses of SysML and OWL motivate an interest in combining them in such a way that we can benefit from the attractive graphical notation of SysML and the formal reasoning of OWL. This paper describes an approach to achieving that combination.
Standard biological parts knowledgebase.
Galdzicki, Michal; Rodriguez, Cesar; Chandran, Deepak; Sauro, Herbert M; Gennari, John H
2011-02-24
We have created the Knowledgebase of Standard Biological Parts (SBPkb) as a publically accessible Semantic Web resource for synthetic biology (sbolstandard.org). The SBPkb allows researchers to query and retrieve standard biological parts for research and use in synthetic biology. Its initial version includes all of the information about parts stored in the Registry of Standard Biological Parts (partsregistry.org). SBPkb transforms this information so that it is computable, using our semantic framework for synthetic biology parts. This framework, known as SBOL-semantic, was built as part of the Synthetic Biology Open Language (SBOL), a project of the Synthetic Biology Data Exchange Group. SBOL-semantic represents commonly used synthetic biology entities, and its purpose is to improve the distribution and exchange of descriptions of biological parts. In this paper, we describe the data, our methods for transformation to SBPkb, and finally, we demonstrate the value of our knowledgebase with a set of sample queries. We use RDF technology and SPARQL queries to retrieve candidate "promoter" parts that are known to be both negatively and positively regulated. This method provides new web based data access to perform searches for parts that are not currently possible.
A fuzzy-match search engine for physician directories.
Rastegar-Mojarad, Majid; Kadolph, Christopher; Ye, Zhan; Wall, Daniel; Murali, Narayana; Lin, Simon
2014-11-04
A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: "Typographic", "Phonetic spelling variation", and "Nickname". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Using the "Challenge Data Set of Marshfield Physician Names," we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%). We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
Locality in Search Engine Queries and Its Implications for Caching
2001-05-01
in the question of whether caching might be effective for search engines as well. They study two real search engine traces by examining query...locality and its implications for caching. The two search engines studied are Vivisimo and Excite. Their trace analysis results show that queries have
A topic clustering approach to finding similar questions from large question and answer archives.
Zhang, Wei-Nan; Liu, Ting; Yang, Yang; Cao, Liujuan; Zhang, Yu; Ji, Rongrong
2014-01-01
With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.
Variability of patient spine education by Internet search engine.
Ghobrial, George M; Mehdi, Angud; Maltenfort, Mitchell; Sharan, Ashwini D; Harrop, James S
2014-03-01
Patients are increasingly reliant upon the Internet as a primary source of medical information. The educational experience varies by search engine, search term, and changes daily. There are no tools for critical evaluation of spinal surgery websites. To highlight the variability between common search engines for the same search terms. To detect bias, by prevalence of specific kinds of websites for certain spinal disorders. Demonstrate a simple scoring system of spinal disorder website for patient use, to maximize the quality of information exposed to the patient. Ten common search terms were used to query three of the most common search engines. The top fifty results of each query were tabulated. A negative binomial regression was performed to highlight the variation across each search engine. Google was more likely than Bing and Yahoo search engines to return hospital ads (P=0.002) and more likely to return scholarly sites of peer-reviewed lite (P=0.003). Educational web sites, surgical group sites, and online web communities had a significantly higher likelihood of returning on any search, regardless of search engine, or search string (P=0.007). Likewise, professional websites, including hospital run, industry sponsored, legal, and peer-reviewed web pages were less likely to be found on a search overall, regardless of engine and search string (P=0.078). The Internet is a rapidly growing body of medical information which can serve as a useful tool for patient education. High quality information is readily available, provided that the patient uses a consistent, focused metric for evaluating online spine surgery information, as there is a clear variability in the way search engines present information to the patient. Published by Elsevier B.V.
A rank-based Prediction Algorithm of Learning User's Intention
NASA Astrophysics Data System (ADS)
Shen, Jie; Gao, Ying; Chen, Cang; Gong, HaiPing
Internet search has become an important part in people's daily life. People can find many types of information to meet different needs through search engines on the Internet. There are two issues for the current search engines: first, the users should predetermine the types of information they want and then change to the appropriate types of search engine interfaces. Second, most search engines can support multiple kinds of search functions, each function has its own separate search interface. While users need different types of information, they must switch between different interfaces. In practice, most queries are corresponding to various types of information results. These queries can search the relevant results in various search engines, such as query "Palace" contains the websites about the introduction of the National Palace Museum, blog, Wikipedia, some pictures and video information. This paper presents a new aggregative algorithm for all kinds of search results. It can filter and sort the search results by learning three aspects about the query words, search results and search history logs to achieve the purpose of detecting user's intention. Experiments demonstrate that this rank-based method for multi-types of search results is effective. It can meet the user's search needs well, enhance user's satisfaction, provide an effective and rational model for optimizing search engines and improve user's search experience.
Mechanisms of Age-Related Decline in Memory Search Across the Adult Life Span
Hills, Thomas T.; Mata, Rui; Wilke, Andreas; Samanez-Larkin, Gregory R.
2013-01-01
Three alternative mechanisms for age-related decline in memory search have been proposed, which result from either reduced processing speed (global slowing hypothesis), overpersistence on categories (cluster-switching hypothesis), or the inability to maintain focus on local cues related to a decline in working memory (cue-maintenance hypothesis). We investigated these 3 hypotheses by formally modeling the semantic recall patterns of 185 adults between 27 to 99 years of age in the animal fluency task (Thurstone, 1938). The results indicate that people switch between global frequency-based retrieval cues and local item-based retrieval cues to navigate their semantic memory. Contrary to the global slowing hypothesis that predicts no qualitative differences in dynamic search processes and the cluster-switching hypothesis that predicts reduced switching between retrieval cues, the results indicate that as people age, they tend to switch more often between local and global cues per item recalled, supporting the cue-maintenance hypothesis. Additional support for the cue-maintenance hypothesis is provided by a negative correlation between switching and digit span scores and between switching and total items recalled, which suggests that cognitive control may be involved in cue maintenance and the effective search of memory. Overall, the results are consistent with age-related decline in memory search being a consequence of reduced cognitive control, consistent with models suggesting that working memory is related to goal perseveration and the ability to inhibit distracting information. PMID:23586941
Leveraging Pattern Semantics for Extracting Entities in Enterprises
Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei
2015-01-01
Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises (“Blue” refers to “Windows Blue”), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach. PMID:26705540
Leveraging Pattern Semantics for Extracting Entities in Enterprises.
Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei
2015-05-01
Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises ("Blue" refers to "Windows Blue"), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach.
Picture grammars in classification and semantic interpretation of 3D coronary vessels visualisations
NASA Astrophysics Data System (ADS)
Ogiela, M. R.; Tadeusiewicz, R.; Trzupek, M.
2009-09-01
The work presents the new opportunity for making semantic descriptions and analysis of medical structures, especially coronary vessels CT spatial reconstructions, with the use of AI graph-based linguistic formalisms. In the paper there will be discussed the manners of applying methods of computational intelligence to the development of a syntactic semantic description of spatial visualisations of the heart's coronary vessels. Such descriptions may be used for both smart ordering of images while archiving them and for their semantic searches in medical multimedia databases. Presented methodology of analysis can furthermore be used for attaining other goals related performance of computer-assisted semantic interpretation of selected elements and/or the entire 3D structure of the coronary vascular tree. These goals are achieved through the use of graph-based image formalisms based on IE graphs generating grammars that allow discovering and automatic semantic interpretation of irregularities visualised on the images obtained during diagnostic examinations of the heart muscle. The basis for the construction of 3D reconstructions of biological objects used in this work are visualisations obtained from helical CT scans, yet the method itself may be applied also for other methods of medical 3D images acquisition. The obtained semantic information makes it possible to make a description of the structure focused on the semantics of various morphological forms of the visualised vessels from the point of view of the operation of coronary circulation and the blood supply of the heart muscle. Thanks to these, the analysis conducted allows fast and — to a great degree — automated interpretation of the semantics of various morphological changes in the coronary vascular tree, and especially makes it possible to detect these stenoses in the lumen of the vessels that can cause critical decrease in blood supply to extensive or especially important fragments of the heart muscle.
Evaluation of Proteomic Search Engines for the Analysis of Histone Modifications
2015-01-01
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118. PMID:25167464
Evaluation of proteomic search engines for the analysis of histone modifications.
Yuan, Zuo-Fei; Lin, Shu; Molden, Rosalynn C; Garcia, Benjamin A
2014-10-03
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118.
Groza, Tudor; Tudorache, Tania; Hunter, Jane
2015-01-01
Collaboration platforms provide a dynamic environment where the content is subject to ongoing evolution through expert contributions. The knowledge embedded in such platforms is not static as it evolves through incremental refinements – or micro-contributions. Such refinements provide vast resources of tacit knowledge and experience. In our previous work, we proposed and evaluated a Semantic and Time-dependent Expertise Profiling (STEP) approach for capturing expertise from micro-contributions. In this paper we extend our investigation to structured micro-contributions that emerge from an ontology engineering environment, such as the one built for developing the International Classification of Diseases (ICD) revision 11. We take advantage of the semantically related nature of these structured micro-contributions to showcase two major aspects: (i) a novel semantic similarity metric, in addition to an approach for creating bottom-up baseline expertise profiles using expertise centroids; and (ii) the application of STEP in this new environment combined with the use of the same semantic similarity measure to both compare STEP against baseline profiles, as well as to investigate the coverage of these baseline profiles by STEP. PMID:28077914
NASA Astrophysics Data System (ADS)
D'Agostino, Gregorio; De Nicola, Antonio
2016-10-01
Exploiting the information about members of a Social Network (SN) represents one of the most attractive and dwelling subjects for both academic and applied scientists. The community of Complexity Science and especially those researchers working on multiplex social systems are devoting increasing efforts to outline general laws, models, and theories, to the purpose of predicting emergent phenomena in SN's (e.g. success of a product). On the other side the semantic web community aims at engineering a new generation of advanced services tailored to specific people needs. This implies defining constructs, models and methods for handling the semantic layer of SNs. We combined models and techniques from both the former fields to provide a hybrid approach to understand a basic (yet complex) phenomenon: the propagation of individual interests along the social networks. Since information may move along different social networks, one should take into account a multiplex structure. Therefore we introduced the notion of "Semantic Multiplex". In this paper we analyse two different semantic social networks represented by authors publishing in the Computer Science and those in the American Physical Society Journals. The comparison allows to outline common and specific features.
A development framework for semantically interoperable health information systems.
Lopez, Diego M; Blobel, Bernd G M E
2009-02-01
Semantic interoperability is a basic challenge to be met for new generations of distributed, communicating and co-operating health information systems (HIS) enabling shared care and e-Health. Analysis, design, implementation and maintenance of such systems and intrinsic architectures have to follow a unified development methodology. The Generic Component Model (GCM) is used as a framework for modeling any system to evaluate and harmonize state of the art architecture development approaches and standards for health information systems as well as to derive a coherent architecture development framework for sustainable, semantically interoperable HIS and their components. The proposed methodology is based on the Rational Unified Process (RUP), taking advantage of its flexibility to be configured for integrating other architectural approaches such as Service-Oriented Architecture (SOA), Model-Driven Architecture (MDA), ISO 10746, and HL7 Development Framework (HDF). Existing architectural approaches have been analyzed, compared and finally harmonized towards an architecture development framework for advanced health information systems. Starting with the requirements for semantic interoperability derived from paradigm changes for health information systems, and supported in formal software process engineering methods, an appropriate development framework for semantically interoperable HIS has been provided. The usability of the framework has been exemplified in a public health scenario.
Modelling and Simulation of Search Engine
NASA Astrophysics Data System (ADS)
Nasution, Mahyuddin K. M.
2017-01-01
The best tool currently used to access information is a search engine. Meanwhile, the information space has its own behaviour. Systematically, an information space needs to be familiarized with mathematics so easily we identify the characteristics associated with it. This paper reveal some characteristics of search engine based on a model of document collection, which are then estimated the impact on the feasibility of information. We reveal some of characteristics of search engine on the lemma and theorem about singleton and doubleton, then computes statistically characteristic as simulating the possibility of using search engine. In this case, Google and Yahoo. There are differences in the behaviour of both search engines, although in theory based on the concept of documents collection.
Survey of Knowledge Representation and Reasoning Systems
2009-07-01
processing large volumes of unstructured information such as natural language documents, email, audio , images and video [Ferrucci et al. 2006]. Using this...information we hope to obtain improved es- timation and prediction, data-mining, social network analysis, and semantic search and visualisation . Knowledge
New generation of the multimedia search engines
NASA Astrophysics Data System (ADS)
Mijes Cruz, Mario Humberto; Soto Aldaco, Andrea; Maldonado Cano, Luis Alejandro; López Rodríguez, Mario; Rodríguez Vázqueza, Manuel Antonio; Amaya Reyes, Laura Mariel; Cano Martínez, Elizabeth; Pérez Rosas, Osvaldo Gerardo; Rodríguez Espejo, Luis; Flores Secundino, Jesús Abimelek; Rivera Martínez, José Luis; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Sánchez Valenzuela, Juan Carlos; Montoya Obeso, Abraham; Ramírez Acosta, Alejandro Álvaro
2016-09-01
Current search engines are based upon search methods that involve the combination of words (text-based search); which has been efficient until now. However, the Internet's growing demand indicates that there's more diversity on it with each passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different types of content denominated multimedia content (images, audio files, video files). Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn't help improve the content or speed of the search itself. One solution is to improve them through the characterization of the content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is presented, focusing the needs according to new technologies. Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown that there are not many search engines that can perform content searches. The area of research of multimedia search engines of new generation is a multidisciplinary area that's in constant growth, generating tools that satisfy the different needs of new generation systems.
NASA Astrophysics Data System (ADS)
Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.
2009-12-01
Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.
Starr, Ariel; Vendetti, Michael S; Bunge, Silvia A
2018-05-01
Analogical reasoning is considered a key driver of cognitive development and is a strong predictor of academic achievement. However, it is difficult for young children, who are prone to focusing on perceptual and semantic similarities among items rather than relational commonalities. For example, in a classic A:B::C:? propositional analogy task, children must inhibit attention towards items that are visually or semantically similar to C, and instead focus on finding a relational match to the A:B pair. Competing theories of reasoning development attribute improvements in children's performance to gains in either executive functioning or semantic knowledge. Here, we sought to identify key drivers of the development of analogical reasoning ability by using eye gaze patterns to infer problem-solving strategies used by six-year-old children and adults. Children had a greater tendency than adults to focus on the immediate task goal and constrain their search based on the C item. However, large individual differences existed within children, and more successful reasoners were able to maintain the broader goal in mind and constrain their search by initially focusing on the A:B pair before turning to C and the response choices. When children adopted this strategy, their attention was drawn more readily to the correct response option. Individual differences in children's reasoning ability were also related to rule-guided behavior but not to semantic knowledge. These findings suggest that both developmental improvements and individual differences in performance are driven by the use of more efficient reasoning strategies regarding which information is prioritized from the start, rather than the ability to disengage from attractive lure items. Copyright © 2018 Elsevier B.V. All rights reserved.
Strategic retrieval and the frontal lobes: evidence from confabulation and amnesia.
Moscovitch, M; Melo, B
1997-07-01
Confabulation and amnesia are considered disorders of episodic but not of semantic memory. To test the limits of this view, retrieval from episodic and semantic memory was investigated in eight confabulating and nine non-confabulating amnesic subjects, and in 17 matched control subjects, by using a personal and an historical version of the Crovitz [Unconstrained search in long-term memory, Paper presented at the meeting of the Psychonomic Society, St Louis, MO, 1973] cue-word test. In response to cue words such as letter or battle subjects had to describe in detail, respectively, a related event from their personal lives or from history before their birth. We found that a subset of amnesic subjects, those with presumed damage or dysfunction in the region of the ventromedial frontal cortex, would confabulate in response to these cues. Their confabulations involved semantic, historical memories, as much as episodic, personal ones: and that distortions of content were at least as common as those of time. Even when not confabulating, they had much more difficulty than other amnesic subjects in recovering memories related to these cues. In confabulating, as compared to non-confabulating, amnesic subjects, prompting led to an increase in confabulations but also to greater recovery of veridical memories. By comparison, non-confabulating amnesic subjects whose memory loss was as severe as that of the confabulators, had a milder deficit on the personal as well as the historical cue-word test. They benefited from prompting somewhat more than matched control subjects. These results suggest that confabulation is associated with impaired strategic retrieval processes resulting from damage in the region of the ventromedial frontal cortex. These strategic retrieval processes help initiate and guide search in episodic and in semantic memory and they help monitor and organize the output from those systems.
Semantics in NETMAR (open service NETwork for MARine environmental data)
NASA Astrophysics Data System (ADS)
Leadbetter, Adam; Lowry, Roy; Clements, Oliver
2010-05-01
Over recent years, there has been a proliferation of environmental data portals utilising a wide range of systems and services, many of which cannot interoperate. The European Union Framework 7 project NETMAR (that commenced February 2010) aims to provide a toolkit for building such portals in a coherent manner through the use of chained Open Geospatial Consortium Web Services (WxS), OPeNDAP file access and W3C standards controlled by a Business Process Execution Language workflow. As such, the end product will be configurable by user communities interested in developing a portal for marine environmental data, and will offer search, download and integration tools for a range of satellite, model and observed data from open ocean and coastal areas. Further processing of these data will also be available in order to provide statistics and derived products suitable for decision making in the chosen environmental domain. In order to make the resulting portals truly interoperable, the NETMAR programme requires a detailed definition of the semantics of the services being called and the data which are being requested. A key goal of the NETMAR programme is, therefore, to develop a multi-domain and multilingual ontology of marine data and services. This will allow searches across both human languages and across scientific domains. The approach taken will be to analyse existing semantic resources and provide mappings between them, gluing together the definitions, semantics and workflows of the WxS services. The mappings between terms aim to be more general than the standard "narrower than", "broader than" type seen in the thesauri or simple ontologies implemented by previous programmes. Tools for the development and population of ontologoies will also be provided by NETMAR as there will be instances in which existing resources cannot sufficiently describe newly encountered data or services.
The Effectiveness of Web Search Engines to Index New Sites from Different Countries
ERIC Educational Resources Information Center
Pirkola, Ari
2009-01-01
Introduction: Investigates how effectively Web search engines index new sites from different countries. The primary interest is whether new sites are indexed equally or whether search engines are biased towards certain countries. If major search engines show biased coverage it can be considered a significant economic and political problem because…
Taming the Information Jungle with WWW Search Engines.
ERIC Educational Resources Information Center
Repman, Judi; And Others
1997-01-01
Because searching the Web with different engines often produces different results, the best strategy is to learn how each engine works. Discusses comparing search engines; qualities to consider (ease of use, relevance of hits, and speed); and six of the most popular search tools (Yahoo, Magellan. InfoSeek, Alta Vista, Lycos, and Excite). Lists…
MIRASS: medical informatics research activity support system using information mashup network.
Kiah, M L M; Zaidan, B B; Zaidan, A A; Nabi, Mohamed; Ibraheem, Rabiu
2014-04-01
The advancement of information technology has facilitated the automation and feasibility of online information sharing. The second generation of the World Wide Web (Web 2.0) enables the collaboration and sharing of online information through Web-serving applications. Data mashup, which is considered a Web 2.0 platform, plays an important role in information and communication technology applications. However, few ideas have been transformed into education and research domains, particularly in medical informatics. The creation of a friendly environment for medical informatics research requires the removal of certain obstacles in terms of search time, resource credibility, and search result accuracy. This paper considers three glitches that researchers encounter in medical informatics research; these glitches include the quality of papers obtained from scientific search engines (particularly, Web of Science and Science Direct), the quality of articles from the indices of these search engines, and the customizability and flexibility of these search engines. A customizable search engine for trusted resources of medical informatics was developed and implemented through data mashup. Results show that the proposed search engine improves the usability of scientific search engines for medical informatics. Pipe search engine was found to be more efficient than other engines.
Sharing Epigraphic Information as Linked Data
NASA Astrophysics Data System (ADS)
Álvarez, Fernando-Luis; García-Barriocanal, Elena; Gómez-Pantoja, Joaquín-L.
The diffusion of epigraphic data has evolved in the last years from printed catalogues to indexed digital databases shared through the Web. Recently, the open EpiDoc specifications have resulted in an XML-based schema for the interchange of ancient texts that uses XSLT to render typographic representations. However, these schemas and representation systems are still not providing a way to encode computational semantics and semantic relations between pieces of epigraphic data. This paper sketches an approach to bring these semantics into an EpiDoc based schema using the Ontology Web Language (OWL) and following the principles and methods of information sharing known as "linked data". The paper describes the general principles of the OWL mapping of the EpiDoc schema and how epigraphic data can be shared in RDF format via dereferenceable URIs that can be used to build advanced search, visualization and analysis systems.
Serial position effects in semantic memory: reconstructing the order of verses of hymns.
Maylor, Elizabeth A
2002-12-01
Serial position effects (primacy and recency) have been consistently demonstrated in both short- and long-term episodic memory tasks. The search for corresponding effects in semantic memory tasks (e.g., reconstructing the order of U.S. presidents) has been confounded by factors such as differential exposure to stimuli. In the present study, the stimuli were six-verse hymns that would have been sung from the first to the last verse by churchgoers on numerous occasions. Participants were presented with the verses of each hymn in random order and were required to reconstruct the correct order. Primacy and recency effects were significantly more evident for churchgoers than for nonchurchgoers. Moreover, error gradients were steeper than chance for churchgoers but not for nonchurchgoers; in other words, churchgoers' errors were more likely to be close to the correct position than further away. These findings provide the first unequivocal demonstration of serial position effects in semantic memory.
Chemical Information in Scirus and BASE (Bielefeld Academic Search Engine)
ERIC Educational Resources Information Center
Bendig, Regina B.
2009-01-01
The author sought to determine to what extent the two search engines, Scirus and BASE (Bielefeld Academic Search Engines), would be useful to first-year university students as the first point of searching for chemical information. Five topics were searched and the first ten records of each search result were evaluated with regard to the type of…
Database Search Engines: Paradigms, Challenges and Solutions.
Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.
Wu, G; Li, J
1999-01-01
Identifying and accessing reliable, relevant consumer health information rapidly on the Internet may challenge the health sciences librarian and layperson alike. In this study, seven search engines are compared using representative consumer health topics for their content relevancy, system features, and attributes. The paper discusses evaluation criteria; systematically compares relevant results; analyzes performance in terms of the strengths and weaknesses of the search engines; and illustrates effective search engine selection, search formulation, and strategies. PMID:10550031
Islamic Extremists Love the Internet
2009-04-03
down on the West. Terrorists’ Use of Search Engines In order to find a particular blog, extremists use search engines such as Bloglines...BlogScope, and Technorati to search blog contents. Technorati, which is among the most popular blog search engines , provides current information on...of mid- January 2009 is tracking over 31.78 million blogs with 579.86 million posts.49 Other ways the terrorists use Web search engines are to
Combinatorial Fusion Analysis for Meta Search Information Retrieval
NASA Astrophysics Data System (ADS)
Hsu, D. Frank; Taksa, Isak
Leading commercial search engines are built as single event systems. In response to a particular search query, the search engine returns a single list of ranked search results. To find more relevant results the user must frequently try several other search engines. A meta search engine was developed to enhance the process of multi-engine querying. The meta search engine queries several engines at the same time and fuses individual engine results into a single search results list. The fusion of multiple search results has been shown (mostly experimentally) to be highly effective. However, the question of why and how the fusion should be done still remains largely unanswered. In this chapter, we utilize the combinatorial fusion analysis proposed by Hsu et al. to analyze combination and fusion of multiple sources of information. A rank/score function is used in the design and analysis of our framework. The framework provides a better understanding of the fusion phenomenon in information retrieval. For example, to improve the performance of the combined multiple scoring systems, it is necessary that each of the individual scoring systems has relatively high performance and the individual scoring systems are diverse. Additionally, we illustrate various applications of the framework using two examples from the information retrieval domain.
Search Engines on the World Wide Web.
ERIC Educational Resources Information Center
Walster, Dian
1997-01-01
Discusses search engines and provides methods for determining what resources are searched, the quality of the information, and the algorithms used that will improve the use of search engines on the World Wide Web, online public access catalogs, and electronic encyclopedias. Lists strategies for conducting searches and for learning about the latest…
The Mercury System: Embedding Computation into Disk Drives
2004-08-20
enabling technologies to build extremely fast data search engines . We do this by moving the search closer to the data, and performing it in hardware...engine searches in parallel across a disk or disk surface 2. System Parallelism: Searching is off-loaded to search engines and main processor can
Leveraging search and content exploration by exploiting context in folksonomy systems
NASA Astrophysics Data System (ADS)
Abel, Fabian; Baldoni, Matteo; Baroglio, Cristina; Henze, Nicola; Kawase, Ricardo; Krause, Daniel; Patti, Viviana
2010-04-01
With the advent of Web 2.0 tagging became a popular feature in social media systems. People tag diverse kinds of content, e.g. products at Amazon, music at Last.fm, images at Flickr, etc. In the last years several researchers analyzed the impact of tags on information retrieval. Most works focused on tags only and ignored context information. In this article we present context-aware approaches for learning semantics and improve personalized information retrieval in tagging systems. We investigate how explorative search, initialized by clicking on tags, can be enhanced with automatically produced context information so that search results better fit to the actual information needs of the users. We introduce the SocialHITS algorithm and present an experiment where we compare different algorithms for ranking users, tags, and resources in a contextualized way. We showcase our approaches in the domain of images and present the TagMe! system that enables users to explore and tag Flickr pictures. In TagMe! we further demonstrate how advanced context information can easily be generated: TagMe! allows users to attach tag assignments to a specific area within an image and to categorize tag assignments. In our corresponding evaluation we show that those additional facets of tag assignments gain valuable semantics, which can be applied to improve existing search and ranking algorithms significantly.
[Biomedical information on the internet using search engines. A one-year trial].
Corrao, Salvatore; Leone, Francesco; Arnone, Sabrina
2004-01-01
The internet is a communication medium and content distributor that provide information in the general sense but it could be of great utility regarding as the search and retrieval of biomedical information. Search engines represent a great deal to rapidly find information on the net. However, we do not know whether general search engines and meta-search ones are reliable in order to find useful and validated biomedical information. The aim of our study was to verify the reproducibility of a search by key-words (pediatric or evidence) using 9 international search engines and 1 meta-search engine at the baseline and after a one year period. We analysed the first 20 citations as output of each searching. We evaluated the formal quality of Web-sites and their domain extensions. Moreover, we compared the output of each search at the start of this study and after a one year period and we considered as a criterion of reliability the number of Web-sites cited again. We found some interesting results that are reported throughout the text. Our findings point out an extreme dynamicity of the information on the Web and, for this reason, we advice a great caution when someone want to use search and meta-search engines as a tool for searching and retrieve reliable biomedical information. On the other hand, some search and meta-search engines could be very useful as a first step searching for defining better a search and, moreover, for finding institutional Web-sites too. This paper allows to know a more conscious approach to the internet biomedical information universe.
Alternative Fuels Data Center: Vehicle Search
ZeroTruck Search Engines and Hybrid Systems For medium- and heavy-duty vehicles: Engine & Power Sources Hydraulic hybrid Hybrid - CNG Hybrid - Diesel Electric Hybrid - LNG Hybrid Search x Pick Engine Fuel Natural Gas Propane Electric Plug-in Hybrid Electric Hydraulic hybrid Hybrid Search x Pick Engine Fuel
Getting to the top of Google: search engine optimization.
Maley, Catherine; Baum, Neil
2010-01-01
Search engine optimization is the process of making your Web site appear at or near the top of popular search engines such as Google, Yahoo, and MSN. This is not done by luck or knowing someone working for the search engines but by understanding the process of how search engines select Web sites for placement on top or on the first page. This article will review the process and provide methods and techniques to use to have your site rated at the top or very near the top.
2014-01-01
With smartphone distribution becoming common and robotic applications on the rise, social tagging services for various applications including robotic domains have advanced significantly. Though social tagging plays an important role when users are finding the exact information through web search, reliability and semantic relation between web contents and tags are not considered. Spams are making ill use of this aspect and put irrelevant tags deliberately on contents and induce users to advertise contents when they click items of search results. Therefore, this study proposes a detection method for tag-ranking manipulation to solve the problem of the existing methods which cannot guarantee the reliability of tagging. Similarity is measured for ranking the grade of registered tag on the contents, and weighted values of each tag are measured by means of synonym relevance, frequency, and semantic distances between tags. Lastly, experimental evaluation results are provided and its efficiency and accuracy are verified through them. PMID:25114975
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Combining results of multiple search engines in proteomics.
Shteynberg, David; Nesvizhskii, Alexey I; Moritz, Robert L; Deutsch, Eric W
2013-09-01
A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.
Combining Results of Multiple Search Engines in Proteomics*
Shteynberg, David; Nesvizhskii, Alexey I.; Moritz, Robert L.; Deutsch, Eric W.
2013-01-01
A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques. PMID:23720762
Gazetteer Brokering through Semantic Mediation
NASA Astrophysics Data System (ADS)
Hobona, G.; Bermudez, L. E.; Brackin, R.
2013-12-01
A gazetteer is a geographical directory containing some information regarding places. It provides names, location and other attributes for places which may include points of interest (e.g. buildings, oilfields and boreholes), and other features. These features can be published via web services conforming to the Gazetteer Application Profile of the Web Feature Service (WFS) standard of the Open Geospatial Consortium (OGC). Against the backdrop of advances in geophysical surveys, there has been a significant increase in the amount of data referenced to locations. Gazetteers services have played a significant role in facilitating access to such data, including through provision of specialized queries such as text, spatial and fuzzy search. Recent developments in the OGC have led to advances in gazetteers such as support for multilingualism, diacritics, and querying via advanced spatial constraints (e.g. search by radial search and nearest neighbor). A challenge remaining however, is that gazetteers produced by different organizations have typically been modeled differently. Inconsistencies from gazetteers produced by different organizations may include naming the same feature in a different way, naming the attributes differently, locating the feature in a different location, and providing fewer or more attributes than the other services. The Gazetteer application profile of the WFS is a starting point to address such inconsistencies by providing a standardized interface based on rules specified in ISO 19112, the international standard for spatial referencing by geographic identifiers. The profile, however, does not provide rules to deal with semantic inconsistencies. The USGS and NGA commissioned research into the potential for a Single Point of Entry Global Gazetteer (SPEGG). The research was conducted by the Cross Community Interoperability thread of the OGC testbed, referenced OWS-9. The testbed prototyped approaches for brokering gazetteers through use of semantic web technologies, including ontologies and a semantic mediator. The semantically-enhanced SPEGG allowed a client to submit a single query (e.g. ';hills') and to retrieve data from two separate gazetteers with different vocabularies (e.g. where one refers to ';summits' another refers to ';hills'). Supporting the SPEGG was a SPARQL server that held the ontologies and processed queries on them. Earth Science surveys and forecast always have a place on Earth. Being able to share the information about a place and solve inconsistencies about that place from different sources will enable geoscientists to better do their research. In the advent of mobile geo computing and location based services (LBS), brokering gazetteers will provide geoscientists with access to gazetteer services rich with information and functionality beyond that offered by current generic gazetteers.
Practical solutions to implementing "Born Semantic" data systems
NASA Astrophysics Data System (ADS)
Leadbetter, A.; Buck, J. J. H.; Stacey, P.
2015-12-01
The concept of data being "Born Semantic" has been proposed in recent years as a Semantic Web analogue to the idea of data being "born digital"[1], [2]. Within the "Born Semantic" concept, data are captured digitally and at a point close to the time of creation are annotated with markup terms from semantic web resources (controlled vocabularies, thesauri or ontologies). This allows heterogeneous data to be more easily ingested and amalgamated in near real-time due to the standards compliant annotation of the data. In taking the "Born Semantic" proposal from concept to operation, a number of difficulties have been encountered. For example, although there are recognised methods such as Header, Dictionary, Triples [3] for the compression, publication and dissemination of large volumes of triples these systems are not practical to deploy in the field on low-powered (both electrically and computationally) devices. Similarly, it is not practical for instruments to output fully formed semantically annotated data files if they are designed to be plugged into a modular system and the data to be centrally logged in the field as is the case on Argo floats and oceanographic gliders where internal bandwidth becomes an issue [2]. In light of these issues, this presentation will concentrate on pragmatic solutions being developed to the problem of generating Linked Data in near real-time systems. Specific examples from the European Commission SenseOCEAN project where Linked Data systems are being developed for autonomous underwater platforms, and from work being undertaken in the streaming of data from the Irish Galway Bay Cable Observatory initiative will be highlighted. Further, developments of a set of tools for the LogStash-ElasticSearch software ecosystem to allow the storing and retrieval of Linked Data will be introduced. References[1] A. Leadbetter & J. Fredericks, We have "born digital" - now what about "born semantic"?, European Geophysical Union General Assembly, 2014.[2] J. Buck & A. Leadbetter, Born semantic: linking data from sensors to users and balancing hardware limitations with data standards, European Geophysical Union General Assembly, 2015.[3] J. Fernandez et al., Binary RDF Representation for Publication and Exchange (HDT), Web Semantics 19:22-41, 2013.
Griffon, Nicolas; Kerdelhué, Gaétan; Hamek, Saliha; Hassler, Sylvain; Boog, César; Lamy, Jean-Baptiste; Duclos, Catherine; Venot, Alain; Darmoni, Stéfan J
2014-10-01
Doc'CISMeF (DC) is a semantic search engine used to find resources in CISMeF-BP, a quality controlled health gateway, which gathers guidelines available on the internet in French. Visualization of Concepts in Medicine (VCM) is an iconic language that may ease information retrieval tasks. This study aimed to describe the creation and evaluation of an interface integrating VCM in DC in order to make this search engine much easier to use. Focus groups were organized to suggest ways to enhance information retrieval tasks using VCM in DC. A VCM interface was created and improved using the ergonomic evaluation approach. 20 physicians were recruited to compare the VCM interface with the non-VCM one. Each evaluator answered two different clinical scenarios in each interface. The ability and time taken to select a relevant resource were recorded and compared. A usability analysis was performed using the System Usability Scale (SUS). The VCM interface contains a filter based on icons, and icons describing each resource according to focus group recommendations. Some ergonomic issues were resolved before evaluation. Use of VCM significantly increased the success of information retrieval tasks (OR=11; 95% CI 1.4 to 507). Nonetheless, it took significantly more time to find a relevant resource with VCM interface (101 vs 65 s; p=0.02). SUS revealed 'good' usability with an average score of 74/100. VCM was successfully implemented in DC as an option. It increased the success rate of information retrieval tasks, despite requiring slightly more time, and was well accepted by end-users. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The Semantic Web: From Representation to Realization
NASA Astrophysics Data System (ADS)
Thórisson, Kristinn R.; Spivack, Nova; Wissner, James M.
A semantically-linked web of electronic information - the Semantic Web - promises numerous benefits including increased precision in automated information sorting, searching, organizing and summarizing. Realizing this requires significantly more reliable meta-information than is readily available today. It also requires a better way to represent information that supports unified management of diverse data and diverse Manipulation methods: from basic keywords to various types of artificial intelligence, to the highest level of intelligent manipulation - the human mind. How this is best done is far from obvious. Relying solely on hand-crafted annotation and ontologies, or solely on artificial intelligence techniques, seems less likely for success than a combination of the two. In this paper describe an integrated, complete solution to these challenges that has already been implemented and tested with hundreds of thousands of users. It is based on an ontological representational level we call SemCards that combines ontological rigour with flexible user interface constructs. SemCards are machine- and human-readable digital entities that allow non-experts to create and use semantic content, while empowering machines to better assist and participate in the process. SemCards enable users to easily create semantically-grounded data that in turn acts as examples for automation processes, creating a positive iterative feedback loop of metadata creation and refinement between user and machine. They provide a holistic solution to the Semantic Web, supporting powerful management of the full lifecycle of data, including its creation, retrieval, classification, sorting and sharing. We have implemented the SemCard technology on the semantic Web site Twine.com, showing that the technology is indeed versatile and scalable. Here we present the key ideas behind SemCards and describe the initial implementation of the technology.
A Practitioner's Perspective on Taxonomy, Ontology and Findability
NASA Technical Reports Server (NTRS)
Berndt, Sarah
2011-01-01
This slide presentation reviews the presenters perspective on developing a taxonomy for JSC to capitalize on the accomplishments of yesterday, while maintaining the flexibility needed for the evolving information of today. A clear vision and scope for the semantic system is integral to its success. The vision for the JSC Taxonomy is to connect information stovepipes to present a unified view for information and knowledge across the Center, across organizations, and across decades. Semantic search at JSC means seamless integration of disparate information sets into a single interface. Ever increasing use, interest, and organizational participation mark successful integration and provide the framework for future application.
ERIC Educational Resources Information Center
Hock, Randolph
This book aims to facilitate more effective and efficient use of World Wide Web search engines by helping the reader: know the basic structure of the major search engines; become acquainted with those attributes (features, benefits, options, content, etc.) that search engines have in common and where they differ; know the main strengths and…
Research on Agriculture Domain Meta-Search Engine System
NASA Astrophysics Data System (ADS)
Xie, Nengfu; Wang, Wensheng
The rapid growth of agriculture web information brings a fact that search engine can not return a satisfied result for users’ queries. In this paper, we propose an agriculture domain search engine system, called ADSE, that can obtains results by an advance interface to several searches and aggregates them. We also discuss two key technologies: agriculture information determination and engine.
Using Internet search engines to estimate word frequency.
Blair, Irene V; Urland, Geoffrey R; Ma, Jennifer E
2002-05-01
The present research investigated Internet search engines as a rapid, cost-effective alternative for estimating word frequencies. Frequency estimates for 382 words were obtained and compared across four methods: (1) Internet search engines, (2) the Kucera and Francis (1967) analysis of a traditional linguistic corpus, (3) the CELEX English linguistic database (Baayen, Piepenbrock, & Gulikers, 1995), and (4) participant ratings of familiarity. The results showed that Internet search engines produced frequency estimates that were highly consistent with those reported by Kucera and Francis and those calculated from CELEX, highly consistent across search engines, and very reliable over a 6-month period of time. Additional results suggested that Internet search engines are an excellent option when traditional word frequency analyses do not contain the necessary data (e.g., estimates for forenames and slang). In contrast, participants' familiarity judgments did not correspond well with the more objective estimates of word frequency. Researchers are advised to use search engines with large databases (e.g., AltaVista) to ensure the greatest representativeness of the frequency estimates.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M
2011-07-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I.; Marcotte, Edward M.
2011-01-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses. PMID:21488652
The Gaze of the Perfect Search Engine: Google as an Infrastructure of Dataveillance
NASA Astrophysics Data System (ADS)
Zimmer, M.
Web search engines have emerged as a ubiquitous and vital tool for the successful navigation of the growing online informational sphere. The goal of the world's largest search engine, Google, is to "organize the world's information and make it universally accessible and useful" and to create the "perfect search engine" that provides only intuitive, personalized, and relevant results. While intended to enhance intellectual mobility in the online sphere, this chapter reveals that the quest for the perfect search engine requires the widespread monitoring and aggregation of a users' online personal and intellectual activities, threatening the values the perfect search engines were designed to sustain. It argues that these search-based infrastructures of dataveillance contribute to a rapidly emerging "soft cage" of everyday digital surveillance, where they, like other dataveillance technologies before them, contribute to the curtailing of individual freedom, affect users' sense of self, and present issues of deep discrimination and social justice.
Searching the Internet for information on prostate cancer screening: an assessment of quality.
Ilic, Dragan; Risbridger, Gail; Green, Sally
2004-07-01
To identify how on-line information relating to prostate cancer screening (PCS) is best sourced, whether through general, medical, or meta-search engines, and to assess the quality of that information. Websites providing information about PCS were searched across 15 search engines representing three distinct types: general, medical, and meta-search engines. The quality of on-line information was assessed using the DISCERN quality assessment tool. Quality performance characteristics were analyzed by performing Mann-Whitney U tests. Search engine efficiency was measured by each search query as a percentage of the relevant websites included for analysis from the total returned and analyzed by performing Kruskal-Wallis analysis of variance. Of 6690 websites reviewed, 84 unique websites were identified as providing information relevant to PCS. General and meta-search engines were significantly more efficient at retrieving relevant information on PCS compared with medical search engines. The quality of information was variable, with most of a poor standard. Websites that provided referral links to other resources and a citation of evidence provided a significantly better quality of information. In contrast, websites offering a direct service were more likely to provide a significantly poorer quality of information. The current lack of a clear consensus on guidelines and recommendation in published data is also reflected by the variable quality of information found on-line. Specialized medical search engines were no more likely to retrieve relevant, high-quality information than general or meta-search engines.
NASA Technical Reports Server (NTRS)
Williams, D. H.; Simpson, C. A.
1976-01-01
Line pilots (fifty captains, first officers, and flight engineers) from 8 different airlines were administered a structured questionnaire relating to future warning system design and solutions to current warning system problems. This was followed by a semantic differential to obtain a factor analysis of 18 different cockpit warning signals on scales such as informative/distracting, annoying/soothing. Half the pilots received a demonstration of the experimental text and voice synthesizer warning systems before answering the questionnaire and the semantic differential. A control group answered the questionnaire and the semantic differential first, thus providing a check for the stability of pilot preferences with and without actual exposure to experimental systems. Generally, the preference data obtained revealed much consistency and strong agreement among line pilots concerning advance cockpit warning system design.
UBioLab: a web-LABoratory for Ubiquitous in-silico experiments.
Bartocci, E; Di Berardini, M R; Merelli, E; Vito, L
2012-03-01
The huge and dynamic amount of bioinformatic resources (e.g., data and tools) available nowadays in Internet represents a big challenge for biologists -for what concerns their management and visualization- and for bioinformaticians -for what concerns the possibility of rapidly creating and executing in-silico experiments involving resources and activities spread over the WWW hyperspace. Any framework aiming at integrating such resources as in a physical laboratory has imperatively to tackle -and possibly to handle in a transparent and uniform way- aspects concerning physical distribution, semantic heterogeneity, co-existence of different computational paradigms and, as a consequence, of different invocation interfaces (i.e., OGSA for Grid nodes, SOAP for Web Services, Java RMI for Java objects, etc.). The framework UBioLab has been just designed and developed as a prototype following the above objective. Several architectural features -as those ones of being fully Web-based and of combining domain ontologies, Semantic Web and workflow techniques- give evidence of an effort in such a direction. The integration of a semantic knowledge management system for distributed (bioinformatic) resources, a semantic-driven graphic environment for defining and monitoring ubiquitous workflows and an intelligent agent-based technology for their distributed execution allows UBioLab to be a semantic guide for bioinformaticians and biologists providing (i) a flexible environment for visualizing, organizing and inferring any (semantics and computational) "type" of domain knowledge (e.g., resources and activities, expressed in a declarative form), (ii) a powerful engine for defining and storing semantic-driven ubiquitous in-silico experiments on the domain hyperspace, as well as (iii) a transparent, automatic and distributed environment for correct experiment executions.
Samadzadeh, Gholam Reza; Rigi, Tahereh; Ganjali, Ali Reza
2013-01-01
Surveying valuable and most recent information from internet, has become vital for researchers and scholars, because every day, thousands and perhaps millions of scientific works are brought out as digital resources which represented by internet and researchers can't ignore this great resource to find related documents for their literature search, which may not be found in any library. With regard to variety of documents presented on the internet, search engines are one of the most effective search tools for finding information. The aim of this study is to evaluate the three criteria, recall, preciseness and importance of the four search engines which are PubMed, Science Direct, Google Scholar and federated search of Iranian National Medical Digital Library in addiction (prevention and treatment) to select the most effective search engine for offering the best literature research. This research was a cross-sectional study by which four popular search engines in medical sciences were evaluated. To select keywords, medical subject heading (Mesh) was used. We entered given keywords in the search engines and after searching, 10 first entries were evaluated. Direct observation was used as a mean for data collection and they were analyzed by descriptive statistics (number, percent number and mean) and inferential statistics, One way analysis of variance (ANOVA) and post hoc Tukey in Spss. 15 statistical software. P Value < 0.05 was considered statistically significant. Results have shown that the search engines had different operations with regard to the evaluated criteria. Since P Value was 0.004 < 0.05 for preciseness and was 0.002 < 0.05 for importance, it shows significant difference among search engines. PubMed, Science Direct and Google Scholar were the best in recall, preciseness and importance respectively. As literature research is one of the most important stages of research, it's better for researchers, especially Substance-Related Disorders scholars to use different search engines with the best recall, preciseness and importance in that subject field to reach desirable results while searching and they don't depend on just one search engine.
Samadzadeh, Gholam Reza; Rigi, Tahereh; Ganjali, Ali Reza
2013-01-01
Background Surveying valuable and most recent information from internet, has become vital for researchers and scholars, because every day, thousands and perhaps millions of scientific works are brought out as digital resources which represented by internet and researchers can’t ignore this great resource to find related documents for their literature search, which may not be found in any library. With regard to variety of documents presented on the internet, search engines are one of the most effective search tools for finding information. Objectives The aim of this study is to evaluate the three criteria, recall, preciseness and importance of the four search engines which are PubMed, Science Direct, Google Scholar and federated search of Iranian National Medical Digital Library in addiction (prevention and treatment) to select the most effective search engine for offering the best literature research. Materials and Methods This research was a cross-sectional study by which four popular search engines in medical sciences were evaluated. To select keywords, medical subject heading (Mesh) was used. We entered given keywords in the search engines and after searching, 10 first entries were evaluated. Direct observation was used as a mean for data collection and they were analyzed by descriptive statistics (number, percent number and mean) and inferential statistics, One way analysis of variance (ANOVA) and post hoc Tukey in Spss. 15 statistical software. P Value < 0.05 was considered statistically significant. Results Results have shown that the search engines had different operations with regard to the evaluated criteria. Since P Value was 0.004 < 0.05 for preciseness and was 0.002 < 0.05 for importance, it shows significant difference among search engines. PubMed, Science Direct and Google Scholar were the best in recall, preciseness and importance respectively. Conclusions As literature research is one of the most important stages of research, it's better for researchers, especially Substance-Related Disorders scholars to use different search engines with the best recall, preciseness and importance in that subject field to reach desirable results while searching and they don’t depend on just one search engine. PMID:24971257
Search Engine Liability for Copyright Infringement
NASA Astrophysics Data System (ADS)
Fitzgerald, B.; O'Brien, D.; Fitzgerald, A.
The chapter provides a broad overview to the topic of search engine liability for copyright infringement. In doing so, the chapter examines some of the key copyright law principles and their application to search engines. The chapter also provides a discussion of some of the most important cases to be decided within the courts of the United States, Australia, China and Europe regarding the liability of search engines for copyright infringement. Finally, the chapter will conclude with some thoughts for reform, including how copyright law can be amended in order to accommodate and realise the great informative power which search engines have to offer society.
Automating Expertise in Collaborative Learning Environments
ERIC Educational Resources Information Center
LaVoie, Noelle; Streeter, Lynn; Lochbaum, Karen; Wroblewski, David; Boyce, Lisa; Krupnick, Charles; Psotka, Joseph
2010-01-01
We have developed a set of tools for improving online collaborative learning including an automated expert that monitors and moderates discussions, and additional tools to evaluate contributions, semantically search all posted comments, access a library of hundreds of digital books and provide reports to instructors. The technology behind these…
ERIC Educational Resources Information Center
Blommaert, Jan
2009-01-01
This paper describes the cultural semantics of internet courses in American accent. Such courses are offered by corporate providers to specific groups of customers: people in search of success in the globalized business environment. The core of such courses is an order of indexicality which stresses uniformity and homogeneity, producing an…
Caro-Rojas, Rosa Angela; Eslava-Schmalbach, Javier H
2005-01-01
To compare the information obtained from the Medline database using Internet commercial search engines with that obtained from a compact disc (Medline-CD). An agreement study was carried out based on 101 clinical scenarios provided by specialists in internal medicine, pharmacy, gynaecology-obstetrics, surgery and paediatrics. 175 search strategies were employed using the connector AND plus text within quotation marks. The search was limited to 1991-1999. Internet search-engines were selected by common criteria. Identical search strategies were independently applied to and masked from Internet search engines, as well as the Medline-CD. 3,488 articles were obtained using 129 search strategies. Agreement with the Medline-CD was 54% for PubMed, 57% for Gateway, 54% for Medscape and 65% for BioMedNet. The highest agreement rate for a given speciality (paediatrics) was 78.1% for BioMedNet, having greater -/- than +/+ agreement. Even though free access to Medline has encouraged the boom and growth of evidence-based medicine, these results must be considered within the context of which search engine was selected for doing the searches. The Internet search engines studied showed a poor agreement with the Medline-CD, the rate of agreement differing according to speciality, thus significantly affecting searches and their reproducibility. Software designed for conducting Medline database searches, including the Medline-CD, must be standardised and validated.
Defining and Exposing Privacy Issues with Social Media
2012-06-11
Twitter, and Linked In[ I 0). VI. SEARCH ENGINES In addition to social networking sites, search engines pose new issues to privacy. As...networking, search engines , and storing personal information online in general have been accepted worldwide due to the benefits they provide. Social...networking provides even more communication in an information-demanding age, allowing users to interact across great distances. Search engines allow
2011-09-01
search engines to find information. Most commercial search engines (Google, Yahoo, Bing, etc.) provide their indexing and search services...at no cost. The DoD can achieve large gains at a small cost by making public documents available to search engines . This can be achieved through the...were organized on the website dodreports.com. The results of this research revealed improvement gains of 8-20% for finding reports through commercial search engines during the first six months of
Can people find patient decision aids on the Internet?
Morris, Debra; Drake, Elizabeth; Saarimaki, Anton; Bennett, Carol; O'Connor, Annette
2008-12-01
To determine if people could find patient decision aids (PtDAs) on the Internet using the most popular general search engines. We chose five medical conditions for which English language PtDAs were available from at least three different developers. The search engines used were: Google (www.google.com), Yahoo! (www.yahoo.com), and MSN (www.msn.com). For each condition and search engine we ran six searches using a combination of search terms. We coded all non-sponsored Web pages that were linked from the first page of the search results. Most first page results linked to informational Web pages about the condition, only 16% linked to PtDAs. PtDAs were more readily found for the breast cancer surgery decision (our searches found seven of the nine developers). The searches using Yahoo and Google search engines were more likely to find PtDAs. The following combination of search terms: condition, treatment, decision (e.g. breast cancer surgery decision) was most successful across all search engines (29%). While some terms and search engines were more successful, few resulted in direct links to PtDAs. Finding PtDAs would be improved with use of standardized labelling, providing patients with specific Web site addresses or access to an independent PtDA clearinghouse.
NASA Astrophysics Data System (ADS)
Gaševic, Dragan; Djuric, Dragan; Devedžic, Vladan
A relevant initiative from the software engineering community called Model Driven Engineering (MDE) is being developed in parallel with the Semantic Web (Mellor et al. 2003a). The MDE approach to software development suggests that one should first develop a model of the system under study, which is then transformed into the real thing (i.e., an executable software entity). The most important research initiative in this area is the Model Driven Architecture (MDA), which is Model Driven Architecture being developed under the umbrella of the Object Management Group (OMG). This chapter describes the basic concepts of this software engineering effort.