Sample records for semantic information extraction

  1. Extracting Useful Semantic Information from Large Scale Corpora of Text

    ERIC Educational Resources Information Center

    Mendoza, Ray Padilla, Jr.

    2012-01-01

    Extracting and representing semantic information from large scale corpora is at the crux of computer-assisted knowledge generation. Semantic information depends on collocation extraction methods, mathematical models used to represent distributional information, and weighting functions which transform the space. This dissertation provides a…

  2. Semantic Preview Benefit in English: Individual Differences in the Extraction and Use of Parafoveal Semantic Information

    ERIC Educational Resources Information Center

    Veldre, Aaron; Andrews, Sally

    2016-01-01

    Although there is robust evidence that skilled readers of English extract and use orthographic and phonological information from the parafovea to facilitate word identification, semantic preview benefits have been elusive. We sought to establish whether individual differences in the extraction and/or use of parafoveal semantic information could…

  3. Semantic Information Extraction of Lanes Based on Onboard Camera Videos

    NASA Astrophysics Data System (ADS)

    Tang, L.; Deng, T.; Ren, C.

    2018-04-01

    In the field of autonomous driving, semantic information of lanes is very important. This paper proposes a method of automatic detection of lanes and extraction of semantic information from onboard camera videos. The proposed method firstly detects the edges of lanes by the grayscale gradient direction, and improves the Probabilistic Hough transform to fit them; then, it uses the vanishing point principle to calculate the lane geometrical position, and uses lane characteristics to extract lane semantic information by the classification of decision trees. In the experiment, 216 road video images captured by a camera mounted onboard a moving vehicle were used to detect lanes and extract lane semantic information. The results show that the proposed method can accurately identify lane semantics from video images.

  4. Lexical and sublexical semantic preview benefits in Chinese reading.

    PubMed

    Yan, Ming; Zhou, Wei; Shu, Hua; Kliegl, Reinhold

    2012-07-01

    Semantic processing from parafoveal words is an elusive phenomenon in alphabetic languages, but it has been demonstrated only for a restricted set of noncompound Chinese characters. Using the gaze-contingent boundary paradigm, this experiment examined whether parafoveal lexical and sublexical semantic information was extracted from compound preview characters. Results generalized parafoveal semantic processing to this representative set of Chinese characters and extended the parafoveal processing to radical (sublexical) level semantic information extraction. Implications for notions of parafoveal information extraction during Chinese reading are discussed. 2012 APA, all rights reserved

  5. Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.

    PubMed

    Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen

    2014-01-01

    Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.

  6. A extract method of mountainous area settlement place information from GF-1 high resolution optical remote sensing image under semantic constraints

    NASA Astrophysics Data System (ADS)

    Guo, H., II

    2016-12-01

    Spatial distribution information of mountainous area settlement place is of great significance to the earthquake emergency work because most of the key earthquake hazardous areas of china are located in the mountainous area. Remote sensing has the advantages of large coverage and low cost, it is an important way to obtain the spatial distribution information of mountainous area settlement place. At present, fully considering the geometric information, spectral information and texture information, most studies have applied object-oriented methods to extract settlement place information, In this article, semantic constraints is to be added on the basis of object-oriented methods. The experimental data is one scene remote sensing image of domestic high resolution satellite (simply as GF-1), with a resolution of 2 meters. The main processing consists of 3 steps, the first is pretreatment, including ortho rectification and image fusion, the second is Object oriented information extraction, including Image segmentation and information extraction, the last step is removing the error elements under semantic constraints, in order to formulate these semantic constraints, the distribution characteristics of mountainous area settlement place must be analyzed and the spatial logic relation between settlement place and other objects must be considered. The extraction accuracy calculation result shows that the extraction accuracy of object oriented method is 49% and rise up to 86% after the use of semantic constraints. As can be seen from the extraction accuracy, the extract method under semantic constraints can effectively improve the accuracy of mountainous area settlement place information extraction. The result shows that it is feasible to extract mountainous area settlement place information form GF-1 image, so the article proves that it has a certain practicality to use domestic high resolution optical remote sensing image in earthquake emergency preparedness.

  7. A Semantic Graph Query Language

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaplan, I L

    2006-10-16

    Semantic graphs can be used to organize large amounts of information from a number of sources into one unified structure. A semantic query language provides a foundation for extracting information from the semantic graph. The graph query language described here provides a simple, powerful method for querying semantic graphs.

  8. The methodology of semantic analysis for extracting physical effects

    NASA Astrophysics Data System (ADS)

    Fomenkova, M. A.; Kamaev, V. A.; Korobkin, D. M.; Fomenkov, S. A.

    2017-01-01

    The paper represents new methodology of semantic analysis for physical effects extracting. This methodology is based on the Tuzov ontology that formally describes the Russian language. In this paper, semantic patterns were described to extract structural physical information in the form of physical effects. A new algorithm of text analysis was described.

  9. Enhancing biomedical text summarization using semantic relation extraction.

    PubMed

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  10. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature

    PubMed Central

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838

  11. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature.

    PubMed

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.

  12. Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

    PubMed Central

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336

  13. Neuroanatomic organization of sound memory in humans.

    PubMed

    Kraut, Michael A; Pitcock, Jeffery A; Calhoun, Vince; Li, Juan; Freeman, Thomas; Hart, John

    2006-11-01

    The neural interface between sensory perception and memory is a central issue in neuroscience, particularly initial memory organization following perceptual analyses. We used functional magnetic resonance imaging to identify anatomic regions extracting initial auditory semantic memory information related to environmental sounds. Two distinct anatomic foci were detected in the right superior temporal gyrus when subjects identified sounds representing either animals or threatening items. Threatening animal stimuli elicited signal changes in both foci, suggesting a distributed neural representation. Our results demonstrate both category- and feature-specific responses to nonverbal sounds in early stages of extracting semantic memory information from these sounds. This organization allows for these category-feature detection nodes to extract early, semantic memory information for efficient processing of transient sound stimuli. Neural regions selective for threatening sounds are similar to those of nonhuman primates, demonstrating semantic memory organization for basic biological/survival primitives are present across species.

  14. Semantic extraction and processing of medical records for patient-oriented visual index

    NASA Astrophysics Data System (ADS)

    Zheng, Weilin; Dong, Wenjie; Chen, Xiangjiao; Zhang, Jianguo

    2012-02-01

    To have comprehensive and completed understanding healthcare status of a patient, doctors need to search patient medical records from different healthcare information systems, such as PACS, RIS, HIS, USIS, as a reference of diagnosis and treatment decisions for the patient. However, it is time-consuming and tedious to do these procedures. In order to solve this kind of problems, we developed a patient-oriented visual index system (VIS) to use the visual technology to show health status and to retrieve the patients' examination information stored in each system with a 3D human model. In this presentation, we present a new approach about how to extract the semantic and characteristic information from the medical record systems such as RIS/USIS to create the 3D Visual Index. This approach includes following steps: (1) Building a medical characteristic semantic knowledge base; (2) Developing natural language processing (NLP) engine to perform semantic analysis and logical judgment on text-based medical records; (3) Applying the knowledge base and NLP engine on medical records to extract medical characteristics (e.g., the positive focus information), and then mapping extracted information to related organ/parts of 3D human model to create the visual index. We performed the testing procedures on 559 samples of radiological reports which include 853 focuses, and achieved 828 focuses' information. The successful rate of focus extraction is about 97.1%.

  15. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media

    PubMed Central

    Cameron, Delroy; Smith, Gary A.; Daniulaityte, Raminta; Sheth, Amit P.; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z.; Falck, Russel

    2013-01-01

    Objectives The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Methods Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. Results A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. Conclusion A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. PMID:23892295

  16. Information Extraction from Unstructured Text for the Biodefense Knowledge Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Samatova, N F; Park, B; Krishnamurthy, R

    2005-04-29

    The Bio-Encyclopedia at the Biodefense Knowledge Center (BKC) is being constructed to allow an early detection of emerging biological threats to homeland security. It requires highly structured information extracted from variety of data sources. However, the quantity of new and vital information available from every day sources cannot be assimilated by hand, and therefore reliable high-throughput information extraction techniques are much anticipated. In support of the BKC, Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, together with the University of Utah, are developing an information extraction system built around the bioterrorism domain. This paper reports two important pieces ofmore » our effort integrated in the system: key phrase extraction and semantic tagging. Whereas two key phrase extraction technologies developed during the course of project help identify relevant texts, our state-of-the-art semantic tagging system can pinpoint phrases related to emerging biological threats. Also we are enhancing and tailoring the Bio-Encyclopedia by augmenting semantic dictionaries and extracting details of important events, such as suspected disease outbreaks. Some of these technologies have already been applied to large corpora of free text sources vital to the BKC mission, including ProMED-mail, PubMed abstracts, and the DHS's Information Analysis and Infrastructure Protection (IAIP) news clippings. In order to address the challenges involved in incorporating such large amounts of unstructured text, the overall system is focused on precise extraction of the most relevant information for inclusion in the BKC.« less

  17. Populating the Semantic Web by Macro-reading Internet Text

    NASA Astrophysics Data System (ADS)

    Mitchell, Tom M.; Betteridge, Justin; Carlson, Andrew; Hruschka, Estevam; Wang, Richard

    A key question regarding the future of the semantic web is "how will we acquire structured information to populate the semantic web on a vast scale?" One approach is to enter this information manually. A second approach is to take advantage of pre-existing databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. We also describe preliminary results demonstrating that machine learning algorithms can learn to extract tens of thousands of facts to populate a diverse ontology, with imperfect but reasonably good accuracy.

  18. Semantic Location Extraction from Crowdsourced Data

    NASA Astrophysics Data System (ADS)

    Koswatte, S.; Mcdougall, K.; Liu, X.

    2016-06-01

    Crowdsourced Data (CSD) has recently received increased attention in many application areas including disaster management. Convenience of production and use, data currency and abundancy are some of the key reasons for attracting this high interest. Conversely, quality issues like incompleteness, credibility and relevancy prevent the direct use of such data in important applications like disaster management. Moreover, location information availability of CSD is problematic as it remains very low in many crowd sourced platforms such as Twitter. Also, this recorded location is mostly related to the mobile device or user location and often does not represent the event location. In CSD, event location is discussed descriptively in the comments in addition to the recorded location (which is generated by means of mobile device's GPS or mobile communication network). This study attempts to semantically extract the CSD location information with the help of an ontological Gazetteer and other available resources. 2011 Queensland flood tweets and Ushahidi Crowd Map data were semantically analysed to extract the location information with the support of Queensland Gazetteer which is converted to an ontological gazetteer and a global gazetteer. Some preliminary results show that the use of ontologies and semantics can improve the accuracy of place name identification of CSD and the process of location information extraction.

  19. SPECTRa-T: machine-based data extraction and semantic searching of chemistry e-theses.

    PubMed

    Downing, Jim; Harvey, Matt J; Morgan, Peter B; Murray-Rust, Peter; Rzepa, Henry S; Stewart, Diana C; Tonge, Alan P; Townsend, Joe A

    2010-02-22

    The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.

  20. Lexical and Sublexical Semantic Preview Benefits in Chinese Reading

    ERIC Educational Resources Information Center

    Yan, Ming; Zhou, Wei; Shu, Hua; Kliegl, Reinhold

    2012-01-01

    Semantic processing from parafoveal words is an elusive phenomenon in alphabetic languages, but it has been demonstrated only for a restricted set of noncompound Chinese characters. Using the gaze-contingent boundary paradigm, this experiment examined whether parafoveal lexical and sublexical semantic information was extracted from compound…

  1. Point Cloud Classification of Tesserae from Terrestrial Laser Data Combined with Dense Image Matching for Archaeological Information Extraction

    NASA Astrophysics Data System (ADS)

    Poux, F.; Neuville, R.; Billen, R.

    2017-08-01

    Reasoning from information extraction given by point cloud data mining allows contextual adaptation and fast decision making. However, to achieve this perceptive level, a point cloud must be semantically rich, retaining relevant information for the end user. This paper presents an automatic knowledge-based method for pre-processing multi-sensory data and classifying a hybrid point cloud from both terrestrial laser scanning and dense image matching. Using 18 features including sensor's biased data, each tessera in the high-density point cloud from the 3D captured complex mosaics of Germigny-des-prés (France) is segmented via a colour multi-scale abstraction-based featuring extracting connectivity. A 2D surface and outline polygon of each tessera is generated by a RANSAC plane extraction and convex hull fitting. Knowledge is then used to classify every tesserae based on their size, surface, shape, material properties and their neighbour's class. The detection and semantic enrichment method shows promising results of 94% correct semantization, a first step toward the creation of an archaeological smart point cloud.

  2. A semantic model for multimodal data mining in healthcare information systems.

    PubMed

    Iakovidis, Dimitris; Smailis, Christos

    2012-01-01

    Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.

  3. Spatial information semantic query based on SPARQL

    NASA Astrophysics Data System (ADS)

    Xiao, Zhifeng; Huang, Lei; Zhai, Xiaofang

    2009-10-01

    How can the efficiency of spatial information inquiries be enhanced in today's fast-growing information age? We are rich in geospatial data but poor in up-to-date geospatial information and knowledge that are ready to be accessed by public users. This paper adopts an approach for querying spatial semantic by building an Web Ontology language(OWL) format ontology and introducing SPARQL Protocol and RDF Query Language(SPARQL) to search spatial semantic relations. It is important to establish spatial semantics that support for effective spatial reasoning for performing semantic query. Compared to earlier keyword-based and information retrieval techniques that rely on syntax, we use semantic approaches in our spatial queries system. Semantic approaches need to be developed by ontology, so we use OWL to describe spatial information extracted by the large-scale map of Wuhan. Spatial information expressed by ontology with formal semantics is available to machines for processing and to people for understanding. The approach is illustrated by introducing a case study for using SPARQL to query geo-spatial ontology instances of Wuhan. The paper shows that making use of SPARQL to search OWL ontology instances can ensure the result's accuracy and applicability. The result also indicates constructing a geo-spatial semantic query system has positive efforts on forming spatial query and retrieval.

  4. PREDOSE: a semantic web platform for drug abuse epidemiology using social media.

    PubMed

    Cameron, Delroy; Smith, Gary A; Daniulaityte, Raminta; Sheth, Amit P; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z; Falck, Russel

    2013-12-01

    The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel semantic web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO--pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC), through combination of lexical, pattern-based and semantics-based techniques. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information, which facilitate search, trend analysis and overall content analysis using social media on prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Semantic Preview Benefit in Eye Movements during Reading: A Parafoveal Fast-Priming Study

    ERIC Educational Resources Information Center

    Hohenstein, Sven; Laubrock, Jochen; Kliegl, Reinhold

    2010-01-01

    Eye movements in reading are sensitive to foveal and parafoveal word features. Whereas the influence of orthographic or phonological parafoveal information on gaze control is undisputed, there has been no reliable evidence for early parafoveal extraction of semantic information in alphabetic script. Using a novel combination of the gaze-contingent…

  6. Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

    PubMed Central

    Roos, Marco; Marshall, M Scott; Gibson, Andrew P; Schuemie, Martijn; Meij, Edgar; Katrenko, Sophia; van Hage, Willem Robert; Krommydas, Konstantinos; Adriaans, Pieter W

    2009-01-01

    Background Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation. PMID:19796406

  7. Semantic information extracting system for classification of radiological reports in radiology information system (RIS)

    NASA Astrophysics Data System (ADS)

    Shi, Liehang; Ling, Tonghui; Zhang, Jianguo

    2016-03-01

    Radiologists currently use a variety of terminologies and standards in most hospitals in China, and even there are multiple terminologies being used for different sections in one department. In this presentation, we introduce a medical semantic comprehension system (MedSCS) to extract semantic information about clinical findings and conclusion from free text radiology reports so that the reports can be classified correctly based on medical terms indexing standards such as Radlex or SONMED-CT. Our system (MedSCS) is based on both rule-based methods and statistics-based methods which improve the performance and the scalability of MedSCS. In order to evaluate the over all of the system and measure the accuracy of the outcomes, we developed computation methods to calculate the parameters of precision rate, recall rate, F-score and exact confidence interval.

  8. Improving life sciences information retrieval using semantic web technology.

    PubMed

    Quan, Dennis

    2007-05-01

    The ability to retrieve relevant information is at the heart of every aspect of research and development in the life sciences industry. Information is often distributed across multiple systems and recorded in a way that makes it difficult to piece together the complete picture. Differences in data formats, naming schemes and network protocols amongst information sources, both public and private, must be overcome, and user interfaces not only need to be able to tap into these diverse information sources but must also assist users in filtering out extraneous information and highlighting the key relationships hidden within an aggregated set of information. The Semantic Web community has made great strides in proposing solutions to these problems, and many efforts are underway to apply Semantic Web techniques to the problem of information retrieval in the life sciences space. This article gives an overview of the principles underlying a Semantic Web-enabled information retrieval system: creating a unified abstraction for knowledge using the RDF semantic network model; designing semantic lenses that extract contextually relevant subsets of information; and assembling semantic lenses into powerful information displays. Furthermore, concrete examples of how these principles can be applied to life science problems including a scenario involving a drug discovery dashboard prototype called BioDash are provided.

  9. Biomedical question answering using semantic relations.

    PubMed

    Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C

    2015-01-16

    The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.

  10. Making Semantic Information Work Effectively for Degraded Environments

    DTIC Science & Technology

    2013-06-01

    Control Research & Technology Symposium (ICCRTS) held 19-21 June, 2013 in Alexandria, VA. 14. ABSTRACT The challenges of effectively managing semantic...technologies over disadvantaged or degraded environments are numerous and complex. One of the greatest challenges is the size of raw data. Large...approach mitigates this challenge by performing data reduction through the adoption of format recognition technologies, semantic data extractions, and the

  11. Enhancing acronym/abbreviation knowledge bases with semantic information.

    PubMed

    Torii, Manabu; Liu, Hongfang

    2007-10-11

    In the biomedical domain, a terminology knowledge base that associates acronyms/abbreviations (denoted as SFs) with the definitions (denoted as LFs) is highly needed. For the construction such terminology knowledge base, we investigate the feasibility to build a system automatically assigning semantic categories to LFs extracted from text. Given a collection of pairs (SF,LF) derived from text, we i) assess the coverage of LFs and pairs (SF,LF) in the UMLS and justify the need of a semantic category assignment system; and ii) automatically derive name phrases annotated with semantic category and construct a system using machine learning. Utilizing ADAM, an existing collection of (SF,LF) pairs extracted from MEDLINE, our system achieved an f-measure of 87% when assigning eight UMLS-based semantic groups to LFs. The system has been incorporated into a web interface which integrates SF knowledge from multiple SF knowledge bases. Web site: http://gauss.dbb.georgetown.edu/liblab/SFThesurus.

  12. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

    PubMed

    Bullinaria, John A; Levy, Joseph P

    2012-09-01

    In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

  13. Hybrid ontology for semantic information retrieval model using keyword matching indexing system.

    PubMed

    Uthayan, K R; Mala, G S Anandha

    2015-01-01

    Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.

  14. Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System

    PubMed Central

    Uthayan, K. R.; Anandha Mala, G. S.

    2015-01-01

    Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology. PMID:25922851

  15. An Ontology-Based Approach to Incorporate User-Generated Geo-Content Into Sdi

    NASA Astrophysics Data System (ADS)

    Deng, D.-P.; Lemmens, R.

    2011-08-01

    The Web is changing the way people share and communicate information because of emergence of various Web technologies, which enable people to contribute information on the Web. User-Generated Geo-Content (UGGC) is a potential resource of geographic information. Due to the different production methods, UGGC often cannot fit in geographic information model. There is a semantic gap between UGGC and formal geographic information. To integrate UGGC into geographic information, this study conducts an ontology-based process to bridge this semantic gap. This ontology-based process includes five steps: Collection, Extraction, Formalization, Mapping, and Deployment. In addition, this study implements this process on Twitter messages, which is relevant to Japan Earthquake disaster. By using this process, we extract disaster relief information from Twitter messages, and develop a knowledge base for GeoSPARQL queries in disaster relief information.

  16. Automatic information extraction from unstructured mammography reports using distributed semantics.

    PubMed

    Gupta, Anupama; Banerjee, Imon; Rubin, Daniel L

    2018-02-01

    To date, the methods developed for automated extraction of information from radiology reports are mainly rule-based or dictionary-based, and, therefore, require substantial manual effort to build these systems. Recent efforts to develop automated systems for entity detection have been undertaken, but little work has been done to automatically extract relations and their associated named entities in narrative radiology reports that have comparable accuracy to rule-based methods. Our goal is to extract relations in a unsupervised way from radiology reports without specifying prior domain knowledge. We propose a hybrid approach for information extraction that combines dependency-based parse tree with distributed semantics for generating structured information frames about particular findings/abnormalities from the free-text mammography reports. The proposed IE system obtains a F 1 -score of 0.94 in terms of completeness of the content in the information frames, which outperforms a state-of-the-art rule-based system in this domain by a significant margin. The proposed system can be leveraged in a variety of applications, such as decision support and information retrieval, and may also easily scale to other radiology domains, since there is no need to tune the system with hand-crafted information extraction rules. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. Medical document anonymization with a semantic lexicon.

    PubMed Central

    Ruch, P.; Baud, R. H.; Rassinoux, A. M.; Bouillon, P.; Robert, G.

    2000-01-01

    We present an original system for locating and removing personally-identifying information in patient records. In this experiment, anonymization is seen as a particular case of knowledge extraction. We use natural language processing tools provided by the MEDTAG framework: a semantic lexicon specialized in medicine, and a toolkit for word-sense and morpho-syntactic tagging. The system finds 98-99% of all personally-identifying information. PMID:11079980

  18. First Steps to Automated Interior Reconstruction from Semantically Enriched Point Clouds and Imagery

    NASA Astrophysics Data System (ADS)

    Obrock, L. S.; Gülch, E.

    2018-05-01

    The automated generation of a BIM-Model from sensor data is a huge challenge for the modeling of existing buildings. Currently the measurements and analyses are time consuming, allow little automation and require expensive equipment. We do lack an automated acquisition of semantical information of objects in a building. We are presenting first results of our approach based on imagery and derived products aiming at a more automated modeling of interior for a BIM building model. We examine the building parts and objects visible in the collected images using Deep Learning Methods based on Convolutional Neural Networks. For localization and classification of building parts we apply the FCN8s-Model for pixel-wise Semantic Segmentation. We, so far, reach a Pixel Accuracy of 77.2 % and a mean Intersection over Union of 44.2 %. We finally use the network for further reasoning on the images of the interior room. We combine the segmented images with the original images and use photogrammetric methods to produce a three-dimensional point cloud. We code the extracted object types as colours of the 3D-points. We thus are able to uniquely classify the points in three-dimensional space. We preliminary investigate a simple extraction method for colour and material of building parts. It is shown, that the combined images are very well suited to further extract more semantic information for the BIM-Model. With the presented methods we see a sound basis for further automation of acquisition and modeling of semantic and geometric information of interior rooms for a BIM-Model.

  19. iSMART: Ontology-based Semantic Query of CDA Documents

    PubMed Central

    Liu, Shengping; Ni, Yuan; Mei, Jing; Li, Hanyu; Xie, Guotong; Hu, Gang; Liu, Haifeng; Hou, Xueqiao; Pan, Yue

    2009-01-01

    The Health Level 7 Clinical Document Architecture (CDA) is widely accepted as the format for electronic clinical document. With the rich ontological references in CDA documents, the ontology-based semantic query could be performed to retrieve CDA documents. In this paper, we present iSMART (interactive Semantic MedicAl Record reTrieval), a prototype system designed for ontology-based semantic query of CDA documents. The clinical information in CDA documents will be extracted into RDF triples by a declarative XML to RDF transformer. An ontology reasoner is developed to infer additional information by combining the background knowledge from SNOMED CT ontology. Then an RDF query engine is leveraged to enable the semantic queries. This system has been evaluated using the real clinical documents collected from a large hospital in southern China. PMID:20351883

  20. Gstruct: a system for extracting schemas from GML documents

    NASA Astrophysics Data System (ADS)

    Chen, Hui; Zhu, Fubao; Guan, Jihong; Zhou, Shuigeng

    2008-10-01

    Geography Markup Language (GML) becomes the de facto standard for geographic information representation on the internet. GML schema provides a way to define the structure, content, and semantic of GML documents. It contains useful structural information of GML documents and plays an important role in storing, querying and analyzing GML data. However, GML schema is not mandatory, and it is common that a GML document contains no schema. In this paper, we present Gstruct, a tool for GML schema extraction. Gstruct finds the features in the input GML documents, identifies geometry datatypes as well as simple datatypes, then integrates all these features and eliminates improper components to output the optimal schema. Experiments demonstrate that Gstruct is effective in extracting semantically meaningful schemas from GML documents.

  1. A Graph-Based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications

    PubMed Central

    Cameron, Delroy; Bodenreider, Olivier; Yalamanchili, Hima; Danh, Tu; Vallabhaneni, Sreeram; Thirunarayan, Krishnaprasad; Sheth, Amit P.; Rindflesch, Thomas C.

    2014-01-01

    Objectives This paper presents a methodology for recovering and decomposing Swanson’s Raynaud Syndrome–Fish Oil Hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson’s manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature. Methods Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson has been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson has been developed. Results Our methodology not only recovered the 3 associations commonly recognized as Swanson’s Hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of Swanson’s Hypothesis has never been attempted. Conclusion In this work therefore, we presented a methodology for semi- automatically recovering and decomposing Swanson’s RS-DFO Hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). These suggest that three critical aspects of LBD include: 1) the need for more expressive representations beyond Swanson’s ABC model; 2) an ability to accurately extract semantic information from text; and 3) the semantic integration of scientific literature with structured background knowledge. PMID:23026233

  2. Towards Text Copyright Detection Using Metadata in Web Applications

    ERIC Educational Resources Information Center

    Poulos, Marios; Korfiatis, Nikolaos; Bokos, George

    2011-01-01

    Purpose: This paper aims to present the semantic content identifier (SCI), a permanent identifier, computed through a linear-time onion-peeling algorithm that enables the extraction of semantic features from a text, and the integration of this information within the permanent identifier. Design/methodology/approach: The authors employ SCI to…

  3. Semantic Preview Benefit during Reading

    ERIC Educational Resources Information Center

    Hohenstein, Sven; Kliegl, Reinhold

    2014-01-01

    Word features in parafoveal vision influence eye movements during reading. The question of whether readers extract semantic information from parafoveal words was studied in 3 experiments by using a gaze-contingent display change technique. Subjects read German sentences containing 1 of several preview words that were replaced by a target word…

  4. Semantic and syntactic interoperability in online processing of big Earth observation data.

    PubMed

    Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea

    2018-01-01

    The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover).

  5. Semantic and syntactic interoperability in online processing of big Earth observation data

    PubMed Central

    Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea

    2018-01-01

    ABSTRACT The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover). PMID:29387171

  6. Vigi4Med Scraper: A Framework for Web Forum Structured Data Extraction and Semantic Representation

    PubMed Central

    Audeh, Bissan; Beigbeder, Michel; Zimmermann, Antoine; Jaillon, Philippe; Bousquet, Cédric

    2017-01-01

    The extraction of information from social media is an essential yet complicated step for data analysis in multiple domains. In this paper, we present Vigi4Med Scraper, a generic open source framework for extracting structured data from web forums. Our framework is highly configurable; using a configuration file, the user can freely choose the data to extract from any web forum. The extracted data are anonymized and represented in a semantic structure using Resource Description Framework (RDF) graphs. This representation enables efficient manipulation by data analysis algorithms and allows the collected data to be directly linked to any existing semantic resource. To avoid server overload, an integrated proxy with caching functionality imposes a minimal delay between sequential requests. Vigi4Med Scraper represents the first step of Vigi4Med, a project to detect adverse drug reactions (ADRs) from social networks founded by the French drug safety agency Agence Nationale de Sécurité du Médicament (ANSM). Vigi4Med Scraper has successfully extracted greater than 200 gigabytes of data from the web forums of over 20 different websites. PMID:28122056

  7. Speed Limits: Orientation and Semantic Context Interactions Constrain Natural Scene Discrimination Dynamics

    ERIC Educational Resources Information Center

    Rieger, Jochem W.; Kochy, Nick; Schalk, Franziska; Gruschow, Marcus; Heinze, Hans-Jochen

    2008-01-01

    The visual system rapidly extracts information about objects from the cluttered natural environment. In 5 experiments, the authors quantified the influence of orientation and semantics on the classification speed of objects in natural scenes, particularly with regard to object-context interactions. Natural scene photographs were presented in an…

  8. A Validation of Parafoveal Semantic Information Extraction in Reading Chinese

    ERIC Educational Resources Information Center

    Zhou, Wei; Kliegl, Reinhold; Yan, Ming

    2013-01-01

    Parafoveal semantic processing has recently been well documented in reading Chinese sentences, presumably because of language-specific features. However, because of a large variation of fixation landing positions on pretarget words, some preview words actually were located in foveal vision when readers' eyes landed close to the end of the…

  9. Processing of Crawled Urban Imagery for Building Use Classification

    NASA Astrophysics Data System (ADS)

    Tutzauer, P.; Haala, N.

    2017-05-01

    Recent years have shown a shift from pure geometric 3D city models to data with semantics. This is induced by new applications (e.g. Virtual/Augmented Reality) and also a requirement for concepts like Smart Cities. However, essential urban semantic data like building use categories is often not available. We present a first step in bridging this gap by proposing a pipeline to use crawled urban imagery and link it with ground truth cadastral data as an input for automatic building use classification. We aim to extract this city-relevant semantic information automatically from Street View (SV) imagery. Convolutional Neural Networks (CNNs) proved to be extremely successful for image interpretation, however, require a huge amount of training data. Main contribution of the paper is the automatic provision of such training datasets by linking semantic information as already available from databases provided from national mapping agencies or city administrations to the corresponding façade images extracted from SV. Finally, we present first investigations with a CNN and an alternative classifier as a proof of concept.

  10. Closed-Loop Lifecycle Management of Service and Product in the Internet of Things: Semantic Framework for Knowledge Integration.

    PubMed

    Yoo, Min-Jung; Grozel, Clément; Kiritsis, Dimitris

    2016-07-08

    This paper describes our conceptual framework of closed-loop lifecycle information sharing for product-service in the Internet of Things (IoT). The framework is based on the ontology model of product-service and a type of IoT message standard, Open Messaging Interface (O-MI) and Open Data Format (O-DF), which ensures data communication. (1) BACKGROUND: Based on an existing product lifecycle management (PLM) methodology, we enhanced the ontology model for the purpose of integrating efficiently the product-service ontology model that was newly developed; (2) METHODS: The IoT message transfer layer is vertically integrated into a semantic knowledge framework inside which a Semantic Info-Node Agent (SINA) uses the message format as a common protocol of product-service lifecycle data transfer; (3) RESULTS: The product-service ontology model facilitates information retrieval and knowledge extraction during the product lifecycle, while making more information available for the sake of service business creation. The vertical integration of IoT message transfer, encompassing all semantic layers, helps achieve a more flexible and modular approach to knowledge sharing in an IoT environment; (4) Contribution: A semantic data annotation applied to IoT can contribute to enhancing collected data types, which entails a richer knowledge extraction. The ontology-based PLM model enables as well the horizontal integration of heterogeneous PLM data while breaking traditional vertical information silos; (5) CONCLUSION: The framework was applied to a fictive case study with an electric car service for the purpose of demonstration. For the purpose of demonstrating the feasibility of the approach, the semantic model is implemented in Sesame APIs, which play the role of an Internet-connected Resource Description Framework (RDF) database.

  11. Closed-Loop Lifecycle Management of Service and Product in the Internet of Things: Semantic Framework for Knowledge Integration

    PubMed Central

    Yoo, Min-Jung; Grozel, Clément; Kiritsis, Dimitris

    2016-01-01

    This paper describes our conceptual framework of closed-loop lifecycle information sharing for product-service in the Internet of Things (IoT). The framework is based on the ontology model of product-service and a type of IoT message standard, Open Messaging Interface (O-MI) and Open Data Format (O-DF), which ensures data communication. (1) Background: Based on an existing product lifecycle management (PLM) methodology, we enhanced the ontology model for the purpose of integrating efficiently the product-service ontology model that was newly developed; (2) Methods: The IoT message transfer layer is vertically integrated into a semantic knowledge framework inside which a Semantic Info-Node Agent (SINA) uses the message format as a common protocol of product-service lifecycle data transfer; (3) Results: The product-service ontology model facilitates information retrieval and knowledge extraction during the product lifecycle, while making more information available for the sake of service business creation. The vertical integration of IoT message transfer, encompassing all semantic layers, helps achieve a more flexible and modular approach to knowledge sharing in an IoT environment; (4) Contribution: A semantic data annotation applied to IoT can contribute to enhancing collected data types, which entails a richer knowledge extraction. The ontology-based PLM model enables as well the horizontal integration of heterogeneous PLM data while breaking traditional vertical information silos; (5) Conclusion: The framework was applied to a fictive case study with an electric car service for the purpose of demonstration. For the purpose of demonstrating the feasibility of the approach, the semantic model is implemented in Sesame APIs, which play the role of an Internet-connected Resource Description Framework (RDF) database. PMID:27399717

  12. QTLTableMiner++: semantic mining of QTL tables in scientific articles.

    PubMed

    Singh, Gurnoor; Kuzniar, Arnold; van Mulligen, Erik M; Gavai, Anand; Bachem, Christian W; Visser, Richard G F; Finkers, Richard

    2018-05-25

    A quantitative trait locus (QTL) is a genomic region that correlates with a phenotype. Most of the experimental information about QTL mapping studies is described in tables of scientific publications. Traditional text mining techniques aim to extract information from unstructured text rather than from tables. We present QTLTableMiner ++ (QTM), a table mining tool that extracts and semantically annotates QTL information buried in (heterogeneous) tables of plant science literature. QTM is a command line tool written in the Java programming language. This tool takes scientific articles from the Europe PMC repository as input, extracts QTL tables using keyword matching and ontology-based concept identification. The tables are further normalized using rules derived from table properties such as captions, column headers and table footers. Furthermore, table columns are classified into three categories namely column descriptors, properties and values based on column headers and data types of cell entries. Abbreviations found in the tables are expanded using the Schwartz and Hearst algorithm. Finally, the content of QTL tables is semantically enriched with domain-specific ontologies (e.g. Crop Ontology, Plant Ontology and Trait Ontology) using the Apache Solr search platform and the results are stored in a relational database and a text file. The performance of the QTM tool was assessed by precision and recall based on the information retrieved from two manually annotated corpora of open access articles, i.e. QTL mapping studies in tomato (Solanum lycopersicum) and in potato (S. tuberosum). In summary, QTM detected QTL statements in tomato with 74.53% precision and 92.56% recall and in potato with 82.82% precision and 98.94% recall. QTM is a unique tool that aids in providing QTL information in machine-readable and semantically interoperable formats.

  13. Morphometric information to reduce the semantic gap in the characterization of microscopic images of thyroid nodules.

    PubMed

    Macedo, Alessandra A; Pessotti, Hugo C; Almansa, Luciana F; Felipe, Joaquim C; Kimura, Edna T

    2016-07-01

    The analyses of several systems for medical-imaging processing typically support the extraction of image attributes, but do not comprise some information that characterizes images. For example, morphometry can be applied to find new information about the visual content of an image. The extension of information may result in knowledge. Subsequently, results of mappings can be applied to recognize exam patterns, thus improving the accuracy of image retrieval and allowing a better interpretation of exam results. Although successfully applied in breast lesion images, the morphometric approach is still poorly explored in thyroid lesions due to the high subjectivity thyroid examinations. This paper presents a theoretical-practical study, considering Computer Aided Diagnosis (CAD) and Morphometry, to reduce the semantic discontinuity between medical image features and human interpretation of image content. The proposed method aggregates the content of microscopic images characterized by morphometric information and other image attributes extracted by traditional object extraction algorithms. This method carries out segmentation, feature extraction, image labeling and classification. Morphometric analysis was included as an object extraction method in order to verify the improvement of its accuracy for automatic classification of microscopic images. To validate this proposal and verify the utility of morphometric information to characterize thyroid images, a CAD system was created to classify real thyroid image-exams into Papillary Cancer, Goiter and Non-Cancer. Results showed that morphometric information can improve the accuracy and precision of image retrieval and the interpretation of results in computer-aided diagnosis. For example, in the scenario where all the extractors are combined with the morphometric information, the CAD system had its best performance (70% of precision in Papillary cases). Results signalized a positive use of morphometric information from images to reduce semantic discontinuity between human interpretation and image characterization. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Knowledge representation and management: benefits and challenges of the semantic web for the fields of KRM and NLP.

    PubMed

    Rassinoux, A-M

    2011-01-01

    To summarize excellent current research in the field of knowledge representation and management (KRM). A synopsis of the articles selected for the IMIA Yearbook 2011 is provided and an attempt to highlight the current trends in the field is sketched. This last decade, with the extension of the text-based web towards a semantic-structured web, NLP techniques have experienced a renewed interest in knowledge extraction. This trend is corroborated through the five papers selected for the KRM section of the Yearbook 2011. They all depict outstanding studies that exploit NLP technologies whenever possible in order to accurately extract meaningful information from various biomedical textual sources. Bringing semantic structure to the meaningful content of textual web pages affords the user with cooperative sharing and intelligent finding of electronic data. As exemplified by the best paper selection, more and more advanced biomedical applications aim at exploiting the meaningful richness of free-text documents in order to generate semantic metadata and recently to learn and populate domain ontologies. These later are becoming a key piece as they allow portraying the semantics of the Semantic Web content. Maintaining their consistency with documents and semantic annotations that refer to them is a crucial challenge of the Semantic Web for the coming years.

  15. Using latent semantic analysis and the predication algorithm to improve extraction of meanings from a diagnostic corpus.

    PubMed

    Jorge-Botana, Guillermo; Olmos, Ricardo; León, José Antonio

    2009-11-01

    There is currently a widespread interest in indexing and extracting taxonomic information from large text collections. An example is the automatic categorization of informally written medical or psychological diagnoses, followed by the extraction of epidemiological information or even terms and structures needed to formulate guiding questions as an heuristic tool for helping doctors. Vector space models have been successfully used to this end (Lee, Cimino, Zhu, Sable, Shanker, Ely & Yu, 2006; Pakhomov, Buntrock & Chute, 2006). In this study we use a computational model known as Latent Semantic Analysis (LSA) on a diagnostic corpus with the aim of retrieving definitions (in the form of lists of semantic neighbors) of common structures it contains (e.g. "storm phobia", "dog phobia") or less common structures that might be formed by logical combinations of categories and diagnostic symptoms (e.g. "gun personality" or "germ personality"). In the quest to bring definitions into line with the meaning of structures and make them in some way representative, various problems commonly arise while recovering content using vector space models. We propose some approaches which bypass these problems, such as Kintsch's (2001) predication algorithm and some corrections to the way lists of neighbors are obtained, which have already been tested on semantic spaces in a non-specific domain (Jorge-Botana, León, Olmos & Hassan-Montero, under review). The results support the idea that the predication algorithm may also be useful for extracting more precise meanings of certain structures from scientific corpora, and that the introduction of some corrections based on vector length may increases its efficiency on non-representative terms.

  16. Extracting and Comparing Places Using Geo-Social Media

    NASA Astrophysics Data System (ADS)

    Ostermann, F. O.; Huang, H.; Andrienko, G.; Andrienko, N.; Capineri, C.; Farkas, K.; Purves, R. S.

    2015-08-01

    Increasing availability of Geo-Social Media (e.g. Facebook, Foursquare and Flickr) has led to the accumulation of large volumes of social media data. These data, especially geotagged ones, contain information about perception of and experiences in various environments. Harnessing these data can be used to provide a better understanding of the semantics of places. We are interested in the similarities or differences between different Geo-Social Media in the description of places. This extended abstract presents the results of a first step towards a more in-depth study of semantic similarity of places. Particularly, we took places extracted through spatio-temporal clustering from one data source (Twitter) and examined whether their structure is reflected semantically in another data set (Flickr). Based on that, we analyse how the semantic similarity between places varies over space and scale, and how Tobler's first law of geography holds with regards to scale and places.

  17. The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes.

    PubMed

    Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc

    2014-12-01

    A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Semantic Analysis of Email Using Domain Ontologies and WordNet

    NASA Technical Reports Server (NTRS)

    Berrios, Daniel C.; Keller, Richard M.

    2005-01-01

    The problem of capturing and accessing knowledge in paper form has been supplanted by a problem of providing structure to vast amounts of electronic information. Systems that can construct semantic links for natural language documents like email messages automatically will be a crucial element of semantic email tools. We have designed an information extraction process that can leverage the knowledge already contained in an existing semantic web, recognizing references in email to existing nodes in a network of ontology instances by using linguistic knowledge and knowledge of the structure of the semantic web. We developed a heuristic score that uses several forms of evidence to detect references in email to existing nodes in the Semanticorganizer repository's network. While these scores cannot directly support automated probabilistic inference, they can be used to rank nodes by relevance and link those deemed most relevant to email messages.

  19. Leveraging Semantic Knowledge in IRB Databases to Improve Translation Science

    PubMed Central

    Hurdle, John F.; Botkin, Jeffery; Rindflesch, Thomas C.

    2007-01-01

    We introduce the notion that research administrative databases (RADs), such as those increasingly used to manage information flow in the Institutional Review Board (IRB), offer a novel, useful, and mine-able data source overlooked by informaticists. As a proof of concept, using an IRB database we extracted all titles and abstracts from system startup through January 2007 (n=1,876); formatted these in a pseudo-MEDLINE format; and processed them through the SemRep semantic knowledge extraction system. Even though SemRep is tuned to find semantic relations in MEDLINE citations, we found that it performed comparably well on the IRB texts. When adjusted to eliminate non-healthcare IRB submissions (e.g., economic and education studies), SemRep extracted an average of 7.3 semantic relations per IRB abstract (compared to an average of 11.1 for MEDLINE citations) with a precision of 70% (compared to 78% for MEDLINE). We conclude that RADs, as represented by IRB data, are mine-able with existing tools, but that performance will improve as these tools are tuned for RAD structures. PMID:18693856

  20. Semantator: semantic annotator for converting biomedical text to linked data.

    PubMed

    Tao, Cui; Song, Dezhao; Sharma, Deepak; Chute, Christopher G

    2013-10-01

    More than 80% of biomedical data is embedded in plain text. The unstructured nature of these text-based documents makes it challenging to easily browse and query the data of interest in them. One approach to facilitate browsing and querying biomedical text is to convert the plain text to a linked web of data, i.e., converting data originally in free text to structured formats with defined meta-level semantics. In this paper, we introduce Semantator (Semantic Annotator), a semantic-web-based environment for annotating data of interest in biomedical documents, browsing and querying the annotated data, and interactively refining annotation results if needed. Through Semantator, information of interest can be either annotated manually or semi-automatically using plug-in information extraction tools. The annotated results will be stored in RDF and can be queried using the SPARQL query language. In addition, semantic reasoners can be directly applied to the annotated data for consistency checking and knowledge inference. Semantator has been released online and was used by the biomedical ontology community who provided positive feedbacks. Our evaluation results indicated that (1) Semantator can perform the annotation functionalities as designed; (2) Semantator can be adopted in real applications in clinical and transactional research; and (3) the annotated results using Semantator can be easily used in Semantic-web-based reasoning tools for further inference. Copyright © 2013 Elsevier Inc. All rights reserved.

  1. Semantic web data warehousing for caGrid.

    PubMed

    McCusker, James P; Phillips, Joshua A; González Beltrán, Alejandra; Finkelstein, Anthony; Krauthammer, Michael

    2009-10-01

    The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges.

  2. A Semantic Approach for Geospatial Information Extraction from Unstructured Documents

    NASA Astrophysics Data System (ADS)

    Sallaberry, Christian; Gaio, Mauro; Lesbegueries, Julien; Loustau, Pierre

    Local cultural heritage document collections are characterized by their content, which is strongly attached to a territory and its land history (i.e., geographical references). Our contribution aims at making the content retrieval process more efficient whenever a query includes geographic criteria. We propose a core model for a formal representation of geographic information. It takes into account characteristics of different modes of expression, such as written language, captures of drawings, maps, photographs, etc. We have developed a prototype that fully implements geographic information extraction (IE) and geographic information retrieval (IR) processes. All PIV prototype processing resources are designed as Web Services. We propose a geographic IE process based on semantic treatment as a supplement to classical IE approaches. We implement geographic IR by using intersection computing algorithms that seek out any intersection between formal geocoded representations of geographic information in a user query and similar representations in document collection indexes.

  3. Synonym extraction and abbreviation expansion with ensembles of semantic spaces.

    PubMed

    Henriksson, Aron; Moen, Hans; Skeppstedt, Maria; Daudaravičius, Vidas; Duneld, Martin

    2014-02-05

    Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. A combination of two distributional models - Random Indexing and Random Permutation - employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora - a corpus of clinical text and a corpus of medical journal articles - further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models - with different model parameters - and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks.

  4. Synonym extraction and abbreviation expansion with ensembles of semantic spaces

    PubMed Central

    2014-01-01

    Background Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. Results A combination of two distributional models – Random Indexing and Random Permutation – employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora – a corpus of clinical text and a corpus of medical journal articles – further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. Conclusions This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models – with different model parameters – and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks. PMID:24499679

  5. Classification of clinically useful sentences in clinical evidence resources.

    PubMed

    Morid, Mohammad Amin; Fiszman, Marcelo; Raja, Kalpana; Jonnalagadda, Siddhartha R; Del Fiol, Guilherme

    2016-04-01

    Most patient care questions raised by clinicians can be answered by online clinical knowledge resources. However, important barriers still challenge the use of these resources at the point of care. To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs. We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure. The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62). The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Semantic Segmentation of Indoor Point Clouds Using Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Babacan, K.; Chen, L.; Sohn, G.

    2017-11-01

    As Building Information Modelling (BIM) thrives, geometry becomes no longer sufficient; an ever increasing variety of semantic information is needed to express an indoor model adequately. On the other hand, for the existing buildings, automatically generating semantically enriched BIM from point cloud data is in its infancy. The previous research to enhance the semantic content rely on frameworks in which some specific rules and/or features that are hand coded by specialists. These methods immanently lack generalization and easily break in different circumstances. On this account, a generalized framework is urgently needed to automatically and accurately generate semantic information. Therefore we propose to employ deep learning techniques for the semantic segmentation of point clouds into meaningful parts. More specifically, we build a volumetric data representation in order to efficiently generate the high number of training samples needed to initiate a convolutional neural network architecture. The feedforward propagation is used in such a way to perform the classification in voxel level for achieving semantic segmentation. The method is tested both for a mobile laser scanner point cloud, and a larger scale synthetically generated data. We also demonstrate a case study, in which our method can be effectively used to leverage the extraction of planar surfaces in challenging cluttered indoor environments.

  7. An object-oriented design for automated navigation of semantic networks inside a medical data dictionary.

    PubMed

    Ruan, W; Bürkle, T; Dudeck, J

    2000-01-01

    In this paper we present a data dictionary server for the automated navigation of information sources. The underlying knowledge is represented within a medical data dictionary. The mapping between medical terms and information sources is based on a semantic network. The key aspect of implementing the dictionary server is how to represent the semantic network in a way that is easier to navigate and to operate, i.e. how to abstract the semantic network and to represent it in memory for various operations. This paper describes an object-oriented design based on Java that represents the semantic network in terms of a group of objects. A node and its relationships to its neighbors are encapsulated in one object. Based on such a representation model, several operations have been implemented. They comprise the extraction of parts of the semantic network which can be reached from a given node as well as finding all paths between a start node and a predefined destination node. This solution is independent of any given layout of the semantic structure. Therefore the module, called Giessen Data Dictionary Server can act independent of a specific clinical information system. The dictionary server will be used to present clinical information, e.g. treatment guidelines or drug information sources to the clinician in an appropriate working context. The server is invoked from clinical documentation applications which contain an infobutton. Automated navigation will guide the user to all the information relevant to her/his topic, which is currently available inside our closed clinical network.

  8. Supervised guiding long-short term memory for image caption generation based on object classes

    NASA Astrophysics Data System (ADS)

    Wang, Jian; Cao, Zhiguo; Xiao, Yang; Qi, Xinyuan

    2018-03-01

    The present models of image caption generation have the problems of image visual semantic information attenuation and errors in guidance information. In order to solve these problems, we propose a supervised guiding Long Short Term Memory model based on object classes, named S-gLSTM for short. It uses the object detection results from R-FCN as supervisory information with high confidence, and updates the guidance word set by judging whether the last output matches the supervisory information. S-gLSTM learns how to extract the current interested information from the image visual se-mantic information based on guidance word set. The interested information is fed into the S-gLSTM at each iteration as guidance information, to guide the caption generation. To acquire the text-related visual semantic information, the S-gLSTM fine-tunes the weights of the network through the back-propagation of the guiding loss. Complementing guidance information at each iteration solves the problem of visual semantic information attenuation in the traditional LSTM model. Besides, the supervised guidance information in our model can reduce the impact of the mismatched words on the caption generation. We test our model on MSCOCO2014 dataset, and obtain better performance than the state-of-the- art models.

  9. Simulating Expert Clinical Comprehension: Adapting Latent Semantic Analysis to Accurately Extract Clinical Concepts from Psychiatric Narrative

    PubMed Central

    Cohen, Trevor; Blatter, Brett; Patel, Vimla

    2008-01-01

    Cognitive studies reveal that less-than-expert clinicians are less able to recognize meaningful patterns of data in clinical narratives. Accordingly, psychiatric residents early in training fail to attend to information that is relevant to diagnosis and the assessment of dangerousness. This manuscript presents cognitively motivated methodology for the simulation of expert ability to organize relevant findings supporting intermediate diagnostic hypotheses. Latent Semantic Analysis is used to generate a semantic space from which meaningful associations between psychiatric terms are derived. Diagnostically meaningful clusters are modeled as geometric structures within this space and compared to elements of psychiatric narrative text using semantic distance measures. A learning algorithm is defined that alters components of these geometric structures in response to labeled training data. Extraction and classification of relevant text segments is evaluated against expert annotation, with system-rater agreement approximating rater-rater agreement. A range of biomedical informatics applications for these methods are suggested. PMID:18455483

  10. EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.

    PubMed

    Boland, Mary Regina; Tu, Samson W; Carini, Simona; Sim, Ida; Weng, Chunhua

    2012-01-01

    Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria.

  11. Semantic web data warehousing for caGrid

    PubMed Central

    McCusker, James P; Phillips, Joshua A; Beltrán, Alejandra González; Finkelstein, Anthony; Krauthammer, Michael

    2009-01-01

    The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG® Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges. PMID:19796399

  12. Dynamic information processing states revealed through neurocognitive models of object semantics

    PubMed Central

    Clarke, Alex

    2015-01-01

    Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations – a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time. PMID:25745632

  13. Multiple Semantic Matching on Augmented N-partite Graph for Object Co-segmentation.

    PubMed

    Wang, Chuan; Zhang, Hua; Yang, Liang; Cao, Xiaochun; Xiong, Hongkai

    2017-09-08

    Recent methods for object co-segmentation focus on discovering single co-occurring relation of candidate regions representing the foreground of multiple images. However, region extraction based only on low and middle level information often occupies a large area of background without the help of semantic context. In addition, seeking single matching solution very likely leads to discover local parts of common objects. To cope with these deficiencies, we present a new object cosegmentation framework, which takes advantages of semantic information and globally explores multiple co-occurring matching cliques based on an N-partite graph structure. To this end, we first propose to incorporate candidate generation with semantic context. Based on the regions extracted from semantic segmentation of each image, we design a merging mechanism to hierarchically generate candidates with high semantic responses. Secondly, all candidates are taken into consideration to globally formulate multiple maximum weighted matching cliques, which complements the discovery of part of the common objects induced by a single clique. To facilitate the discovery of multiple matching cliques, an N-partite graph, which inherently excludes intralinks between candidates from the same image, is constructed to separate multiple cliques without additional constraints. Further, we augment the graph with an additional virtual node in each part to handle irrelevant matches when the similarity between two candidates is too small. Finally, with the explored multiple cliques, we statistically compute pixel-wise co-occurrence map for each image. Experimental results on two benchmark datasets, i.e., iCoseg and MSRC datasets, achieve desirable performance and demonstrate the effectiveness of our proposed framework.

  14. A Semantic Web-based System for Mining Genetic Mutations in Cancer Clinical Trials.

    PubMed

    Priya, Sambhawa; Jiang, Guoqian; Dasari, Surendra; Zimmermann, Michael T; Wang, Chen; Heflin, Jeff; Chute, Christopher G

    2015-01-01

    Textual eligibility criteria in clinical trial protocols contain important information about potential clinically relevant pharmacogenomic events. Manual curation for harvesting this evidence is intractable as it is error prone and time consuming. In this paper, we develop and evaluate a Semantic Web-based system that captures and manages mutation evidences and related contextual information from cancer clinical trials. The system has 2 main components: an NLP-based annotator and a Semantic Web ontology-based annotation manager. We evaluated the performance of the annotator in terms of precision and recall. We demonstrated the usefulness of the system by conducting case studies in retrieving relevant clinical trials using a collection of mutations identified from TCGA Leukemia patients and Atlas of Genetics and Cytogenetics in Oncology and Haematology. In conclusion, our system using Semantic Web technologies provides an effective framework for extraction, annotation, standardization and management of genetic mutations in cancer clinical trials.

  15. Model for Semantically Rich Point Cloud Data

    NASA Astrophysics Data System (ADS)

    Poux, F.; Neuville, R.; Hallot, P.; Billen, R.

    2017-10-01

    This paper proposes an interoperable model for managing high dimensional point clouds while integrating semantics. Point clouds from sensors are a direct source of information physically describing a 3D state of the recorded environment. As such, they are an exhaustive representation of the real world at every scale: 3D reality-based spatial data. Their generation is increasingly fast but processing routines and data models lack of knowledge to reason from information extraction rather than interpretation. The enhanced smart point cloud developed model allows to bring intelligence to point clouds via 3 connected meta-models while linking available knowledge and classification procedures that permits semantic injection. Interoperability drives the model adaptation to potentially many applications through specialized domain ontologies. A first prototype is implemented in Python and PostgreSQL database and allows to combine semantic and spatial concepts for basic hybrid queries on different point clouds.

  16. Multi-talker background and semantic priming effect

    PubMed Central

    Dekerle, Marie; Boulenger, Véronique; Hoen, Michel; Meunier, Fanny

    2014-01-01

    The reported studies have aimed to investigate whether informational masking in a multi-talker background relies on semantic interference between the background and target using an adapted semantic priming paradigm. In 3 experiments, participants were required to perform a lexical decision task on a target item embedded in backgrounds composed of 1–4 voices. These voices were Semantically Consistent (SC) voices (i.e., pronouncing words sharing semantic features with the target) or Semantically Inconsistent (SI) voices (i.e., pronouncing words semantically unrelated to each other and to the target). In the first experiment, backgrounds consisted of 1 or 2 SC voices. One and 2 SI voices were added in Experiments 2 and 3, respectively. The results showed a semantic priming effect only in the conditions where the number of SC voices was greater than the number of SI voices, suggesting that semantic priming depended on prime intelligibility and strategic processes. However, even if backgrounds were composed of 3 or 4 voices, reducing intelligibility, participants were able to recognize words from these backgrounds, although no semantic priming effect on the targets was observed. Overall this finding suggests that informational masking can occur at a semantic level if intelligibility is sufficient. Based on the Effortfulness Hypothesis, we also suggest that when there is an increased difficulty in extracting target signals (caused by a relatively high number of voices in the background), more cognitive resources were allocated to formal processes (i.e., acoustic and phonological), leading to a decrease in available resources for deeper semantic processing of background words, therefore preventing semantic priming from occurring. PMID:25400572

  17. Semantic Technologies for Re-Use of Clinical Routine Data.

    PubMed

    Kreuzthaler, Markus; Martínez-Costa, Catalina; Kaiser, Peter; Schulz, Stefan

    2017-01-01

    Routine patient data in electronic patient records are only partly structured, and an even smaller segment is coded, mainly for administrative purposes. Large parts are only available as free text. Transforming this content into a structured and semantically explicit form is a prerequisite for querying and information extraction. The core of the system architecture presented in this paper is based on SAP HANA in-memory database technology using the SAP Connected Health platform for data integration as well as for clinical data warehousing. A natural language processing pipeline analyses unstructured content and maps it to a standardized vocabulary within a well-defined information model. The resulting semantically standardized patient profiles are used for a broad range of clinical and research application scenarios.

  18. Hybrid Semantic Analysis for Mapping Adverse Drug Reaction Mentions in Tweets to Medical Terminology.

    PubMed

    Emadzadeh, Ehsan; Sarker, Abeed; Nikfarjam, Azadeh; Gonzalez, Graciela

    2017-01-01

    Social networks, such as Twitter, have become important sources for active monitoring of user-reported adverse drug reactions (ADRs). Automatic extraction of ADR information can be crucial for healthcare providers, drug manufacturers, and consumers. However, because of the non-standard nature of social media language, automatically extracted ADR mentions need to be mapped to standard forms before they can be used by operational pharmacovigilance systems. We propose a modular natural language processing pipeline for mapping (normalizing) colloquial mentions of ADRs to their corresponding standardized identifiers. We seek to accomplish this task and enable customization of the pipeline so that distinct unlabeled free text resources can be incorporated to use the system for other normalization tasks. Our approach, which we call Hybrid Semantic Analysis (HSA), sequentially employs rule-based and semantic matching algorithms for mapping user-generated mentions to concept IDs in the Unified Medical Language System vocabulary. The semantic matching component of HSA is adaptive in nature and uses a regression model to combine various measures of semantic relatedness and resources to optimize normalization performance on the selected data source. On a publicly available corpus, our normalization method achieves 0.502 recall and 0.823 precision (F-measure: 0.624). Our proposed method outperforms a baseline based on latent semantic analysis and another that uses MetaMap.

  19. Using architectures for semantic interoperability to create journal clubs for emergency response

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Powell, James E; Collins, Linn M; Martinez, Mark L B

    2009-01-01

    In certain types of 'slow burn' emergencies, careful accumulation and evaluation of information can offer a crucial advantage. The SARS outbreak in the first decade of the 21st century was such an event, and ad hoc journal clubs played a critical role in assisting scientific and technical responders in identifying and developing various strategies for halting what could have become a dangerous pandemic. This research-in-progress paper describes a process for leveraging emerging semantic web and digital library architectures and standards to (1) create a focused collection of bibliographic metadata, (2) extract semantic information, (3) convert it to the Resource Descriptionmore » Framework /Extensible Markup Language (RDF/XML), and (4) integrate it so that scientific and technical responders can share and explore critical information in the collections.« less

  20. Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.

    PubMed

    Wu, Stephen; Liu, Hongfang

    2011-01-01

    Natural language processing (NLP) has become crucial in unlocking information stored in free text, from both clinical notes and biomedical literature. Clinical notes convey clinical information related to individual patient health care, while biomedical literature communicates scientific findings. This work focuses on semantic characterization of texts at an enterprise scale, comparing and contrasting the two domains and their NLP approaches. We analyzed the empirical distributional characteristics of NLP-discovered named entities in Mayo Clinic clinical notes from 2001-2010, and in the 2011 MetaMapped Medline Baseline. We give qualitative and quantitative measures of domain similarity and point to the feasibility of transferring resources and techniques. An important by-product for this study is the development of a weighted ontology for each domain, which gives distributional semantic information that may be used to improve NLP applications.

  1. Semantic Similarity between Web Documents Using Ontology

    NASA Astrophysics Data System (ADS)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-06-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  2. Semantic Similarity between Web Documents Using Ontology

    NASA Astrophysics Data System (ADS)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-03-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  3. Wide coverage biomedical event extraction using multiple partially overlapping corpora

    PubMed Central

    2013-01-01

    Background Biomedical events are key to understanding physiological processes and disease, and wide coverage extraction is required for comprehensive automatic analysis of statements describing biomedical systems in the literature. In turn, the training and evaluation of extraction methods requires manually annotated corpora. However, as manual annotation is time-consuming and expensive, any single event-annotated corpus can only cover a limited number of semantic types. Although combined use of several such corpora could potentially allow an extraction system to achieve broad semantic coverage, there has been little research into learning from multiple corpora with partially overlapping semantic annotation scopes. Results We propose a method for learning from multiple corpora with partial semantic annotation overlap, and implement this method to improve our existing event extraction system, EventMine. An evaluation using seven event annotated corpora, including 65 event types in total, shows that learning from overlapping corpora can produce a single, corpus-independent, wide coverage extraction system that outperforms systems trained on single corpora and exceeds previously reported results on two established event extraction tasks from the BioNLP Shared Task 2011. Conclusions The proposed method allows the training of a wide-coverage, state-of-the-art event extraction system from multiple corpora with partial semantic annotation overlap. The resulting single model makes broad-coverage extraction straightforward in practice by removing the need to either select a subset of compatible corpora or semantic types, or to merge results from several models trained on different individual corpora. Multi-corpus learning also allows annotation efforts to focus on covering additional semantic types, rather than aiming for exhaustive coverage in any single annotation effort, or extending the coverage of semantic types annotated in existing corpora. PMID:23731785

  4. Semantic World Modelling and Data Management in a 4d Forest Simulation and Information System

    NASA Astrophysics Data System (ADS)

    Roßmann, J.; Hoppen, M.; Bücken, A.

    2013-08-01

    Various types of 3D simulation applications benefit from realistic forest models. They range from flight simulators for entertainment to harvester simulators for training and tree growth simulations for research and planning. Our 4D forest simulation and information system integrates the necessary methods for data extraction, modelling and management. Using modern methods of semantic world modelling, tree data can efficiently be extracted from remote sensing data. The derived forest models contain position, height, crown volume, type and diameter of each tree. This data is modelled using GML-based data models to assure compatibility and exchangeability. A flexible approach for database synchronization is used to manage the data and provide caching, persistence, a central communication hub for change distribution, and a versioning mechanism. Combining various simulation techniques and data versioning, the 4D forest simulation and information system can provide applications with "both directions" of the fourth dimension. Our paper outlines the current state, new developments, and integration of tree extraction, data modelling, and data management. It also shows several applications realized with the system.

  5. Categorizing words through semantic memory navigation

    NASA Astrophysics Data System (ADS)

    Borge-Holthoefer, J.; Arenas, A.

    2010-03-01

    Semantic memory is the cognitive system devoted to storage and retrieval of conceptual knowledge. Empirical data indicate that semantic memory is organized in a network structure. Everyday experience shows that word search and retrieval processes provide fluent and coherent speech, i.e. are efficient. This implies either that semantic memory encodes, besides thousands of words, different kind of links for different relationships (introducing greater complexity and storage costs), or that the structure evolves facilitating the differentiation between long-lasting semantic relations from incidental, phenomenological ones. Assuming the latter possibility, we explore a mechanism to disentangle the underlying semantic backbone which comprises conceptual structure (extraction of categorical relations between pairs of words), from the rest of information present in the structure. To this end, we first present and characterize an empirical data set modeled as a network, then we simulate a stochastic cognitive navigation on this topology. We schematize this latter process as uncorrelated random walks from node to node, which converge to a feature vectors network. By doing so we both introduce a novel mechanism for information retrieval, and point at the problem of category formation in close connection to linguistic and non-linguistic experience.

  6. Using a high-dimensional graph of semantic space to model relationships among words

    PubMed Central

    Jackson, Alice F.; Bolger, Donald J.

    2014-01-01

    The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD). PMID:24860525

  7. Using a high-dimensional graph of semantic space to model relationships among words.

    PubMed

    Jackson, Alice F; Bolger, Donald J

    2014-01-01

    The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).

  8. A fusion network for semantic segmentation using RGB-D data

    NASA Astrophysics Data System (ADS)

    Yuan, Jiahui; Zhang, Kun; Xia, Yifan; Qi, Lin; Dong, Junyu

    2018-04-01

    Semantic scene parsing is considerable in many intelligent field, including perceptual robotics. For the past few years, pixel-wise prediction tasks like semantic segmentation with RGB images has been extensively studied and has reached very remarkable parsing levels, thanks to convolutional neural networks (CNNs) and large scene datasets. With the development of stereo cameras and RGBD sensors, it is expected that additional depth information will help improving accuracy. In this paper, we propose a semantic segmentation framework incorporating RGB and complementary depth information. Motivated by the success of fully convolutional networks (FCN) in semantic segmentation field, we design a fully convolutional networks consists of two branches which extract features from both RGB and depth data simultaneously and fuse them as the network goes deeper. Instead of aggregating multiple model, our goal is to utilize RGB data and depth data more effectively in a single model. We evaluate our approach on the NYU-Depth V2 dataset, which consists of 1449 cluttered indoor scenes, and achieve competitive results with the state-of-the-art methods.

  9. A Geospatial Semantic Enrichment and Query Service for Geotagged Photographs

    PubMed Central

    Ennis, Andrew; Nugent, Chris; Morrow, Philip; Chen, Liming; Ioannidis, George; Stan, Alexandru; Rachev, Preslav

    2015-01-01

    With the increasing abundance of technologies and smart devices, equipped with a multitude of sensors for sensing the environment around them, information creation and consumption has now become effortless. This, in particular, is the case for photographs with vast amounts being created and shared every day. For example, at the time of this writing, Instagram users upload 70 million photographs a day. Nevertheless, it still remains a challenge to discover the “right” information for the appropriate purpose. This paper describes an approach to create semantic geospatial metadata for photographs, which can facilitate photograph search and discovery. To achieve this we have developed and implemented a semantic geospatial data model by which a photograph can be enrich with geospatial metadata extracted from several geospatial data sources based on the raw low-level geo-metadata from a smartphone photograph. We present the details of our method and implementation for searching and querying the semantic geospatial metadata repository to enable a user or third party system to find the information they are looking for. PMID:26205265

  10. KneeTex: an ontology-driven system for information extraction from MRI reports.

    PubMed

    Spasić, Irena; Zhao, Bo; Jones, Christopher B; Button, Kate

    2015-01-01

    In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities. In this paper we describe KneeTex, an information extraction system that operates in this domain. As an ontology-driven information extraction system, KneeTex makes active use of an ontology to strongly guide and constrain text analysis. We used automatic term recognition to facilitate the development of a domain-specific ontology with sufficient detail and coverage for text mining applications. In combination with the ontology, high regularity of the sublanguage used in knee MRI reports allowed us to model its processing by a set of sophisticated lexico-semantic rules with minimal syntactic analysis. The main processing steps involve named entity recognition combined with coordination, enumeration, ambiguity and co-reference resolution, followed by text segmentation. Ontology-based semantic typing is then used to drive the template filling process. We adopted an existing ontology, TRAK (Taxonomy for RehAbilitation of Knee conditions), for use within KneeTex. The original TRAK ontology expanded from 1,292 concepts, 1,720 synonyms and 518 relationship instances to 1,621 concepts, 2,550 synonyms and 560 relationship instances. This provided KneeTex with a very fine-grained lexico-semantic knowledge base, which is highly attuned to the given sublanguage. Information extraction results were evaluated on a test set of 100 MRI reports. A gold standard consisted of 1,259 filled template records with the following slots: finding, finding qualifier, negation, certainty, anatomy and anatomy qualifier. KneeTex extracted information with precision of 98.00 %, recall of 97.63 % and F-measure of 97.81 %, the values of which are in line with human-like performance. KneeTex is an open-source, stand-alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions. As a result, formally structured and coded information allows for complex searches to be conducted efficiently over the original MRI reports, thereby effectively supporting epidemiologic studies of knee conditions.

  11. Amatchmethod Based on Latent Semantic Analysis for Earthquakehazard Emergency Plan

    NASA Astrophysics Data System (ADS)

    Sun, D.; Zhao, S.; Zhang, Z.; Shi, X.

    2017-09-01

    The structure of the emergency plan on earthquake is complex, and it's difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.

  12. A case study of data integration for aquatic resources using semantic web technologies

    USGS Publications Warehouse

    Gordon, Janice M.; Chkhenkeli, Nina; Govoni, David L.; Lightsom, Frances L.; Ostroff, Andrea C.; Schweitzer, Peter N.; Thongsavanh, Phethala; Varanka, Dalia E.; Zednik, Stephan

    2015-01-01

    Use cases, information modeling, and linked data techniques are Semantic Web technologies used to develop a prototype system that integrates scientific observations from four independent USGS and cooperator data systems. The techniques were tested with a use case goal of creating a data set for use in exploring potential relationships among freshwater fish populations and environmental factors. The resulting prototype extracts data from the BioData Retrieval System, the Multistate Aquatic Resource Information System, the National Geochemical Survey, and the National Hydrography Dataset. A prototype user interface allows a scientist to select observations from these data systems and combine them into a single data set in RDF format that includes explicitly defined relationships and data definitions. The project was funded by the USGS Community for Data Integration and undertaken by the Community for Data Integration Semantic Web Working Group in order to demonstrate use of Semantic Web technologies by scientists. This allows scientists to simultaneously explore data that are available in multiple, disparate systems beyond those they traditionally have used.

  13. Centrality-based Selection of Semantic Resources for Geosciences

    NASA Astrophysics Data System (ADS)

    Cerba, Otakar; Jedlicka, Karel

    2017-04-01

    Semantical questions intervene almost in all disciplines dealing with geographic data and information, because relevant semantics is crucial for any way of communication and interaction among humans as well as among machines. But the existence of such a large number of different semantic resources (such as various thesauri, controlled vocabularies, knowledge bases or ontologies) makes the process of semantics implementation much more difficult and complicates the use of the advantages of semantics. This is because in many cases users are not able to find the most suitable resource for their purposes. The research presented in this paper introduces a methodology consisting of an analysis of identical relations in Linked Data space, which covers a majority of semantic resources, to find a suitable resource of semantic information. Identical links interconnect representations of an object or a concept in various semantic resources. Therefore this type of relations is considered to be crucial from the view of Linked Data, because these links provide new additional information, including various views on one concept based on different cultural or regional aspects (so-called social role of Linked Data). For these reasons it is possible to declare that one reasonable criterion for feasible semantic resources for almost all domains, including geosciences, is their position in a network of interconnected semantic resources and level of linking to other knowledge bases and similar products. The presented methodology is based on searching of mutual connections between various instances of one concept using "follow your nose" approach. The extracted data on interconnections between semantic resources are arranged to directed graphs and processed by various metrics patterned on centrality computing (degree, closeness or betweenness centrality). Semantic resources recommended by the research could be used for providing semantically described keywords for metadata records or as names of items in data models. Such an approach enables much more efficient data harmonization, integration, sharing and exploitation. * * * * This publication was supported by the project LO1506 of the Czech Ministry of Education, Youth and Sports. This publication was supported by project Data-Driven Bioeconomy (DataBio) from the ICT-15-2016-2017, Big Data PPP call.

  14. Longitudinal Analysis of New Information Types in Clinical Notes

    PubMed Central

    Zhang, Rui; Pakhomov, Serguei; Melton, Genevieve B.

    2014-01-01

    It is increasingly recognized that redundant information in clinical notes within electronic health record (EHR) systems is ubiquitous, significant, and may negatively impact the secondary use of these notes for research and patient care. We investigated several automated methods to identify redundant versus relevant new information in clinical reports. These methods may provide a valuable approach to extract clinically pertinent information and further improve the accuracy of clinical information extraction systems. In this study, we used UMLS semantic types to extract several types of new information, including problems, medications, and laboratory information. Automatically identified new information highly correlated with manual reference standard annotations. Methods to identify different types of new information can potentially help to build up more robust information extraction systems for clinical researchers as well as aid clinicians and researchers in navigating clinical notes more effectively and quickly identify information pertaining to changes in health states. PMID:25717418

  15. Semantic Segmentation and Difference Extraction via Time Series Aerial Video Camera and its Application

    NASA Astrophysics Data System (ADS)

    Amit, S. N. K.; Saito, S.; Sasaki, S.; Kiyoki, Y.; Aoki, Y.

    2015-04-01

    Google earth with high-resolution imagery basically takes months to process new images before online updates. It is a time consuming and slow process especially for post-disaster application. The objective of this research is to develop a fast and effective method of updating maps by detecting local differences occurred over different time series; where only region with differences will be updated. In our system, aerial images from Massachusetts's road and building open datasets, Saitama district datasets are used as input images. Semantic segmentation is then applied to input images. Semantic segmentation is a pixel-wise classification of images by implementing deep neural network technique. Deep neural network technique is implemented due to being not only efficient in learning highly discriminative image features such as road, buildings etc., but also partially robust to incomplete and poorly registered target maps. Then, aerial images which contain semantic information are stored as database in 5D world map is set as ground truth images. This system is developed to visualise multimedia data in 5 dimensions; 3 dimensions as spatial dimensions, 1 dimension as temporal dimension, and 1 dimension as degenerated dimensions of semantic and colour combination dimension. Next, ground truth images chosen from database in 5D world map and a new aerial image with same spatial information but different time series are compared via difference extraction method. The map will only update where local changes had occurred. Hence, map updating will be cheaper, faster and more effective especially post-disaster application, by leaving unchanged region and only update changed region.

  16. Situation Tracking in Large Data Streams

    DTIC Science & Technology

    2015-02-01

    Pike, R. 1989: Different Ways to Cue a Coherent Memory System: A Theory for Episodic , Semantic , and Procedural Tasks. Psychological Review, 96(2), 208...to extract general concept models, such as Latent Semantic Indexing. This technique generally extracts a single topic model, and does not extract a...2001. Semantic leaps: frame-shifting and conceptual blending in meaning construction. Cambridge University Press. Humphreys, M. S, Bain, J. D

  17. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  18. A common type system for clinical natural language processing

    PubMed Central

    2013-01-01

    Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462

  19. A common type system for clinical natural language processing.

    PubMed

    Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G

    2013-01-03

    One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

  20. Enhancing clinical concept extraction with distributional semantics

    PubMed Central

    Cohen, Trevor; Wu, Stephen; Gonzalez, Graciela

    2011-01-01

    Extracting concepts (such as drugs, symptoms, and diagnoses) from clinical narratives constitutes a basic enabling technology to unlock the knowledge within and support more advanced reasoning applications such as diagnosis explanation, disease progression modeling, and intelligent analysis of the effectiveness of treatment. The recent release of annotated training sets of de-identified clinical narratives has contributed to the development and refinement of concept extraction methods. However, as the annotation process is labor-intensive, training data are necessarily limited in the concepts and concept patterns covered, which impacts the performance of supervised machine learning applications trained with these data. This paper proposes an approach to minimize this limitation by combining supervised machine learning with empirical learning of semantic relatedness from the distribution of the relevant words in additional unannotated text. The approach uses a sequential discriminative classifier (Conditional Random Fields) to extract the mentions of medical problems, treatments and tests from clinical narratives. It takes advantage of all Medline abstracts indexed as being of the publication type “clinical trials” to estimate the relatedness between words in the i2b2/VA training and testing corpora. In addition to the traditional features such as dictionary matching, pattern matching and part-of-speech tags, we also used as a feature words that appear in similar contexts to the word in question (that is, words that have a similar vector representation measured with the commonly used cosine metric, where vector representations are derived using methods of distributional semantics). To the best of our knowledge, this is the first effort exploring the use of distributional semantics, the semantics derived empirically from unannotated text often using vector space models, for a sequence classification task such as concept extraction. Therefore, we first experimented with different sliding window models and found the model with parameters that led to best performance in a preliminary sequence labeling task. The evaluation of this approach, performed against the i2b2/VA concept extraction corpus, showed that incorporating features based on the distribution of words across a large unannotated corpus significantly aids concept extraction. Compared to a supervised-only approach as a baseline, the micro-averaged f-measure for exact match increased from 80.3% to 82.3% and the micro-averaged f-measure based on inexact match increased from 89.7% to 91.3%. These improvements are highly significant according to the bootstrap resampling method and also considering the performance of other systems. Thus, distributional semantic features significantly improve the performance of concept extraction from clinical narratives by taking advantage of word distribution information obtained from unannotated data. PMID:22085698

  1. Learning the Language of Healthcare Enabling Semantic Web Technology in CHCS

    DTIC Science & Technology

    2013-09-01

    tuples”, (subject, predicate, object), to relate data and achieve semantic interoperability . Other similar technologies exist, but their... Semantic Healthcare repository [5]. Ultimately, both of our data approaches were successful. However, our current test system is based on the CPRS demo...to extract system dependencies and workflows; to extract semantically related patient data ; and to browse patient- centric views into the system . We

  2. Ontology-Based Information Extraction for Business Intelligence

    NASA Astrophysics Data System (ADS)

    Saggion, Horacio; Funk, Adam; Maynard, Diana; Bontcheva, Kalina

    Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.

  3. Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network.

    PubMed

    Yang, Zhongliang; Huang, Yongfeng; Jiang, Yiran; Sun, Yuxi; Zhang, Yu-Jin; Luo, Pengcheng

    2018-04-20

    Automatically extracting useful information from electronic medical records along with conducting disease diagnoses is a promising task for both clinical decision support(CDS) and neural language processing(NLP). Most of the existing systems are based on artificially constructed knowledge bases, and then auxiliary diagnosis is done by rule matching. In this study, we present a clinical intelligent decision approach based on Convolutional Neural Networks(CNN), which can automatically extract high-level semantic information of electronic medical records and then perform automatic diagnosis without artificial construction of rules or knowledge bases. We use collected 18,590 copies of the real-world clinical electronic medical records to train and test the proposed model. Experimental results show that the proposed model can achieve 98.67% accuracy and 96.02% recall, which strongly supports that using convolutional neural network to automatically learn high-level semantic features of electronic medical records and then conduct assist diagnosis is feasible and effective.

  4. A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain.

    PubMed

    Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K

    2013-08-12

    A variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text. Using an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and compared the average rank of correctly identified rule definition or corresponding rule template using both our semantic-based approach and a standard term-based approach. We examined three separate scenarios: (1) the snippet of text contained a definition already in the knowledge base; (2) the snippet contained an alternative definition for a concept in the knowledge base; and (3) the snippet contained a definition not in the knowledge base. Our semantic-based approach had a higher average rank than the term-based approach for each of the three scenarios (scenario 1: 3.8 vs. 5.0; scenario 2: 2.8 vs. 4.9; and scenario 3: 4.5 vs. 6.2), with each comparison significant at the p-value of 0.05 using the Wilcoxon signed-rank test. Our work shows that leveraging existing domain knowledge in the information extraction of biomedical definitions significantly improves the correct identification of such knowledge within sentences. Our method can thus help researchers rapidly acquire knowledge about biomedical definitions that are specified and evolving within an ever-growing corpus of scientific publications.

  5. Getting ahead of yourself: Parafoveal word expectancy modulates the N400 during sentence reading

    DOE PAGES

    Stites, Mallory C.; Payne, Brennan R.; Federmeier, Kara D.

    2017-01-18

    An important question in the reading literature regards the nature of the semantic information readers can extract from the parafovea (i.e., the next word in a sentence). Recent eye-tracking findings have found a semantic parafoveal preview benefit under many circumstances, and findings from event-related brain potentials (ERPs) also suggest that readers can at least detect semantic anomalies parafoveally. We use ERPs to ask whether fine-grained aspects of semantic expectancy can affect the N400 elicited by a word appearing in the parafovea. In an RSVP-with-flankers paradigm, sentences were presented word by word, flanked 2° bilaterally by the previous and upcoming words.more » Stimuli consisted of high constraint sentences that were identical up to the target word, which could be expected, unexpected but plausible, or anomalous, as well as low constraint sentences that were always completed with the most expected ending. Findings revealed an N400 effect to the target word when it appeared in the parafovea, which was graded with respect to the target’s expectancy and congruency within the sentence context. Moreover, when targets appeared at central fixation, this graded congruency effect was mitigated, suggesting that the semantic information gleaned from parafoveal vision functionally changes the semantic processing of those words when foveated.« less

  6. Getting ahead of yourself: Parafoveal word expectancy modulates the N400 during sentence reading

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stites, Mallory C.; Payne, Brennan R.; Federmeier, Kara D.

    An important question in the reading literature regards the nature of the semantic information readers can extract from the parafovea (i.e., the next word in a sentence). Recent eye-tracking findings have found a semantic parafoveal preview benefit under many circumstances, and findings from event-related brain potentials (ERPs) also suggest that readers can at least detect semantic anomalies parafoveally. We use ERPs to ask whether fine-grained aspects of semantic expectancy can affect the N400 elicited by a word appearing in the parafovea. In an RSVP-with-flankers paradigm, sentences were presented word by word, flanked 2° bilaterally by the previous and upcoming words.more » Stimuli consisted of high constraint sentences that were identical up to the target word, which could be expected, unexpected but plausible, or anomalous, as well as low constraint sentences that were always completed with the most expected ending. Findings revealed an N400 effect to the target word when it appeared in the parafovea, which was graded with respect to the target’s expectancy and congruency within the sentence context. Moreover, when targets appeared at central fixation, this graded congruency effect was mitigated, suggesting that the semantic information gleaned from parafoveal vision functionally changes the semantic processing of those words when foveated.« less

  7. Sortal anaphora resolution to enhance relation extraction from biomedical literature.

    PubMed

    Kilicoglu, Halil; Rosemblat, Graciela; Fiszman, Marcelo; Rindflesch, Thomas C

    2016-04-14

    Entity coreference is common in biomedical literature and it can affect text understanding systems that rely on accurate identification of named entities, such as relation extraction and automatic summarization. Coreference resolution is a foundational yet challenging natural language processing task which, if performed successfully, is likely to enhance such systems significantly. In this paper, we propose a semantically oriented, rule-based method to resolve sortal anaphora, a specific type of coreference that forms the majority of coreference instances in biomedical literature. The method addresses all entity types and relies on linguistic components of SemRep, a broad-coverage biomedical relation extraction system. It has been incorporated into SemRep, extending its core semantic interpretation capability from sentence level to discourse level. We evaluated our sortal anaphora resolution method in several ways. The first evaluation specifically focused on sortal anaphora relations. Our methodology achieved a F1 score of 59.6 on the test portion of a manually annotated corpus of 320 Medline abstracts, a 4-fold improvement over the baseline method. Investigating the impact of sortal anaphora resolution on relation extraction, we found that the overall effect was positive, with 50 % of the changes involving uninformative relations being replaced by more specific and informative ones, while 35 % of the changes had no effect, and only 15 % were negative. We estimate that anaphora resolution results in changes in about 1.5 % of approximately 82 million semantic relations extracted from the entire PubMed. Our results demonstrate that a heavily semantic approach to sortal anaphora resolution is largely effective for biomedical literature. Our evaluation and error analysis highlight some areas for further improvements, such as coordination processing and intra-sentential antecedent selection.

  8. Semantic Information Processing of Physical Simulation Based on Scientific Concept Vocabulary Model

    NASA Astrophysics Data System (ADS)

    Kino, Chiaki; Suzuki, Yoshio; Takemiya, Hiroshi

    Scientific Concept Vocabulary (SCV) has been developed to actualize Cognitive methodology based Data Analysis System: CDAS which supports researchers to analyze large scale data efficiently and comprehensively. SCV is an information model for processing semantic information for physics and engineering. In the model of SCV, all semantic information is related to substantial data and algorisms. Consequently, SCV enables a data analysis system to recognize the meaning of execution results output from a numerical simulation. This method has allowed a data analysis system to extract important information from a scientific view point. Previous research has shown that SCV is able to describe simple scientific indices and scientific perceptions. However, it is difficult to describe complex scientific perceptions by currently-proposed SCV. In this paper, a new data structure for SCV has been proposed in order to describe scientific perceptions in more detail. Additionally, the prototype of the new model has been constructed and applied to actual data of numerical simulation. The result means that the new SCV is able to describe more complex scientific perceptions.

  9. Industrial application of semantic process mining

    NASA Astrophysics Data System (ADS)

    Espen Ingvaldsen, Jon; Atle Gulla, Jon

    2012-05-01

    Process mining relates to the extraction of non-trivial and useful information from information system event logs. It is a new research discipline that has evolved significantly since the early work on idealistic process logs. Over the last years, process mining prototypes have incorporated elements from semantics and data mining and targeted visualisation techniques that are more user-friendly to business experts and process owners. In this article, we present a framework for evaluating different aspects of enterprise process flows and address practical challenges of state-of-the-art industrial process mining. We also explore the inherent strengths of the technology for more efficient process optimisation.

  10. Semantic Storyboard of Judicial Debates: A Novel Multimedia Summarization Environment

    ERIC Educational Resources Information Center

    Fersini, E.; Sartori, F.

    2012-01-01

    Purpose: The need of tools for content analysis, information extraction and retrieval of multimedia objects in their native form is strongly emphasized into the judicial domain: digital videos represent a fundamental informative source of events occurring during judicial proceedings that should be stored, organized and retrieved in short time and…

  11. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

    PubMed Central

    2012-01-01

    Background Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. Methods We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. Results We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. Conclusions We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data. PMID:22759462

  12. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach.

    PubMed

    Ratkovic, Zorana; Golik, Wiktoria; Warnier, Pierre

    2012-06-26

    Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.

  13. Relation Extraction with Weak Supervision and Distributional Semantics

    DTIC Science & Technology

    2013-05-01

    DATES COVERED 00-00-2013 to 00-00-2013 4 . TITLE AND SUBTITLE Relation Extraction with Weak Supervision and Distributional Semantics 5a...ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction 1 2 Prior Work 4 ...2.1 Supervised relation extraction . . . . . . . . . . . . . . . . . . . . . 4 2.2 Distant supervision for relation extraction

  14. Latent semantic analysis.

    PubMed

    Evangelopoulos, Nicholas E

    2013-11-01

    This article reviews latent semantic analysis (LSA), a theory of meaning as well as a method for extracting that meaning from passages of text, based on statistical computations over a collection of documents. LSA as a theory of meaning defines a latent semantic space where documents and individual words are represented as vectors. LSA as a computational technique uses linear algebra to extract dimensions that represent that space. This representation enables the computation of similarity among terms and documents, categorization of terms and documents, and summarization of large collections of documents using automated procedures that mimic the way humans perform similar cognitive tasks. We present some technical details, various illustrative examples, and discuss a number of applications from linguistics, psychology, cognitive science, education, information science, and analysis of textual data in general. WIREs Cogn Sci 2013, 4:683-692. doi: 10.1002/wcs.1254 CONFLICT OF INTEREST: The author has declared no conflicts of interest for this article. For further resources related to this article, please visit the WIREs website. © 2013 John Wiley & Sons, Ltd.

  15. Modelling Metamorphism by Abstract Interpretation

    NASA Astrophysics Data System (ADS)

    Dalla Preda, Mila; Giacobazzi, Roberto; Debray, Saumya; Coogan, Kevin; Townsend, Gregg M.

    Metamorphic malware apply semantics-preserving transformations to their own code in order to foil detection systems based on signature matching. In this paper we consider the problem of automatically extract metamorphic signatures from these malware. We introduce a semantics for self-modifying code, later called phase semantics, and prove its correctness by showing that it is an abstract interpretation of the standard trace semantics. Phase semantics precisely models the metamorphic code behavior by providing a set of traces of programs which correspond to the possible evolutions of the metamorphic code during execution. We show that metamorphic signatures can be automatically extracted by abstract interpretation of the phase semantics, and that regular metamorphism can be modelled as finite state automata abstraction of the phase semantics.

  16. Semantic Interoperability of Health Risk Assessments

    PubMed Central

    Rajda, Jay; Vreeman, Daniel J.; Wei, Henry G.

    2011-01-01

    The health insurance and benefits industry has administered Health Risk Assessments (HRAs) at an increasing rate. These are used to collect data on modifiable health risk factors for wellness and disease management programs. However, there is significant variability in the semantics of these assessments, making it difficult to compare data sets from the output of 2 different HRAs. There is also an increasing need to exchange this data with Health Information Exchanges and Electronic Medical Records. To standardize the data and concepts from these tools, we outline a process to determine presence of certain common elements of modifiable health risk extracted from these surveys. This information is coded using concept identifiers, which allows cross-survey comparison and analysis. We propose that using LOINC codes or other universal coding schema may allow semantic interoperability of a variety of HRA tools across the industry, research, and clinical settings. PMID:22195174

  17. Medical Image Databases

    PubMed Central

    Tagare, Hemant D.; Jaffe, C. Carl; Duncan, James

    1997-01-01

    Abstract Information contained in medical images differs considerably from that residing in alphanumeric format. The difference can be attributed to four characteristics: (1) the semantics of medical knowledge extractable from images is imprecise; (2) image information contains form and spatial data, which are not expressible in conventional language; (3) a large part of image information is geometric; (4) diagnostic inferences derived from images rest on an incomplete, continuously evolving model of normality. This paper explores the differentiating characteristics of text versus images and their impact on design of a medical image database intended to allow content-based indexing and retrieval. One strategy for implementing medical image databases is presented, which employs object-oriented iconic queries, semantics by association with prototypes, and a generic schema. PMID:9147338

  18. Supporting Better Treatments for Meeting Health Consumers' Needs: Extracting Semantics in Social Data for Representing a Consumer Health Ontology

    ERIC Educational Resources Information Center

    Choi, Yunseon

    2016-01-01

    Introduction: The purpose of this paper is to provide a framework for building a consumer health ontology using social tags. This would assist health users when they are accessing health information and increase the number of documents relevant to their needs. Methods: In order to extract concepts from social tags, this study conducted an…

  19. Biological event composition

    PubMed Central

    2012-01-01

    Background In recent years, biological event extraction has emerged as a key natural language processing task, aiming to address the information overload problem in accessing the molecular biology literature. The BioNLP shared task competitions have contributed to this recent interest considerably. The first competition (BioNLP'09) focused on extracting biological events from Medline abstracts from a narrow domain, while the theme of the latest competition (BioNLP-ST'11) was generalization and a wider range of text types, event types, and subject domains were considered. We view event extraction as a building block in larger discourse interpretation and propose a two-phase, linguistically-grounded, rule-based methodology. In the first phase, a general, underspecified semantic interpretation is composed from syntactic dependency relations in a bottom-up manner. The notion of embedding underpins this phase and it is informed by a trigger dictionary and argument identification rules. Coreference resolution is also performed at this step, allowing extraction of inter-sentential relations. The second phase is concerned with constraining the resulting semantic interpretation by shared task specifications. We evaluated our general methodology on core biological event extraction and speculation/negation tasks in three main tracks of BioNLP-ST'11 (GENIA, EPI, and ID). Results We achieved competitive results in GENIA and ID tracks, while our results in the EPI track leave room for improvement. One notable feature of our system is that its performance across abstracts and articles bodies is stable. Coreference resolution results in minor improvement in system performance. Due to our interest in discourse-level elements, such as speculation/negation and coreference, we provide a more detailed analysis of our system performance in these subtasks. Conclusions The results demonstrate the viability of a robust, linguistically-oriented methodology, which clearly distinguishes general semantic interpretation from shared task specific aspects, for biological event extraction. Our error analysis pinpoints some shortcomings, which we plan to address in future work within our incremental system development methodology. PMID:22759461

  20. UltiMatch-NL: A Web Service Matchmaker Based on Multiple Semantic Filters

    PubMed Central

    Mohebbi, Keyvan; Ibrahim, Suhaimi; Zamani, Mazdak; Khezrian, Mojtaba

    2014-01-01

    In this paper, a Semantic Web service matchmaker called UltiMatch-NL is presented. UltiMatch-NL applies two filters namely Signature-based and Description-based on different abstraction levels of a service profile to achieve more accurate results. More specifically, the proposed filters rely on semantic knowledge to extract the similarity between a given pair of service descriptions. Thus it is a further step towards fully automated Web service discovery via making this process more semantic-aware. In addition, a new technique is proposed to weight and combine the results of different filters of UltiMatch-NL, automatically. Moreover, an innovative approach is introduced to predict the relevance of requests and Web services and eliminate the need for setting a threshold value of similarity. In order to evaluate UltiMatch-NL, the repository of OWLS-TC is used. The performance evaluation based on standard measures from the information retrieval field shows that semantic matching of OWL-S services can be significantly improved by incorporating designed matching filters. PMID:25157872

  1. Assessing semantic similarity of texts - Methods and algorithms

    NASA Astrophysics Data System (ADS)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  2. UltiMatch-NL: a Web service matchmaker based on multiple semantic filters.

    PubMed

    Mohebbi, Keyvan; Ibrahim, Suhaimi; Zamani, Mazdak; Khezrian, Mojtaba

    2014-01-01

    In this paper, a Semantic Web service matchmaker called UltiMatch-NL is presented. UltiMatch-NL applies two filters namely Signature-based and Description-based on different abstraction levels of a service profile to achieve more accurate results. More specifically, the proposed filters rely on semantic knowledge to extract the similarity between a given pair of service descriptions. Thus it is a further step towards fully automated Web service discovery via making this process more semantic-aware. In addition, a new technique is proposed to weight and combine the results of different filters of UltiMatch-NL, automatically. Moreover, an innovative approach is introduced to predict the relevance of requests and Web services and eliminate the need for setting a threshold value of similarity. In order to evaluate UltiMatch-NL, the repository of OWLS-TC is used. The performance evaluation based on standard measures from the information retrieval field shows that semantic matching of OWL-S services can be significantly improved by incorporating designed matching filters.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rios Velazquez, E; Parmar, C; Narayan, V

    Purpose: To compare the complementary value of quantitative radiomic features to that of radiologist-annotated semantic features in predicting EGFR mutations in lung adenocarcinomas. Methods: Pre-operative CT images of 258 lung adenocarcinoma patients were available. Tumors were segmented using the sing-click ensemble segmentation algorithm. A set of radiomic features was extracted using 3D-Slicer. Test-retest reproducibility and unsupervised dimensionality reduction were applied to select a subset of reproducible and independent radiomic features. Twenty semantic annotations were scored by an expert radiologist, describing the tumor, surrounding tissue and associated findings. Minimum-redundancy-maximum-relevance (MRMR) was used to identify the most informative radiomic and semantic featuresmore » in 172 patients (training-set, temporal split). Radiomic, semantic and combined radiomic-semantic logistic regression models to predict EGFR mutations were evaluated in and independent validation dataset of 86 patients using the area under the receiver operating curve (AUC). Results: EGFR mutations were found in 77/172 (45%) and 39/86 (45%) of the training and validation sets, respectively. Univariate AUCs showed a similar range for both feature types: radiomics median AUC = 0.57 (range: 0.50 – 0.62); semantic median AUC = 0.53 (range: 0.50 – 0.64, Wilcoxon p = 0.55). After MRMR feature selection, the best-performing radiomic, semantic, and radiomic-semantic logistic regression models, for EGFR mutations, showed a validation AUC of 0.56 (p = 0.29), 0.63 (p = 0.063) and 0.67 (p = 0.004), respectively. Conclusion: Quantitative volumetric and textural Radiomic features complement the qualitative and semi-quantitative radiologist annotations. The prognostic value of informative qualitative semantic features such as cavitation and lobulation is increased with the addition of quantitative textural features from the tumor region.« less

  4. Intelligent Semantic Query of Notices to Airmen (NOTAMs)

    DTIC Science & Technology

    2006-07-01

    definition of the airspace is constantly changing, new vocabulary is added and old words retired on a monthly basis, and the information specifying this is...NOTAMs are notices containing information on the conditions, or changes to, aeronautical facilities, services, procedures, or hazards, which are...develop a new parsing system, employing and extending ideas developed by the information-extraction community, rather than on classical computational

  5. Deep visual-semantic for crowded video understanding

    NASA Astrophysics Data System (ADS)

    Deng, Chunhua; Zhang, Junwen

    2018-03-01

    Visual-semantic features play a vital role for crowded video understanding. Convolutional Neural Networks (CNNs) have experienced a significant breakthrough in learning representations from images. However, the learning of visualsemantic features, and how it can be effectively extracted for video analysis, still remains a challenging task. In this study, we propose a novel visual-semantic method to capture both appearance and dynamic representations. In particular, we propose a spatial context method, based on the fractional Fisher vector (FV) encoding on CNN features, which can be regarded as our main contribution. In addition, to capture temporal context information, we also applied fractional encoding method on dynamic images. Experimental results on the WWW crowed video dataset demonstrate that the proposed method outperform the state of the art.

  6. Medical knowledge discovery and management.

    PubMed

    Prior, Fred

    2009-05-01

    Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.

  7. Discovering gene annotations in biomedical text databases

    PubMed Central

    Cakmak, Ali; Ozsoyoglu, Gultekin

    2008-01-01

    Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. PMID:18325104

  8. Discovering gene annotations in biomedical text databases.

    PubMed

    Cakmak, Ali; Ozsoyoglu, Gultekin

    2008-03-06

    Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values.

  9. Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation

    PubMed Central

    Huang, Chung-Chi; Lu, Zhiyong

    2016-01-01

    Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as ‘CHEMICAL-1 compared to CHEMICAL-2.’ With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical–disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order covering diverse bio-entity relations. To assess the potential utility of our automated top-ranked patterns of a given relation in semantic search, we performed a pilot study on frequently sought semantic relations in PubMed and observed improved literature retrieval effectiveness based on post-hoc human relevance evaluation. Further investigation in larger tests and in real-world scenarios is warranted. PMID:27016698

  10. Bridging the semantic gap in sports

    NASA Astrophysics Data System (ADS)

    Li, Baoxin; Errico, James; Pan, Hao; Sezan, M. Ibrahim

    2003-01-01

    One of the major challenges facing current media management systems and the related applications is the so-called "semantic gap" between the rich meaning that a user desires and the shallowness of the content descriptions that are automatically extracted from the media. In this paper, we address the problem of bridging this gap in the sports domain. We propose a general framework for indexing and summarizing sports broadcast programs. The framework is based on a high-level model of sports broadcast video using the concept of an event, defined according to domain-specific knowledge for different types of sports. Within this general framework, we develop automatic event detection algorithms that are based on automatic analysis of the visual and aural signals in the media. We have successfully applied the event detection algorithms to different types of sports including American football, baseball, Japanese sumo wrestling, and soccer. Event modeling and detection contribute to the reduction of the semantic gap by providing rudimentary semantic information obtained through media analysis. We further propose a novel approach, which makes use of independently generated rich textual metadata, to fill the gap completely through synchronization of the information-laden textual data with the basic event segments. An MPEG-7 compliant prototype browsing system has been implemented to demonstrate semantic retrieval and summarization of sports video.

  11. Semantic segmentation of mFISH images using convolutional networks.

    PubMed

    Pardo, Esteban; Morgado, José Mário T; Malpica, Norberto

    2018-04-30

    Multicolor in situ hybridization (mFISH) is a karyotyping technique used to detect major chromosomal alterations using fluorescent probes and imaging techniques. Manual interpretation of mFISH images is a time consuming step that can be automated using machine learning; in previous works, pixel or patch wise classification was employed, overlooking spatial information which can help identify chromosomes. In this work, we propose a fully convolutional semantic segmentation network for the interpretation of mFISH images, which uses both spatial and spectral information to classify each pixel in an end-to-end fashion. The semantic segmentation network developed was tested on samples extracted from a public dataset using cross validation. Despite having no labeling information of the image it was tested on, our algorithm yielded an average correct classification ratio (CCR) of 87.41%. Previously, this level of accuracy was only achieved with state of the art algorithms when classifying pixels from the same image in which the classifier has been trained. These results provide evidence that fully convolutional semantic segmentation networks may be employed in the computer aided diagnosis of genetic diseases with improved performance over the current image analysis methods. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.

  12. The processing of phonological, orthographical, and lexical information of Chinese characters in sentence contexts: an ERP study.

    PubMed

    Liu, Baolin; Jin, Zhixing; Qing, Zhao; Wang, Zhongning

    2011-02-04

    In the current work, we aimed to study the processing of phonological, orthographical, and lexical information of Chinese characters in sentence contexts, as well as to provide further evidence for psychological models. In the experiment, we designed sentences with expected, homophonic, orthographically similar, synonymous, and control characters as endings, respectively. The results indicated that P200 might be related to the early extraction of phonological information. Moreover, it might also represent immediate semantic and orthographic lexical access. This suggested that there might be a dual-route in cognitive processing, where the direct access route and the phonologically mediated access route both exist and interact with each other. The increased N400 under the control condition suggested that both phonological and orthographical information would influence semantic integration in Chinese sentence comprehension. The two positive peaks of the late positive shift might represent the semantic monitoring, and orthographical retrieval and reanalysis processing, respectively. Under the orthographically similar condition, orthographical retrieval and reanalysis processing was more difficult in comparison with the other conditions, which suggested that there might be direct access from orthography to semantic representation in cognitive processing. In conclusion, it was shown that the direct access hypothesis or the dual-route hypothesis could better explain cognitive processing in the brain. Copyright © 2010 Elsevier B.V. All rights reserved.

  13. Automatic extraction of road features in urban environments using dense ALS data

    NASA Astrophysics Data System (ADS)

    Soilán, Mario; Truong-Hong, Linh; Riveiro, Belén; Laefer, Debra

    2018-02-01

    This paper describes a methodology that automatically extracts semantic information from urban ALS data for urban parameterization and road network definition. First, building façades are segmented from the ground surface by combining knowledge-based information with both voxel and raster data. Next, heuristic rules and unsupervised learning are applied to the ground surface data to distinguish sidewalk and pavement points as a means for curb detection. Then radiometric information was employed for road marking extraction. Using high-density ALS data from Dublin, Ireland, this fully automatic workflow was able to generate a F-score close to 95% for pavement and sidewalk identification with a resolution of 20 cm and better than 80% for road marking detection.

  14. Bladder cancer treatment response assessment with radiomic, clinical, and radiologist semantic features

    NASA Astrophysics Data System (ADS)

    Gordon, Marshall N.; Cha, Kenny H.; Hadjiiski, Lubomir M.; Chan, Heang-Ping; Cohan, Richard H.; Caoili, Elaine M.; Paramagul, Chintana; Alva, Ajjai; Weizer, Alon Z.

    2018-02-01

    We are developing a decision support system for assisting clinicians in assessment of response to neoadjuvant chemotherapy for bladder cancer. Accurate treatment response assessment is crucial for identifying responders and improving quality of life for non-responders. An objective machine learning decision support system may help reduce variability and inaccuracy in treatment response assessment. We developed a predictive model to assess the likelihood that a patient will respond based on image and clinical features. With IRB approval, we retrospectively collected a data set of pre- and post- treatment CT scans along with clinical information from surgical pathology from 98 patients. A linear discriminant analysis (LDA) classifier was used to predict the likelihood that a patient would respond to treatment based on radiomic features extracted from CT urography (CTU), a radiologist's semantic feature, and a clinical feature extracted from surgical and pathology reports. The classification accuracy was evaluated using the area under the ROC curve (AUC) with a leave-one-case-out cross validation. The classification accuracy was compared for the systems based on radiomic features, clinical feature, and radiologist's semantic feature. For the system based on only radiomic features the AUC was 0.75. With the addition of clinical information from examination under anesthesia (EUA) the AUC was improved to 0.78. Our study demonstrated the potential of designing a decision support system to assist in treatment response assessment. The combination of clinical features, radiologist semantic features and CTU radiomic features improved the performance of the classifier and the accuracy of treatment response assessment.

  15. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications

    PubMed Central

    Masanz, James J; Ogren, Philip V; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin C; Chute, Christopher G

    2010-01-01

    We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text. PMID:20819853

  16. Identifying biological concepts from a protein-related corpus with a probabilistic topic model

    PubMed Central

    Zheng, Bin; McLean, David C; Lu, Xinghua

    2006-01-01

    Background Biomedical literature, e.g., MEDLINE, contains a wealth of knowledge regarding functions of proteins. Major recurring biological concepts within such text corpora represent the domains of this body of knowledge. The goal of this research is to identify the major biological topics/concepts from a corpus of protein-related MEDLINE© titles and abstracts by applying a probabilistic topic model. Results The latent Dirichlet allocation (LDA) model was applied to the corpus. Based on the Bayesian model selection, 300 major topics were extracted from the corpus. The majority of identified topics/concepts was found to be semantically coherent and most represented biological objects or concepts. The identified topics/concepts were further mapped to the controlled vocabulary of the Gene Ontology (GO) terms based on mutual information. Conclusion The major and recurring biological concepts within a collection of MEDLINE documents can be extracted by the LDA model. The identified topics/concepts provide parsimonious and semantically-enriched representation of the texts in a semantic space with reduced dimensionality and can be used to index text. PMID:16466569

  17. SAS- Semantic Annotation Service for Geoscience resources on the web

    NASA Astrophysics Data System (ADS)

    Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.

    2015-12-01

    There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.

  18. Rule Extracting based on MCG with its Application in Helicopter Power Train Fault Diagnosis

    NASA Astrophysics Data System (ADS)

    Wang, M.; Hu, N. Q.; Qin, G. J.

    2011-07-01

    In order to extract decision rules for fault diagnosis from incomplete historical test records for knowledge-based damage assessment of helicopter power train structure. A method that can directly extract the optimal generalized decision rules from incomplete information based on GrC was proposed. Based on semantic analysis of unknown attribute value, the granule was extended to handle incomplete information. Maximum characteristic granule (MCG) was defined based on characteristic relation, and MCG was used to construct the resolution function matrix. The optimal general decision rule was introduced, with the basic equivalent forms of propositional logic, the rules were extracted and reduction from incomplete information table. Combined with a fault diagnosis example of power train, the application approach of the method was present, and the validity of this method in knowledge acquisition was proved.

  19. Joint modality fusion and temporal context exploitation for semantic video analysis

    NASA Astrophysics Data System (ADS)

    Papadopoulos, Georgios Th; Mezaris, Vasileios; Kompatsiaris, Ioannis; Strintzis, Michael G.

    2011-12-01

    In this paper, a multi-modal context-aware approach to semantic video analysis is presented. Overall, the examined video sequence is initially segmented into shots and for every resulting shot appropriate color, motion and audio features are extracted. Then, Hidden Markov Models (HMMs) are employed for performing an initial association of each shot with the semantic classes that are of interest separately for each modality. Subsequently, a graphical modeling-based approach is proposed for jointly performing modality fusion and temporal context exploitation. Novelties of this work include the combined use of contextual information and multi-modal fusion, and the development of a new representation for providing motion distribution information to HMMs. Specifically, an integrated Bayesian Network is introduced for simultaneously performing information fusion of the individual modality analysis results and exploitation of temporal context, contrary to the usual practice of performing each task separately. Contextual information is in the form of temporal relations among the supported classes. Additionally, a new computationally efficient method for providing motion energy distribution-related information to HMMs, which supports the incorporation of motion characteristics from previous frames to the currently examined one, is presented. The final outcome of this overall video analysis framework is the association of a semantic class with every shot. Experimental results as well as comparative evaluation from the application of the proposed approach to four datasets belonging to the domains of tennis, news and volleyball broadcast video are presented.

  20. Concept indexing and expansion for social multimedia websites based on semantic processing and graph analysis

    NASA Astrophysics Data System (ADS)

    Lin, Po-Chuan; Chen, Bo-Wei; Chang, Hangbae

    2016-07-01

    This study presents a human-centric technique for social video expansion based on semantic processing and graph analysis. The objective is to increase metadata of an online video and to explore related information, thereby facilitating user browsing activities. To analyze the semantic meaning of a video, shots and scenes are firstly extracted from the video on the server side. Subsequently, this study uses annotations along with ConceptNet to establish the underlying framework. Detailed metadata, including visual objects and audio events among the predefined categories, are indexed by using the proposed method. Furthermore, relevant online media associated with each category are also analyzed to enrich the existing content. With the above-mentioned information, users can easily browse and search the content according to the link analysis and its complementary knowledge. Experiments on a video dataset are conducted for evaluation. The results show that our system can achieve satisfactory performance, thereby demonstrating the feasibility of the proposed idea.

  1. A novel key-frame extraction approach for both video summary and video index.

    PubMed

    Lei, Shaoshuai; Xie, Gang; Yan, Gaowei

    2014-01-01

    Existing key-frame extraction methods are basically video summary oriented; yet the index task of key-frames is ignored. This paper presents a novel key-frame extraction approach which can be available for both video summary and video index. First a dynamic distance separability algorithm is advanced to divide a shot into subshots based on semantic structure, and then appropriate key-frames are extracted in each subshot by SVD decomposition. Finally, three evaluation indicators are proposed to evaluate the performance of the new approach. Experimental results show that the proposed approach achieves good semantic structure for semantics-based video index and meanwhile produces video summary consistent with human perception.

  2. Talk this way: the effect of prosodically conveyed semantic information on memory for novel words.

    PubMed

    Shintel, Hadas; Anderson, Nathan L; Fenn, Kimberly M

    2014-08-01

    Speakers modulate their prosody to express not only emotional information but also semantic information (e.g., raising pitch for upward motion). Moreover, this information can help listeners infer meaning. Work investigating the communicative role of prosodically conveyed meaning has focused on reference resolution, and potential mnemonic benefits remain unexplored. We investigated the effect of prosody on memory for the meaning of novel words, even when it conveys superfluous information. Participants heard novel words, produced with congruent or incongruent prosody, and viewed image pairs representing the intended meaning and its antonym (e.g., a small and a large dog). Importantly, an arrow indicated the image representing the intended meaning, resolving the ambiguity. Participants then completed 2 memory tests, either immediately after learning or after a 24-hr delay, on which they chose an image (out of a new image pair) and a definition that best represented the word. On the image test, memory was similar on the immediate test, but incongruent prosody led to greater loss over time. On the definition test, memory was better for congruent prosody at both times. Results suggest that listeners extract semantic information from prosody even when it is redundant and that prosody can enhance memory, beyond its role in comprehension. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  3. Using Data Crawlers and Semantic Web to Build Financial XBRL Data Generators: The SONAR Extension Approach

    PubMed Central

    Rodríguez-García, Miguel Ángel; Rodríguez-González, Alejandro; Valencia-García, Rafael; Gómez-Berbís, Juan Miguel

    2014-01-01

    Precise, reliable and real-time financial information is critical for added-value financial services after the economic turmoil from which markets are still struggling to recover. Since the Web has become the most significant data source, intelligent crawlers based on Semantic Technologies have become trailblazers in the search of knowledge combining natural language processing and ontology engineering techniques. In this paper, we present the SONAR extension approach, which will leverage the potential of knowledge representation by extracting, managing, and turning scarce and disperse financial information into well-classified, structured, and widely used XBRL format-oriented knowledge, strongly supported by a proof-of-concept implementation and a thorough evaluation of the benefits of the approach. PMID:24587726

  4. Tweets clustering using latent semantic analysis

    NASA Astrophysics Data System (ADS)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  5. Using data crawlers and semantic Web to build financial XBRL data generators: the SONAR extension approach.

    PubMed

    Rodríguez-García, Miguel Ángel; Rodríguez-González, Alejandro; Colomo-Palacios, Ricardo; Valencia-García, Rafael; Gómez-Berbís, Juan Miguel; García-Sánchez, Francisco

    2014-01-01

    Precise, reliable and real-time financial information is critical for added-value financial services after the economic turmoil from which markets are still struggling to recover. Since the Web has become the most significant data source, intelligent crawlers based on Semantic Technologies have become trailblazers in the search of knowledge combining natural language processing and ontology engineering techniques. In this paper, we present the SONAR extension approach, which will leverage the potential of knowledge representation by extracting, managing, and turning scarce and disperse financial information into well-classified, structured, and widely used XBRL format-oriented knowledge, strongly supported by a proof-of-concept implementation and a thorough evaluation of the benefits of the approach.

  6. EXACT2: the semantics of biomedical protocols

    PubMed Central

    2014-01-01

    Background The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously 'unseen' (not used for the construction of EXACT2) protocols. Results The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format. PMID:25472549

  7. Ontology Research and Development. Part 1-A Review of Ontology Generation.

    ERIC Educational Resources Information Center

    Ding, Ying; Foo, Schubert

    2002-01-01

    Discusses the role of ontology in knowledge representation, including enabling content-based access, interoperability, communications, and new levels of service on the Semantic Web; reviews current ontology generation studies and projects as well as problems facing such research; and discusses ontology mapping, information extraction, natural…

  8. Semantic similarity measures in the biomedical domain by leveraging a web search engine.

    PubMed

    Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

    2013-07-01

    Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.

  9. A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations

    PubMed Central

    Kurtz, Camille; Beaulieu, Christopher F.; Napel, Sandy; Rubin, Daniel L.

    2014-01-01

    Computer-assisted image retrieval applications could assist radiologist interpretations by identifying similar images in large archives as a means to providing decision support. However, the semantic gap between low-level image features and their high level semantics may impair the system performances. Indeed, it can be challenging to comprehensively characterize the images using low-level imaging features to fully capture the visual appearance of diseases on images, and recently the use of semantic terms has been advocated to provide semantic descriptions of the visual contents of images. However, most of the existing image retrieval strategies do not consider the intrinsic properties of these terms during the comparison of the images beyond treating them as simple binary (presence/absence) features. We propose a new framework that includes semantic features in images and that enables retrieval of similar images in large databases based on their semantic relations. It is based on two main steps: (1) annotation of the images with semantic terms extracted from an ontology, and (2) evaluation of the similarity of image pairs by computing the similarity between the terms using the Hierarchical Semantic-Based Distance (HSBD) coupled to an ontological measure. The combination of these two steps provides a means of capturing the semantic correlations among the terms used to characterize the images that can be considered as a potential solution to deal with the semantic gap problem. We validate this approach in the context of the retrieval and the classification of 2D regions of interest (ROIs) extracted from computed tomographic (CT) images of the liver. Under this framework, retrieval accuracy of more than 0.96 was obtained on a 30-images dataset using the Normalized Discounted Cumulative Gain (NDCG) index that is a standard technique used to measure the effectiveness of information retrieval algorithms when a separate reference standard is available. Classification results of more than 95% were obtained on a 77-images dataset. For comparison purpose, the use of the Earth Mover's Distance (EMD), which is an alternative distance metric that considers all the existing relations among the terms, led to results retrieval accuracy of 0.95 and classification results of 93% with a higher computational cost. The results provided by the presented framework are competitive with the state-of-the-art and emphasize the usefulness of the proposed methodology for radiology image retrieval and classification. PMID:24632078

  10. Augmented Robotics Dialog System for Enhancing Human–Robot Interaction

    PubMed Central

    Alonso-Martín, Fernando; Castro-González, Aívaro; de Gorostiza Luengo, Francisco Javier Fernandez; Salichs, Miguel Ángel

    2015-01-01

    Augmented reality, augmented television and second screen are cutting edge technologies that provide end users extra and enhanced information related to certain events in real time. This enriched information helps users better understand such events, at the same time providing a more satisfactory experience. In the present paper, we apply this main idea to human–robot interaction (HRI), to how users and robots interchange information. The ultimate goal of this paper is to improve the quality of HRI, developing a new dialog manager system that incorporates enriched information from the semantic web. This work presents the augmented robotic dialog system (ARDS), which uses natural language understanding mechanisms to provide two features: (i) a non-grammar multimodal input (verbal and/or written) text; and (ii) a contextualization of the information conveyed in the interaction. This contextualization is achieved by information enrichment techniques that link the extracted information from the dialog with extra information about the world available in semantic knowledge bases. This enriched or contextualized information (information enrichment, semantic enhancement or contextualized information are used interchangeably in the rest of this paper) offers many possibilities in terms of HRI. For instance, it can enhance the robot's pro-activeness during a human–robot dialog (the enriched information can be used to propose new topics during the dialog, while ensuring a coherent interaction). Another possibility is to display additional multimedia content related to the enriched information on a visual device. This paper describes the ARDS and shows a proof of concept of its applications. PMID:26151202

  11. Augmented Robotics Dialog System for Enhancing Human-Robot Interaction.

    PubMed

    Alonso-Martín, Fernando; Castro-González, Aĺvaro; Luengo, Francisco Javier Fernandez de Gorostiza; Salichs, Miguel Ángel

    2015-07-03

    Augmented reality, augmented television and second screen are cutting edge technologies that provide end users extra and enhanced information related to certain events in real time. This enriched information helps users better understand such events, at the same time providing a more satisfactory experience. In the present paper, we apply this main idea to human-robot interaction (HRI), to how users and robots interchange information. The ultimate goal of this paper is to improve the quality of HRI, developing a new dialog manager system that incorporates enriched information from the semantic web. This work presents the augmented robotic dialog system (ARDS), which uses natural language understanding mechanisms to provide two features: (i) a non-grammar multimodal input (verbal and/or written) text; and (ii) a contextualization of the information conveyed in the interaction. This contextualization is achieved by information enrichment techniques that link the extracted information from the dialog with extra information about the world available in semantic knowledge bases. This enriched or contextualized information (information enrichment, semantic enhancement or contextualized information are used interchangeably in the rest of this paper) offers many possibilities in terms of HRI. For instance, it can enhance the robot's pro-activeness during a human-robot dialog (the enriched information can be used to propose new topics during the dialog, while ensuring a coherent interaction). Another possibility is to display additional multimedia content related to the enriched information on a visual device. This paper describes the ARDS and shows a proof of concept of its applications.

  12. Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

    PubMed

    Huang, Chung-Chi; Lu, Zhiyong

    2016-01-01

    Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as 'CHEMICAL-1 compared to CHEMICAL-2' With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical-disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order covering diverse bio-entity relations. To assess the potential utility of our automated top-ranked patterns of a given relation in semantic search, we performed a pilot study on frequently sought semantic relations in PubMed and observed improved literature retrieval effectiveness based on post-hoc human relevance evaluation. Further investigation in larger tests and in real-world scenarios is warranted. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  13. Distant supervision for neural relation extraction integrated with word attention and property features.

    PubMed

    Qu, Jianfeng; Ouyang, Dantong; Hua, Wen; Ye, Yuxin; Li, Ximing

    2018-04-01

    Distant supervision for neural relation extraction is an efficient approach to extracting massive relations with reference to plain texts. However, the existing neural methods fail to capture the critical words in sentence encoding and meanwhile lack useful sentence information for some positive training instances. To address the above issues, we propose a novel neural relation extraction model. First, we develop a word-level attention mechanism to distinguish the importance of each individual word in a sentence, increasing the attention weights for those critical words. Second, we investigate the semantic information from word embeddings of target entities, which can be developed as a supplementary feature for the extractor. Experimental results show that our model outperforms previous state-of-the-art baselines. Copyright © 2018 Elsevier Ltd. All rights reserved.

  14. From Data to Semantic Information

    NASA Astrophysics Data System (ADS)

    Floridi, Luciano

    2003-06-01

    There is no consensus yet on the definition of semantic information. This paper contributes to the current debate by criticising and revising the Standard Definition of semantic Information (SDI) as meaningful data, in favour of the Dretske-Grice approach: meaningful and well-formed data constitute semantic information only if they also qualify as contingently truthful. After a brief introduction, SDI is criticised for providing necessary but insufficient conditions for the definition of semantic information. SDI is incorrect because truth-values do not supervene on semantic information, and misinformation (that is, false semantic information) is not a type of semantic information, but pseudo-information, that is not semantic information at all. This is shown by arguing that none of the reasons for interpreting misinformation as a type of semantic information is convincing, whilst there are compelling reasons to treat it as pseudo-information. As a consequence, SDI is revised to include a necessary truth-condition. The last section summarises the main results of the paper and indicates the important implications of the revised definition for the analysis of the deflationary theories of truth, the standard definition of knowledge and the classic, quantitative theory of semantic information.

  15. CRL/NMSU and Brandeis: Description of the MucBruce System as Used for MUC-4

    DTIC Science & Technology

    1992-01-01

    developing a method fo r identifying articles of interest and extracting and storing specific kinds of information from large volumes o f Japanese and...performance . Most of the information produced in our MUC template s is arrived at by probing the text which surrounds `significant’ words for the...strings with semantic information . The other two, the Relevant Template Filter and the Relevant Paragraph Filter, perform word frequency analysis to

  16. Construction of an annotated corpus to support biomedical information extraction

    PubMed Central

    Thompson, Paul; Iqbal, Syed A; McNaught, John; Ananiadou, Sophia

    2009-01-01

    Background Information Extraction (IE) is a component of text mining that facilitates knowledge discovery by automatically locating instances of interesting biomedical events from huge document collections. As events are usually centred on verbs and nominalised verbs, understanding the syntactic and semantic behaviour of these words is highly important. Corpora annotated with information concerning this behaviour can constitute a valuable resource in the training of IE components and resources. Results We have defined a new scheme for annotating sentence-bound gene regulation events, centred on both verbs and nominalised verbs. For each event instance, all participants (arguments) in the same sentence are identified and assigned a semantic role from a rich set of 13 roles tailored to biomedical research articles, together with a biological concept type linked to the Gene Regulation Ontology. To our knowledge, our scheme is unique within the biomedical field in terms of the range of event arguments identified. Using the scheme, we have created the Gene Regulation Event Corpus (GREC), consisting of 240 MEDLINE abstracts, in which events relating to gene regulation and expression have been annotated by biologists. A novel method of evaluating various different facets of the annotation task showed that average inter-annotator agreement rates fall within the range of 66% - 90%. Conclusion The GREC is a unique resource within the biomedical field, in that it annotates not only core relationships between entities, but also a range of other important details about these relationships, e.g., location, temporal, manner and environmental conditions. As such, it is specifically designed to support bio-specific tool and resource development. It has already been used to acquire semantic frames for inclusion within the BioLexicon (a lexical, terminological resource to aid biomedical text mining). Initial experiments have also shown that the corpus may viably be used to train IE components, such as semantic role labellers. The corpus and annotation guidelines are freely available for academic purposes. PMID:19852798

  17. Extracting Semantic Building Models from Aerial Stereo Images and Conversion to Citygml

    NASA Astrophysics Data System (ADS)

    Sengul, A.

    2012-07-01

    The collection of geographic data is of primary importance for the creation and maintenance of a GIS. Traditionally the acquisition of 3D information has been the task of photogrammetry using aerial stereo images. Digital photogrammetric systems employ sophisticated software to extract digital terrain models or to plot 3D objects. The demand for 3D city models leads to new applications and new standards. City Geography Mark-up Language (CityGML), a concept for modelling and exchange of 3D city and landscape models, defines the classes and relations for the most relevant topographic objects in cities and regional models with respect to their geometrical, topological, semantically and topological properties. It now is increasingly accepted, since it fulfils the prerequisites required e.g. for risk analysis, urban planning, and simulations. There is a need to include existing 3D information derived from photogrammetric processes in CityGML databases. In order to filling the gap, this paper reports on a framework transferring data plotted by Erdas LPS and Stereo Analyst for ArcGIS software to CityGML using Safe Software's Feature Manupulate Engine (FME)

  18. Classifying patents based on their semantic content.

    PubMed

    Bergeaud, Antonin; Potiron, Yoann; Raimbault, Juste

    2017-01-01

    In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.

  19. Classifying patents based on their semantic content

    PubMed Central

    2017-01-01

    In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information. PMID:28445550

  20. A study of actions in operative notes.

    PubMed

    Wang, Yan; Pakhomov, Serguei; Burkart, Nora E; Ryan, James O; Melton, Genevieve B

    2012-01-01

    Operative notes contain rich information about techniques, instruments, and materials used in procedures. To assist development of effective information extraction (IE) techniques for operative notes, we investigated the sublanguage used to describe actions within the operative report 'procedure description' section. Deep parsing results of 362,310 operative notes with an expanded Stanford parser using the SPECIALIST Lexicon resulted in 200 verbs (92% coverage) including 147 action verbs. Nominal action predicates for each action verb were gathered from WordNet, SPECIALIST Lexicon, New Oxford American Dictionary and Stedman's Medical Dictionary. Coverage gaps were seen in existing lexical, domain, and semantic resources (Unified Medical Language System (UMLS) Metathesaurus, SPECIALIST Lexicon, WordNet and FrameNet). Our findings demonstrate the need to construct surgical domain-specific semantic resources for IE from operative notes.

  1. Semantic preview benefit during reading.

    PubMed

    Hohenstein, Sven; Kliegl, Reinhold

    2014-01-01

    Word features in parafoveal vision influence eye movements during reading. The question of whether readers extract semantic information from parafoveal words was studied in 3 experiments by using a gaze-contingent display change technique. Subjects read German sentences containing 1 of several preview words that were replaced by a target word during the saccade to the preview (boundary paradigm). In the 1st experiment the preview word was semantically related or unrelated to the target. Fixation durations on the target were shorter for semantically related than unrelated previews, consistent with a semantic preview benefit. In the 2nd experiment, half the sentences were presented following the rules of German spelling (i.e., previews and targets were printed with an initial capital letter), and the other half were presented completely in lowercase. A semantic preview benefit was obtained under both conditions. In the 3rd experiment, we introduced 2 further preview conditions, an identical word and a pronounceable nonword, while also manipulating the text contrast. Whereas the contrast had negligible effects, fixation durations on the target were reliably different for all 4 types of preview. Semantic preview benefits were greater for pretarget fixations closer to the boundary (large preview space) and, although not as consistently, for long pretarget fixation durations (long preview time). The results constrain theoretical proposals about eye movement control in reading. (PsycINFO Database Record (c) 2013 APA, all rights reserved).

  2. Convergence of Health Level Seven Version 2 Messages to Semantic Web Technologies for Software-Intensive Systems in Telemedicine Trauma Care.

    PubMed

    Menezes, Pedro Monteiro; Cook, Timothy Wayne; Cavalini, Luciana Tricai

    2016-01-01

    To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies.

  3. Ontology design patterns to disambiguate relations between genes and gene products in GENIA

    PubMed Central

    2011-01-01

    Motivation Annotated reference corpora play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies is challenging due to the inherent ambiguity of natural language. The provision of formal definitions and axioms for semantic annotations offers the means for ensuring consistency as well as enables the development of verifiable annotation guidelines. Consistent semantic annotations facilitate the automatic discovery of new information through deductive inferences. Results We provide a formal characterization of the relations used in the recent GENIA corpus annotations. For this purpose, we both select existing axiom systems based on the desired properties of the relations within the domain and develop new axioms for several relations. To apply this ontology of relations to the semantic annotation of text corpora, we implement two ontology design patterns. In addition, we provide a software application to convert annotated GENIA abstracts into OWL ontologies by combining both the ontology of relations and the design patterns. As a result, the GENIA abstracts become available as OWL ontologies and are amenable for automated verification, deductive inferences and other knowledge-based applications. Availability Documentation, implementation and examples are available from http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/. PMID:22166341

  4. Joint classification and contour extraction of large 3D point clouds

    NASA Astrophysics Data System (ADS)

    Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

    2017-08-01

    We present an effective and efficient method for point-wise semantic classification and extraction of object contours of large-scale 3D point clouds. What makes point cloud interpretation challenging is the sheer size of several millions of points per scan and the non-grid, sparse, and uneven distribution of points. Standard image processing tools like texture filters, for example, cannot handle such data efficiently, which calls for dedicated point cloud labeling methods. It turns out that one of the major drivers for efficient computation and handling of strong variations in point density, is a careful formulation of per-point neighborhoods at multiple scales. This allows, both, to define an expressive feature set and to extract topologically meaningful object contours. Semantic classification and contour extraction are interlaced problems. Point-wise semantic classification enables extracting a meaningful candidate set of contour points while contours help generating a rich feature representation that benefits point-wise classification. These methods are tailored to have fast run time and small memory footprint for processing large-scale, unstructured, and inhomogeneous point clouds, while still achieving high classification accuracy. We evaluate our methods on the semantic3d.net benchmark for terrestrial laser scans with >109 points.

  5. Leveraging Pattern Semantics for Extracting Entities in Enterprises

    PubMed Central

    Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei

    2015-01-01

    Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises (“Blue” refers to “Windows Blue”), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach. PMID:26705540

  6. Leveraging Pattern Semantics for Extracting Entities in Enterprises.

    PubMed

    Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei

    2015-05-01

    Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises ("Blue" refers to "Windows Blue"), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach.

  7. 5W1H Information Extraction with CNN-Bidirectional LSTM

    NASA Astrophysics Data System (ADS)

    Nurdin, A.; Maulidevi, N. U.

    2018-03-01

    In this work, information about who, did what, when, where, why, and how on Indonesian news articles were extracted by combining Convolutional Neural Network and Bidirectional Long Short-Term Memory. Convolutional Neural Network can learn semantically meaningful representations of sentences. Bidirectional LSTM can analyze the relations among words in the sequence. We also use word embedding word2vec for word representation. By combining these algorithms, we obtained F-measure 0.808. Our experiments show that CNN-BLSTM outperforms other shallow methods, namely IBk, C4.5, and Naïve Bayes with the F-measure 0.655, 0.645, and 0.595, respectively.

  8. Extracting business vocabularies from business process models: SBVR and BPMN standards-based approach

    NASA Astrophysics Data System (ADS)

    Skersys, Tomas; Butleris, Rimantas; Kapocius, Kestutis

    2013-10-01

    Approaches for the analysis and specification of business vocabularies and rules are very relevant topics in both Business Process Management and Information Systems Development disciplines. However, in common practice of Information Systems Development, the Business modeling activities still are of mostly empiric nature. In this paper, basic aspects of the approach for business vocabularies' semi-automated extraction from business process models are presented. The approach is based on novel business modeling-level OMG standards "Business Process Model and Notation" (BPMN) and "Semantics for Business Vocabularies and Business Rules" (SBVR), thus contributing to OMG's vision about Model-Driven Architecture (MDA) and to model-driven development in general.

  9. On Propagating Interpersonal Trust in Social Networks

    NASA Astrophysics Data System (ADS)

    Ziegler, Cai-Nicolas

    The age of information glut has fostered the proliferation of data and documents on the Web, created by man and machine alike. Hence, there is an enormous wealth of minable knowledge that is yet to be extracted, in particular, on the Semantic Web. However, besides understanding information stated by subjects, knowing about their credibility becomes equally crucial. Hence, trust and trust metrics, conceived as computational means to evaluate trust relationships between individuals, come into play. Our major contribution to Semantic Web trust management through this work is twofold. First, we introduce a classification scheme for trust metrics along various axes and discuss advantages and drawbacks of existing approaches for Semantic Web scenarios. Hereby, we devise an advocacy for local group trust metrics, guiding us to the second part, which presents Appleseed, our novel proposal for local group trust computation. Compelling in its simplicity, Appleseed borrows many ideas from spreading activation models in psychology and relates their concepts to trust evaluation in an intuitive fashion. Moreover, we provide extensions for the Appleseed nucleus that make our trust metric handle distrust statements.

  10. Building a knowledge base of severe adverse drug events based on AERS reporting data using semantic web technologies.

    PubMed

    Jiang, Guoqian; Wang, Liwei; Liu, Hongfang; Solbrig, Harold R; Chute, Christopher G

    2013-01-01

    A semantically coded knowledge base of adverse drug events (ADEs) with severity information is critical for clinical decision support systems and translational research applications. However it remains challenging to measure and identify the severity information of ADEs. The objective of the study is to develop and evaluate a semantic web based approach for building a knowledge base of severe ADEs based on the FDA Adverse Event Reporting System (AERS) reporting data. We utilized a normalized AERS reporting dataset and extracted putative drug-ADE pairs and their associated outcome codes in the domain of cardiac disorders. We validated the drug-ADE associations using ADE datasets from SIDe Effect Resource (SIDER) and the UMLS. We leveraged the Common Terminology Criteria for Adverse Event (CTCAE) grading system and classified the ADEs into the CTCAE in the Web Ontology Language (OWL). We identified and validated 2,444 unique Drug-ADE pairs in the domain of cardiac disorders, of which 760 pairs are in Grade 5, 775 pairs in Grade 4 and 2,196 pairs in Grade 3.

  11. Meaning in the avian auditory cortex: Neural representation of communication calls

    PubMed Central

    Elie, Julie E; Theunissen, Frédéric E

    2014-01-01

    Understanding how the brain extracts the behavioral meaning carried by specific vocalization types that can be emitted by various vocalizers and in different conditions is a central question in auditory research. This semantic categorization is a fundamental process required for acoustic communication and presupposes discriminative and invariance properties of the auditory system for conspecific vocalizations. Songbirds have been used extensively to study vocal learning, but the communicative function of all their vocalizations and their neural representation has yet to be examined. In our research, we first generated a library containing almost the entire zebra finch vocal repertoire and organized communication calls along 9 different categories based on their behavioral meaning. We then investigated the neural representations of these semantic categories in the primary and secondary auditory areas of 6 anesthetized zebra finches. To analyze how single units encode these call categories, we described neural responses in terms of their discrimination, selectivity and invariance properties. Quantitative measures for these neural properties were obtained using an optimal decoder based both on spike counts and spike patterns. Information theoretic metrics show that almost half of the single units encode semantic information. Neurons achieve higher discrimination of these semantic categories by being more selective and more invariant. These results demonstrate that computations necessary for semantic categorization of meaningful vocalizations are already present in the auditory cortex and emphasize the value of a neuro-ethological approach to understand vocal communication. PMID:25728175

  12. A health analytics semantic ETL service for obesity surveillance.

    PubMed

    Poulymenopoulou, M; Papakonstantinou, D; Malamateniou, F; Vassilacopoulos, G

    2015-01-01

    The increasingly large amount of data produced in healthcare (e.g. collected through health information systems such as electronic medical records - EMRs or collected through novel data sources such as personal health records - PHRs, social media, web resources) enable the creation of detailed records about people's health, sentiments and activities (e.g. physical activity, diet, sleep quality) that can be used in the public health area among others. However, despite the transformative potential of big data in public health surveillance there are several challenges in integrating big data. In this paper, the interoperability challenge is tackled and a semantic Extract Transform Load (ETL) service is proposed that seeks to semantically annotate big data to result into valuable data for analysis. This service is considered as part of a health analytics engine on the cloud that interacts with existing healthcare information exchange networks, like the Integrating the Healthcare Enterprise (IHE), PHRs, sensors, mobile applications, and other web resources to retrieve patient health, behavioral and daily activity data. The semantic ETL service aims at semantically integrating big data for use by analytic mechanisms. An illustrative implementation of the service on big data which is potentially relevant to human obesity, enables using appropriate analytic techniques (e.g. machine learning, text mining) that are expected to assist in identifying patterns and contributing factors (e.g. genetic background, social, environmental) for this social phenomenon and, hence, drive health policy changes and promote healthy behaviors where residents live, work, learn, shop and play.

  13. Establishing semantic interoperability of biomedical metadata registries using extended semantic relationships.

    PubMed

    Park, Yu Rang; Yoon, Young Jo; Kim, Hye Hyeon; Kim, Ju Han

    2013-01-01

    Achieving semantic interoperability is critical for biomedical data sharing between individuals, organizations and systems. The ISO/IEC 11179 MetaData Registry (MDR) standard has been recognized as one of the solutions for this purpose. The standard model, however, is limited. Representing concepts consist of two or more values, for instance, are not allowed including blood pressure with systolic and diastolic values. We addressed the structural limitations of ISO/IEC 11179 by an integrated metadata object model in our previous research. In the present study, we introduce semantic extensions for the model by defining three new types of semantic relationships; dependency, composite and variable relationships. To evaluate our extensions in a real world setting, we measured the efficiency of metadata reduction by means of mapping to existing others. We extracted metadata from the College of American Pathologist Cancer Protocols and then evaluated our extensions. With no semantic loss, one third of the extracted metadata could be successfully eliminated, suggesting better strategy for implementing clinical MDRs with improved efficiency and utility.

  14. Activation of semantic information at the sublexical level during handwriting production: Evidence from inhibition effects of Chinese semantic radicals in the picture-word interference paradigm.

    PubMed

    Chen, Xuqian; Liao, Yuanlan; Chen, Xianzhe

    2017-08-01

    Using a non-alphabetic language (e.g., Chinese), the present study tested a novel view that semantic information at the sublexical level should be activated during handwriting production. Over 80% of Chinese characters are phonograms, in which semantic radicals represent category information (e.g., 'chair,' 'peach,' 'orange' are related to plants) while phonetic radicals represent phonetic information (e.g., 'wolf,' 'brightness,' 'male,' are all pronounced /lang/). Under different semantic category conditions at the lexical level (semantically related in Experiment 1; semantically unrelated in Experiment 2), the orthographic relatedness and semantic relatedness of semantic radicals in the picture name and its distractor were manipulated under different SOAs (i.e., stimulus onset asynchrony, the interval between the onset of the picture and the onset of the interference word). Two questions were addressed: (1) Is it possible that semantic information could be activated in the sublexical level conditions? (2) How are semantic and orthographic information dynamically accessed in word production? Results showed that both orthographic and semantic information were activated under the present picture-word interference paradigm, dynamically under different SOAs, which supported our view that discussions on semantic processes in the writing modality should be extended to the sublexical level. The current findings provide possibility for building new orthography-phonology-semantics models in writing. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.

  15. Meeting medical terminology needs--the Ontology-Enhanced Medical Concept Mapper.

    PubMed

    Leroy, G; Chen, H

    2001-12-01

    This paper describes the development and testing of the Medical Concept Mapper, a tool designed to facilitate access to online medical information sources by providing users with appropriate medical search terms for their personal queries. Our system is valuable for patients whose knowledge of medical vocabularies is inadequate to find the desired information, and for medical experts who search for information outside their field of expertise. The Medical Concept Mapper maps synonyms and semantically related concepts to a user's query. The system is unique because it integrates our natural language processing tool, i.e., the Arizona (AZ) Noun Phraser, with human-created ontologies, the Unified Medical Language System (UMLS) and WordNet, and our computer generated Concept Space, into one system. Our unique contribution results from combining the UMLS Semantic Net with Concept Space in our deep semantic parsing (DSP) algorithm. This algorithm establishes a medical query context based on the UMLS Semantic Net, which allows Concept Space terms to be filtered so as to isolate related terms relevant to the query. We performed two user studies in which Medical Concept Mapper terms were compared against human experts' terms. We conclude that the AZ Noun Phraser is well suited to extract medical phrases from user queries, that WordNet is not well suited to provide strictly medical synonyms, that the UMLS Metathesaurus is well suited to provide medical synonyms, and that Concept Space is well suited to provide related medical terms, especially when these terms are limited by our DSP algorithm.

  16. Enriching semantic knowledge bases for opinion mining in big data applications.

    PubMed

    Weichselbraun, A; Gindl, S; Scharl, A

    2014-10-01

    This paper presents a novel method for contextualizing and enriching large semantic knowledge bases for opinion mining with a focus on Web intelligence platforms and other high-throughput big data applications. The method is not only applicable to traditional sentiment lexicons, but also to more comprehensive, multi-dimensional affective resources such as SenticNet. It comprises the following steps: (i) identify ambiguous sentiment terms, (ii) provide context information extracted from a domain-specific training corpus, and (iii) ground this contextual information to structured background knowledge sources such as ConceptNet and WordNet. A quantitative evaluation shows a significant improvement when using an enriched version of SenticNet for polarity classification. Crowdsourced gold standard data in conjunction with a qualitative evaluation sheds light on the strengths and weaknesses of the concept grounding, and on the quality of the enrichment process.

  17. ECO: A Framework for Entity Co-Occurrence Exploration with Faceted Navigation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Halliday, K. D.

    2010-08-20

    Even as highly structured databases and semantic knowledge bases become more prevalent, a substantial amount of human knowledge is reported as written prose. Typical textual reports, such as news articles, contain information about entities (people, organizations, and locations) and their relationships. Automatically extracting such relationships from large text corpora is a key component of corporate and government knowledge bases. The primary goal of the ECO project is to develop a scalable framework for extracting and presenting these relationships for exploration using an easily navigable faceted user interface. ECO uses entity co-occurrence relationships to identify related entities. The system aggregates andmore » indexes information on each entity pair, allowing the user to rapidly discover and mine relational information.« less

  18. Principal semantic components of language and the measurement of meaning.

    PubMed

    Samsonovich, Alexei V; Samsonovic, Alexei V; Ascoli, Giorgio A

    2010-06-11

    Metric systems for semantics, or semantic cognitive maps, are allocations of words or other representations in a metric space based on their meaning. Existing methods for semantic mapping, such as Latent Semantic Analysis and Latent Dirichlet Allocation, are based on paradigms involving dissimilarity metrics. They typically do not take into account relations of antonymy and yield a large number of domain-specific semantic dimensions. Here, using a novel self-organization approach, we construct a low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy. Emergent semantics of the map principal components are clearly identifiable: the first three correspond to the meanings of "good/bad" (valence), "calm/excited" (arousal), and "open/closed" (freedom), respectively. The semantic map is sufficiently robust to allow the automated extraction of synonyms and antonyms not originally in the dictionaries used to construct the map and to predict connotation from their coordinates. The map geometric characteristics include a limited number ( approximately 4) of statistically significant dimensions, a bimodal distribution of the first component, increasing kurtosis of subsequent (unimodal) components, and a U-shaped maximum-spread planar projection. Both the semantic content and the main geometric features of the map are consistent between dictionaries (Microsoft Word and Princeton's WordNet), among Western languages (English, French, German, and Spanish), and with previously established psychometric measures. By defining the semantics of its dimensions, the constructed map provides a foundational metric system for the quantitative analysis of word meaning. Language can be viewed as a cumulative product of human experiences. Therefore, the extracted principal semantic dimensions may be useful to characterize the general semantic dimensions of the content of mental states. This is a fundamental step toward a universal metric system for semantics of human experiences, which is necessary for developing a rigorous science of the mind.

  19. Processing changes when listening to foreign-accented speech

    PubMed Central

    Romero-Rivas, Carlos; Martin, Clara D.; Costa, Albert

    2015-01-01

    This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs) were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker. PMID:25859209

  20. Convergence of Health Level Seven Version 2 Messages to Semantic Web Technologies for Software-Intensive Systems in Telemedicine Trauma Care

    PubMed Central

    Cook, Timothy Wayne; Cavalini, Luciana Tricai

    2016-01-01

    Objectives To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. Methods This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Results Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. Conclusions This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies. PMID:26893947

  1. The Role of Simple Semantics in the Process of Artificial Grammar Learning.

    PubMed

    Öttl, Birgit; Jäger, Gerhard; Kaup, Barbara

    2017-10-01

    This study investigated the effect of semantic information on artificial grammar learning (AGL). Recursive grammars of different complexity levels (regular language, mirror language, copy language) were investigated in a series of AGL experiments. In the with-semantics condition, participants acquired semantic information prior to the AGL experiment; in the without-semantics control condition, participants did not receive semantic information. It was hypothesized that semantics would generally facilitate grammar acquisition and that the learning benefit in the with-semantics conditions would increase with increasing grammar complexity. Experiment 1 showed learning effects for all grammars but no performance difference between conditions. Experiment 2 replicated the absence of a semantic benefit for all grammars even though semantic information was more prominent during grammar acquisition as compared to Experiment 1. Thus, we did not find evidence for the idea that semantics facilitates grammar acquisition, which seems to support the view of an independent syntactic processing component.

  2. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications.

    PubMed

    Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart

    2013-01-01

    Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.

  3. The Use of a Context-Based Information Retrieval Technique

    DTIC Science & Technology

    2009-07-01

    provided in context. Latent Semantic Analysis (LSA) is a statistical technique for inferring contextual and structural information, and previous studies...WAIS). 10 DSTO-TR-2322 1.4.4 Latent Semantic Analysis LSA, which is also known as latent semantic indexing (LSI), uses a statistical and...1.4.6 Language Models In contrast, natural language models apply algorithms that combine statistical information with semantic information. Semantic

  4. Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.

    PubMed

    Cheng Ye, M S; Fabbri, Daniel

    2018-05-21

    Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks. Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart. The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question. Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings. Copyright © 2018. Published by Elsevier Inc.

  5. GOOSE: semantic search on internet connected sensors

    NASA Astrophysics Data System (ADS)

    Schutte, Klamer; Bomhof, Freek; Burghouts, Gertjan; van Diggelen, Jurriaan; Hiemstra, Peter; van't Hof, Jaap; Kraaij, Wessel; Pasman, Huib; Smith, Arthur; Versloot, Corne; de Wit, Joost

    2013-05-01

    More and more sensors are getting Internet connected. Examples are cameras on cell phones, CCTV cameras for traffic control as well as dedicated security and defense sensor systems. Due to the steadily increasing data volume, human exploitation of all this sensor data is impossible for effective mission execution. Smart access to all sensor data acts as enabler for questions such as "Is there a person behind this building" or "Alert me when a vehicle approaches". The GOOSE concept has the ambition to provide the capability to search semantically for any relevant information within "all" (including imaging) sensor streams in the entire Internet of sensors. This is similar to the capability provided by presently available Internet search engines which enable the retrieval of information on "all" web pages on the Internet. In line with current Internet search engines any indexing services shall be utilized cross-domain. The two main challenge for GOOSE is the Semantic Gap and Scalability. The GOOSE architecture consists of five elements: (1) an online extraction of primitives on each sensor stream; (2) an indexing and search mechanism for these primitives; (3) a ontology based semantic matching module; (4) a top-down hypothesis verification mechanism and (5) a controlling man-machine interface. This paper reports on the initial GOOSE demonstrator, which consists of the MES multimedia analysis platform and the CORTEX action recognition module. It also provides an outlook into future GOOSE development.

  6. Towards an intelligent framework for multimodal affective data analysis.

    PubMed

    Poria, Soujanya; Cambria, Erik; Hussain, Amir; Huang, Guang-Bin

    2015-03-01

    An increasingly large amount of multimodal content is posted on social media websites such as YouTube and Facebook everyday. In order to cope with the growth of such so much multimodal data, there is an urgent need to develop an intelligent multi-modal analysis framework that can effectively extract information from multiple modalities. In this paper, we propose a novel multimodal information extraction agent, which infers and aggregates the semantic and affective information associated with user-generated multimodal data in contexts such as e-learning, e-health, automatic video content tagging and human-computer interaction. In particular, the developed intelligent agent adopts an ensemble feature extraction approach by exploiting the joint use of tri-modal (text, audio and video) features to enhance the multimodal information extraction process. In preliminary experiments using the eNTERFACE dataset, our proposed multi-modal system is shown to achieve an accuracy of 87.95%, outperforming the best state-of-the-art system by more than 10%, or in relative terms, a 56% reduction in error rate. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Familiar Person Recognition: Is Autonoetic Consciousness More Likely to Accompany Face Recognition Than Voice Recognition?

    NASA Astrophysics Data System (ADS)

    Barsics, Catherine; Brédart, Serge

    2010-11-01

    Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.

  8. Using Graph Components Derived from an Associative Concept Dictionary to Predict fMRI Neural Activation Patterns that Represent the Meaning of Nouns.

    PubMed

    Akama, Hiroyuki; Miyake, Maki; Jung, Jaeyoung; Murphy, Brian

    2015-01-01

    In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.

  9. Model-based document categorization employing semantic pattern analysis and local structure clustering

    NASA Astrophysics Data System (ADS)

    Fume, Kosei; Ishitani, Yasuto

    2008-01-01

    We propose a document categorization method based on a document model that can be defined externally for each task and that categorizes Web content or business documents into a target category in accordance with the similarity of the model. The main feature of the proposed method consists of two aspects of semantics extraction from an input document. The semantics of terms are extracted by the semantic pattern analysis and implicit meanings of document substructure are specified by a bottom-up text clustering technique focusing on the similarity of text line attributes. We have constructed a system based on the proposed method for trial purposes. The experimental results show that the system achieves more than 80% classification accuracy in categorizing Web content and business documents into 15 or 70 categories.

  10. Integrated Japanese Dependency Analysis Using a Dialog Context

    NASA Astrophysics Data System (ADS)

    Ikegaya, Yuki; Noguchi, Yasuhiro; Kogure, Satoru; Itoh, Toshihiko; Konishi, Tatsuhiro; Kondo, Makoto; Asoh, Hideki; Takagi, Akira; Itoh, Yukihiro

    This paper describes how to perform syntactic parsing and semantic analysis in a dialog system. The paper especially deals with how to disambiguate potentially ambiguous sentences using the contextual information. Although syntactic parsing and semantic analysis are often studied independently of each other, correct parsing of a sentence often requires the semantic information on the input and/or the contextual information prior to the input. Accordingly, we merge syntactic parsing with semantic analysis, which enables syntactic parsing taking advantage of the semantic content of an input and its context. One of the biggest problems of semantic analysis is how to interpret dependency structures. We employ a framework for semantic representations that circumvents the problem. Within the framework, the meaning of any predicate is converted into a semantic representation which only permits a single type of predicate: an identifying predicate "aru". The semantic representations are expressed as sets of "attribute-value" pairs, and those semantic representations are stored in the context information. Our system disambiguates syntactic/semantic ambiguities of inputs referring to the attribute-value pairs in the context information. We have experimentally confirmed the effectiveness of our approach; specifically, the experiment confirmed high accuracy of parsing and correctness of generated semantic representations.

  11. TRENCADIS--a WSRF grid MiddleWare for managing DICOM structured reporting objects.

    PubMed

    Blanquer, Ignacio; Hernandez, Vicente; Segrelles, Damià

    2006-01-01

    The adoption of the digital processing of medical data, especially on radiology, has leaded to the availability of millions of records (images and reports). However, this information is mainly used at patient level, being the extraction of information, organised according to administrative criteria, which make the extraction of knowledge difficult. Moreover, legal constraints make the direct integration of information systems complex or even impossible. On the other side, the widespread of the DICOM format has leaded to the inclusion of other information different from just radiological images. The possibility of coding radiology reports in a structured form, adding semantic information about the data contained in the DICOM objects, eases the process of structuring images according to content. DICOM Structured Reporting (DICOM-SR) is a specification of tags and sections to code and integrate radiology reports, with seamless references to findings and regions of interests of the associated images, movies, waveforms, signals, etc. The work presented in this paper aims at developing of a framework to efficiently and securely share medical images and radiology reports, as well as to provide high throughput processing services. This system is based on a previously developed architecture in the framework of the TRENCADIS project, and uses other components such as the security system and the Grid processing service developed in previous activities. The work presented here introduces a semantic structuring and an ontology framework, to organise medical images considering standard terminology and disease coding formats (SNOMED, ICD9, LOINC..).

  12. Automatic extraction of property norm-like data from large text corpora.

    PubMed

    Kelly, Colin; Devereux, Barry; Korhonen, Anna

    2014-01-01

    Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.

  13. Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

    NASA Astrophysics Data System (ADS)

    Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

    2016-12-01

    Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi

  14. Two-dimensional hidden semantic information model for target saliency detection and eyetracking identification

    NASA Astrophysics Data System (ADS)

    Wan, Weibing; Yuan, Lingfeng; Zhao, Qunfei; Fang, Tao

    2018-01-01

    Saliency detection has been applied to the target acquisition case. This paper proposes a two-dimensional hidden Markov model (2D-HMM) that exploits the hidden semantic information of an image to detect its salient regions. A spatial pyramid histogram of oriented gradient descriptors is used to extract features. After encoding the image by a learned dictionary, the 2D-Viterbi algorithm is applied to infer the saliency map. This model can predict fixation of the targets and further creates robust and effective depictions of the targets' change in posture and viewpoint. To validate the model with a human visual search mechanism, two eyetrack experiments are employed to train our model directly from eye movement data. The results show that our model achieves better performance than visual attention. Moreover, it indicates the plausibility of utilizing visual track data to identify targets.

  15. Enriching semantic knowledge bases for opinion mining in big data applications

    PubMed Central

    Weichselbraun, A.; Gindl, S.; Scharl, A.

    2014-01-01

    This paper presents a novel method for contextualizing and enriching large semantic knowledge bases for opinion mining with a focus on Web intelligence platforms and other high-throughput big data applications. The method is not only applicable to traditional sentiment lexicons, but also to more comprehensive, multi-dimensional affective resources such as SenticNet. It comprises the following steps: (i) identify ambiguous sentiment terms, (ii) provide context information extracted from a domain-specific training corpus, and (iii) ground this contextual information to structured background knowledge sources such as ConceptNet and WordNet. A quantitative evaluation shows a significant improvement when using an enriched version of SenticNet for polarity classification. Crowdsourced gold standard data in conjunction with a qualitative evaluation sheds light on the strengths and weaknesses of the concept grounding, and on the quality of the enrichment process. PMID:25431524

  16. Concept recognition for extracting protein interaction relations from biomedical text

    PubMed Central

    Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-01

    Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500

  17. Selective Short-Term Memory Deficits Arise from Impaired Domain-General Semantic Control Mechanisms

    ERIC Educational Resources Information Center

    Hoffman, Paul; Jefferies, Elizabeth; Ehsan, Sheeba; Hopper, Samantha; Lambon Ralph, Matthew A.

    2009-01-01

    Semantic short-term memory (STM) patients have a reduced ability to retain semantic information over brief delays but perform well on other semantic tasks; this pattern suggests damage to a dedicated buffer for semantic information. Alternatively, these difficulties may arise from mild disruption to domain-general semantic processes that have…

  18. Developing Visualization Techniques for Semantics-based Information Networks

    NASA Technical Reports Server (NTRS)

    Keller, Richard M.; Hall, David R.

    2003-01-01

    Information systems incorporating complex network structured information spaces with a semantic underpinning - such as hypermedia networks, semantic networks, topic maps, and concept maps - are being deployed to solve some of NASA s critical information management problems. This paper describes some of the human interaction and navigation problems associated with complex semantic information spaces and describes a set of new visual interface approaches to address these problems. A key strategy is to leverage semantic knowledge represented within these information spaces to construct abstractions and views that will be meaningful to the human user. Human-computer interaction methodologies will guide the development and evaluation of these approaches, which will benefit deployed NASA systems and also apply to information systems based on the emerging Semantic Web.

  19. Instantaneous Coastline Extraction from LIDAR Point Cloud and High Resolution Remote Sensing Imagery

    NASA Astrophysics Data System (ADS)

    Li, Y.; Zhoing, L.; Lai, Z.; Gan, Z.

    2018-04-01

    A new method was proposed for instantaneous waterline extraction in this paper, which combines point cloud geometry features and image spectral characteristics of the coastal zone. The proposed method consists of follow steps: Mean Shift algorithm is used to segment the coastal zone of high resolution remote sensing images into small regions containing semantic information;Region features are extracted by integrating LiDAR data and the surface area of the image; initial waterlines are extracted by α-shape algorithm; a region growing algorithm with is taking into coastline refinement, with a growth rule integrating the intensity and topography of LiDAR data; moothing the coastline. Experiments are conducted to demonstrate the efficiency of the proposed method.

  20. Computational approaches for predicting biomedical research collaborations.

    PubMed

    Zhang, Qing; Yu, Hong

    2014-01-01

    Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naïve Bayes, naïve Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets.

  1. Algorithms and semantic infrastructure for mutation impact extraction and grounding.

    PubMed

    Laurila, Jonas B; Naderi, Nona; Witte, René; Riazanov, Alexandre; Kouznetsov, Alexandre; Baker, Christopher J O

    2010-12-02

    Mutation impact extraction is a hitherto unaccomplished task in state of the art mutation extraction systems. Protein mutations and their impacts on protein properties are hidden in scientific literature, making them poorly accessible for protein engineers and inaccessible for phenotype-prediction systems that currently depend on manually curated genomic variation databases. We present the first rule-based approach for the extraction of mutation impacts on protein properties, categorizing their directionality as positive, negative or neutral. Furthermore protein and mutation mentions are grounded to their respective UniProtKB IDs and selected protein properties, namely protein functions to concepts found in the Gene Ontology. The extracted entities are populated to an OWL-DL Mutation Impact ontology facilitating complex querying for mutation impacts using SPARQL. We illustrate retrieval of proteins and mutant sequences for a given direction of impact on specific protein properties. Moreover we provide programmatic access to the data through semantic web services using the SADI (Semantic Automated Discovery and Integration) framework. We address the problem of access to legacy mutation data in unstructured form through the creation of novel mutation impact extraction methods which are evaluated on a corpus of full-text articles on haloalkane dehalogenases, tagged by domain experts. Our approaches show state of the art levels of precision and recall for Mutation Grounding and respectable level of precision but lower recall for the task of Mutant-Impact relation extraction. The system is deployed using text mining and semantic web technologies with the goal of publishing to a broad spectrum of consumers.

  2. Unlocking echocardiogram measurements for heart disease research through natural language processing.

    PubMed

    Patterson, Olga V; Freiberg, Matthew S; Skanderson, Melissa; J Fodeh, Samah; Brandt, Cynthia A; DuVall, Scott L

    2017-06-12

    In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center study. A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in echocardiogram reports as measurement-value pairs. Curated semantic bootstrapping was used to create a custom dictionary that extends existing terminologies based on terms that actually appear in the medical record. A novel disambiguation method based on semantic constraints was created to identify and discard erroneous alternative definitions of the measurement terms. The system was built utilizing a scalable framework, making it available for processing large datasets. The system was developed for and validated on notes from three sources: general clinic notes, echocardiogram reports, and radiology reports. The system achieved F-scores of 0.872, 0.844, and 0.877 with precision of 0.936, 0.982, and 0.969 for each dataset respectively averaged across all extracted values. Left ventricular ejection fraction (LVEF) is the most frequently extracted measurement. The precision of extraction of the LVEF measure ranged from 0.968 to 1.0 across different document types. This system illustrates the feasibility and effectiveness of a large-scale information extraction on clinical data. New clinical questions can be addressed in the domain of heart failure using retrospective clinical data analysis because key heart function measurements can be successfully extracted using natural language processing.

  3. PANTHER. Pattern ANalytics To support High-performance Exploitation and Reasoning.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Czuchlewski, Kristina Rodriguez; Hart, William E.

    Sandia has approached the analysis of big datasets with an integrated methodology that uses computer science, image processing, and human factors to exploit critical patterns and relationships in large datasets despite the variety and rapidity of information. The work is part of a three-year LDRD Grand Challenge called PANTHER (Pattern ANalytics To support High-performance Exploitation and Reasoning). To maximize data analysis capability, Sandia pursued scientific advances across three key technical domains: (1) geospatial-temporal feature extraction via image segmentation and classification; (2) geospatial-temporal analysis capabilities tailored to identify and process new signatures more efficiently; and (3) domain- relevant models of humanmore » perception and cognition informing the design of analytic systems. Our integrated results include advances in geographical information systems (GIS) in which we discover activity patterns in noisy, spatial-temporal datasets using geospatial-temporal semantic graphs. We employed computational geometry and machine learning to allow us to extract and predict spatial-temporal patterns and outliers from large aircraft and maritime trajectory datasets. We automatically extracted static and ephemeral features from real, noisy synthetic aperture radar imagery for ingestion into a geospatial-temporal semantic graph. We worked with analysts and investigated analytic workflows to (1) determine how experiential knowledge evolves and is deployed in high-demand, high-throughput visual search workflows, and (2) better understand visual search performance and attention. Through PANTHER, Sandia's fundamental rethinking of key aspects of geospatial data analysis permits the extraction of much richer information from large amounts of data. The project results enable analysts to examine mountains of historical and current data that would otherwise go untouched, while also gaining meaningful, measurable, and defensible insights into overlooked relationships and patterns. The capability is directly relevant to the nation's nonproliferation remote-sensing activities and has broad national security applications for military and intelligence- gathering organizations.« less

  4. Semantic attributes for people's appearance description: an appearance modality for video surveillance applications

    NASA Astrophysics Data System (ADS)

    Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed

    2017-09-01

    Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.

  5. Automatic Short Essay Scoring Using Natural Language Processing to Extract Semantic Information in the Form of Propositions. CRESST Report 831

    ERIC Educational Resources Information Center

    Kerr, Deirdre; Mousavi, Hamid; Iseli, Markus R.

    2013-01-01

    The Common Core assessments emphasize short essay constructed-response items over multiple-choice items because they are more precise measures of understanding. However, such items are too costly and time consuming to be used in national assessments unless a way to score them automatically can be found. Current automatic essay-scoring techniques…

  6. Positive and negative emotion enhances the processing of famous faces in a semantic judgment task.

    PubMed

    Bate, Sarah; Haslam, Catherine; Hodgson, Timothy L; Jansari, Ashok; Gregory, Nicola; Kay, Janice

    2010-01-01

    Previous work has consistently reported a facilitatory influence of positive emotion in face recognition (e.g., D'Argembeau, Van der Linden, Comblain, & Etienne, 2003). However, these reports asked participants to make recognition judgments in response to faces, and it is unknown whether emotional valence may influence other stages of processing, such as at the level of semantics. Furthermore, other evidence suggests that negative rather than positive emotion facilitates higher level judgments when processing nonfacial stimuli (e.g., Mickley & Kensinger, 2008), and it is possible that negative emotion also influences latter stages of face processing. The present study addressed this issue, examining the influence of emotional valence while participants made semantic judgments in response to a set of famous faces. Eye movements were monitored while participants performed this task, and analyses revealed a reduction in information extraction for the faces of liked and disliked celebrities compared with those of emotionally neutral celebrities. Thus, in contrast to work using familiarity judgments, both positive and negative emotion facilitated processing in this semantic-based task. This pattern of findings is discussed in relation to current models of face processing. Copyright 2009 APA, all rights reserved.

  7. Getting connected: Both associative and semantic links structure semantic memory for newly learned persons.

    PubMed

    Wiese, Holger; Schweinberger, Stefan R

    2015-01-01

    The present study examined whether semantic memory for newly learned people is structured by visual co-occurrence, shared semantics, or both. Participants were trained with pairs of simultaneously presented (i.e., co-occurring) preexperimentally unfamiliar faces, which either did or did not share additionally provided semantic information (occupation, place of living, etc.). Semantic information could also be shared between faces that did not co-occur. A subsequent priming experiment revealed faster responses for both co-occurrence/no shared semantics and no co-occurrence/shared semantics conditions, than for an unrelated condition. Strikingly, priming was strongest in the co-occurrence/shared semantics condition, suggesting additive effects of these factors. Additional analysis of event-related brain potentials yielded priming in the N400 component only for combined effects of visual co-occurrence and shared semantics, with more positive amplitudes in this than in the unrelated condition. Overall, these findings suggest that both semantic relatedness and visual co-occurrence are important when novel information is integrated into person-related semantic memory.

  8. Learning semantic and visual similarity for endomicroscopy video retrieval.

    PubMed

    Andre, Barbara; Vercauteren, Tom; Buchner, Anna M; Wallace, Michael B; Ayache, Nicholas

    2012-06-01

    Content-based image retrieval (CBIR) is a valuable computer vision technique which is increasingly being applied in the medical community for diagnosis support. However, traditional CBIR systems only deliver visual outputs, i.e., images having a similar appearance to the query, which is not directly interpretable by the physicians. Our objective is to provide a system for endomicroscopy video retrieval which delivers both visual and semantic outputs that are consistent with each other. In a previous study, we developed an adapted bag-of-visual-words method for endomicroscopy retrieval, called "Dense-Sift," that computes a visual signature for each video. In this paper, we present a novel approach to complement visual similarity learning with semantic knowledge extraction, in the field of in vivo endomicroscopy. We first leverage a semantic ground truth based on eight binary concepts, in order to transform these visual signatures into semantic signatures that reflect how much the presence of each semantic concept is expressed by the visual words describing the videos. Using cross-validation, we demonstrate that, in terms of semantic detection, our intuitive Fisher-based method transforming visual-word histograms into semantic estimations outperforms support vector machine (SVM) methods with statistical significance. In a second step, we propose to improve retrieval relevance by learning an adjusted similarity distance from a perceived similarity ground truth. As a result, our distance learning method allows to statistically improve the correlation with the perceived similarity. We also demonstrate that, in terms of perceived similarity, the recall performance of the semantic signatures is close to that of visual signatures and significantly better than those of several state-of-the-art CBIR methods. The semantic signatures are thus able to communicate high-level medical knowledge while being consistent with the low-level visual signatures and much shorter than them. In our resulting retrieval system, we decide to use visual signatures for perceived similarity learning and retrieval, and semantic signatures for the output of an additional information, expressed in the endoscopist own language, which provides a relevant semantic translation of the visual retrieval outputs.

  9. Intelligent services for discovery of complex geospatial features from remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Yue, Peng; Di, Liping; Wei, Yaxing; Han, Weiguo

    2013-09-01

    Remote sensing imagery has been commonly used by intelligence analysts to discover geospatial features, including complex ones. The overwhelming volume of routine image acquisition requires automated methods or systems for feature discovery instead of manual image interpretation. The methods of extraction of elementary ground features such as buildings and roads from remote sensing imagery have been studied extensively. The discovery of complex geospatial features, however, is still rather understudied. A complex feature, such as a Weapon of Mass Destruction (WMD) proliferation facility, is spatially composed of elementary features (e.g., buildings for hosting fuel concentration machines, cooling towers, transportation roads, and fences). Such spatial semantics, together with thematic semantics of feature types, can be used to discover complex geospatial features. This paper proposes a workflow-based approach for discovery of complex geospatial features that uses geospatial semantics and services. The elementary features extracted from imagery are archived in distributed Web Feature Services (WFSs) and discoverable from a catalogue service. Using spatial semantics among elementary features and thematic semantics among feature types, workflow-based service chains can be constructed to locate semantically-related complex features in imagery. The workflows are reusable and can provide on-demand discovery of complex features in a distributed environment.

  10. A semantic medical multimedia retrieval approach using ontology information hiding.

    PubMed

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.

  11. FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.

    PubMed

    Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong

    2015-11-01

    Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.

  12. Methods and apparatus for capture and storage of semantic information with sub-files in a parallel computing system

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Torres, Aaron

    2015-02-03

    Techniques are provided for storing files in a parallel computing system using sub-files with semantically meaningful boundaries. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a plurality of sub-files. The method comprises the steps of obtaining a user specification of semantic information related to the file; providing the semantic information as a data structure description to a data formatting library write function; and storing the semantic information related to the file with one or more of the sub-files in one or more storage nodes of the parallel computing system. The semantic information provides a description of data in the file. The sub-files can be replicated based on semantically meaningful boundaries.

  13. Simultaneous face and voice processing in schizophrenia.

    PubMed

    Liu, Taosheng; Pinheiro, Ana P; Zhao, Zhongxin; Nestor, Paul G; McCarley, Robert W; Niznikiewicz, Margaret

    2016-05-15

    While several studies have consistently demonstrated abnormalities in the unisensory processing of face and voice in schizophrenia (SZ), the extent of abnormalities in the simultaneous processing of both types of information remains unclear. To address this issue, we used event-related potentials (ERP) methodology to probe the multisensory integration of face and non-semantic sounds in schizophrenia. EEG was recorded from 18 schizophrenia patients and 19 healthy control (HC) subjects in three conditions: neutral faces (visual condition-VIS); neutral non-semantic sounds (auditory condition-AUD); neutral faces presented simultaneously with neutral non-semantic sounds (audiovisual condition-AUDVIS). When compared with HC, the schizophrenia group showed less negative N170 to both face and face-voice stimuli; later P270 peak latency in the multimodal condition of face-voice relative to unimodal condition of face (the reverse was true in HC); reduced P400 amplitude and earlier P400 peak latency in the face but not in the voice-face condition. Thus, the analysis of ERP components suggests that deficits in the encoding of facial information extend to multimodal face-voice stimuli and that delays exist in feature extraction from multimodal face-voice stimuli in schizophrenia. In contrast, categorization processes seem to benefit from the presentation of simultaneous face-voice information. Timepoint by timepoint tests of multimodal integration did not suggest impairment in the initial stages of processing in schizophrenia. Published by Elsevier B.V.

  14. Semantic Theme Analysis of Pilot Incident Reports

    NASA Technical Reports Server (NTRS)

    Thirumalainambi, Rajkumar

    2009-01-01

    Pilots report accidents or incidents during take-off, on flight and landing to airline authorities and Federal aviation authority as well. The description of pilot reports for an incident contains technical terms related to Flight instruments and operations. Normal text mining approaches collect keywords from text documents and relate them among documents that are stored in database. Present approach will extract specific theme analysis of incident reports and semantically relate hierarchy of terms assigning weights of themes. Once the theme extraction has been performed for a given document, a unique key can be assigned to that document to cross linking the documents. Semantic linking will be used to categorize the documents based on specific rules that can help an end-user to analyze certain types of accidents. This presentation outlines the architecture of text mining for pilot incident reports for autonomous categorization of pilot incident reports using semantic theme analysis.

  15. An ensemble method for extracting adverse drug events from social media.

    PubMed

    Liu, Jing; Zhao, Songzheng; Zhang, Xiaodi

    2016-06-01

    Because adverse drug events (ADEs) are a serious health problem and a leading cause of death, it is of vital importance to identify them correctly and in a timely manner. With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study is to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. We develop a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, we evaluate the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. Our experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. A Case Study on Sepsis Using PubMed and Deep Learning for Ontology Learning.

    PubMed

    Arguello Casteleiro, Mercedes; Maseda Fernandez, Diego; Demetriou, George; Read, Warren; Fernandez Prieto, Maria Jesus; Des Diz, Julio; Nenadic, Goran; Keane, John; Stevens, Robert

    2017-01-01

    We investigate the application of distributional semantics models for facilitating unsupervised extraction of biomedical terms from unannotated corpora. Term extraction is used as the first step of an ontology learning process that aims to (semi-)automatic annotation of biomedical concepts and relations from more than 300K PubMed titles and abstracts. We experimented with both traditional distributional semantics methods such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) as well as the neural language models CBOW and Skip-gram from Deep Learning. The evaluation conducted concentrates on sepsis, a major life-threatening condition, and shows that Deep Learning models outperform LSA and LDA with much higher precision.

  17. Developing a Satellite Based Automatic System for Crop Monitoring: Kenya's Great Rift Valley, A Case Study

    NASA Astrophysics Data System (ADS)

    Lucciani, Roberto; Laneve, Giovanni; Jahjah, Munzer; Mito, Collins

    2016-08-01

    The crop growth stage represents essential information for agricultural areas management. In this study we investigate the feasibility of a tool based on remotely sensed satellite (Landsat 8) imagery, capable of automatically classify crop fields and how much resolution enhancement based on pan-sharpening techniques and phenological information extraction, useful to create decision rules that allow to identify semantic class to assign to an object, can effectively support the classification process. Moreover we investigate the opportunity to extract vegetation health status information from remotely sensed assessment of the equivalent water thickness (EWT). Our case study is the Kenya's Great Rift valley, in this area a ground truth campaign was conducted during August 2015 in order to collect crop fields GPS measurements, leaf area index (LAI) and chlorophyll samples.

  18. Semantic data association for planar features in outdoor 6D-SLAM using lidar

    NASA Astrophysics Data System (ADS)

    Ulas, C.; Temeltas, H.

    2013-05-01

    Simultaneous Localization and Mapping (SLAM) is a fundamental problem of the autonomous systems in GPS (Global Navigation System) denied environments. The traditional probabilistic SLAM methods uses point features as landmarks and hold all the feature positions in their state vector in addition to the robot pose. The bottleneck of the point-feature based SLAM methods is the data association problem, which are mostly based on a statistical measure. The data association performance is very critical for a robust SLAM method since all the filtering strategies are applied after a known correspondence. For point-features, two different but very close landmarks in the same scene might be confused while giving the correspondence decision when their positions and error covariance matrix are solely taking into account. Instead of using the point features, planar features can be considered as an alternative landmark model in the SLAM problem to be able to provide a more consistent data association. Planes contain rich information for the solution of the data association problem and can be distinguished easily with respect to point features. In addition, planar maps are very compact since an environment has only very limited number of planar structures. The planar features does not have to be large structures like building wall or roofs; the small plane segments can also be used as landmarks like billboards, traffic posts and some part of the bridges in urban areas. In this paper, a probabilistic plane-feature extraction method from 3DLiDAR data and the data association based on the extracted semantic information of the planar features is introduced. The experimental results show that the semantic data association provides very satisfactory result in outdoor 6D-SLAM.

  19. Data-driven Ontology Development: A Case Study at NASA's Atmospheric Science Data Center

    NASA Astrophysics Data System (ADS)

    Hertz, J.; Huffer, E.; Kusterer, J.

    2012-12-01

    Well-founded ontologies are key to enabling transformative semantic technologies and accelerating scientific research. One example is semantically enabled search and discovery, making scientific data accessible and more understandable by accurately modeling a complex domain. The ontology creation process remains a challenge for many anxious to pursue semantic technologies. The key may be that the creation process -- whether formal, community-based, automated or semi-automated -- should encompass not only a foundational core and supplemental resources but also a focus on the purpose or mission the ontology is created to support. Are there tools or processes to de-mystify, assess or enhance the resulting ontology? We suggest that comparison and analysis of a domain-focused ontology can be made using text engineering tools for information extraction, tokenizers, named entity transducers and others. The results are analyzed to ensure the ontology reflects the core purpose of the domain's mission and that the ontology integrates and describes the supporting data in the language of the domain - how the science is analyzed and discussed among all users of the data. Commonalities and relationships among domain resources describing the Clouds and Earth's Radiant Energy (CERES) Bi-Directional Scan (BDS) datasets from NASA's Atmospheric Science Data Center are compared. The domain resources include: a formal ontology created for CERES; scientific works such as papers, conference proceedings and notes; information extracted from the datasets (i.e., header metadata); and BDS scientific documentation (Algorithm Theoretical Basis Documents, collection guides, data quality summaries and others). These resources are analyzed using the open source software General Architecture for Text Engineering, a mature framework for computational tasks involving human language.

  20. Hippocampal activation during retrieval of spatial context from episodic and semantic memory.

    PubMed

    Hoscheidt, Siobhan M; Nadel, Lynn; Payne, Jessica; Ryan, Lee

    2010-10-15

    The hippocampus, a region implicated in the processing of spatial information and episodic memory, is central to the debate concerning the relationship between episodic and semantic memory. Studies of medial temporal lobe amnesic patients provide evidence that the hippocampus is critical for the retrieval of episodic but not semantic memory. On the other hand, recent neuroimaging studies of intact individuals report hippocampal activation during retrieval of both autobiographical memories and semantic information that includes historical facts, famous faces, and categorical information, suggesting that episodic and semantic memory may engage the hippocampus during memory retrieval in similar ways. Few studies have matched episodic and semantic tasks for the degree to which they include spatial content, even though spatial content may be what drives hippocampal activation during semantic retrieval. To examine this issue, we conducted a functional magnetic resonance imaging (fMRI) study in which retrieval of spatial and nonspatial information was compared during an episodic and semantic recognition task. Results show that the hippocampus (1) participates preferentially in the retrieval of episodic memories; (2) is also engaged by retrieval of semantic memories, particularly those that include spatial information. These data suggest that sharp dissociations between episodic and semantic memory may be overly simplistic and that the hippocampus plays a role in the retrieval of spatial content whether drawn from a memory of one's own life experiences or real-world semantic knowledge. Published by Elsevier B.V.

  1. The Fusion Model of Intelligent Transportation Systems Based on the Urban Traffic Ontology

    NASA Astrophysics Data System (ADS)

    Yang, Wang-Dong; Wang, Tao

    On these issues unified representation of urban transport information using urban transport ontology, it defines the statute and the algebraic operations of semantic fusion in ontology level in order to achieve the fusion of urban traffic information in the semantic completeness and consistency. Thus this paper takes advantage of the semantic completeness of the ontology to build urban traffic ontology model with which we resolve the problems as ontology mergence and equivalence verification in semantic fusion of traffic information integration. Information integration in urban transport can increase the function of semantic fusion, and reduce the amount of data integration of urban traffic information as well enhance the efficiency and integrity of traffic information query for the help, through the practical application of intelligent traffic information integration platform of Changde city, the paper has practically proved that the semantic fusion based on ontology increases the effect and efficiency of the urban traffic information integration, reduces the storage quantity, and improve query efficiency and information completeness.

  2. A Semantic Medical Multimedia Retrieval Approach Using Ontology Information Hiding

    PubMed Central

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches. PMID:24082915

  3. Spatial and temporal features of superordinate semantic processing studied with fMRI and EEG.

    PubMed

    Costanzo, Michelle E; McArdle, Joseph J; Swett, Bruce; Nechaev, Vladimir; Kemeny, Stefan; Xu, Jiang; Braun, Allen R

    2013-01-01

    The relationships between the anatomical representation of semantic knowledge in the human brain and the timing of neurophysiological mechanisms involved in manipulating such information remain unclear. This is the case for superordinate semantic categorization-the extraction of general features shared by broad classes of exemplars (e.g., living vs. non-living semantic categories). We proposed that, because of the abstract nature of this information, input from diverse input modalities (visual or auditory, lexical or non-lexical) should converge and be processed in the same regions of the brain, at similar time scales during superordinate categorization-specifically in a network of heteromodal regions, and late in the course of the categorization process. In order to test this hypothesis, we utilized electroencephalography and event related potentials (EEG/ERP) with functional magnetic resonance imaging (fMRI) to characterize subjects' responses as they made superordinate categorical decisions (living vs. non-living) about objects presented as visual pictures or auditory words. Our results reveal that, consistent with our hypothesis, during the course of superordinate categorization, information provided by these diverse inputs appears to converge in both time and space: fMRI showed that heteromodal areas of the parietal and temporal cortices are active during categorization of both classes of stimuli. The ERP results suggest that superordinate categorization is reflected as a late positive component (LPC) with a parietal distribution and long latencies for both stimulus types. Within the areas and times in which modality independent responses were identified, some differences between living and non-living categories were observed, with a more widespread spatial extent and longer latency responses for categorization of non-living items.

  4. Spatial and temporal features of superordinate semantic processing studied with fMRI and EEG

    PubMed Central

    Costanzo, Michelle E.; McArdle, Joseph J.; Swett, Bruce; Nechaev, Vladimir; Kemeny, Stefan; Xu, Jiang; Braun, Allen R.

    2013-01-01

    The relationships between the anatomical representation of semantic knowledge in the human brain and the timing of neurophysiological mechanisms involved in manipulating such information remain unclear. This is the case for superordinate semantic categorization—the extraction of general features shared by broad classes of exemplars (e.g., living vs. non-living semantic categories). We proposed that, because of the abstract nature of this information, input from diverse input modalities (visual or auditory, lexical or non-lexical) should converge and be processed in the same regions of the brain, at similar time scales during superordinate categorization—specifically in a network of heteromodal regions, and late in the course of the categorization process. In order to test this hypothesis, we utilized electroencephalography and event related potentials (EEG/ERP) with functional magnetic resonance imaging (fMRI) to characterize subjects' responses as they made superordinate categorical decisions (living vs. non-living) about objects presented as visual pictures or auditory words. Our results reveal that, consistent with our hypothesis, during the course of superordinate categorization, information provided by these diverse inputs appears to converge in both time and space: fMRI showed that heteromodal areas of the parietal and temporal cortices are active during categorization of both classes of stimuli. The ERP results suggest that superordinate categorization is reflected as a late positive component (LPC) with a parietal distribution and long latencies for both stimulus types. Within the areas and times in which modality independent responses were identified, some differences between living and non-living categories were observed, with a more widespread spatial extent and longer latency responses for categorization of non-living items. PMID:23847490

  5. Explaining semantic short-term memory deficits: Evidence for the critical role of semantic control

    PubMed Central

    Hoffman, Paul; Jefferies, Elizabeth; Lambon Ralph, Matthew A.

    2011-01-01

    Patients with apparently selective short-term memory (STM) deficits for semantic information have played an important role in developing multi-store theories of STM and challenge the idea that verbal STM is supported by maintaining activation in the language system. We propose that semantic STM deficits are not as selective as previously thought and can occur as a result of mild disruption to semantic control processes, i.e., mechanisms that bias semantic processing towards task-relevant aspects of knowledge and away from irrelevant information. We tested three semantic STM patients with tasks that tapped four aspects of semantic control: (i) resolving ambiguity between word meanings, (ii) sensitivity to cues, (iii) ignoring irrelevant information and (iv) detecting weak semantic associations. All were impaired in conditions requiring more semantic control, irrespective of the STM demands of the task, suggesting a mild, but task-general, deficit in regulating semantic knowledge. This mild deficit has a disproportionate effect on STM tasks because they have high intrinsic control demands: in STM tasks, control is required to keep information active when it is no longer available in the environment and to manage competition between items held in memory simultaneously. By re-interpreting the core deficit in semantic STM patients in this way, we are able to explain their apparently selective impairment without the need for a specialised STM store. Instead, we argue that semantic STM patients occupy the mildest end of spectrum of semantic control disorders. PMID:21195105

  6. Data mart construction based on semantic annotation of scientific articles: A case study for the prioritization of drug targets.

    PubMed

    Teixeira, Marlon Amaro Coelho; Belloze, Kele Teixeira; Cavalcanti, Maria Cláudia; Silva-Junior, Floriano P

    2018-04-01

    Semantic text annotation enables the association of semantic information (ontology concepts) to text expressions (terms), which are readable by software agents. In the scientific scenario, this is particularly useful because it reveals a lot of scientific discoveries that are hidden within academic articles. The Biomedical area has more than 300 ontologies, most of them composed of over 500 concepts. These ontologies can be used to annotate scientific papers and thus, facilitate data extraction. However, in the context of a scientific research, a simple keyword-based query using the interface of a digital scientific texts library can return more than a thousand hits. The analysis of such a large set of texts, annotated with such numerous and large ontologies, is not an easy task. Therefore, the main objective of this work is to provide a method that could facilitate this task. This work describes a method called Text and Ontology ETL (TOETL), to build an analytical view over such texts. First, a corpus of selected papers is semantically annotated using distinct ontologies. Then, the annotation data is extracted, organized and aggregated into the dimensional schema of a data mart. Besides the TOETL method, this work illustrates its application through the development of the TaP DM (Target Prioritization data mart). This data mart has focus on the research of gene essentiality, a key concept to be considered when searching for genes showing potential as anti-infective drug targets. This work reveals that the proposed approach is a relevant tool to support decision making in the prioritization of new drug targets, being more efficient than the keyword-based traditional tools. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

    PubMed

    Velupillai, S; Mowery, D; South, B R; Kvist, M; Dalianis, H

    2015-08-13

    We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices.

  8. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  9. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    VERSPOOR, KARIN; LIN, SHOU-DE

    An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learnedmore » without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.« less

  10. Introduction to geospatial semantics and technology workshop handbook

    USGS Publications Warehouse

    Varanka, Dalia E.

    2012-01-01

    The workshop is a tutorial on introductory geospatial semantics with hands-on exercises using standard Web browsers. The workshop is divided into two sections, general semantics on the Web and specific examples of geospatial semantics using data from The National Map of the U.S. Geological Survey and the Open Ontology Repository. The general semantics section includes information and access to publicly available semantic archives. The specific session includes information on geospatial semantics with access to semantically enhanced data for hydrography, transportation, boundaries, and names. The Open Ontology Repository offers open-source ontologies for public use.

  11. Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain

    PubMed Central

    Madkour, Mohcine; Benhaddou, Driss; Tao, Cui

    2016-01-01

    Background and Objective We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic Health Records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. Methods This review surveys the methods used in three important area: modeling and representing of time, Medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. Results the main findings of this review is revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. Conclusions Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems. PMID:27040831

  12. Classification of Aerial Photogrammetric 3d Point Clouds

    NASA Astrophysics Data System (ADS)

    Becker, C.; Häni, N.; Rosinskaya, E.; d'Angelo, E.; Strecha, C.

    2017-05-01

    We present a powerful method to extract per-point semantic class labels from aerial photogrammetry data. Labelling this kind of data is important for tasks such as environmental modelling, object classification and scene understanding. Unlike previous point cloud classification methods that rely exclusively on geometric features, we show that incorporating color information yields a significant increase in accuracy in detecting semantic classes. We test our classification method on three real-world photogrammetry datasets that were generated with Pix4Dmapper Pro, and with varying point densities. We show that off-the-shelf machine learning techniques coupled with our new features allow us to train highly accurate classifiers that generalize well to unseen data, processing point clouds containing 10 million points in less than 3 minutes on a desktop computer.

  13. Integrating semantic dimension into openEHR archetypes for the management of cerebral palsy electronic medical records.

    PubMed

    Ellouze, Afef Samet; Bouaziz, Rafik; Ghorbel, Hanen

    2016-10-01

    Integrating semantic dimension into clinical archetypes is necessary once modeling medical records. First, it enables semantic interoperability and, it offers applying semantic activities on clinical data and provides a higher design quality of Electronic Medical Record (EMR) systems. However, to obtain these advantages, designers need to use archetypes that cover semantic features of clinical concepts involved in their specific applications. In fact, most of archetypes filed within open repositories are expressed in the Archetype Definition Language (ALD) which allows defining only the syntactic structure of clinical concepts weakening semantic activities on the EMR content in the semantic web environment. This paper focuses on the modeling of an EMR prototype for infants affected by Cerebral Palsy (CP), using the dual model approach and integrating semantic web technologies. Such a modeling provides a better delivery of quality of care and ensures semantic interoperability between all involved therapies' information systems. First, data to be documented are identified and collected from the involved therapies. Subsequently, data are analyzed and arranged into archetypes expressed in accordance of ADL. During this step, open archetype repositories are explored, in order to find the suitable archetypes. Then, ADL archetypes are transformed into archetypes expressed in OWL-DL (Ontology Web Language - Description Language). Finally, we construct an ontological source related to these archetypes enabling hence their annotation to facilitate data extraction and providing possibility to exercise semantic activities on such archetypes. Semantic dimension integration into EMR modeled in accordance to the archetype approach. The feasibility of our solution is shown through the development of a prototype, baptized "CP-SMS", which ensures semantic exploitation of CP EMR. This prototype provides the following features: (i) creation of CP EMR instances and their checking by using a knowledge base which we have constructed by interviews with domain experts, (ii) translation of initially CP ADL archetypes into CP OWL-DL archetypes, (iii) creation of an ontological source which we can use to annotate obtained archetypes and (vi) enrichment and supply of the ontological source and integration of semantic relations by providing hence fueling the ontology with new concepts, ensuring consistency and eliminating ambiguity between concepts. The degree of semantic interoperability that could be reached between EMR systems depends strongly on the quality of the used archetypes. Thus, the integration of semantic dimension in archetypes modeling process is crucial. By creating an ontological source and annotating archetypes, we create a supportive platform ensuring semantic interoperability between archetypes-based EMR-systems. Copyright © 2016. Published by Elsevier Inc.

  14. OntoCR: A CEN/ISO-13606 clinical repository based on ontologies.

    PubMed

    Lozano-Rubí, Raimundo; Muñoz Carrero, Adolfo; Serrano Balazote, Pablo; Pastor, Xavier

    2016-04-01

    To design a new semantically interoperable clinical repository, based on ontologies, conforming to CEN/ISO 13606 standard. The approach followed is to extend OntoCRF, a framework for the development of clinical repositories based on ontologies. The meta-model of OntoCRF has been extended by incorporating an OWL model integrating CEN/ISO 13606, ISO 21090 and SNOMED CT structure. This approach has demonstrated a complete evaluation cycle involving the creation of the meta-model in OWL format, the creation of a simple test application, and the communication of standardized extracts to another organization. Using a CEN/ISO 13606 based system, an indefinite number of archetypes can be merged (and reused) to build new applications. Our approach, based on the use of ontologies, maintains data storage independent of content specification. With this approach, relational technology can be used for storage, maintaining extensibility capabilities. The present work demonstrates that it is possible to build a native CEN/ISO 13606 repository for the storage of clinical data. We have demonstrated semantic interoperability of clinical information using CEN/ISO 13606 extracts. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Intrusive effects of semantic information on visual selective attention.

    PubMed

    Malcolm, George L; Rattinger, Michelle; Shomstein, Sarah

    2016-10-01

    Every object is represented by semantic information in extension to its low-level properties. It is well documented that such information biases attention when it is necessary for an ongoing task. However, whether semantic relationships influence attentional selection when they are irrelevant to the ongoing task remains an open question. The ubiquitous nature of semantic information suggests that it could bias attention even when these properties are irrelevant. In the present study, three objects appeared on screen, two of which were semantically related. After a varying time interval, a target or distractor appeared on top of each object. The objects' semantic relationships never predicted the target location. Despite this, a semantic bias on attentional allocation was observed, with an initial, transient bias to semantically related objects. Further experiments demonstrated that this effect was contingent on the objects being attended: if an object never contained the target, it no longer exerted a semantic influence. In a final set of experiments, we demonstrated that the semantic bias is robust and appears even in the presence of more predictive cues (spatial probability). These results suggest that as long as an object is attended, its semantic properties bias attention, even if it is irrelevant to an ongoing task and if more predictive factors are available.

  16. Drug knowledge expressed as computable semantic triples.

    PubMed

    Elkin, Peter L; Carter, John S; Nabar, Manasi; Tuttle, Mark; Lincoln, Michael; Brown, Steven H

    2011-01-01

    The majority of questions that arise in the practice of medicine relate to drug information. Additionally, adverse reactions account for as many as 98,000 deaths per year in the United States. Adverse drug reactions account for a significant portion of those errors. Many authors believe that clinical decision support associated with computerized physician order entry has the potential to decrease this adverse drug event rate. This decision support requires knowledge to drive the process. One important and rich source of drug knowledge is the DailyMed product labels. In this project we used computationally extracted SNOMED CT™ codified data associated with each section of each product label as input to a rules engine that created computable assertional knowledge in the form of semantic triples. These are expressed in the form of "Drug" HasIndication "SNOMED CT™". The information density of drug labels is deep, broad and quite substantial. By providing a computable form of this information content from drug labels we make these important axioms (facts) more accessible to computer programs designed to support improved care.

  17. Bridging semantics and syntax with graph algorithms—state-of-the-art of extracting biomedical relations

    PubMed Central

    Uzuner, Özlem; Szolovits, Peter

    2017-01-01

    Research on extracting biomedical relations has received growing attention recently, with numerous biological and clinical applications including those in pharmacogenomics, clinical trial screening and adverse drug reaction detection. The ability to accurately capture both semantic and syntactic structures in text expressing these relations becomes increasingly critical to enable deep understanding of scientific papers and clinical narratives. Shared task challenges have been organized by both bioinformatics and clinical informatics communities to assess and advance the state-of-the-art research. Significant progress has been made in algorithm development and resource construction. In particular, graph-based approaches bridge semantics and syntax, often achieving the best performance in shared tasks. However, a number of problems at the frontiers of biomedical relation extraction continue to pose interesting challenges and present opportunities for great improvement and fruitful research. In this article, we place biomedical relation extraction against the backdrop of its versatile applications, present a gentle introduction to its general pipeline and shared resources, review the current state-of-the-art in methodology advancement, discuss limitations and point out several promising future directions. PMID:26851224

  18. The Role of Shape in Semantic Memory Organization of Objects: An Experimental Study Using PI-Release.

    PubMed

    van Weelden, Lisanne; Schilperoord, Joost; Swerts, Marc; Pecher, Diane

    2015-01-01

    Visual information contributes fundamentally to the process of object categorization. The present study investigated whether the degree of activation of visual information in this process is dependent on the contextual relevance of this information. We used the Proactive Interference (PI-release) paradigm. In four experiments, we manipulated the information by which objects could be categorized and subsequently be retrieved from memory. The pattern of PI-release showed that if objects could be stored and retrieved both by (non-perceptual) semantic and (perceptual) shape information, then shape information was overruled by semantic information. If, however, semantic information could not be (satisfactorily) used to store and retrieve objects, then objects were stored in memory in terms of their shape. The latter effect was found to be strongest for objects from identical semantic categories.

  19. Bim-Based Indoor Path Planning Considering Obstacles

    NASA Astrophysics Data System (ADS)

    Xu, M.; Wei, S.; Zlatanova, S.; Zhang, R.

    2017-09-01

    At present, 87 % of people's activities are in indoor environment; indoor navigation has become a research issue. As the building structures for people's daily life are more and more complex, many obstacles influence humans' moving. Therefore it is essential to provide an accurate and efficient indoor path planning. Nowadays there are many challenges and problems in indoor navigation. Most existing path planning approaches are based on 2D plans, pay more attention to the geometric configuration of indoor space, often ignore rich semantic information of building components, and mostly consider simple indoor layout without taking into account the furniture. Addressing the above shortcomings, this paper uses BIM (IFC) as the input data and concentrates on indoor navigation considering obstacles in the multi-floor buildings. After geometric and semantic information are extracted, 2D and 3D space subdivision methods are adopted to build the indoor navigation network and to realize a path planning that avoids obstacles. The 3D space subdivision is based on triangular prism. The two approaches are verified by the experiments.

  20. Clinical information modeling processes for semantic interoperability of electronic health records: systematic review and inductive analysis.

    PubMed

    Moreno-Conde, Alberto; Moner, David; Cruz, Wellington Dimas da; Santos, Marcelo R; Maldonado, José Alberto; Robles, Montserrat; Kalra, Dipak

    2015-07-01

    This systematic review aims to identify and compare the existing processes and methodologies that have been published in the literature for defining clinical information models (CIMs) that support the semantic interoperability of electronic health record (EHR) systems. Following the preferred reporting items for systematic reviews and meta-analyses systematic review methodology, the authors reviewed published papers between 2000 and 2013 that covered that semantic interoperability of EHRs, found by searching the PubMed, IEEE Xplore, and ScienceDirect databases. Additionally, after selection of a final group of articles, an inductive content analysis was done to summarize the steps and methodologies followed in order to build CIMs described in those articles. Three hundred and seventy-eight articles were screened and thirty six were selected for full review. The articles selected for full review were analyzed to extract relevant information for the analysis and characterized according to the steps the authors had followed for clinical information modeling. Most of the reviewed papers lack a detailed description of the modeling methodologies used to create CIMs. A representative example is the lack of description related to the definition of terminology bindings and the publication of the generated models. However, this systematic review confirms that most clinical information modeling activities follow very similar steps for the definition of CIMs. Having a robust and shared methodology could improve their correctness, reliability, and quality. Independently of implementation technologies and standards, it is possible to find common patterns in methods for developing CIMs, suggesting the viability of defining a unified good practice methodology to be used by any clinical information modeler. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. The Role of Simple Semantics in the Process of Artificial Grammar Learning

    ERIC Educational Resources Information Center

    Öttl, Birgit; Jäger, Gerhard; Kaup, Barbara

    2017-01-01

    This study investigated the effect of semantic information on artificial grammar learning (AGL). Recursive grammars of different complexity levels (regular language, mirror language, copy language) were investigated in a series of AGL experiments. In the with-semantics condition, participants acquired semantic information prior to the AGL…

  2. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management

    PubMed Central

    Keselman, Alla; Rosemblat, Graciela; Kilicoglu, Halil; Fiszman, Marcelo; Jin, Honglan; Shin, Dongwook; Rindflesch, Thomas C.

    2013-01-01

    Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design. PMID:24311971

  3. Semantic and phonological information in sentence recall: converging psycholinguistic and neuropsychological evidence.

    PubMed

    Schweppe, Judith; Rummer, Ralf; Bormann, Tobias; Martin, Randi C

    2011-12-01

    We present one experiment and a neuropsychological case study to investigate to what extent phonological and semantic representations contribute to short-term sentence recall. We modified Potter and Lombardi's (1990) intrusion paradigm, in which retention of a list interferes with sentence recall such that on the list a semantically related lure is presented, which is expected to intrude into sentence recall. In our version, lure words are either semantically related to target words in the sentence or semantically plus phonologically related. With healthy participants, intrusions are more frequent when lure and target overlap phonologically in addition to semantically than when they solely overlap semantically. When this paradigm is applied to a patient with a phonological short-term memory impairment, both lure types induce the same amount of intrusions. These findings indicate that usually phonological information is retained in sentence recall in addition to semantic information.

  4. New approach for cognitive analysis and understanding of medical patterns and visualizations

    NASA Astrophysics Data System (ADS)

    Ogiela, Marek R.; Tadeusiewicz, Ryszard

    2003-11-01

    This paper presents new opportunities for applying linguistic description of the picture merit content and AI methods to undertake tasks of the automatic understanding of images semantics in intelligent medical information systems. A successful obtaining of the crucial semantic content of the medical image may contribute considerably to the creation of new intelligent multimedia cognitive medical systems. Thanks to the new idea of cognitive resonance between stream of the data extracted from the image using linguistic methods and expectations taken from the representaion of the medical knowledge, it is possible to understand the merit content of the image even if teh form of the image is very different from any known pattern. This article proves that structural techniques of artificial intelligence may be applied in the case of tasks related to automatic classification and machine perception based on semantic pattern content in order to determine the semantic meaning of the patterns. In the paper are described some examples presenting ways of applying such techniques in the creation of cognitive vision systems for selected classes of medical images. On the base of scientific research described in the paper we try to build some new systems for collecting, storing, retrieving and intelligent interpreting selected medical images especially obtained in radiological and MRI examinations.

  5. Knowledge representation and management: towards an integration of a semantic web in daily health practice.

    PubMed

    Griffon, N; Charlet, J; Darmoni, Sj

    2013-01-01

    To summarize the best papers in the field of Knowledge Representation and Management (KRM). A synopsis of the four selected articles for the IMIA Yearbook 2013 KRM section is provided, as well as highlights of current KRM trends, in particular, of the semantic web in daily health practice. The manual selection was performed in three stages: first a set of 3,106 articles, then a second set of 86 articles followed by a third set of 15 articles, and finally the last set of four chosen articles. Among the four selected articles (see Table 1), one focuses on knowledge engineering to prevent adverse drug events; the objective of the second is to propose mappings between clinical archetypes and SNOMED CT in the context of clinical practice; the third presents an ontology to create a question-answering system; the fourth describes a biomonitoring network based on semantic web technologies. These four articles clearly indicate that the health semantic web has become a part of daily practice of health professionals since 2012. In the review of the second set of 86 articles, the same topics included in the previous IMIA yearbook remain active research fields: Knowledge extraction, automatic indexing, information retrieval, natural language processing, management of health terminologies and ontologies.

  6. Unsupervised user similarity mining in GSM sensor networks.

    PubMed

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining.

  7. Overlaid caption extraction in news video based on SVM

    NASA Astrophysics Data System (ADS)

    Liu, Manman; Su, Yuting; Ji, Zhong

    2007-11-01

    Overlaid caption in news video often carries condensed semantic information which is key cues for content-based video indexing and retrieval. However, it is still a challenging work to extract caption from video because of its complex background and low resolution. In this paper, we propose an effective overlaid caption extraction approach for news video. We first scan the video key frames using a small window, and then classify the blocks into the text and non-text ones via support vector machine (SVM), with statistical features extracted from the gray level co-occurrence matrices, the LH and HL sub-bands wavelet coefficients and the orientated edge intensity ratios. Finally morphological filtering and projection profile analysis are employed to localize and refine the candidate caption regions. Experiments show its high performance on four 30-minute news video programs.

  8. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research.

    PubMed

    Wu, Honghan; Toti, Giulia; Morley, Katherine I; Ibrahim, Zina M; Folarin, Amos; Jackson, Richard; Kartoglu, Ismail; Agrawal, Asha; Stringer, Clive; Gale, Darren; Gorrell, Genevieve; Roberts, Angus; Broadbent, Matthew; Stewart, Robert; Dobson, Richard J B

    2018-05-01

    Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs. SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient. The semantic data are serviced via ontology-based search and analytics interfaces. SemEHR has been deployed at a number of UK hospitals, including the Clinical Record Interactive Search, an anonymized replica of the EHR of the UK South London and Maudsley National Health Service Foundation Trust, one of Europe's largest providers of mental health services. In 2 Clinical Record Interactive Search-based studies, SemEHR achieved 93% (hepatitis C) and 99% (HIV) F-measure results in identifying true positive patients. At King's College Hospital in London, as part of the CogStack program (github.com/cogstack), SemEHR is being used to recruit patients into the UK Department of Health 100 000 Genomes Project (genomicsengland.co.uk). The validation study suggests that the tool can validate previously recruited cases and is very fast at searching phenotypes; time for recruitment criteria checking was reduced from days to minutes. Validated on open intensive care EHR data, Medical Information Mart for Intensive Care III, the vital signs extracted by SemEHR can achieve around 97% accuracy. Results from the multiple case studies demonstrate SemEHR's efficiency: weeks or months of work can be done within hours or minutes in some cases. SemEHR provides a more comprehensive view of patients, bringing in more and unexpected insight compared to study-oriented bespoke IE systems. SemEHR is open source, available at https://github.com/CogStack/SemEHR.

  9. Sentiment Analysis Using Common-Sense and Context Information

    PubMed Central

    Mittal, Namita; Bansal, Pooja; Garg, Sonal

    2015-01-01

    Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods. PMID:25866505

  10. Sentiment analysis using common-sense and context information.

    PubMed

    Agarwal, Basant; Mittal, Namita; Bansal, Pooja; Garg, Sonal

    2015-01-01

    Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods.

  11. Semantic annotation of consumer health questions.

    PubMed

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most useful in estimating annotation confidence. To our knowledge, our corpus is the first focusing on annotation of uncurated consumer health questions. It is currently used to develop machine learning-based methods for question understanding. We make the corpus publicly available to stimulate further research on consumer health QA.

  12. Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Bassier, M.; Vergauwen, M.; Van Genechten, B.

    2017-08-01

    Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.

  13. Using Eye Tracking to Investigate Semantic and Spatial Representations of Scientific Diagrams During Text-Diagram Integration

    NASA Astrophysics Data System (ADS)

    Jian, Yu-Cin; Wu, Chao-Jung

    2015-02-01

    We investigated strategies used by readers when reading a science article with a diagram and assessed whether semantic and spatial representations were constructed while reading the diagram. Seventy-one undergraduate participants read a scientific article while tracking their eye movements and then completed a reading comprehension test. Our results showed that the text-diagram referencing strategy was commonly used. However, some readers adopted other reading strategies, such as reading the diagram or text first. We found all readers who had referred to the diagram spent roughly the same amount of time reading and performed equally well. However, some participants who ignored the diagram performed more poorly on questions that tested understanding of basic facts. This result indicates that dual coding theory may be a possible theory to explain the phenomenon. Eye movement patterns indicated that at least some readers had extracted semantic information of the scientific terms when first looking at the diagram. Readers who read the scientific terms on the diagram first tended to spend less time looking at the same terms in the text, which they read after. Besides, presented clear diagrams can help readers process both semantic and spatial information, thereby facilitating an overall understanding of the article. In addition, although text-first and diagram-first readers spent similar total reading time on the text and diagram parts of the article, respectively, text-first readers had significantly less number of saccades of text and diagram than diagram-first readers. This result might be explained as text-directed reading.

  14. Digital Investigations of AN Archaeological Smart Point Cloud: a Real Time Web-Based Platform to Manage the Visualisation of Semantical Queries

    NASA Astrophysics Data System (ADS)

    Poux, F.; Neuville, R.; Hallot, P.; Van Wersch, L.; Luczfalvy Jancsó, A.; Billen, R.

    2017-05-01

    While virtual copies of the real world tend to be created faster than ever through point clouds and derivatives, their working proficiency by all professionals' demands adapted tools to facilitate knowledge dissemination. Digital investigations are changing the way cultural heritage researchers, archaeologists, and curators work and collaborate to progressively aggregate expertise through one common platform. In this paper, we present a web application in a WebGL framework accessible on any HTML5-compatible browser. It allows real time point cloud exploration of the mosaics in the Oratory of Germigny-des-Prés, and emphasises the ease of use as well as performances. Our reasoning engine is constructed over a semantically rich point cloud data structure, where metadata has been injected a priori. We developed a tool that directly allows semantic extraction and visualisation of pertinent information for the end users. It leads to efficient communication between actors by proposing optimal 3D viewpoints as a basis on which interactions can grow.

  15. Preservation of Person-Specific Semantic Knowledge in Semantic Dementia: Does Direct Personal Experience Have a Specific Role?

    PubMed Central

    Péron, Julie A.; Piolino, Pascale; Moal-Boursiquot, Sandrine Le; Biseul, Isabelle; Leray, Emmanuelle; Bon, Laetitia; Desgranges, Béatrice; Eustache, Francis; Belliard, Serge

    2015-01-01

    Semantic dementia patients seem to have better knowledge of information linked to the self. More specifically, despite having severe semantic impairment, these patients show that they have more general information about the people they know personally by direct experience than they do about other individuals they know indirectly. However, the role of direct personal experience remains debated because of confounding factors such as frequency, recency of exposure, and affective relevance. We performed an exploratory study comparing the performance of five semantic dementia patients with that of 10 matched healthy controls on the recognition (familiarity judgment) and identification (biographic information recall) of personally familiar names vs. famous names. As expected, intergroup comparisons indicated a semantic breakdown in semantic dementia patients as compared with healthy controls. Moreover, unlike healthy controls, the semantic dementia patients recognized and identified personally familiar names better than they did famous names. This pattern of results suggests that direct personal experience indeed plays a specific role in the relative preservation of person-specific semantic meaning in semantic dementia. We discuss the role of direct personal experience on the preservation of semantic knowledge and the potential neurophysiological mechanisms underlying these processes. PMID:26635578

  16. Semantic and episodic memory of music are subserved by distinct neural networks.

    PubMed

    Platel, Hervé; Baron, Jean-Claude; Desgranges, Béatrice; Bernard, Frédéric; Eustache, Francis

    2003-09-01

    Numerous functional imaging studies have shown that retrieval from semantic and episodic memory is subserved by distinct neural networks. However, these results were essentially obtained with verbal and visuospatial material. The aim of this work was to determine the neural substrates underlying the semantic and episodic components of music using familiar and nonfamiliar melodic tunes. To study musical semantic memory, we designed a task in which the instruction was to judge whether or not the musical extract was felt as "familiar." To study musical episodic memory, we constructed two delayed recognition tasks, one containing only familiar and the other only nonfamiliar items. For each recognition task, half of the extracts (targets) were presented in the prior semantic task. The episodic and semantic tasks were to be contrasted by a comparison to two perceptive control tasks and to one another. Cerebral blood flow was assessed by means of the oxygen-15-labeled water injection method, using high-resolution PET. Distinct patterns of activations were found. First, regarding the episodic memory condition, bilateral activations of the middle and superior frontal gyri and precuneus (more prominent on the right side) were observed. Second, the semantic memory condition disclosed extensive activations in the medial and orbital frontal cortex bilaterally, the left angular gyrus, and predominantly the left anterior part of the middle temporal gyri. The findings from this study are discussed in light of the available neuropsychological data obtained in brain-damaged subjects and functional neuroimaging studies.

  17. Contextually guided very-high-resolution imagery classification with semantic segments

    NASA Astrophysics Data System (ADS)

    Zhao, Wenzhi; Du, Shihong; Wang, Qiao; Emery, William J.

    2017-10-01

    Contextual information, revealing relationships and dependencies between image objects, is one of the most important information for the successful interpretation of very-high-resolution (VHR) remote sensing imagery. Over the last decade, geographic object-based image analysis (GEOBIA) technique has been widely used to first divide images into homogeneous parts, and then to assign semantic labels according to the properties of image segments. However, due to the complexity and heterogeneity of VHR images, segments without semantic labels (i.e., semantic-free segments) generated with low-level features often fail to represent geographic entities (such as building roofs usually be partitioned into chimney/antenna/shadow parts). As a result, it is hard to capture contextual information across geographic entities when using semantic-free segments. In contrast to low-level features, "deep" features can be used to build robust segments with accurate labels (i.e., semantic segments) in order to represent geographic entities at higher levels. Based on these semantic segments, semantic graphs can be constructed to capture contextual information in VHR images. In this paper, semantic segments were first explored with convolutional neural networks (CNN) and a conditional random field (CRF) model was then applied to model the contextual information between semantic segments. Experimental results on two challenging VHR datasets (i.e., the Vaihingen and Beijing scenes) indicate that the proposed method is an improvement over existing image classification techniques in classification performance (overall accuracy ranges from 82% to 96%).

  18. Alpha Oscillations during Incidental Encoding Predict Subsequent Memory for New "Foil" Information.

    PubMed

    Vogelsang, David A; Gruber, Matthias; Bergström, Zara M; Ranganath, Charan; Simons, Jon S

    2018-05-01

    People can employ adaptive strategies to increase the likelihood that previously encoded information will be successfully retrieved. One such strategy is to constrain retrieval toward relevant information by reimplementing the neurocognitive processes that were engaged during encoding. Using EEG, we examined the temporal dynamics with which constraining retrieval toward semantic versus nonsemantic information affects the processing of new "foil" information encountered during a memory test. Time-frequency analysis of EEG data acquired during an initial study phase revealed that semantic compared with nonsemantic processing was associated with alpha decreases in a left frontal electrode cluster from around 600 msec after stimulus onset. Successful encoding of semantic versus nonsemantic foils during a subsequent memory test was related to decreases in alpha oscillatory activity in the same left frontal electrode cluster, which emerged relatively late in the trial at around 1000-1600 msec after stimulus onset. Across participants, left frontal alpha power elicited by semantic processing during the study phase correlated significantly with left frontal alpha power associated with semantic foil encoding during the memory test. Furthermore, larger left frontal alpha power decreases elicited by semantic foil encoding during the memory test predicted better subsequent semantic foil recognition in an additional surprise foil memory test, although this effect did not reach significance. These findings indicate that constraining retrieval toward semantic information involves reimplementing semantic encoding operations that are mediated by alpha oscillations and that such reimplementation occurs at a late stage of memory retrieval, perhaps reflecting additional monitoring processes.

  19. Residual Shuffling Convolutional Neural Networks for Deep Semantic Image Segmentation Using Multi-Modal Data

    NASA Astrophysics Data System (ADS)

    Chen, K.; Weinmann, M.; Gao, X.; Yan, M.; Hinz, S.; Jutzi, B.; Weinmann, M.

    2018-05-01

    In this paper, we address the deep semantic segmentation of aerial imagery based on multi-modal data. Given multi-modal data composed of true orthophotos and the corresponding Digital Surface Models (DSMs), we extract a variety of hand-crafted radiometric and geometric features which are provided separately and in different combinations as input to a modern deep learning framework. The latter is represented by a Residual Shuffling Convolutional Neural Network (RSCNN) combining the characteristics of a Residual Network with the advantages of atrous convolution and a shuffling operator to achieve a dense semantic labeling. Via performance evaluation on a benchmark dataset, we analyze the value of different feature sets for the semantic segmentation task. The derived results reveal that the use of radiometric features yields better classification results than the use of geometric features for the considered dataset. Furthermore, the consideration of data on both modalities leads to an improvement of the classification results. However, the derived results also indicate that the use of all defined features is less favorable than the use of selected features. Consequently, data representations derived via feature extraction and feature selection techniques still provide a gain if used as the basis for deep semantic segmentation.

  20. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    PubMed

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  1. OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    PubMed Central

    Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel

    2008-01-01

    Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434

  2. Context-Aware Adaptive Hybrid Semantic Relatedness in Biomedical Science

    NASA Astrophysics Data System (ADS)

    Emadzadeh, Ehsan

    Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems such as relationship extraction, ontology creation and question / answering [1--6]. Several techniques exist in calculating semantic relatedness of two concepts. These techniques utilize different knowledge sources and corpora. So far, researchers attempted to find the best hybrid method for each domain by combining semantic relatedness techniques and data sources manually. In this work, attempts were made to eliminate the needs for manually combining semantic relatedness methods targeting any new contexts or resources through proposing an automated method, which attempted to find the best combination of semantic relatedness techniques and resources to achieve the best semantic relatedness score in every context. This may help the research community find the best hybrid method for each context considering the available algorithms and resources.

  3. Organizing Diverse, Distributed Project Information

    NASA Technical Reports Server (NTRS)

    Keller, Richard M.

    2003-01-01

    SemanticOrganizer is a software application designed to organize and integrate information generated within a distributed organization or as part of a project that involves multiple, geographically dispersed collaborators. SemanticOrganizer incorporates the capabilities of database storage, document sharing, hypermedia navigation, and semantic-interlinking into a system that can be customized to satisfy the specific information-management needs of different user communities. The program provides a centralized repository of information that is both secure and accessible to project collaborators via the World Wide Web. SemanticOrganizer's repository can be used to collect diverse information (including forms, documents, notes, data, spreadsheets, images, and sounds) from computers at collaborators work sites. The program organizes the information using a unique network-structured conceptual framework, wherein each node represents a data record that contains not only the original information but also metadata (in effect, standardized data that characterize the information). Links among nodes express semantic relationships among the data records. The program features a Web interface through which users enter, interlink, and/or search for information in the repository. By use of this repository, the collaborators have immediate access to the most recent project information, as well as to archived information. A key advantage to SemanticOrganizer is its ability to interlink information together in a natural fashion using customized terminology and concepts that are familiar to a user community.

  4. Examining the Hemispheric Distribution of Semantic Information Using Lateralised Priming of Familiar Faces

    ERIC Educational Resources Information Center

    Vladeanu, Matei; Bourne, Victoria J.

    2009-01-01

    The way in which the semantic information associated with people is organised in the brain is still unclear. Most evidence suggests either bilateral or left hemisphere lateralisation. In this paper we use a lateralised semantic priming paradigm to further examine this neuropsychological organisation. A clear semantic priming effect was found with…

  5. BeeSpace Navigator: exploratory analysis of gene function using semantic indexing of biological literature.

    PubMed

    Sen Sarma, Moushumi; Arcoleo, David; Khetani, Radhika S; Chee, Brant; Ling, Xu; He, Xin; Jiang, Jing; Mei, Qiaozhu; Zhai, ChengXiang; Schatz, Bruce

    2011-07-01

    With the rapid decrease in cost of genome sequencing, the classification of gene function is becoming a primary problem. Such classification has been performed by human curators who read biological literature to extract evidence. BeeSpace Navigator is a prototype software for exploratory analysis of gene function using biological literature. The software supports an automatic analogue of the curator process to extract functions, with a simple interface intended for all biologists. Since extraction is done on selected collections that are semantically indexed into conceptual spaces, the curation can be task specific. Biological literature containing references to gene lists from expression experiments can be analyzed to extract concepts that are computational equivalents of a classification such as Gene Ontology, yielding discriminating concepts that differentiate gene mentions from other mentions. The functions of individual genes can be summarized from sentences in biological literature, to produce results resembling a model organism database entry that is automatically computed. Statistical frequency analysis based on literature phrase extraction generates offline semantic indexes to support these gene function services. The website with BeeSpace Navigator is free and open to all; there is no login requirement at www.beespace.illinois.edu for version 4. Materials from the 2010 BeeSpace Software Training Workshop are available at www.beespace.illinois.edu/bstwmaterials.php.

  6. Real Time Filtering of Tweets Using Wikipedia Concepts and Google Tri-gram Semantic Relatedness

    DTIC Science & Technology

    2015-11-20

    Real Time Filtering of Tweets Using Wikipedia Concepts and Google Tri-gram Semantic Relatedness Anh Dang1, Raheleh Makki1, Abidalrahman Moh’d1...of a topic that the user is interested in receiving relevant posts in real-time. Our proposed approach extracts Wikipedia concepts for profiles and...group name “DALTREC”. Our proposed approach for this year’s filtering task is based on using Wikipedia and Google Trigram for calculating the semantic

  7. A novel architecture for information retrieval system based on semantic web

    NASA Astrophysics Data System (ADS)

    Zhang, Hui

    2011-12-01

    Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.

  8. Insights from child development on the relationship between episodic and semantic memory.

    PubMed

    Robertson, Erin K; Köhler, Stefan

    2007-11-05

    The present study was motivated by a recent controversy in the neuropsychological literature on semantic dementia as to whether episodic encoding requires semantic processing or whether it can proceed solely based on perceptual processing. We addressed this issue by examining the effect of age-related limitations in semantic competency on episodic memory in 4-6-year-old children (n=67). We administered three different forced-choice recognition memory tests for pictures previously encountered in a single study episode. The tests varied in the degree to which access to semantically encoded information was required at retrieval. Semantic competency predicted recognition performance regardless of whether access to semantic information was required. A direct relation between picture naming at encoding and subsequent recognition was also found for all tests. Our findings emphasize the importance of semantic encoding processes even in retrieval situations that purportedly do not require access to semantic information. They also highlight the importance of testing neuropsychological models of memory in different populations, healthy and brain damaged, at both ends of the developmental continuum.

  9. Phase synchronization of delta and theta oscillations increase during the detection of relevant lexical information.

    PubMed

    Brunetti, Enzo; Maldonado, Pedro E; Aboitiz, Francisco

    2013-01-01

    During monitoring of the discourse, the detection of the relevance of incoming lexical information could be critical for its incorporation to update mental representations in memory. Because, in these situations, the relevance for lexical information is defined by abstract rules that are maintained in memory, a central aspect to elucidate is how an abstract level of knowledge maintained in mind mediates the detection of the lower-level semantic information. In the present study, we propose that neuronal oscillations participate in the detection of relevant lexical information, based on "kept in mind" rules deriving from more abstract semantic information. We tested our hypothesis using an experimental paradigm that restricted the detection of relevance to inferences based on explicit information, thus controlling for ambiguities derived from implicit aspects. We used a categorization task, in which the semantic relevance was previously defined based on the congruency between a kept in mind category (abstract knowledge), and the lexical semantic information presented. Our results show that during the detection of the relevant lexical information, phase synchronization of neuronal oscillations selectively increases in delta and theta frequency bands during the interval of semantic analysis. These increments occurred irrespective of the semantic category maintained in memory, had a temporal profile specific for each subject, and were mainly induced, as they had no effect on the evoked mean global field power. Also, recruitment of an increased number of pairs of electrodes was a robust observation during the detection of semantic contingent words. These results are consistent with the notion that the detection of relevant lexical information based on a particular semantic rule, could be mediated by increasing the global phase synchronization of neuronal oscillations, which may contribute to the recruitment of an extended number of cortical regions.

  10. Extracting the Textual and Temporal Structure of Supercomputing Logs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jain, S; Singh, I; Chandra, A

    2009-05-26

    Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an onlinemore » clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.« less

  11. A model-driven approach for representing clinical archetypes for Semantic Web environments.

    PubMed

    Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás; Maldonado, José Alberto

    2009-02-01

    The life-long clinical information of any person supported by electronic means configures his Electronic Health Record (EHR). This information is usually distributed among several independent and heterogeneous systems that may be syntactically or semantically incompatible. There are currently different standards for representing and exchanging EHR information among different systems. In advanced EHR approaches, clinical information is represented by means of archetypes. Most of these approaches use the Archetype Definition Language (ADL) to specify archetypes. However, ADL has some drawbacks when attempting to perform semantic activities in Semantic Web environments. In this work, Semantic Web technologies are used to specify clinical archetypes for advanced EHR architectures. The advantages of using the Ontology Web Language (OWL) instead of ADL are described and discussed in this work. Moreover, a solution combining Semantic Web and Model-driven Engineering technologies is proposed to transform ADL into OWL for the CEN EN13606 EHR architecture.

  12. You shall know an object by the company it keeps: An investigation of semantic representations derived from object co-occurrence in visual scenes.

    PubMed

    Sadeghi, Zahra; McClelland, James L; Hoffman, Paul

    2015-09-01

    An influential position in lexical semantics holds that semantic representations for words can be derived through analysis of patterns of lexical co-occurrence in large language corpora. Firth (1957) famously summarised this principle as "you shall know a word by the company it keeps". We explored whether the same principle could be applied to non-verbal patterns of object co-occurrence in natural scenes. We performed latent semantic analysis (LSA) on a set of photographed scenes in which all of the objects present had been manually labelled. This resulted in a representation of objects in a high-dimensional space in which similarity between two objects indicated the degree to which they appeared in similar scenes. These representations revealed similarities among objects belonging to the same taxonomic category (e.g., items of clothing) as well as cross-category associations (e.g., between fruits and kitchen utensils). We also compared representations generated from this scene dataset with two established methods for elucidating semantic representations: (a) a published database of semantic features generated verbally by participants and (b) LSA applied to a linguistic corpus in the usual fashion. Statistical comparisons of the three methods indicated significant association between the structures revealed by each method, with the scene dataset displaying greater convergence with feature-based representations than did LSA applied to linguistic data. The results indicate that information about the conceptual significance of objects can be extracted from their patterns of co-occurrence in natural environments, opening the possibility for such data to be incorporated into existing models of conceptual representation. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Knowledge-based understanding of aerial surveillance video

    NASA Astrophysics Data System (ADS)

    Cheng, Hui; Butler, Darren

    2006-05-01

    Aerial surveillance has long been used by the military to locate, monitor and track the enemy. Recently, its scope has expanded to include law enforcement activities, disaster management and commercial applications. With the ever-growing amount of aerial surveillance video acquired daily, there is an urgent need for extracting actionable intelligence in a timely manner. Furthermore, to support high-level video understanding, this analysis needs to go beyond current approaches and consider the relationships, motivations and intentions of the objects in the scene. In this paper we propose a system for interpreting aerial surveillance videos that automatically generates a succinct but meaningful description of the observed regions, objects and events. For a given video, the semantics of important regions and objects, and the relationships between them, are summarised into a semantic concept graph. From this, a textual description is derived that provides new search and indexing options for aerial video and enables the fusion of aerial video with other information modalities, such as human intelligence, reports and signal intelligence. Using a Mixture-of-Experts video segmentation algorithm an aerial video is first decomposed into regions and objects with predefined semantic meanings. The objects are then tracked and coerced into a semantic concept graph and the graph is summarized spatially, temporally and semantically using ontology guided sub-graph matching and re-writing. The system exploits domain specific knowledge and uses a reasoning engine to verify and correct the classes, identities and semantic relationships between the objects. This approach is advantageous because misclassifications lead to knowledge contradictions and hence they can be easily detected and intelligently corrected. In addition, the graph representation highlights events and anomalies that a low-level analysis would overlook.

  14. Supervised Outlier Detection in Large-Scale Mvs Point Clouds for 3d City Modeling Applications

    NASA Astrophysics Data System (ADS)

    Stucker, C.; Richard, A.; Wegner, J. D.; Schindler, K.

    2018-05-01

    We propose to use a discriminative classifier for outlier detection in large-scale point clouds of cities generated via multi-view stereo (MVS) from densely acquired images. What makes outlier removal hard are varying distributions of inliers and outliers across a scene. Heuristic outlier removal using a specific feature that encodes point distribution often delivers unsatisfying results. Although most outliers can be identified correctly (high recall), many inliers are erroneously removed (low precision), too. This aggravates object 3D reconstruction due to missing data. We thus propose to discriminatively learn class-specific distributions directly from the data to achieve high precision. We apply a standard Random Forest classifier that infers a binary label (inlier or outlier) for each 3D point in the raw, unfiltered point cloud and test two approaches for training. In the first, non-semantic approach, features are extracted without considering the semantic interpretation of the 3D points. The trained model approximates the average distribution of inliers and outliers across all semantic classes. Second, semantic interpretation is incorporated into the learning process, i.e. we train separate inlieroutlier classifiers per semantic class (building facades, roof, ground, vegetation, fields, and water). Performance of learned filtering is evaluated on several large SfM point clouds of cities. We find that results confirm our underlying assumption that discriminatively learning inlier-outlier distributions does improve precision over global heuristics by up to ≍ 12 percent points. Moreover, semantically informed filtering that models class-specific distributions further improves precision by up to ≍ 10 percent points, being able to remove very isolated building, roof, and water points while preserving inliers on building facades and vegetation.

  15. Unconscious semantic activation depends on feature-specific attention allocation.

    PubMed

    Spruyt, Adriaan; De Houwer, Jan; Everaert, Tom; Hermans, Dirk

    2012-01-01

    We examined whether semantic activation by subliminally presented stimuli is dependent upon the extent to which participants assign attention to specific semantic stimulus features and stimulus dimensions. Participants pronounced visible target words that were preceded by briefly presented, masked prime words. Both affective and non-affective semantic congruence of the prime-target pairs were manipulated under conditions that either promoted selective attention for affective stimulus information or selective attention for non-affective semantic stimulus information. In line with our predictions, results showed that affective congruence had a clear impact on word pronunciation latencies only if participants were encouraged to assign attention to the affective stimulus dimension. In contrast, non-affective semantic relatedness of the prime-target pairs produced no priming at all. Our findings are consistent with the hypothesis that unconscious activation of (affective) semantic information is modulated by feature-specific attention allocation. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. A Large-Scale Analysis of Variance in Written Language

    ERIC Educational Resources Information Center

    Johns, Brendan T.; Jamieson, Randall K.

    2018-01-01

    The collection of very large text sources has revolutionized the study of natural language, leading to the development of several models of language learning and distributional semantics that extract sophisticated semantic representations of words based on the statistical redundancies contained within natural language (e.g., Griffiths, Steyvers,…

  17. Intelligent resource discovery using ontology-based resource profiles

    NASA Technical Reports Server (NTRS)

    Hughes, J. Steven; Crichton, Dan; Kelly, Sean; Crichton, Jerry; Tran, Thuy

    2004-01-01

    Successful resource discovery across heterogeneous repositories is strongly dependent on the semantic and syntactic homogeneity of the associated resource descriptions. Ideally, resource descriptions are easily extracted from pre-existing standardized sources, expressed using standard syntactic and semantic structures, and managed and accessed within a distributed, flexible, and scaleable software framework.

  18. Interconnected growing self-organizing maps for auditory and semantic acquisition modeling.

    PubMed

    Cao, Mengxue; Li, Aijun; Fang, Qiang; Kaufmann, Emily; Kröger, Bernd J

    2014-01-01

    Based on the incremental nature of knowledge acquisition, in this study we propose a growing self-organizing neural network approach for modeling the acquisition of auditory and semantic categories. We introduce an Interconnected Growing Self-Organizing Maps (I-GSOM) algorithm, which takes associations between auditory information and semantic information into consideration, in this paper. Direct phonetic-semantic association is simulated in order to model the language acquisition in early phases, such as the babbling and imitation stages, in which no phonological representations exist. Based on the I-GSOM algorithm, we conducted experiments using paired acoustic and semantic training data. We use a cyclical reinforcing and reviewing training procedure to model the teaching and learning process between children and their communication partners. A reinforcing-by-link training procedure and a link-forgetting procedure are introduced to model the acquisition of associative relations between auditory and semantic information. Experimental results indicate that (1) I-GSOM has good ability to learn auditory and semantic categories presented within the training data; (2) clear auditory and semantic boundaries can be found in the network representation; (3) cyclical reinforcing and reviewing training leads to a detailed categorization as well as to a detailed clustering, while keeping the clusters that have already been learned and the network structure that has already been developed stable; and (4) reinforcing-by-link training leads to well-perceived auditory-semantic associations. Our I-GSOM model suggests that it is important to associate auditory information with semantic information during language acquisition. Despite its high level of abstraction, our I-GSOM approach can be interpreted as a biologically-inspired neurocomputational model.

  19. Challenges Facing the Semantic Web and Social Software as Communication Technology Agents in E-Learning Environments

    ERIC Educational Resources Information Center

    Olaniran, Bolanle A.

    2010-01-01

    The semantic web describes the process whereby information content is made available for machine consumption. With increased reliance on information communication technologies, the semantic web promises effective and efficient information acquisition and dissemination of products and services in the global economy, in particular, e-learning.…

  20. EIIS: An Educational Information Intelligent Search Engine Supported by Semantic Services

    ERIC Educational Resources Information Center

    Huang, Chang-Qin; Duan, Ru-Lin; Tang, Yong; Zhu, Zhi-Ting; Yan, Yong-Jian; Guo, Yu-Qing

    2011-01-01

    The semantic web brings a new opportunity for efficient information organization and search. To meet the special requirements of the educational field, this paper proposes an intelligent search engine enabled by educational semantic support service, where three kinds of searches are integrated into Educational Information Intelligent Search (EIIS)…

  1. Relations between Short-term Memory Deficits, Semantic Processing, and Executive Function

    PubMed Central

    Allen, Corinne M.; Martin, Randi C.; Martin, Nadine

    2012-01-01

    Background Previous research has suggested separable short-term memory (STM) buffers for the maintenance of phonological and lexical-semantic information, as some patients with aphasia show better ability to retain semantic than phonological information and others show the reverse. Recently, researchers have proposed that deficits to the maintenance of semantic information in STM are related to executive control abilities. Aims The present study investigated the relationship of executive function abilities with semantic and phonological short-term memory (STM) and semantic processing in such patients, as some previous research has suggested that semantic STM deficits and semantic processing abilities are critically related to specific or general executive function deficits. Method and Procedures 20 patients with aphasia and STM deficits were tested on measures of short-term retention, semantic processing, and both complex and simple executive function tasks. Outcome and Results In correlational analyses, we found no relation between semantic STM and performance on simple or complex executive function tasks. In contrast, phonological STM was related to executive function performance in tasks that had a verbal component, suggesting that performance in some executive function tasks depends on maintaining or rehearsing phonological codes. Although semantic STM was not related to executive function ability, performance on semantic processing tasks was related to executive function, perhaps due to similar executive task requirements in both semantic processing and executive function tasks. Conclusions Implications for treatment and interpretations of executive deficits are discussed. PMID:22736889

  2. ccML, a new mark-up language to improve ISO/EN 13606-based electronic health record extracts practical edition.

    PubMed

    Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Cáceres, Jesús; Somolinos, Roberto; Pascual, Mario; Martínez, Ignacio; Salvador, Carlos H; Monteagudo, José Luis

    2013-01-01

    The objective of this paper is to introduce a new language called ccML, designed to provide convenient pragmatic information to applications using the ISO/EN13606 reference model (RM), such as electronic health record (EHR) extracts editors. EHR extracts are presently built using the syntactic and semantic information provided in the RM and constrained by archetypes. The ccML extra information enables the automation of the medico-legal context information edition, which is over 70% of the total in an extract, without modifying the RM information. ccML is defined using a W3C XML schema file. Valid ccML files complement the RM with additional pragmatics information. The ccML language grammar is defined using formal language theory as a single-type tree grammar. The new language is tested using an EHR extracts editor application as proof-of-concept system. Seven ccML PVCodes (predefined value codes) are introduced in this grammar to cope with different realistic EHR edition situations. These seven PVCodes have different interpretation strategies, from direct look up in the ccML file itself, to more complex searches in archetypes or system precomputation. The possibility to declare generic types in ccML gives rise to ambiguity during interpretation. The criterion used to overcome ambiguity is that specificity should prevail over generality. The opposite would make the individual specific element declarations useless. A new mark-up language ccML is introduced that opens up the possibility of providing applications using the ISO/EN13606 RM with the necessary pragmatics information to be practical and realistic.

  3. E-Government Goes Semantic Web: How Administrations Can Transform Their Information Processes

    NASA Astrophysics Data System (ADS)

    Klischewski, Ralf; Ukena, Stefan

    E-government applications and services are built mainly on access to, retrieval of, integration of, and delivery of relevant information to citizens, businesses, and administrative users. In order to perform such information processing automatically through the Semantic Web,1 machine-readable2 enhancements of web resources are needed, based on the understanding of the content and context of the information in focus. While these enhancements are far from trivial to produce, administrations in their role of information and service providers so far find little guidance on how to migrate their web resources and enable a new quality of information processing; even research is still seeking best practices. Therefore, the underlying research question of this chapter is: what are the appropriate approaches which guide administrations in transforming their information processes toward the Semantic Web? In search for answers, this chapter analyzes the challenges and possible solutions from the perspective of administrations: (a) the reconstruction of the information processing in the e-government in terms of how semantic technologies must be employed to support information provision and consumption through the Semantic Web; (b) the required contribution to the transformation is compared to the capabilities and expectations of administrations; and (c) available experience with the steps of transformation are reviewed and discussed as to what extent they can be expected to successfully drive the e-government to the Semantic Web. This research builds on studying the case of Schleswig-Holstein, Germany, where semantic technologies have been used within the frame of the Access-eGov3 project in order to semantically enhance electronic service interfaces with the aim of providing a new way of accessing and combining e-government services.

  4. Scientific Datasets: Discovery and Aggregation for Semantic Interpretation.

    NASA Astrophysics Data System (ADS)

    Lopez, L. A.; Scott, S.; Khalsa, S. J. S.; Duerr, R.

    2015-12-01

    One of the biggest challenges that interdisciplinary researchers face is finding suitable datasets in order to advance their science; this problem remains consistent across multiple disciplines. A surprising number of scientists, when asked what tool they use for data discovery, reply "Google", which is an acceptable solution in some cases but not even Google can find -or cares to compile- all the data that's relevant for science and particularly geo sciences. If a dataset is not discoverable through a well known search provider it will remain dark data to the scientific world.For the past year, BCube, an EarthCube Building Block project, has been developing, testing and deploying a technology stack capable of data discovery at web-scale using the ultimate dataset: The Internet. This stack has 2 principal components, a web-scale crawling infrastructure and a semantic aggregator. The web-crawler is a modified version of Apache Nutch (the originator of Hadoop and other big data technologies) that has been improved and tailored for data and data service discovery. The second component is semantic aggregation, carried out by a python-based workflow that extracts valuable metadata and stores it in the form of triples through the use semantic technologies.While implementing the BCube stack we have run into several challenges such as a) scaling the project to cover big portions of the Internet at a reasonable cost, b) making sense of very diverse and non-homogeneous data, and lastly, c) extracting facts about these datasets using semantic technologies in order to make them usable for the geosciences community. Despite all these challenges we have proven that we can discover and characterize data that otherwise would have remained in the dark corners of the Internet. Having all this data indexed and 'triplelized' will enable scientists to access a trove of information relevant to their work in a more natural way. An important characteristic of the BCube stack is that all the code we have developed is open sourced and available to anyone who wants to experiment and collaborate with the project at: http://github.com/b-cube/

  5. Fine-grained information extraction from German transthoracic echocardiography reports.

    PubMed

    Toepfer, Martin; Corovic, Hamo; Fette, Georg; Klügl, Peter; Störk, Stefan; Puppe, Frank

    2015-11-12

    Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg. A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts. The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout. The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.

  6. Towards Semantic e-Science for Traditional Chinese Medicine

    PubMed Central

    Chen, Huajun; Mao, Yuxin; Zheng, Xiaoqing; Cui, Meng; Feng, Yi; Deng, Shuiguang; Yin, Aining; Zhou, Chunying; Tang, Jinming; Jiang, Xiaohong; Wu, Zhaohui

    2007-01-01

    Background Recent advances in Web and information technologies with the increasing decentralization of organizational structures have resulted in massive amounts of information resources and domain-specific services in Traditional Chinese Medicine. The massive volume and diversity of information and services available have made it difficult to achieve seamless and interoperable e-Science for knowledge-intensive disciplines like TCM. Therefore, information integration and service coordination are two major challenges in e-Science for TCM. We still lack sophisticated approaches to integrate scientific data and services for TCM e-Science. Results We present a comprehensive approach to build dynamic and extendable e-Science applications for knowledge-intensive disciplines like TCM based on semantic and knowledge-based techniques. The semantic e-Science infrastructure for TCM supports large-scale database integration and service coordination in a virtual organization. We use domain ontologies to integrate TCM database resources and services in a semantic cyberspace and deliver a semantically superior experience including browsing, searching, querying and knowledge discovering to users. We have developed a collection of semantic-based toolkits to facilitate TCM scientists and researchers in information sharing and collaborative research. Conclusion Semantic and knowledge-based techniques are suitable to knowledge-intensive disciplines like TCM. It's possible to build on-demand e-Science system for TCM based on existing semantic and knowledge-based techniques. The presented approach in the paper integrates heterogeneous distributed TCM databases and services, and provides scientists with semantically superior experience to support collaborative research in TCM discipline. PMID:17493289

  7. The semantic pathfinder: using an authoring metaphor for generic multimedia indexing.

    PubMed

    Snoek, Cees G M; Worring, Marcel; Geusebroek, Jan-Mark; Koelma, Dennis C; Seinstra, Frank J; Smeulders, Arnold W M

    2006-10-01

    This paper presents the semantic pathfinder architecture for generic indexing of multimedia archives. The semantic pathfinder extracts semantic concepts from video by exploring different paths through three consecutive analysis steps, which we derive from the observation that produced video is the result of an authoring-driven process. We exploit this authoring metaphor for machine-driven understanding. The pathfinder starts with the content analysis step. In this analysis step, we follow a data-driven approach of indexing semantics. The style analysis step is the second analysis step. Here, we tackle the indexing problem by viewing a video from the perspective of production. Finally, in the context analysis step, we view semantics in context. The virtue of the semantic pathfinder is its ability to learn the best path of analysis steps on a per-concept basis. To show the generality of this novel indexing approach, we develop detectors for a lexicon of 32 concepts and we evaluate the semantic pathfinder against the 2004 NIST TRECVID video retrieval benchmark, using a news archive of 64 hours. Top ranking performance in the semantic concept detection task indicates the merit of the semantic pathfinder for generic indexing of multimedia archives.

  8. Generic Information Can Retrieve Known Biological Associations: Implications for Biomedical Knowledge Discovery

    PubMed Central

    van Haagen, Herman H. H. B. M.; 't Hoen, Peter A. C.; Mons, Barend; Schultes, Erik A.

    2013-01-01

    Motivation Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone. Results Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is indicative for gene-disease associations. Conclusion Generic concepts are important for information retrieval and cannot be removed from semantic networks without negative impact on retrieval performance. PMID:24260124

  9. Event-based text mining for biology and functional genomics

    PubMed Central

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

    2015-01-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365

  10. The semantic representation of event information depends on the cue modality: an instance of meaning-based retrieval.

    PubMed

    Karlsson, Kristina; Sikström, Sverker; Willander, Johan

    2013-01-01

    The semantic content, or the meaning, is the essence of autobiographical memories. In comparison to previous research, which has mainly focused on the phenomenological experience and the age distribution of retrieved events, the present study provides a novel view on the retrieval of event information by quantifying the information as semantic representations. We investigated the semantic representation of sensory cued autobiographical events and studied the modality hierarchy within the multimodal retrieval cues. The experiment comprised a cued recall task, where the participants were presented with visual, auditory, olfactory or multimodal retrieval cues and asked to recall autobiographical events. The results indicated that the three different unimodal retrieval cues generate significantly different semantic representations. Further, the auditory and the visual modalities contributed the most to the semantic representation of the multimodally retrieved events. Finally, the semantic representation of the multimodal condition could be described as a combination of the three unimodal conditions. In conclusion, these results suggest that the meaning of the retrieved event information depends on the modality of the retrieval cues.

  11. The Semantic Representation of Event Information Depends on the Cue Modality: An Instance of Meaning-Based Retrieval

    PubMed Central

    Karlsson, Kristina; Sikström, Sverker; Willander, Johan

    2013-01-01

    The semantic content, or the meaning, is the essence of autobiographical memories. In comparison to previous research, which has mainly focused on the phenomenological experience and the age distribution of retrieved events, the present study provides a novel view on the retrieval of event information by quantifying the information as semantic representations. We investigated the semantic representation of sensory cued autobiographical events and studied the modality hierarchy within the multimodal retrieval cues. The experiment comprised a cued recall task, where the participants were presented with visual, auditory, olfactory or multimodal retrieval cues and asked to recall autobiographical events. The results indicated that the three different unimodal retrieval cues generate significantly different semantic representations. Further, the auditory and the visual modalities contributed the most to the semantic representation of the multimodally retrieved events. Finally, the semantic representation of the multimodal condition could be described as a combination of the three unimodal conditions. In conclusion, these results suggest that the meaning of the retrieved event information depends on the modality of the retrieval cues. PMID:24204561

  12. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

    PubMed

    Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela

    2015-05-01

    Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  13. Integrated use of spatial and semantic relationships for extracting road networks from floating car data

    NASA Astrophysics Data System (ADS)

    Li, Jun; Qin, Qiming; Xie, Chao; Zhao, Yue

    2012-10-01

    The update frequency of digital road maps influences the quality of road-dependent services. However, digital road maps surveyed by probe vehicles or extracted from remotely sensed images still have a long updating circle and their cost remain high. With GPS technology and wireless communication technology maturing and their cost decreasing, floating car technology has been used in traffic monitoring and management, and the dynamic positioning data from floating cars become a new data source for updating road maps. In this paper, we aim to update digital road maps using the floating car data from China's National Commercial Vehicle Monitoring Platform, and present an incremental road network extraction method suitable for the platform's GPS data whose sampling frequency is low and which cover a large area. Based on both spatial and semantic relationships between a trajectory point and its associated road segment, the method classifies each trajectory point, and then merges every trajectory point into the candidate road network through the adding or modifying process according to its type. The road network is gradually updated until all trajectories have been processed. Finally, this method is applied in the updating process of major roads in North China and the experimental results reveal that it can accurately derive geometric information of roads under various scenes. This paper provides a highly-efficient, low-cost approach to update digital road maps.

  14. Open semantic annotation of scientific publications using DOMEO.

    PubMed

    Ciccarese, Paolo; Ocana, Marco; Clark, Tim

    2012-04-24

    Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies.Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF's next full release.Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework.

  15. Open semantic annotation of scientific publications using DOMEO

    PubMed Central

    2012-01-01

    Background Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. Methods The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies. Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. Results We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF’s next full release. Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework. PMID:22541592

  16. CNTRO: A Semantic Web Ontology for Temporal Relation Inferencing in Clinical Narratives.

    PubMed

    Tao, Cui; Wei, Wei-Qi; Solbrig, Harold R; Savova, Guergana; Chute, Christopher G

    2010-11-13

    Using Semantic-Web specifications to represent temporal information in clinical narratives is an important step for temporal reasoning and answering time-oriented queries. Existing temporal models are either not compatible with the powerful reasoning tools developed for the Semantic Web, or designed only for structured clinical data and therefore are not ready to be applied on natural-language-based clinical narrative reports directly. We have developed a Semantic-Web ontology which is called Clinical Narrative Temporal Relation ontology. Using this ontology, temporal information in clinical narratives can be represented as RDF (Resource Description Framework) triples. More temporal information and relations can then be inferred by Semantic-Web based reasoning tools. Experimental results show that this ontology can represent temporal information in real clinical narratives successfully.

  17. RelFinder: Revealing Relationships in RDF Knowledge Bases

    NASA Astrophysics Data System (ADS)

    Heim, Philipp; Hellmann, Sebastian; Lehmann, Jens; Lohmann, Steffen; Stegemann, Timo

    The Semantic Web has recently seen a rise of large knowledge bases (such as DBpedia) that are freely accessible via SPARQL endpoints. The structured representation of the contained information opens up new possibilities in the way it can be accessed and queried. In this paper, we present an approach that extracts a graph covering relationships between two objects of interest. We show an interactive visualization of this graph that supports the systematic analysis of the found relationships by providing highlighting, previewing, and filtering features.

  18. SemanticFind: Locating What You Want in a Patient Record, Not Just What You Ask For

    PubMed Central

    Prager, John M.; Liang, Jennifer J.; Devarakonda, Murthy V.

    2017-01-01

    We present a new model of patient record search, called SemanticFind, which goes beyond traditional textual and medical synonym matches by locating patient data that a clinician would want to see rather than just what they ask for. The new model is implemented by making extensive use of the UMLS semantic network, distributional semantics, and NLP, to match query terms along several dimensions in a patient record with the returned matches organized accordingly. The new approach finds all clinically related concepts without the user having to ask for them. An evaluation of the accuracy of SemanticFind shows that it found twice as many relevant matches compared to those found by literal (traditional) search alone, along with very high precision and recall. These results suggest potential uses for SemanticFind in clinical practice, retrospective chart reviews, and in automated extraction of quality metrics. PMID:28815139

  19. Ontology based heterogeneous materials database integration and semantic query

    NASA Astrophysics Data System (ADS)

    Zhao, Shuai; Qian, Quan

    2017-10-01

    Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.

  20. Semi-Supervised Learning to Identify UMLS Semantic Relations.

    PubMed

    Luo, Yuan; Uzuner, Ozlem

    2014-01-01

    The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way).

  1. The role of sleep spindles and slow-wave activity in integrating new information in semantic memory.

    PubMed

    Tamminen, Jakke; Lambon Ralph, Matthew A; Lewis, Penelope A

    2013-09-25

    Assimilating new information into existing knowledge is a fundamental part of consolidating new memories and allowing them to guide behavior optimally and is vital for conceptual knowledge (semantic memory), which is accrued over many years. Sleep is important for memory consolidation, but its impact upon assimilation of new information into existing semantic knowledge has received minimal examination. Here, we examined the integration process by training human participants on novel words with meanings that fell into densely or sparsely populated areas of semantic memory in two separate sessions. Overnight sleep was polysomnographically monitored after each training session and recall was tested immediately after training, after a night of sleep, and 1 week later. Results showed that participants learned equal numbers of both word types, thus equating amount and difficulty of learning across the conditions. Measures of word recognition speed showed a disadvantage for novel words in dense semantic neighborhoods, presumably due to interference from many semantically related concepts, suggesting that the novel words had been successfully integrated into semantic memory. Most critically, semantic neighborhood density influenced sleep architecture, with participants exhibiting more sleep spindles and slow-wave activity after learning the sparse compared with the dense neighborhood words. These findings provide the first evidence that spindles and slow-wave activity mediate integration of new information into existing semantic networks.

  2. Interconnected growing self-organizing maps for auditory and semantic acquisition modeling

    PubMed Central

    Cao, Mengxue; Li, Aijun; Fang, Qiang; Kaufmann, Emily; Kröger, Bernd J.

    2014-01-01

    Based on the incremental nature of knowledge acquisition, in this study we propose a growing self-organizing neural network approach for modeling the acquisition of auditory and semantic categories. We introduce an Interconnected Growing Self-Organizing Maps (I-GSOM) algorithm, which takes associations between auditory information and semantic information into consideration, in this paper. Direct phonetic–semantic association is simulated in order to model the language acquisition in early phases, such as the babbling and imitation stages, in which no phonological representations exist. Based on the I-GSOM algorithm, we conducted experiments using paired acoustic and semantic training data. We use a cyclical reinforcing and reviewing training procedure to model the teaching and learning process between children and their communication partners. A reinforcing-by-link training procedure and a link-forgetting procedure are introduced to model the acquisition of associative relations between auditory and semantic information. Experimental results indicate that (1) I-GSOM has good ability to learn auditory and semantic categories presented within the training data; (2) clear auditory and semantic boundaries can be found in the network representation; (3) cyclical reinforcing and reviewing training leads to a detailed categorization as well as to a detailed clustering, while keeping the clusters that have already been learned and the network structure that has already been developed stable; and (4) reinforcing-by-link training leads to well-perceived auditory–semantic associations. Our I-GSOM model suggests that it is important to associate auditory information with semantic information during language acquisition. Despite its high level of abstraction, our I-GSOM approach can be interpreted as a biologically-inspired neurocomputational model. PMID:24688478

  3. How Semantic Radicals in Chinese characters Facilitate Hierarchical Category-Based Induction.

    PubMed

    Wang, Xiaoxi; Ma, Xie; Tao, Yun; Tao, Yachen; Li, Hong

    2018-04-03

    Prior studies indicate that the semantic radical in Chinese characters contains category information that can support the independent retrieval of category information through the lexical network to the conceptual network. Inductive reasoning relies on category information; thus, semantic radicals may influence inductive reasoning. As most natural concepts are hierarchically structured in the human brain, this study examined how semantic radicals impact inductive reasoning for hierarchical concepts. The study used animal and plant nouns, organized in basic, superordinate, and subordinate levels; half had a semantic radical and half did not. Eighteen participants completed an inductive reasoning task. Behavioural and event-related potential (ERP) data were collected. The behavioural results showed that participants reacted faster and more accurately in the with-semantic-radical condition than in the without-semantic-radical condition. For the ERPs, differences between the conditions were found, and these differences lasted from the very early cognitive processing stage (i.e., the N1 time window) to the relatively late processing stages (i.e., the N400 and LPC time windows). Semantic radicals can help to distinguish the hierarchies earlier (in the N400 period) than characters without a semantic radical (in the LPC period). These results provide electrophysiological evidence that semantic radicals may improve sensitivity to distinguish between hierarchical concepts.

  4. The semantic and episodic subcomponents of famous person knowledge: dissociation in healthy subjects.

    PubMed

    Piolino, Pascale; Lamidey, Virginie; Desgranges, Béatrice; Eustache, Francis

    2007-01-01

    Fifty-two subjects between ages 40 and 79 years were administered a questionnaire assessing their ability to recall semantic information about famous people from 4 different decades and to recollect its episodic source of acquisition together with autonoetic consciousness via the remember-know paradigm. In addition, they underwent a battery of standardized neuropsychological tests to assess episodic and semantic memory and executive functions. The analyses of age reveal differences for the episodic source score but no differences between age groups for the semantic scores within each decade. Regardless of the age of people, the analyses also show that semantic memory subcomponents of the famous person test are highly associated with each other as well as with the source component. The recall of semantic information on the famous person test relies on participants' semantic abilities, whereas the recall of its episodic source depends on their executive functions. The present findings confirm the existence of an episodic-semantic distinction in knowledge about famous people. They provide further evidence that personal source and semantic information are at once distinct and highly interactive within the framework of remote memory. (c) 2007 APA, all rights reserved.

  5. Semantic information mediates visual attention during spoken word recognition in Chinese: Evidence from the printed-word version of the visual-world paradigm.

    PubMed

    Shen, Wei; Qu, Qingqing; Li, Xingshan

    2016-07-01

    In the present study, we investigated whether the activation of semantic information during spoken word recognition can mediate visual attention's deployment to printed Chinese words. We used a visual-world paradigm with printed words, in which participants listened to a spoken target word embedded in a neutral spoken sentence while looking at a visual display of printed words. We examined whether a semantic competitor effect could be observed in the printed-word version of the visual-world paradigm. In Experiment 1, the relationship between the spoken target words and the printed words was manipulated so that they were semantically related (a semantic competitor), phonologically related (a phonological competitor), or unrelated (distractors). We found that the probability of fixations on semantic competitors was significantly higher than that of fixations on the distractors. In Experiment 2, the orthographic similarity between the spoken target words and their semantic competitors was manipulated to further examine whether the semantic competitor effect was modulated by orthographic similarity. We found significant semantic competitor effects regardless of orthographic similarity. Our study not only reveals that semantic information can affect visual attention, it also provides important new insights into the methodology employed to investigate the semantic processing of spoken words during spoken word recognition using the printed-word version of the visual-world paradigm.

  6. Discovering semantic features in the literature: a foundation for building functional associations

    PubMed Central

    Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto

    2006-01-01

    Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716

  7. Spicy Adjectives and Nominal Donkeys: Capturing Semantic Deviance Using Compositionality in Distributional Spaces.

    PubMed

    Vecchi, Eva M; Marelli, Marco; Zamparelli, Roberto; Baroni, Marco

    2017-01-01

    Sophisticated senator and legislative onion. Whether or not you have ever heard of these things, we all have some intuition that one of them makes much less sense than the other. In this paper, we introduce a large dataset of human judgments about novel adjective-noun phrases. We use these data to test an approach to semantic deviance based on phrase representations derived with compositional distributional semantic methods, that is, methods that derive word meanings from contextual information, and approximate phrase meanings by combining word meanings. We present several simple measures extracted from distributional representations of words and phrases, and we show that they have a significant impact on predicting the acceptability of novel adjective-noun phrases even when a number of alternative measures classically employed in studies of compound processing and bigram plausibility are taken into account. Our results show that the extent to which an attributive adjective alters the distributional representation of the noun is the most significant factor in modeling the distinction between acceptable and deviant phrases. Our study extends current applications of compositional distributional semantic methods to linguistically and cognitively interesting problems, and it offers a new, quantitatively precise approach to the challenge of predicting when humans will find novel linguistic expressions acceptable and when they will not. Copyright © 2016 Cognitive Science Society, Inc.

  8. Smart Shop Assistant - Using Semantic Technologies to Improve Online Shopping

    NASA Astrophysics Data System (ADS)

    Niemann, Magnus; Mochol, Malgorzata; Tolksdorf, Robert

    Internet commerce experiences a rising complexity: Not only more and more products become available online but also the amount of information available on a single product has been constantly increasing. Thanks to the Web 2.0 development it is, in the meantime, quite common to involve customers in the creation of product description and extraction of additional product information by offering customers feedback forms and product review sites, users' weblogs and other social web services. To face this situation, one of the main tasks in a future internet will be to aggregate, sort and evaluate this huge amount of information to aid the customers in choosing the "perfect" product for their needs.

  9. [Semantic Network Analysis of Online News and Social Media Text Related to Comprehensive Nursing Care Service].

    PubMed

    Kim, Minji; Choi, Mona; Youm, Yoosik

    2017-12-01

    As comprehensive nursing care service has gradually expanded, it has become necessary to explore the various opinions about it. The purpose of this study is to explore the large amount of text data regarding comprehensive nursing care service extracted from online news and social media by applying a semantic network analysis. The web pages of the Korean Nurses Association (KNA) News, major daily newspapers, and Twitter were crawled by searching the keyword 'comprehensive nursing care service' using Python. A morphological analysis was performed using KoNLPy. Nodes on a 'comprehensive nursing care service' cluster were selected, and frequency, edge weight, and degree centrality were calculated and visualized with Gephi for the semantic network. A total of 536 news pages and 464 tweets were analyzed. In the KNA News and major daily newspapers, 'nursing workforce' and 'nursing service' were highly rated in frequency, edge weight, and degree centrality. On Twitter, the most frequent nodes were 'National Health Insurance Service' and 'comprehensive nursing care service hospital.' The nodes with the highest edge weight were 'national health insurance,' 'wards without caregiver presence,' and 'caregiving costs.' 'National Health Insurance Service' was highest in degree centrality. This study provides an example of how to use atypical big data for a nursing issue through semantic network analysis to explore diverse perspectives surrounding the nursing community through various media sources. Applying semantic network analysis to online big data to gather information regarding various nursing issues would help to explore opinions for formulating and implementing nursing policies. © 2017 Korean Society of Nursing Science

  10. Information integration from heterogeneous data sources: a Semantic Web approach.

    PubMed

    Kunapareddy, Narendra; Mirhaji, Parsa; Richards, David; Casscells, S Ward

    2006-01-01

    Although the decentralized and autonomous implementation of health information systems has made it possible to extend the reach of surveillance systems to a variety of contextually disparate domains, public health use of data from these systems is not primarily anticipated. The Semantic Web has been proposed to address both representational and semantic heterogeneity in distributed and collaborative environments. We introduce a semantic approach for the integration of health data using the Resource Definition Framework (RDF) and the Simple Knowledge Organization System (SKOS) developed by the Semantic Web community.

  11. Visuospatial working memory in children with autism: the effect of a semantic global organization.

    PubMed

    Mammarella, Irene C; Giofrè, David; Caviola, Sara; Cornoldi, Cesare; Hamilton, Colin

    2014-06-01

    It has been reported that individuals with Autism Spectrum Disorders (ASD) perceive visual scenes as a sparse set of details rather than as a congruent and meaningful unit, failing in the extraction of the global configuration of the scene. In the present study, children with ASD were compared with typically developing (TD) children, in a visuospatial working memory task, the Visual Patterns Test (VPT). The VPT array was manipulated to vary the semantic affordance of the pattern, high semantic (global) vs. low semantic; temporal parameters were also manipulated within the change detection protocol. Overall, there was no main effect associated with Group, however there was a significant effect associated with Semantics, which was further qualified by an interaction between the Group and Semantic factors; there was only a significant effect of semantics in the TD group. The findings are discussed in light of the weak central coherence theory where the ASD group are unable to make use of long term memory semantics in order to construct global representations of the array. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. SSWAP: A Simple Semantic Web Architecture and Protocol for Semantic Web Services

    USDA-ARS?s Scientific Manuscript database

    SSWAP (Simple Semantic Web Architecture and Protocol) is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP is the driving technology behind the Virtual Plant Information Network, an NSF-funded semantic w...

  13. Ontology-Based Search of Genomic Metadata.

    PubMed

    Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

    2016-01-01

    The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.

  14. Semantic enrichment of clinical models towards semantic interoperability. The heart failure summary use case.

    PubMed

    Martínez-Costa, Catalina; Cornet, Ronald; Karlsson, Daniel; Schulz, Stefan; Kalra, Dipak

    2015-05-01

    To improve semantic interoperability of electronic health records (EHRs) by ontology-based mediation across syntactically heterogeneous representations of the same or similar clinical information. Our approach is based on a semantic layer that consists of: (1) a set of ontologies supported by (2) a set of semantic patterns. The first aspect of the semantic layer helps standardize the clinical information modeling task and the second shields modelers from the complexity of ontology modeling. We applied this approach to heterogeneous representations of an excerpt of a heart failure summary. Using a set of finite top-level patterns to derive semantic patterns, we demonstrate that those patterns, or compositions thereof, can be used to represent information from clinical models. Homogeneous querying of the same or similar information, when represented according to heterogeneous clinical models, is feasible. Our approach focuses on the meaning embedded in EHRs, regardless of their structure. This complex task requires a clear ontological commitment (ie, agreement to consistently use the shared vocabulary within some context), together with formalization rules. These requirements are supported by semantic patterns. Other potential uses of this approach, such as clinical models validation, require further investigation. We show how an ontology-based representation of a clinical summary, guided by semantic patterns, allows homogeneous querying of heterogeneous information structures. Whether there are a finite number of top-level patterns is an open question. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. A familiar pattern? Semantic memory contributes to the enhancement of visuo-spatial memories.

    PubMed

    Riby, Leigh M; Orme, Elizabeth

    2013-03-01

    In this study we quantify for the first time electrophysiological components associated with incorporating long-term semantic knowledge with visuo-spatial information using two variants of a traditional matrix patterns task. Results indicated that the matrix task with greater semantic content was associated with enhanced accuracy and RTs in a change-detection paradigm; this was also associated with increased P300 and N400 components as well as a sustained negative slow wave (NSW). In contrast, processing of the low semantic stimuli was associated with an increased N200 and a reduction in the P300. These findings suggest that semantic content can aid in reducing early visual processing of information and subsequent memory load by unitizing complex patterns into familiar forms. The N400/NSW may be associated with the requirements for maintaining visuo-spatial information about semantic forms such as orientation and relative location. Evidence for individual differences in semantic elaboration strategies used by participants is also discussed. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Transient medial prefrontal perturbation reduces false memory formation.

    PubMed

    Berkers, Ruud M W J; van der Linden, Marieke; de Almeida, Rafael F; Müller, Nils C J; Bovy, Leonore; Dresler, Martin; Morris, Richard G M; Fernández, Guillén

    2017-03-01

    Knowledge extracted across previous experiences, or schemas, benefit encoding and retention of congruent information. However, they can also reduce specificity and augment memory for semantically related, but false information. A demonstration of the latter is given by the Deese-Roediger-McDermott (DRM) paradigm, where the studying of words that fit a common semantic schema are found to induce false memories for words that are congruent with the given schema, but were not studied. The medial prefrontal cortex (mPFC) has been ascribed the function of leveraging prior knowledge to influence encoding and retrieval, based on imaging and patient studies. Here, we used transcranial magnetic stimulation (TMS) to transiently perturb ongoing mPFC processing immediately before participants performed the DRM-task. We observed the predicted reduction in false recall of critical lures after mPFC perturbation, compared to two control groups, whereas veridical recall and recognition memory performance remained similar across groups. These data provide initial causal evidence for a role of the mPFC in biasing the assimilation of new memories and their consolidation as a function of prior knowledge. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. "What is relevant in a text document?": An interpretable machine learning approach

    PubMed Central

    Arras, Leila; Horn, Franziska; Montavon, Grégoire; Müller, Klaus-Robert

    2017-01-01

    Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text’s category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications. PMID:28800619

  18. The absoluteness of semantic processing: lessons from the analysis of temporal clusters in phonemic verbal fluency.

    PubMed

    Vonberg, Isabelle; Ehlen, Felicitas; Fromm, Ortwin; Klostermann, Fabian

    2014-01-01

    For word production, we may consciously pursue semantic or phonological search strategies, but it is uncertain whether we can retrieve the different aspects of lexical information independently from each other. We therefore studied the spread of semantic information into words produced under exclusively phonemic task demands. 42 subjects participated in a letter verbal fluency task, demanding the production of as many s-words as possible in two minutes. Based on curve fittings for the time courses of word production, output spurts (temporal clusters) considered to reflect rapid lexical retrieval based on automatic activation spread, were identified. Semantic and phonemic word relatedness within versus between these clusters was assessed by respective scores (0 meaning no relation, 4 maximum relation). Subjects produced 27.5 (±9.4) words belonging to 6.7 (±2.4) clusters. Both phonemically and semantically words were more related within clusters than between clusters (phon: 0.33±0.22 vs. 0.19±0.17, p<.01; sem: 0.65±0.29 vs. 0.37±0.29, p<.01). Whereas the extent of phonemic relatedness correlated with high task performance, the contrary was the case for the extent of semantic relatedness. The results indicate that semantic information spread occurs, even if the consciously pursued word search strategy is purely phonological. This, together with the negative correlation between semantic relatedness and verbal output suits the idea of a semantic default mode of lexical search, acting against rapid task performance in the given scenario of phonemic verbal fluency. The simultaneity of enhanced semantic and phonemic word relatedness within the same temporal cluster boundaries suggests an interaction between content and sound-related information whenever a new semantic field has been opened.

  19. Implicit and explicit forgetting: when is gist remembered?

    PubMed

    Dorfman, J; Mandler, G

    1994-08-01

    Recognition (YES/NO) and stem completion (cued: complete with a word from the list; and uncued: complete with the first word that comes to mind) were tested following either semantic or non-semantic processing of a categorized input list. Item/instance information was tested by contrasting target items from the input list with new items that were categorically related to them; gist/categorical information was tested by comparing target items semantically related to the input items with unrelated new items. For both recognition and stem completion, regardless of initial processing condition, item information decayed rapidly over a period of one week. Gist information was maintained over the same period when initial processing was semantic but only in the cued condition for completion. These results are discussed in terms of dual process theory, which postulates activation/integration of a representation as primarily relevant to implicit item information and elaboration of a representation as mainly relevant to semantic (i.e. categorical) information.

  20. a Framework of Change Detection Based on Combined Morphologica Features and Multi-Index Classification

    NASA Astrophysics Data System (ADS)

    Li, S.; Zhang, S.; Yang, D.

    2017-09-01

    Remote sensing images are particularly well suited for analysis of land cover change. In this paper, we present a new framework for detection of changing land cover using satellite imagery. Morphological features and a multi-index are used to extract typical objects from the imagery, including vegetation, water, bare land, buildings, and roads. Our method, based on connected domains, is different from traditional methods; it uses image segmentation to extract morphological features, while the enhanced vegetation index (EVI), the differential water index (NDWI) are used to extract vegetation and water, and a fragmentation index is used to the correct extraction results of water. HSV transformation and threshold segmentation extract and remove the effects of shadows on extraction results. Change detection is performed on these results. One of the advantages of the proposed framework is that semantic information is extracted automatically using low-level morphological features and indexes. Another advantage is that the proposed method detects specific types of change without any training samples. A test on ZY-3 images demonstrates that our framework has a promising capability to detect change.

  1. Unsupervised User Similarity Mining in GSM Sensor Networks

    PubMed Central

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining. PMID:23576905

  2. Integrating semantic web technologies and geospatial catalog services for geospatial information discovery and processing in cyberinfrastructure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yue, Peng; Gong, Jianya; Di, Liping

    Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less

  3. Context-rich semantic framework for effective data-to-decisions in coalition networks

    NASA Astrophysics Data System (ADS)

    Grueneberg, Keith; de Mel, Geeth; Braines, Dave; Wang, Xiping; Calo, Seraphin; Pham, Tien

    2013-05-01

    In a coalition context, data fusion involves combining of soft (e.g., field reports, intelligence reports) and hard (e.g., acoustic, imagery) sensory data such that the resulting output is better than what it would have been if the data are taken individually. However, due to the lack of explicit semantics attached with such data, it is difficult to automatically disseminate and put the right contextual data in the hands of the decision makers. In order to understand the data, explicit meaning needs to be added by means of categorizing and/or classifying the data in relationship to each other from base reference sources. In this paper, we present a semantic framework that provides automated mechanisms to expose real-time raw data effectively by presenting appropriate information needed for a given situation so that an informed decision could be made effectively. The system utilizes controlled natural language capabilities provided by the ITA (International Technology Alliance) Controlled English (CE) toolkit to provide a human-friendly semantic representation of messages so that the messages can be directly processed in human/machine hybrid environments. The Real-time Semantic Enrichment (RTSE) service adds relevant contextual information to raw data streams from domain knowledge bases using declarative rules. The rules define how the added semantics and context information are derived and stored in a semantic knowledge base. The software framework exposes contextual information from a variety of hard and soft data sources in a fast, reliable manner so that an informed decision can be made using semantic queries in intelligent software systems.

  4. Medical image classification based on multi-scale non-negative sparse coding.

    PubMed

    Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar

    2017-11-01

    With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Lack of semantic priming effects in famous person recognition in Mild Cognitive Impairment.

    PubMed

    Brambati, Simona M; Peters, Frédéric; Belleville, Sylvie; Joubert, Sven

    2012-04-01

    Growing evidence indicates that individuals with Mild Cognitive Impairment (MCI) manifest semantic deficits that are often more severe for items that are characterized by a unique semantic and lexical association, such as famous people and famous buildings, than common concepts, such as objects. However, it is still controversial whether the semantic deficits observed in MCI are determined by a degradation of semantic information or by a deficit in intentional access to semantic knowledge. Here we used a semantic priming task in order to assess the integrity of the semantic system without requiring explicit access to this system. This paradigm may provide new insights in clarifying the nature of the semantic deficits in MCI. We assessed the semantic and repetition priming effect in 13 individuals with MCI and 13 age-matched controls who engaged in a familiarity judgment task of famous names. In the semantic priming condition, the prime was the name of a member of the same occupation category as the target (Tom Cruise-Brad Pitt), while in the repetition priming condition the prime was the same name as the target (Charlie Chaplin-Charlie Chaplin). The results showed a defective priming effect in MCI in the semantic but not in the repetition priming condition. Specifically, when compared to controls, MCI patients did not show a facilitation effect in responding to the same occupation prime-target pairs, but they showed an equivalent facilitation effect when the target was the same name as the prime. The present results provide support to the hypothesis that the semantic impairments observed in MCI cannot be uniquely ascribed to a deficit in intentional access to semantic information. Instead, these findings point to the semantic nature of these deficits and, in particular, to a degraded representation of semantic information concerning famous people. Copyright © 2011 Elsevier Srl. All rights reserved.

  6. An individual differences approach to semantic cognition: Divergent effects of age on representation, retrieval and selection.

    PubMed

    Hoffman, Paul

    2018-05-25

    Semantic cognition refers to the appropriate use of acquired knowledge about the world. This requires representation of knowledge as well as control processes which ensure that currently-relevant aspects of knowledge are retrieved and selected. Although these abilities can be impaired selectively following brain damage, the relationship between them in healthy individuals is unclear. It is also commonly assumed that semantic cognition is preserved in later life, because older people have greater reserves of knowledge. However, this claim overlooks the possibility of decline in semantic control processes. Here, semantic cognition was assessed in 100 young and older adults. Despite having a broader knowledge base, older people showed specific impairments in semantic control, performing more poorly than young people when selecting among competing semantic representations. Conversely, they showed preserved controlled retrieval of less salient information from the semantic store. Breadth of semantic knowledge was positively correlated with controlled retrieval but was unrelated to semantic selection ability, which was instead correlated with non-semantic executive function. These findings indicate that three distinct elements contribute to semantic cognition: semantic representations that accumulate throughout the lifespan, processes for controlled retrieval of less salient semantic information, which appear age-invariant, and mechanisms for selecting task-relevant aspects of semantic knowledge, which decline with age and may relate more closely to domain-general executive control.

  7. LDRD final report :

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brost, Randolph C.; McLendon, William Clarence,

    2013-01-01

    Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report amore » preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.« less

  8. Operationalizing Semantic Medline for meeting the information needs at point of care.

    PubMed

    Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang

    2015-01-01

    Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible.

  9. Preserved semantic access in global amnesia and hippocampal damage.

    PubMed

    Giovagnoli, A R; Erbetta, A; Bugiani, O

    2001-12-01

    C.B., a right-handed 33-year-old man, presented with anterograde amnesia after acute heart block. Cognitive abilities were normal except for serious impairment of long-term episodic memory. The access to semantic information was fully preserved. Magnetic resonance showed high signal intensity and marked volume loss in the hippocampus bilaterally; the left and right parahippocampal gyrus, lateral occipito-temporal gyrus, inferior temporal gyrus, and lateral temporal cortex were normal. This case underlines that global amnesia associated with hippocampal damage does not affect semantic memory. Although the hippocampus is important in retrieving context-linked information, its role is not so crucial in retrieving semantic contents. Cortical areas surrounding the hippocampus and lateral temporal areas might guide the recall of semantic information.

  10. Operationalizing Semantic Medline for meeting the information needs at point of care

    PubMed Central

    Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang

    2015-01-01

    Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible. PMID:26306259

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crain, Steven P.; Yang, Shuang-Hong; Zha, Hongyuan

    Access to health information by consumers is ham- pered by a fundamental language gap. Current attempts to close the gap leverage consumer oriented health information, which does not, however, have good coverage of slang medical terminology. In this paper, we present a Bayesian model to automatically align documents with different dialects (slang, com- mon and technical) while extracting their semantic topics. The proposed diaTM model enables effective information retrieval, even when the query contains slang words, by explicitly modeling the mixtures of dialects in documents and the joint influence of dialects and topics on word selection. Simulations us- ing consumermore » questions to retrieve medical information from a corpus of medical documents show that diaTM achieves a 25% improvement in information retrieval relevance by nDCG@5 over an LDA baseline.« less

  12. Social Networking on the Semantic Web

    ERIC Educational Resources Information Center

    Finin, Tim; Ding, Li; Zhou, Lina; Joshi, Anupam

    2005-01-01

    Purpose: Aims to investigate the way that the semantic web is being used to represent and process social network information. Design/methodology/approach: The Swoogle semantic web search engine was used to construct several large data sets of Resource Description Framework (RDF) documents with social network information that were encoded using the…

  13. Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life

    PubMed Central

    Thessen, Anne E.; Parr, Cynthia Sims

    2014-01-01

    Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network. Both workflows work well: the annotation workflow has an F1 Score of 0.941 and the association algorithm has an F1 Score of 0.885. Existing text annotators such as Terminizer and DBpedia Spotlight performed well, but require some optimization to be useful in the ecology and evolution domain. Important future work includes scaling up and improving accuracy through the use of distributional semantics. PMID:24594988

  14. Social Semantics for an Effective Enterprise

    NASA Technical Reports Server (NTRS)

    Berndt, Sarah; Doane, Mike

    2012-01-01

    An evolution of the Semantic Web, the Social Semantic Web (s2w), facilitates knowledge sharing with "useful information based on human contributions, which gets better as more people participate." The s2w reaches beyond the search box to move us from a collection of hyperlinked facts, to meaningful, real time context. When focused through the lens of Enterprise Search, the Social Semantic Web facilitates the fluid transition of meaningful business information from the source to the user. It is the confluence of human thought and computer processing structured with the iterative application of taxonomies, folksonomies, ontologies, and metadata schemas. The importance and nuances of human interaction are often deemphasized when focusing on automatic generation of semantic markup, which results in dissatisfied users and unrealized return on investment. Users consistently qualify the value of information sets through the act of selection, making them the de facto stakeholders of the Social Semantic Web. Employers are the ultimate beneficiaries of s2w utilization with a better informed, more decisive workforce; one not achieved with an IT miracle technology, but by improved human-computer interactions. Johnson Space Center Taxonomist Sarah Berndt and Mike Doane, principal owner of Term Management, LLC discuss the planning, development, and maintenance stages for components of a semantic system while emphasizing the necessity of a Social Semantic Web for the Enterprise. Identification of risks and variables associated with layering the successful implementation of a semantic system are also modeled.

  15. Dissociation of lexical syntax and semantics: evidence from focal cortical degeneration.

    PubMed

    Garrard, P; Carroll, E; Vinson, D; Vigliocco, G

    2004-10-01

    The question of whether information relevant to meaning (semantics) and structure (syntax) relies on a common language processor or on separate subsystems has proved difficult to address definitively because of the confounds involved in comparing the two types of information. At the sentence level syntactic and semantic judgments make different cognitive demands, while at the single word level, the most commonly used syntactic distinction (between nouns and verbs) is confounded with a fundamental semantic difference (between objects and actions). The present study employs a different syntactic contrast (between count nouns and mass nouns), which is crossed with a semantic difference (between naturally occurring and man-made substances) applying to words within a circumscribed semantic field (foodstuffs). We show, first, that grammaticality judgments of a patient with semantic dementia are indistinguishable from those of a group of age-matched controls, and are similar regardless of the status of his semantic knowledge about the item. In a second experiment we use the triadic task in a group of age-matched controls to show that similarity judgments are influenced not only by meaning (natural vs. manmade), but also implicitly by syntactic information (count vs. mass). Using the same task in a patient with semantic dementia we show that the semantic influences on the syntactic dimension are unlikely to account for this pattern in normals. These data are discussed in relation to modular vs. nonmodular models of language processing, and in particular to the semantic-syntactic distinction.

  16. Information Warfare: Evaluation of Operator Information Processing Models

    DTIC Science & Technology

    1997-10-01

    that people can describe or report, including both episodic and semantic information. Declarative memory contains a network of knowledge represented...second dimension corresponds roughly to the distinction between episodic and semantic memory that is commonly made in cognitive psychology. Episodic ...3 is long-term memory for the discourse, a subset of episodic memory . Partition 4 is long-term semantic memory , or the knowledge-base. According to

  17. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to Process Medication Information in Outpatient Clinical Notes

    PubMed Central

    Zhou, Li; Plasek, Joseph M; Mahoney, Lisa M; Karipineni, Neelima; Chang, Frank; Yan, Xuemin; Chang, Fenny; Dimaggio, Dana; Goldman, Debora S.; Rocha, Roberto A.

    2011-01-01

    Clinical information is often coded using different terminologies, and therefore is not interoperable. Our goal is to develop a general natural language processing (NLP) system, called Medical Text Extraction, Reasoning and Mapping System (MTERMS), which encodes clinical text using different terminologies and simultaneously establishes dynamic mappings between them. MTERMS applies a modular, pipeline approach flowing from a preprocessor, semantic tagger, terminology mapper, context analyzer, and parser to structure inputted clinical notes. Evaluators manually reviewed 30 free-text and 10 structured outpatient clinical notes compared to MTERMS output. MTERMS achieved an overall F-measure of 90.6 and 94.0 for free-text and structured notes respectively for medication and temporal information. The local medication terminology had 83.0% coverage compared to RxNorm’s 98.0% coverage for free-text notes. 61.6% of mappings between the terminologies are exact match. Capture of duration was significantly improved (91.7% vs. 52.5%) from systems in the third i2b2 challenge. PMID:22195230

  18. Robust Deep Semantics for Language Understanding

    DTIC Science & Technology

    focus on five areas: deep learning, textual inferential relations, relation and event extraction by distant supervision , semantic parsing and...ontology expansion, and coreference resolution. As time went by, the program focus converged towards emphasizing technologies for knowledge base...natural logic methods for text understanding, improved mention coreference algorithms, and the further development of multilingual tools in CoreNLP.

  19. Information extraction from multi-institutional radiology reports.

    PubMed

    Hassanpour, Saeed; Langlotz, Curtis P

    2016-01-01

    The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Mapping a Nursing Terminology Subset to openEHR Archetypes. A Case Study of the International Classification for Nursing Practice.

    PubMed

    Nogueira, J R M; Cook, T W; Cavalini, L T

    2015-01-01

    Healthcare information technologies have the potential to transform nursing care. However, healthcare information systems based on conventional software architecture are not semantically interoperable and have high maintenance costs. Health informatics standards, such as controlled terminologies, have been proposed to improve healthcare information systems, but their implementation in conventional software has not been enough to overcome the current challenge. Such obstacles could be removed by adopting a multilevel model-driven approach, such as the openEHR specifications, in nursing information systems. To create an openEHR archetype model for the Functional Status concepts as published in Nursing Outcome Indicators Catalog of the International Classification for Nursing Practice (NOIC-ICNP). Four methodological steps were followed: 1) extraction of terms from the NOIC-ICNP terminology; 2) identification of previously published openEHR archetypes; 3) assessment of the adequacy of those openEHR archetypes to represent the terms; and 4) development of new openEHR archetypes when required. The "Barthel Index" archetype was retrieved and mapped to the 68 NOIC-ICNP Functional Status terms. There were 19 exact matches between a term and the correspondent archetype node and 23 archetype nodes that matched to one or more NOIC-INCP. No matches were found between the archetype and 14 of the NOIC-ICNP terms, and nine archetype nodes did not match any of the NOIC-ICNP terms. The openEHR model was sufficient to represent the semantics of the Functional Status concept according to the NOIC-ICNP, but there were differences in data granularity between the terminology and the archetype, thus producing a significantly complex mapping, which could be difficult to implement in real healthcare information systems. However, despite the technological complexity, the present study demonstrated the feasibility of mapping nursing terminologies to openEHR archetypes, which emphasizes the importance of adopting the multilevel model-driven approach for the achievement of semantic interoperability between healthcare information systems.

  1. ccML, a new mark-up language to improve ISO/EN 13606-based electronic health record extracts practical edition

    PubMed Central

    Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Cáceres, Jesús; Somolinos, Roberto; Pascual, Mario; Martínez, Ignacio; Salvador, Carlos H; Monteagudo, José Luis

    2013-01-01

    Objective The objective of this paper is to introduce a new language called ccML, designed to provide convenient pragmatic information to applications using the ISO/EN13606 reference model (RM), such as electronic health record (EHR) extracts editors. EHR extracts are presently built using the syntactic and semantic information provided in the RM and constrained by archetypes. The ccML extra information enables the automation of the medico-legal context information edition, which is over 70% of the total in an extract, without modifying the RM information. Materials and Methods ccML is defined using a W3C XML schema file. Valid ccML files complement the RM with additional pragmatics information. The ccML language grammar is defined using formal language theory as a single-type tree grammar. The new language is tested using an EHR extracts editor application as proof-of-concept system. Results Seven ccML PVCodes (predefined value codes) are introduced in this grammar to cope with different realistic EHR edition situations. These seven PVCodes have different interpretation strategies, from direct look up in the ccML file itself, to more complex searches in archetypes or system precomputation. Discussion The possibility to declare generic types in ccML gives rise to ambiguity during interpretation. The criterion used to overcome ambiguity is that specificity should prevail over generality. The opposite would make the individual specific element declarations useless. Conclusion A new mark-up language ccML is introduced that opens up the possibility of providing applications using the ISO/EN13606 RM with the necessary pragmatics information to be practical and realistic. PMID:23019241

  2. Perceptual Versus Semantic Information Processing in Semantic Category Decisions.

    ERIC Educational Resources Information Center

    Kamil, Michael L.; Hanson, Raymond H.

    This study examined the ability of junior high school students to use advance information when making semantic category decisions. The subjects, eight good readers and eight poor readers, identified paired words as "same" or "different" in category, with some words more highly associated with the category than others--in the "fruit" category, for…

  3. Comprehensive Analysis of Semantic Web Reasoners and Tools: A Survey

    ERIC Educational Resources Information Center

    Khamparia, Aditya; Pandey, Babita

    2017-01-01

    Ontologies are emerging as best representation techniques for knowledge based context domains. The continuing need for interoperation, collaboration and effective information retrieval has lead to the creation of semantic web with the help of tools and reasoners which manages personalized information. The future of semantic web lies in an ontology…

  4. A trade-off between local and distributed information processing associated with remote episodic versus semantic memory.

    PubMed

    Heisz, Jennifer J; Vakorin, Vasily; Ross, Bernhard; Levine, Brian; McIntosh, Anthony R

    2014-01-01

    Episodic memory and semantic memory produce very different subjective experiences yet rely on overlapping networks of brain regions for processing. Traditional approaches for characterizing functional brain networks emphasize static states of function and thus are blind to the dynamic information processing within and across brain regions. This study used information theoretic measures of entropy to quantify changes in the complexity of the brain's response as measured by magnetoencephalography while participants listened to audio recordings describing past personal episodic and general semantic events. Personal episodic recordings evoked richer subjective mnemonic experiences and more complex brain responses than general semantic recordings. Critically, we observed a trade-off between the relative contribution of local versus distributed entropy, such that personal episodic recordings produced relatively more local entropy whereas general semantic recordings produced relatively more distributed entropy. Changes in the relative contributions of local and distributed entropy to the total complexity of the system provides a potential mechanism that allows the same network of brain regions to represent cognitive information as either specific episodes or more general semantic knowledge.

  5. Performance impact of stop lists and morphological decomposition on word-word corpus-based semantic space models.

    PubMed

    Keith, Jeff; Westbury, Chris; Goldman, James

    2015-09-01

    Corpus-based semantic space models, which primarily rely on lexical co-occurrence statistics, have proven effective in modeling and predicting human behavior in a number of experimental paradigms that explore semantic memory representation. The most widely studied extant models, however, are strongly influenced by orthographic word frequency (e.g., Shaoul & Westbury, Behavior Research Methods, 38, 190-195, 2006). This has the implication that high-frequency closed-class words can potentially bias co-occurrence statistics. Because these closed-class words are purported to carry primarily syntactic, rather than semantic, information, the performance of corpus-based semantic space models may be improved by excluding closed-class words (using stop lists) from co-occurrence statistics, while retaining their syntactic information through other means (e.g., part-of-speech tagging and/or affixes from inflected word forms). Additionally, very little work has been done to explore the effect of employing morphological decomposition on the inflected forms of words in corpora prior to compiling co-occurrence statistics, despite (controversial) evidence that humans perform early morphological decomposition in semantic processing. In this study, we explored the impact of these factors on corpus-based semantic space models. From this study, morphological decomposition appears to significantly improve performance in word-word co-occurrence semantic space models, providing some support for the claim that sublexical information-specifically, word morphology-plays a role in lexical semantic processing. An overall decrease in performance was observed in models employing stop lists (e.g., excluding closed-class words). Furthermore, we found some evidence that weakens the claim that closed-class words supply primarily syntactic information in word-word co-occurrence semantic space models.

  6. XSemantic: An Extension of LCA Based XML Semantic Search

    NASA Astrophysics Data System (ADS)

    Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai

    One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.

  7. Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention.

    PubMed

    Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei

    2016-01-13

    An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.

  8. [Effects of punctuation on the processing of syntactically ambiguous Japanese sentences with a semantic bias].

    PubMed

    Niikuni, Keiyu; Muramoto, Toshiaki

    2014-06-01

    This study explored the effects of a comma on the processing of structurally ambiguous Japanese sentences with a semantic bias. A previous study has shown that a comma which is incompatible with an ambiguous sentence's semantic bias affects the processing of the sentence, but the effects of a comma that is compatible with the bias are unclear. In the present study, we examined the role of a comma compatible with the sentence's semantic bias using the self-paced reading method, which enabled us to determine the reading times for the region of the sentence where readers would be expected to solve the ambiguity using semantic information (the "target region"). The results show that a comma significantly increases the reading time of the punctuated word but decreases the reading time in the target region. We concluded that even if the semantic information provided might be sufficient for disambiguation, the insertion of a comma would affect the processing cost of the ambiguity, indicating that readers use both the comma and semantic information in parallel for sentence processing.

  9. CNN-based ranking for biomedical entity normalization.

    PubMed

    Li, Haodi; Chen, Qingcai; Tang, Buzhou; Wang, Xiaolong; Xu, Hua; Wang, Baohua; Huang, Dong

    2017-10-03

    Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical entity normalization as a ranking problem and benefits from semantic information of biomedical entities. The CNN-based ranking method first generates candidates using handcrafted rules, and then ranks the candidates according to their semantic information modeled by CNN as well as their morphological information. Experiments on two benchmark datasets for biomedical entity normalization show that our proposed CNN-based ranking method outperforms traditional rule-based method with state-of-the-art performance. We propose a CNN architecture that regards biomedical entity normalization as a ranking problem. Comparison results show that semantic information is beneficial to biomedical entity normalization and can be well combined with morphological information in our CNN architecture for further improvement.

  10. Rapid extraction of gist from visual text and its influence on word recognition.

    PubMed

    Asano, Michiko; Yokosawa, Kazuhiko

    2011-01-01

    Two experiments explored rapid extraction of gist from a visual text and its influence on word recognition. In both, a short text (sentence) containing a target word was presented for 200 ms and was followed by a target recognition task. Results showed that participants recognized contextually anomalous word targets less frequently than contextually consistent counterparts (Experiment 1). This context effect was obtained when sentences contained the same semantic content but with disrupted syntactic structure (Experiment 2). Results demonstrate that words in a briefly presented visual sentence are processed in parallel and that rapid extraction of sentence gist relies on a primitive representation of sentence context (termed protocontext) that is semantically activated by the simultaneous presentation of multiple words (i.e., a sentence) before syntactic processing.

  11. Developmental changes in the neural influence of sublexical information on semantic processing.

    PubMed

    Lee, Shu-Hui; Booth, James R; Chou, Tai-Li

    2015-07-01

    Functional magnetic resonance imaging (fMRI) was used to examine the developmental changes in a group of normally developing children (aged 8-12) and adolescents (aged 13-16) during semantic processing. We manipulated association strength (i.e. a global reading unit) and semantic radical (i.e. a local reading unit) to explore the interaction of lexical and sublexical semantic information in making semantic judgments. In the semantic judgment task, two types of stimuli were used: visually-similar (i.e. shared a semantic radical) versus visually-dissimilar (i.e. did not share a semantic radical) character pairs. Participants were asked to indicate if two Chinese characters, arranged according to association strength, were related in meaning. The results showed greater developmental increases in activation in left angular gyrus (BA 39) in the visually-similar compared to the visually-dissimilar pairs for the strong association. There were also greater age-related increases in angular gyrus for the strong compared to weak association in the visually-similar pairs. Both of these results suggest that shared semantics at the sublexical level facilitates the integration of overlapping features at the lexical level in older children. In addition, there was a larger developmental increase in left posterior middle temporal gyrus (BA 21) for the weak compared to strong association in the visually-dissimilar pairs, suggesting conflicting sublexical information placed greater demands on access to lexical representations in the older children. All together, these results suggest that older children are more sensitive to sublexical information when processing lexical representations. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Automatic identification of comparative effectiveness research from Medline citations to support clinicians’ treatment information needs

    PubMed Central

    Zhang, Mingyuan; Fiol, Guilherme Del; Grout, Randall W.; Jonnalagadda, Siddhartha; Medlin, Richard; Mishra, Rashmi; Weir, Charlene; Liu, Hongfang; Mostafa, Javed; Fiszman, Marcelo

    2014-01-01

    Online knowledge resources such as Medline can address most clinicians’ patient care information needs. Yet, significant barriers, notably lack of time, limit the use of these sources at the point of care. The most common information needs raised by clinicians are treatment-related. Comparative effectiveness studies allow clinicians to consider multiple treatment alternatives for a particular problem. Still, solutions are needed to enable efficient and effective consumption of comparative effectiveness research at the point of care. Objective Design and assess an algorithm for automatically identifying comparative effectiveness studies and extracting the interventions investigated in these studies. Methods The algorithm combines semantic natural language processing, Medline citation metadata, and machine learning techniques. We assessed the algorithm in a case study of treatment alternatives for depression. Results Both precision and recall for identifying comparative studies was 0.83. A total of 86% of the interventions extracted perfectly or partially matched the gold standard. Conclusion Overall, the algorithm achieved reasonable performance. The method provides building blocks for the automatic summarization of comparative effectiveness research to inform point of care decision-making. PMID:23920677

  13. Supporting Semantic Annotation of Educational Content by Automatic Extraction of Hierarchical Domain Relationships

    ERIC Educational Resources Information Center

    Vrablecová, Petra; Šimko, Marián

    2016-01-01

    The domain model is an essential part of an adaptive learning system. For each educational course, it involves educational content and semantics, which is also viewed as a form of conceptual metadata about educational content. Due to the size of a domain model, manual domain model creation is a challenging and demanding task for teachers or…

  14. Impact of Machine-Translated Text on Entity and Relationship Extraction

    DTIC Science & Technology

    2014-12-01

    20 1 1. Introduction Using social network analysis tools is an important asset in...semantic modeling software to automatically build detailed network models from unstructured text. Contour imports unstructured text and then maps the text...onto an existing ontology of frames at the sentence level, using FrameNet, a structured language model, and through Semantic Role Labeling ( SRL

  15. Semantic Web and Contextual Information: Semantic Network Analysis of Online Journalistic Texts

    NASA Astrophysics Data System (ADS)

    Lim, Yon Soo

    This study examines why contextual information is important to actualize the idea of semantic web, based on a case study of a socio-political issue in South Korea. For this study, semantic network analyses were conducted regarding English-language based 62 blog posts and 101 news stories on the web. The results indicated the differences of the meaning structures between blog posts and professional journalism as well as between conservative journalism and progressive journalism. From the results, this study ascertains empirical validity of current concerns about the practical application of the new web technology, and discusses how the semantic web should be developed.

  16. Remote semantic memory for public figures in HIV infection, alcoholism, and their comorbidity.

    PubMed

    Fama, Rosemary; Rosenbloom, Margaret J; Sassoon, Stephanie A; Thompson, Megan A; Pfefferbaum, Adolf; Sullivan, Edith V

    2011-02-01

    Impairments in component processes of working and episodic memory mark both HIV infection and chronic alcoholism, with compounded deficits often observed in individuals comorbid for these conditions. Remote semantic memory processes, however, have only seldom been studied in these diagnostic groups. Examination of remote semantic memory could provide insight into the underlying processes associated with storage and retrieval of learned information over extended time periods while elucidating spared and impaired cognitive functions in these clinical groups. We examined component processes of remote semantic memory in HIV infection and chronic alcoholism in 4 subject groups (HIV, ALC, HIV + ALC, and age-matched healthy adults) using a modified version of the Presidents Test. Free recall, recognition, and sequencing of presidential candidates and election dates were assessed. In addition, component processes of working, episodic, and semantic memory were assessed with ancillary cognitive tests. The comorbid group (HIV + ALC) was significantly impaired on sequencing of remote semantic information compared with age-matched healthy adults. Free recall of remote semantic information was also modestly impaired in the HIV + ALC group, but normal performance for recognition of this information was observed. Few differences were observed between the single diagnosis groups (HIV, ALC) and healthy adults, although examination of the component processes underlying remote semantic memory scores elicited differences between the HIV and ALC groups. Selective remote memory processes were related to lifetime alcohol consumption in the ALC group and to viral load and depression level in the HIV group. Hepatitis C diagnosis was associated with lower remote semantic memory scores in all 3 clinical groups. Education level did not account for group differences reported. This study provides behavioral support for the existence of adverse effects associated with the comorbidity of HIV infection and chronic alcoholism on selective component processes of memory function, with untoward effects exacerbated by Hepatitis C infection. The pattern of remote semantic memory function in HIV + ALC is consistent with those observed in neurological conditions primarily affecting frontostriatal pathways and suggests that remote memory dysfunction in HIV + ALC may be a result of impaired retrieval processes rather than loss of remote semantic information per se. Copyright © 2010 by the Research Society on Alcoholism.

  17. Examining lateralized semantic access using pictures.

    PubMed

    Lovseth, Kyle; Atchley, Ruth Ann

    2010-03-01

    A divided visual field (DVF) experiment examined the semantic processing strategies employed by the cerebral hemispheres to determine if strategies observed with written word stimuli generalize to other media for communicating semantic information. We employed picture stimuli and vary the degree of semantic relatedness between the picture pairs. Participants made an on-line semantic relatedness judgment in response to sequentially presented pictures. We found that when pictures are presented to the right hemisphere responses are generally more accurate than the left hemisphere for semantic relatedness judgments for picture pairs. Furthermore, consistent with earlier DVF studies employing words, we conclude that the RH is better at accessing or maintaining access to information that has a weak or more remote semantic relationship. We also found evidence of faster access for pictures presented to the LH in the strongly-related condition. Overall, these results are consistent with earlier DVF word studies that argue that the cerebral hemispheres each play an important and separable role during semantic retrieval. Copyright 2009 Elsevier Inc. All rights reserved.

  18. Biomedical semantics in the Semantic Web

    PubMed Central

    2011-01-01

    The Semantic Web offers an ideal platform for representing and linking biomedical information, which is a prerequisite for the development and application of analytical tools to address problems in data-intensive areas such as systems biology and translational medicine. As for any new paradigm, the adoption of the Semantic Web offers opportunities and poses questions and challenges to the life sciences scientific community: which technologies in the Semantic Web stack will be more beneficial for the life sciences? Is biomedical information too complex to benefit from simple interlinked representations? What are the implications of adopting a new paradigm for knowledge representation? What are the incentives for the adoption of the Semantic Web, and who are the facilitators? Is there going to be a Semantic Web revolution in the life sciences? We report here a few reflections on these questions, following discussions at the SWAT4LS (Semantic Web Applications and Tools for Life Sciences) workshop series, of which this Journal of Biomedical Semantics special issue presents selected papers from the 2009 edition, held in Amsterdam on November 20th. PMID:21388570

  19. Biomedical semantics in the Semantic Web.

    PubMed

    Splendiani, Andrea; Burger, Albert; Paschke, Adrian; Romano, Paolo; Marshall, M Scott

    2011-03-07

    The Semantic Web offers an ideal platform for representing and linking biomedical information, which is a prerequisite for the development and application of analytical tools to address problems in data-intensive areas such as systems biology and translational medicine. As for any new paradigm, the adoption of the Semantic Web offers opportunities and poses questions and challenges to the life sciences scientific community: which technologies in the Semantic Web stack will be more beneficial for the life sciences? Is biomedical information too complex to benefit from simple interlinked representations? What are the implications of adopting a new paradigm for knowledge representation? What are the incentives for the adoption of the Semantic Web, and who are the facilitators? Is there going to be a Semantic Web revolution in the life sciences?We report here a few reflections on these questions, following discussions at the SWAT4LS (Semantic Web Applications and Tools for Life Sciences) workshop series, of which this Journal of Biomedical Semantics special issue presents selected papers from the 2009 edition, held in Amsterdam on November 20th.

  20. I See What You Mean: Theta Power Increases Are Involved in the Retrieval of Lexical Semantic Information

    ERIC Educational Resources Information Center

    Bastiaansen, Marcel C. M.; Oostenveld, Robert; Jensen, Ole; Hagoort, Peter

    2008-01-01

    An influential hypothesis regarding the neural basis of the mental lexicon is that semantic representations are neurally implemented as distributed networks carrying sensory, motor and/or more abstract functional information. This work investigates whether the semantic properties of words partly determine the topography of such networks. Subjects…

  1. Intersection Detection Based on Qualitative Spatial Reasoning on Stopping Point Clusters

    NASA Astrophysics Data System (ADS)

    Zourlidou, S.; Sester, M.

    2016-06-01

    The purpose of this research is to propose and test a method for detecting intersections by analysing collectively acquired trajectories of moving vehicles. Instead of solely relying on the geometric features of the trajectories, such as heading changes, which may indicate turning points and consequently intersections, we extract semantic features of the trajectories in form of sequences of stops and moves. Under this spatiotemporal prism, the extracted semantic information which indicates where vehicles stop can reveal important locations, such as junctions. The advantage of the proposed approach in comparison with existing turning-points oriented approaches is that it can detect intersections even when not all the crossing road segments are sampled and therefore no turning points are observed in the trajectories. The challenge with this approach is that first of all, not all vehicles stop at the same location - thus, the stop-location is blurred along the direction of the road; this, secondly, leads to the effect that nearby junctions can induce similar stop-locations. As a first step, a density-based clustering is applied on the layer of stop observations and clusters of stop events are found. Representative points of the clusters are determined (one per cluster) and in a last step the existence of an intersection is clarified based on spatial relational cluster reasoning, with which less informative geospatial clusters, in terms of whether a junction exists and where its centre lies, are transformed in more informative ones. Relational reasoning criteria, based on the relative orientation of the clusters with their adjacent ones are discussed for making sense of the relation that connects them, and finally for forming groups of stop events that belong to the same junction.

  2. A model for indexing medical documents combining statistical and symbolic knowledge.

    PubMed

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-10-11

    To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.

  3. Extraction of UMLS® Concepts Using Apache cTAKES™ for German Language.

    PubMed

    Becker, Matthias; Böckmann, Britta

    2016-01-01

    Automatic information extraction of medical concepts and classification with semantic standards from medical reports is useful for standardization and for clinical research. This paper presents an approach for an UMLS concept extraction with a customized natural language processing pipeline for German clinical notes using Apache cTAKES. The objectives are, to test the natural language processing tool for German language if it is suitable to identify UMLS concepts and map these with SNOMED-CT. The German UMLS database and German OpenNLP models extended the natural language processing pipeline, so the pipeline can normalize to domain ontologies such as SNOMED-CT using the German concepts. For testing, the ShARe/CLEF eHealth 2013 training dataset translated into German was used. The implemented algorithms are tested with a set of 199 German reports, obtaining a result of average 0.36 F1 measure without German stemming, pre- and post-processing of the reports.

  4. A Model for Indexing Medical Documents Combining Statistical and Symbolic Knowledge.

    PubMed Central

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-01-01

    OBJECTIVES: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. METHODS: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). RESULTS: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. CONCLUSIONS: The use of several terminologies leads to more precise indexing. The improvement achieved in the model’s implementation performances as a result of using semantic relationships is encouraging. PMID:18693792

  5. Knowledge of the human body: a distinct semantic domain.

    PubMed

    Coslett, H Branch; Saffran, Eleanor M; Schwoebel, John

    2002-08-13

    Patients with selective deficits in the naming and comprehension of animals, plants, and artifacts have been reported. These descriptions of specific semantic category deficits have contributed substantially to the understanding of the architecture of semantic representations. This study sought to further understanding of the organization of the semantic system by demonstrating that another semantic category, knowledge of the human body, may be selectively preserved. The performance of a patient with semantic dementia was compared with the performance of healthy controls on a variety of tasks assessing distinct types of body representations, including the body schema, body image, and body structural description. Despite substantial deficits on tasks involving language and knowledge of the world generally, the patient performed normally on all tests of body knowledge except body part naming; even in this naming task, however, her performance with body parts was significantly better than on artifacts. The demonstration that body knowledge may be preserved despite substantial semantic deficits involving other types of semantic information argues that body knowledge is a distinct and dissociable semantic category. These data are interpreted as support for a model of semantics that proposes that knowledge is distributed across different cortical regions reflecting the manner in which the information was acquired.

  6. A review of EO image information mining

    NASA Astrophysics Data System (ADS)

    Quartulli, Marco; Olaizola, Igor G.

    2013-01-01

    We analyze the state of the art of content-based retrieval in Earth observation image archives focusing on complete systems showing promise for operational implementation. The different paradigms at the basis of the main system families are introduced. The approaches taken are considered, focusing in particular on the phases after primitive feature extraction. The solutions envisaged for the issues related to feature simplification and synthesis, indexing, semantic labeling are reviewed. The methodologies for query specification and execution are evaluated. Conclusions are drawn on the state of published research in Earth observation (EO) mining.

  7. Remembering 'zeal' but not 'thing': reverse frequency effects as a consequence of deregulated semantic processing.

    PubMed

    Hoffman, Paul; Jefferies, Elizabeth; Ralph, Matthew A Lambon

    2011-02-01

    More efficient processing of high frequency (HF) words is a ubiquitous finding in healthy individuals, yet frequency effects are often small or absent in stroke aphasia. We propose that some patients fail to show the expected frequency effect because processing of HF words places strong demands on semantic control and regulation processes, counteracting the usual effect. This may occur because HF words appear in a wide range of linguistic contexts, each associated with distinct semantic information. This theory predicts that in extreme circumstances, patients with impaired semantic control should show an outright reversal of the normal frequency effect. To test this prediction, we tested two patients with impaired semantic control with a delayed repetition task that emphasised activation of semantic representations. By alternating HF and low frequency (LF) trials, we demonstrated a significant repetition advantage for LF words, principally because of perseverative errors in which patients produced the previous LF response in place of the HF target. These errors indicated that HF words were more weakly activated than LF words. We suggest that when presented with no contextual information, patients generate a weak and unstable pattern of semantic activation for HF words because information relating to many possible contexts and interpretations is activated. In contrast, LF words are associated with more stable patterns of activation because similar semantic information is activated whenever they are encountered. Copyright © 2011 Elsevier Ltd. All rights reserved.

  8. Examining Lateralized Semantic Access Using Pictures

    ERIC Educational Resources Information Center

    Lovseth, Kyle; Atchley, Ruth Ann

    2010-01-01

    A divided visual field (DVF) experiment examined the semantic processing strategies employed by the cerebral hemispheres to determine if strategies observed with written word stimuli generalize to other media for communicating semantic information. We employed picture stimuli and vary the degree of semantic relatedness between the picture pairs.…

  9. Exploiting Recurring Structure in a Semantic Network

    NASA Technical Reports Server (NTRS)

    Wolfe, Shawn R.; Keller, Richard M.

    2004-01-01

    With the growing popularity of the Semantic Web, an increasing amount of information is becoming available in machine interpretable, semantically structured networks. Within these semantic networks are recurring structures that could be mined by existing or novel knowledge discovery methods. The mining of these semantic structures represents an interesting area that focuses on mining both for and from the Semantic Web, with surprising applicability to problems confronting the developers of Semantic Web applications. In this paper, we present representative examples of recurring structures and show how these structures could be used to increase the utility of a semantic repository deployed at NASA.

  10. The structure of semantic person memory: evidence from semantic priming in person recognition.

    PubMed

    Wiese, Holger

    2011-11-01

    This paper reviews research on the structure of semantic person memory as examined with semantic priming. In this experimental paradigm, a familiarity decision on a target face or written name is usually faster when it is preceded by a related as compared to an unrelated prime. This effect has been shown to be relatively short lived and susceptible to interfering items. Moreover, semantic priming can cross stimulus domains, such that a written name can prime a target face and vice versa. However, it remains controversial whether representations of people are stored in associative networks based on co-occurrence, or in more abstract semantic categories. In line with prominent cognitive models of face recognition, which explain semantic priming by shared semantic information between prime and target, recent research demonstrated that priming could be obtained from purely categorically related, non-associated prime/target pairs. Although strategic processes, such as expectancy and retrospective matching likely contribute, there is also evidence for a non-strategic contribution to priming, presumably related to spreading activation. Finally, a semantic priming effect has been demonstrated in the N400 event-related potential (ERP) component, which may reflect facilitated access to semantic information. It is concluded that categorical relatedness is one organizing principle of semantic person memory. ©2011 The British Psychological Society.

  11. Semantic attributes are encoded in human electrocorticographic signals during visual object recognition.

    PubMed

    Rupp, Kyle; Roos, Matthew; Milsap, Griffin; Caceres, Carlos; Ratto, Christopher; Chevillet, Mark; Crone, Nathan E; Wolmetz, Michael

    2017-03-01

    Non-invasive neuroimaging studies have shown that semantic category and attribute information are encoded in neural population activity. Electrocorticography (ECoG) offers several advantages over non-invasive approaches, but the degree to which semantic attribute information is encoded in ECoG responses is not known. We recorded ECoG while patients named objects from 12 semantic categories and then trained high-dimensional encoding models to map semantic attributes to spectral-temporal features of the task-related neural responses. Using these semantic attribute encoding models, untrained objects were decoded with accuracies comparable to whole-brain functional Magnetic Resonance Imaging (fMRI), and we observed that high-gamma activity (70-110Hz) at basal occipitotemporal electrodes was associated with specific semantic dimensions (manmade-animate, canonically large-small, and places-tools). Individual patient results were in close agreement with reports from other imaging modalities on the time course and functional organization of semantic processing along the ventral visual pathway during object recognition. The semantic attribute encoding model approach is critical for decoding objects absent from a training set, as well as for studying complex semantic encodings without artificially restricting stimuli to a small number of semantic categories. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  12. LexValueSets: An Approach for Context-Driven Value Sets Extraction

    PubMed Central

    Pathak, Jyotishman; Jiang, Guoqian; Dwarkanath, Sridhar O.; Buntrock, James D.; Chute, Christopher G.

    2008-01-01

    The ability to model, share and re-use value sets across multiple medical information systems is an important requirement. However, generating value sets semi-automatically from a terminology service is still an unresolved issue, in part due to the lack of linkage to clinical context patterns that provide the constraints in defining a concept domain and invocation of value sets extraction. Towards this goal, we develop and evaluate an approach for context-driven automatic value sets extraction based on a formal terminology model. The crux of the technique is to identify and define the context patterns from various domains of discourse and leverage them for value set extraction using two complementary ideas based on (i) local terms provided by the Subject Matter Experts (extensional) and (ii) semantic definition of the concepts in coding schemes (intensional). A prototype was implemented based on SNOMED CT rendered in the LexGrid terminology model and a preliminary evaluation is presented. PMID:18998955

  13. The Long Road to Semantic Interoperability in Support of Public Health: Experiences from Two States

    PubMed Central

    Vreeman, Daniel J.; Grannis, Shaun J.

    2014-01-01

    Proliferation of health information technologies creates opportunities to improve clinical and public health, including high quality, safer care and lower costs. To maximize such potential benefits, health information technologies must readily and reliably exchange information with other systems. However, evidence from public health surveillance programs in two states suggests that operational clinical information systems often fail to use available standards, a barrier to semantic interoperability. Furthermore, analysis of existing policies incentivizing semantic interoperability suggests they have limited impact and are fragmented. In this essay, we discuss three approaches for increasing semantic interoperability to support national goals for using health information technologies. A clear, comprehensive strategy requiring collaborative efforts by clinical and public health stakeholders is suggested as a guide for the long road towards better population health data and outcomes. PMID:24680985

  14. DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation.

    PubMed

    You, Ronghui; Huang, Xiaodi; Zhu, Shanfeng

    2018-06-06

    As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. Detecting Anomalous Insiders in Collaborative Information Systems

    PubMed Central

    Chen, You; Nyemba, Steve; Malin, Bradley

    2012-01-01

    Collaborative information systems (CISs) are deployed within a diverse array of environments that manage sensitive information. Current security mechanisms detect insider threats, but they are ill-suited to monitor systems in which users function in dynamic teams. In this paper, we introduce the community anomaly detection system (CADS), an unsupervised learning framework to detect insider threats based on the access logs of collaborative environments. The framework is based on the observation that typical CIS users tend to form community structures based on the subjects accessed (e.g., patients’ records viewed by healthcare providers). CADS consists of two components: 1) relational pattern extraction, which derives community structures and 2) anomaly prediction, which leverages a statistical model to determine when users have sufficiently deviated from communities. We further extend CADS into MetaCADS to account for the semantics of subjects (e.g., patients’ diagnoses). To empirically evaluate the framework, we perform an assessment with three months of access logs from a real electronic health record (EHR) system in a large medical center. The results illustrate our models exhibit significant performance gains over state-of-the-art competitors. When the number of illicit users is low, MetaCADS is the best model, but as the number grows, commonly accessed semantics lead to hiding in a crowd, such that CADS is more prudent. PMID:24489520

  16. Sex differences in the use of delayed semantic context when listening to disrupted speech.

    PubMed

    Liederman, Jacqueline; Fisher, Janet McGraw; Coty, Alexis; Matthews, Geetha; Frye, Richard E; Lincoln, Alexis; Alexander, Rebecca

    2013-02-01

    Female as opposed to male listeners were better able to use a delayed informative cue at the end of a long sentence to report an earlier word which was disrupted by noise. Informative (semantically related) or uninformative (semantically unrelated) word cues were presented 2, 6, or 10 words after a target word whose initial phoneme had been replaced with noise. A total of 84 young adults (45 males) listened to each sentence and then repeated it after its offset. The semantic benefit effect (SBE) was the difference in the accuracy of report of the disrupted target word during informative vs. uninformative sentences. Women had significantly higher SBEs than men even though there were no significant sex differences in terms of number of non-target words reported, the effect of distance between the disrupted target word and the informative cue, or kinds of errors generated. We suggest that the superior ability of women to use delayed semantic information to decode an earlier ambiguous speech signal may be linked to women's tendency to engage the hemispheres more bilaterally than men during word processing. Since the maintenance of semantic context under ambiguous conditions demands more right than left hemispheric resources, this may give women an advantage.

  17. Architecture for WSN Nodes Integration in Context Aware Systems Using Semantic Messages

    NASA Astrophysics Data System (ADS)

    Larizgoitia, Iker; Muguira, Leire; Vazquez, Juan Ignacio

    Wireless sensor networks (WSN) are becoming extremely popular in the development of context aware systems. Traditionally WSN have been focused on capturing data, which was later analyzed and interpreted in a server with more computational power. In this kind of scenario the problem of representing the sensor information needs to be addressed. Every node in the network might have different sensors attached; therefore their correspondent packet structures will be different. The server has to be aware of the meaning of every single structure and data in order to be able to interpret them. Multiple sensors, multiple nodes, multiple packet structures (and not following a standard format) is neither scalable nor interoperable. Context aware systems have solved this problem with the use of semantic technologies. They provide a common framework to achieve a standard definition of any domain. Nevertheless, these representations are computationally expensive, so a WSN cannot afford them. The work presented in this paper tries to bridge the gap between the sensor information and its semantic representation, by defining a simple architecture that enables the definition of this information natively in a semantic way, achieving the integration of the semantic information in the network packets. This will have several benefits, the most important being the possibility of promoting every WSN node to a real semantic information source.

  18. Eye-Tracking and Corpus-Based Analyses of Syntax-Semantics Interactions in Complement Coercion

    PubMed Central

    Lowder, Matthew W.; Gordon, Peter C.

    2016-01-01

    Previous work has shown that the difficulty associated with processing complex semantic expressions is reduced when the critical constituents appear in separate clauses as opposed to when they appear together in the same clause. We investigated this effect further, focusing in particular on complement coercion, in which an event-selecting verb (e.g., began) combines with a complement that represents an entity (e.g., began the memo). Experiment 1 compared reading times for coercion versus control expressions when the critical verb and complement appeared together in a subject-extracted relative clause (SRC) (e.g., The secretary that began/wrote the memo) compared to when they appeared together in a simple sentence. Readers spent more time processing coercion expressions than control expressions, replicating the typical coercion cost. In addition, readers spent less time processing the verb and complement in SRCs than in simple sentences; however, the magnitude of the coercion cost did not depend on sentence structure. In contrast, Experiment 2 showed that the coercion cost was reduced when the complement appeared as the head of an object-extracted relative clause (ORC) (e.g., The memo that the secretary began/wrote) compared to when the constituents appeared together in an SRC. Consistent with the eye-tracking results of Experiment 2, a corpus analysis showed that expressions requiring complement coercion are more frequent when the constituents are separated by the clause boundary of an ORC compared to when they are embedded together within an SRC. The results provide important information about the types of structural configurations that contribute to reduced difficulty with complex semantic expressions, as well as how these processing patterns are reflected in naturally occurring language. PMID:28529960

  19. Data Mining and Knowledge Discovery tools for exploiting big Earth-Observation data

    NASA Astrophysics Data System (ADS)

    Espinoza Molina, D.; Datcu, M.

    2015-04-01

    The continuous increase in the size of the archives and in the variety and complexity of Earth-Observation (EO) sensors require new methodologies and tools that allow the end-user to access a large image repository, to extract and to infer knowledge about the patterns hidden in the images, to retrieve dynamically a collection of relevant images, and to support the creation of emerging applications (e.g.: change detection, global monitoring, disaster and risk management, image time series, etc.). In this context, we are concerned with providing a platform for data mining and knowledge discovery content from EO archives. The platform's goal is to implement a communication channel between Payload Ground Segments and the end-user who receives the content of the data coded in an understandable format associated with semantics that is ready for immediate exploitation. It will provide the user with automated tools to explore and understand the content of highly complex images archives. The challenge lies in the extraction of meaningful information and understanding observations of large extended areas, over long periods of time, with a broad variety of EO imaging sensors in synergy with other related measurements and data. The platform is composed of several components such as 1.) ingestion of EO images and related data providing basic features for image analysis, 2.) query engine based on metadata, semantics and image content, 3.) data mining and knowledge discovery tools for supporting the interpretation and understanding of image content, 4.) semantic definition of the image content via machine learning methods. All these components are integrated and supported by a relational database management system, ensuring the integrity and consistency of Terabytes of Earth Observation data.

  20. Methods and apparatus for distributed resource discovery using examples

    NASA Technical Reports Server (NTRS)

    Chang, Yuan-Chi (Inventor); Li, Chung-Sheng (Inventor); Smith, John Richard (Inventor); Hill, Matthew L. (Inventor); Bergman, Lawrence David (Inventor); Castelli, Vittorio (Inventor)

    2005-01-01

    Distributed resource discovery is an essential step for information retrieval and/or providing information services. This step is usually used for determining the location of an information or data repository which has relevant information. The most fundamental challenge is the usual lack of semantic interoperability of the requested resource. In accordance with the invention, a method is disclosed where distributed repositories achieve semantic interoperability through the exchange of examples and, optionally, classifiers. The outcome of the inventive method can be used to determine whether common labels are referring to the same semantic meaning.

  1. NASA and The Semantic Web

    NASA Technical Reports Server (NTRS)

    Ashish, Naveen

    2005-01-01

    We provide an overview of several ongoing NASA endeavors based on concepts, systems, and technology from the Semantic Web arena. Indeed NASA has been one of the early adopters of Semantic Web Technology and we describe ongoing and completed R&D efforts for several applications ranging from collaborative systems to airspace information management to enterprise search to scientific information gathering and discovery systems at NASA.

  2. Automatic Semantic Facilitation in Anterior Temporal Cortex Revealed through Multimodal Neuroimaging

    PubMed Central

    Gramfort, Alexandre; Hämäläinen, Matti S.; Kuperberg, Gina R.

    2013-01-01

    A core property of human semantic processing is the rapid, facilitatory influence of prior input on extracting the meaning of what comes next, even under conditions of minimal awareness. Previous work has shown a number of neurophysiological indices of this facilitation, but the mapping between time course and localization—critical for separating automatic semantic facilitation from other mechanisms—has thus far been unclear. In the current study, we used a multimodal imaging approach to isolate early, bottom-up effects of context on semantic memory, acquiring a combination of electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI) measurements in the same individuals with a masked semantic priming paradigm. Across techniques, the results provide a strikingly convergent picture of early automatic semantic facilitation. Event-related potentials demonstrated early sensitivity to semantic association between 300 and 500 ms; MEG localized the differential neural response within this time window to the left anterior temporal cortex, and fMRI localized the effect more precisely to the left anterior superior temporal gyrus, a region previously implicated in semantic associative processing. However, fMRI diverged from early EEG/MEG measures in revealing semantic enhancement effects within frontal and parietal regions, perhaps reflecting downstream attempts to consciously access the semantic features of the masked prime. Together, these results provide strong evidence that automatic associative semantic facilitation is realized as reduced activity within the left anterior superior temporal cortex between 300 and 500 ms after a word is presented, and emphasize the importance of multimodal neuroimaging approaches in distinguishing the contributions of multiple regions to semantic processing. PMID:24155321

  3. Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention

    PubMed Central

    Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei

    2016-01-01

    An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features. PMID:26759193

  4. Ontology-based knowledge representation for resolution of semantic heterogeneity in GIS

    NASA Astrophysics Data System (ADS)

    Liu, Ying; Xiao, Han; Wang, Limin; Han, Jialing

    2017-07-01

    Lack of semantic interoperability in geographical information systems has been identified as the main obstacle for data sharing and database integration. The new method should be found to overcome the problems of semantic heterogeneity. Ontologies are considered to be one approach to support geographic information sharing. This paper presents an ontology-driven integration approach to help in detecting and possibly resolving semantic conflicts. Its originality is that each data source participating in the integration process contains an ontology that defines the meaning of its own data. This approach ensures the automation of the integration through regulation of semantic integration algorithm. Finally, land classification in field GIS is described as the example.

  5. Knowledge Based Text Generation

    DTIC Science & Technology

    1989-08-01

    Number 4, October-December, 1985, pp. 219-242. de Joia , A. and Stenton, A., Terms in Linguistics: A Guide to Halliday, London: Batsford Academic and...extraction of text schemata and their corresponding rhetorical predicates; design of a system motivated by the desire for domain and language independence...semantics and semantics effects syntax. Functional Linguistic Framework Page 19 The design of GENNY was guided by the functional paradigm. Provided a

  6. A multiple distributed representation method based on neural network for biomedical event extraction.

    PubMed

    Wang, Anran; Wang, Jian; Lin, Hongfei; Zhang, Jianhai; Yang, Zhihao; Xu, Kan

    2017-12-20

    Biomedical event extraction is one of the most frontier domains in biomedical research. The two main subtasks of biomedical event extraction are trigger identification and arguments detection which can both be considered as classification problems. However, traditional state-of-the-art methods are based on support vector machine (SVM) with massive manually designed one-hot represented features, which require enormous work but lack semantic relation among words. In this paper, we propose a multiple distributed representation method for biomedical event extraction. The method combines context consisting of dependency-based word embedding, and task-based features represented in a distributed way as the input of deep learning models to train deep learning models. Finally, we used softmax classifier to label the example candidates. The experimental results on Multi-Level Event Extraction (MLEE) corpus show higher F-scores of 77.97% in trigger identification and 58.31% in overall compared to the state-of-the-art SVM method. Our distributed representation method for biomedical event extraction avoids the problems of semantic gap and dimension disaster from traditional one-hot representation methods. The promising results demonstrate that our proposed method is effective for biomedical event extraction.

  7. Visual analytics for semantic queries of TerraSAR-X image content

    NASA Astrophysics Data System (ADS)

    Espinoza-Molina, Daniela; Alonso, Kevin; Datcu, Mihai

    2015-10-01

    With the continuous image product acquisition of satellite missions, the size of the image archives is considerably increasing every day as well as the variety and complexity of their content, surpassing the end-user capacity to analyse and exploit them. Advances in the image retrieval field have contributed to the development of tools for interactive exploration and extraction of the images from huge archives using different parameters like metadata, key-words, and basic image descriptors. Even though we count on more powerful tools for automated image retrieval and data analysis, we still face the problem of understanding and analyzing the results. Thus, a systematic computational analysis of these results is required in order to provide to the end-user a summary of the archive content in comprehensible terms. In this context, visual analytics combines automated analysis with interactive visualizations analysis techniques for an effective understanding, reasoning and decision making on the basis of very large and complex datasets. Moreover, currently several researches are focused on associating the content of the images with semantic definitions for describing the data in a format to be easily understood by the end-user. In this paper, we present our approach for computing visual analytics and semantically querying the TerraSAR-X archive. Our approach is mainly composed of four steps: 1) the generation of a data model that explains the information contained in a TerraSAR-X product. The model is formed by primitive descriptors and metadata entries, 2) the storage of this model in a database system, 3) the semantic definition of the image content based on machine learning algorithms and relevance feedback, and 4) querying the image archive using semantic descriptors as query parameters and computing the statistical analysis of the query results. The experimental results shows that with the help of visual analytics and semantic definitions we are able to explain the image content using semantic terms and the relations between them answering questions such as what is the percentage of urban area in a region? or what is the distribution of water bodies in a city?

  8. Use of Co-occurrences for Temporal Expressions Annotation

    NASA Astrophysics Data System (ADS)

    Craveiro, Olga; Macedo, Joaquim; Madeira, Henrique

    The annotation or extraction of temporal information from text documents is becoming increasingly important in many natural language processing applications such as text summarization, information retrieval, question answering, etc.. This paper presents an original method for easy recognition of temporal expressions in text documents. The method creates semantically classified temporal patterns, using word co-occurrences obtained from training corpora and a pre-defined seed keywords set, derived from the used language temporal references. A participation on a Portuguese named entity evaluation contest showed promising effectiveness and efficiency results. This approach can be adapted to recognize other type of expressions or languages, within other contexts, by defining the suitable word sets and training corpora.

  9. Causal Evidence for a Mechanism of Semantic Integration in the Angular Gyrus as Revealed by High-Definition Transcranial Direct Current Stimulation

    PubMed Central

    Peelle, Jonathan E.; Bonner, Michael F.; Grossman, Murray

    2016-01-01

    A defining aspect of human cognition is the ability to integrate conceptual information into complex semantic combinations. For example, we can comprehend “plaid” and “jacket” as individual concepts, but we can also effortlessly combine these concepts to form the semantic representation of “plaid jacket.” Many neuroanatomic models of semantic memory propose that heteromodal cortical hubs integrate distributed semantic features into coherent representations. However, little work has specifically examined these proposed integrative mechanisms and the causal role of these regions in semantic integration. Here, we test the hypothesis that the angular gyrus (AG) is critical for integrating semantic information by applying high-definition transcranial direct current stimulation (tDCS) to an fMRI-guided region-of-interest in the left AG. We found that anodal stimulation to the left AG modulated semantic integration but had no effect on a letter-string control task. Specifically, anodal stimulation to the left AG resulted in faster comprehension of semantically meaningful combinations like “tiny radish” relative to non-meaningful combinations, such as “fast blueberry,” when compared to the effects observed during sham stimulation and stimulation to a right-hemisphere control brain region. Moreover, the size of the effect from brain stimulation correlated with the degree of semantic coherence between the word pairs. These findings demonstrate that the left AG plays a causal role in the integration of lexical-semantic information, and that high-definition tDCS to an associative cortical hub can selectively modulate integrative processes in semantic memory. SIGNIFICANCE STATEMENT A major goal of neuroscience is to understand the neural basis of behaviors that are fundamental to human intelligence. One essential behavior is the ability to integrate conceptual knowledge from semantic memory, allowing us to construct an almost unlimited number of complex concepts from a limited set of basic constituents (e.g., “leaf” and “wet” can be combined into the more complex representation “wet leaf”). Here, we present a novel approach to studying integrative processes in semantic memory by applying focal brain stimulation to a heteromodal cortical hub implicated in semantic processing. Our findings demonstrate a causal role of the left angular gyrus in lexical-semantic integration and provide motivation for novel therapeutic applications in patients with lexical-semantic deficits. PMID:27030767

  10. Causal Evidence for a Mechanism of Semantic Integration in the Angular Gyrus as Revealed by High-Definition Transcranial Direct Current Stimulation.

    PubMed

    Price, Amy Rose; Peelle, Jonathan E; Bonner, Michael F; Grossman, Murray; Hamilton, Roy H

    2016-03-30

    A defining aspect of human cognition is the ability to integrate conceptual information into complex semantic combinations. For example, we can comprehend "plaid" and "jacket" as individual concepts, but we can also effortlessly combine these concepts to form the semantic representation of "plaid jacket." Many neuroanatomic models of semantic memory propose that heteromodal cortical hubs integrate distributed semantic features into coherent representations. However, little work has specifically examined these proposed integrative mechanisms and the causal role of these regions in semantic integration. Here, we test the hypothesis that the angular gyrus (AG) is critical for integrating semantic information by applying high-definition transcranial direct current stimulation (tDCS) to an fMRI-guided region-of-interest in the left AG. We found that anodal stimulation to the left AG modulated semantic integration but had no effect on a letter-string control task. Specifically, anodal stimulation to the left AG resulted in faster comprehension of semantically meaningful combinations like "tiny radish" relative to non-meaningful combinations, such as "fast blueberry," when compared to the effects observed during sham stimulation and stimulation to a right-hemisphere control brain region. Moreover, the size of the effect from brain stimulation correlated with the degree of semantic coherence between the word pairs. These findings demonstrate that the left AG plays a causal role in the integration of lexical-semantic information, and that high-definition tDCS to an associative cortical hub can selectively modulate integrative processes in semantic memory. A major goal of neuroscience is to understand the neural basis of behaviors that are fundamental to human intelligence. One essential behavior is the ability to integrate conceptual knowledge from semantic memory, allowing us to construct an almost unlimited number of complex concepts from a limited set of basic constituents (e.g., "leaf" and "wet" can be combined into the more complex representation "wet leaf"). Here, we present a novel approach to studying integrative processes in semantic memory by applying focal brain stimulation to a heteromodal cortical hub implicated in semantic processing. Our findings demonstrate a causal role of the left angular gyrus in lexical-semantic integration and provide motivation for novel therapeutic applications in patients with lexical-semantic deficits. Copyright © 2016 the authors 0270-6474/16/363829-10$15.00/0.

  11. Access to Biomedical Information: The Unified Medical Language System.

    ERIC Educational Resources Information Center

    Squires, Steven J.

    1993-01-01

    Describes the development of a Unified Medical Language System (UMLS) by the National Library of Medicine that will retrieve and integrate information from a variety of information resources. Highlights include the metathesaurus; the UMLS semantic network; semantic locality; information sources map; evaluation of the metathesaurus; future…

  12. Semantic Likelihood Models for Bayesian Inference in Human-Robot Interaction

    NASA Astrophysics Data System (ADS)

    Sweet, Nicholas

    Autonomous systems, particularly unmanned aerial systems (UAS), remain limited in au- tonomous capabilities largely due to a poor understanding of their environment. Current sensors simply do not match human perceptive capabilities, impeding progress towards full autonomy. Recent work has shown the value of humans as sources of information within a human-robot team; in target applications, communicating human-generated 'soft data' to autonomous systems enables higher levels of autonomy through large, efficient information gains. This requires development of a 'human sensor model' that allows soft data fusion through Bayesian inference to update the probabilistic belief representations maintained by autonomous systems. Current human sensor models that capture linguistic inputs as semantic information are limited in their ability to generalize likelihood functions for semantic statements: they may be learned from dense data; they do not exploit the contextual information embedded within groundings; and they often limit human input to restrictive and simplistic interfaces. This work provides mechanisms to synthesize human sensor models from constraints based on easily attainable a priori knowledge, develops compression techniques to capture information-dense semantics, and investigates the problem of capturing and fusing semantic information contained within unstructured natural language. A robotic experimental testbed is also developed to validate the above contributions.

  13. Remote semantic memory is impoverished in hippocampal amnesia

    PubMed Central

    Klooster, Nathaniel B.; Duff, Melissa C.

    2015-01-01

    The necessity of the hippocampus for acquiring new semantic concepts is a topic of considerable debate. However, it is generally accepted that any role the hippocampus plays in semantic memory is time limited and that previously acquired information becomes independent of the hippocampus over time. This view, along with intact naming and word-definition matching performance in amnesia, has led to the notion that remote semantic memory is intact in patients with hippocampal amnesia. Motivated by perspectives of word learning as a protracted process where additional features and senses of a word are added over time, and by recent discoveries about the time course of hippocampal contributions to on-line relational processing, reconsolidation, and the flexible integration of information, we revisit the notion that remote semantic memory is intact in amnesia. Using measures of semantic richness and vocabulary depth from psycholinguistics and first and second language-learning studies, we examined how much information is associated with previously acquired, highly familiar words in a group of patients with bilateral hippocampal damage and amnesia. Relative to healthy demographically matched comparison participants and a group of brain-damaged comparison participants, the patients with hippocampal amnesia performed significantly worse on both productive and receptive measures of vocabulary depth and semantic richness. These findings suggest that remote semantic memory is impoverished in patients with hippocampal amnesia and that the hippocampus may play a role in the maintenance and updating of semantic memory beyond its initial acquisition. PMID:26474741

  14. Remote semantic memory is impoverished in hippocampal amnesia.

    PubMed

    Klooster, Nathaniel B; Duff, Melissa C

    2015-12-01

    The necessity of the hippocampus for acquiring new semantic concepts is a topic of considerable debate. However, it is generally accepted that any role the hippocampus plays in semantic memory is time limited and that previously acquired information becomes independent of the hippocampus over time. This view, along with intact naming and word-definition matching performance in amnesia, has led to the notion that remote semantic memory is intact in patients with hippocampal amnesia. Motivated by perspectives of word learning as a protracted process where additional features and senses of a word are added over time, and by recent discoveries about the time course of hippocampal contributions to on-line relational processing, reconsolidation, and the flexible integration of information, we revisit the notion that remote semantic memory is intact in amnesia. Using measures of semantic richness and vocabulary depth from psycholinguistics and first and second language-learning studies, we examined how much information is associated with previously acquired, highly familiar words in a group of patients with bilateral hippocampal damage and amnesia. Relative to healthy demographically matched comparison participants and a group of brain-damaged comparison participants, the patients with hippocampal amnesia performed significantly worse on both productive and receptive measures of vocabulary depth and semantic richness. These findings suggest that remote semantic memory is impoverished in patients with hippocampal amnesia and that the hippocampus may play a role in the maintenance and updating of semantic memory beyond its initial acquisition. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Mandarin-Speaking Children's Speech Recognition: Developmental Changes in the Influences of Semantic Context and F0 Contours.

    PubMed

    Zhou, Hong; Li, Yu; Liang, Meng; Guan, Connie Qun; Zhang, Linjun; Shu, Hua; Zhang, Yang

    2017-01-01

    The goal of this developmental speech perception study was to assess whether and how age group modulated the influences of high-level semantic context and low-level fundamental frequency ( F 0 ) contours on the recognition of Mandarin speech by elementary and middle-school-aged children in quiet and interference backgrounds. The results revealed different patterns for semantic and F 0 information. One the one hand, age group modulated significantly the use of F 0 contours, indicating that elementary school children relied more on natural F 0 contours than middle school children during Mandarin speech recognition. On the other hand, there was no significant modulation effect of age group on semantic context, indicating that children of both age groups used semantic context to assist speech recognition to a similar extent. Furthermore, the significant modulation effect of age group on the interaction between F 0 contours and semantic context revealed that younger children could not make better use of semantic context in recognizing speech with flat F 0 contours compared with natural F 0 contours, while older children could benefit from semantic context even when natural F 0 contours were altered, thus confirming the important role of F 0 contours in Mandarin speech recognition by elementary school children. The developmental changes in the effects of high-level semantic and low-level F 0 information on speech recognition might reflect the differences in auditory and cognitive resources associated with processing of the two types of information in speech perception.

  16. Reproducibility and discriminability of brain patterns of semantic categories enhanced by congruent audiovisual stimuli.

    PubMed

    Li, Yuanqing; Wang, Guangyi; Long, Jinyi; Yu, Zhuliang; Huang, Biao; Li, Xiaojian; Yu, Tianyou; Liang, Changhong; Li, Zheng; Sun, Pei

    2011-01-01

    One of the central questions in cognitive neuroscience is the precise neural representation, or brain pattern, associated with a semantic category. In this study, we explored the influence of audiovisual stimuli on the brain patterns of concepts or semantic categories through a functional magnetic resonance imaging (fMRI) experiment. We used a pattern search method to extract brain patterns corresponding to two semantic categories: "old people" and "young people." These brain patterns were elicited by semantically congruent audiovisual, semantically incongruent audiovisual, unimodal visual, and unimodal auditory stimuli belonging to the two semantic categories. We calculated the reproducibility index, which measures the similarity of the patterns within the same category. We also decoded the semantic categories from these brain patterns. The decoding accuracy reflects the discriminability of the brain patterns between two categories. The results showed that both the reproducibility index of brain patterns and the decoding accuracy were significantly higher for semantically congruent audiovisual stimuli than for unimodal visual and unimodal auditory stimuli, while the semantically incongruent stimuli did not elicit brain patterns with significantly higher reproducibility index or decoding accuracy. Thus, the semantically congruent audiovisual stimuli enhanced the within-class reproducibility of brain patterns and the between-class discriminability of brain patterns, and facilitate neural representations of semantic categories or concepts. Furthermore, we analyzed the brain activity in superior temporal sulcus and middle temporal gyrus (STS/MTG). The strength of the fMRI signal and the reproducibility index were enhanced by the semantically congruent audiovisual stimuli. Our results support the use of the reproducibility index as a potential tool to supplement the fMRI signal amplitude for evaluating multimodal integration.

  17. Reproducibility and Discriminability of Brain Patterns of Semantic Categories Enhanced by Congruent Audiovisual Stimuli

    PubMed Central

    Long, Jinyi; Yu, Zhuliang; Huang, Biao; Li, Xiaojian; Yu, Tianyou; Liang, Changhong; Li, Zheng; Sun, Pei

    2011-01-01

    One of the central questions in cognitive neuroscience is the precise neural representation, or brain pattern, associated with a semantic category. In this study, we explored the influence of audiovisual stimuli on the brain patterns of concepts or semantic categories through a functional magnetic resonance imaging (fMRI) experiment. We used a pattern search method to extract brain patterns corresponding to two semantic categories: “old people” and “young people.” These brain patterns were elicited by semantically congruent audiovisual, semantically incongruent audiovisual, unimodal visual, and unimodal auditory stimuli belonging to the two semantic categories. We calculated the reproducibility index, which measures the similarity of the patterns within the same category. We also decoded the semantic categories from these brain patterns. The decoding accuracy reflects the discriminability of the brain patterns between two categories. The results showed that both the reproducibility index of brain patterns and the decoding accuracy were significantly higher for semantically congruent audiovisual stimuli than for unimodal visual and unimodal auditory stimuli, while the semantically incongruent stimuli did not elicit brain patterns with significantly higher reproducibility index or decoding accuracy. Thus, the semantically congruent audiovisual stimuli enhanced the within-class reproducibility of brain patterns and the between-class discriminability of brain patterns, and facilitate neural representations of semantic categories or concepts. Furthermore, we analyzed the brain activity in superior temporal sulcus and middle temporal gyrus (STS/MTG). The strength of the fMRI signal and the reproducibility index were enhanced by the semantically congruent audiovisual stimuli. Our results support the use of the reproducibility index as a potential tool to supplement the fMRI signal amplitude for evaluating multimodal integration. PMID:21750692

  18. A Big-Data-based platform of workers' behavior: Observations from the field.

    PubMed

    Guo, S Y; Ding, L Y; Luo, H B; Jiang, X Y

    2016-08-01

    Behavior-Based Safety (BBS) has been used in construction to observe, analyze and modify workers' behavior. However, studies have identified that BBS has several limitations, which have hindered its effective implementation. To mitigate the negative impact of BBS, this paper uses a case study approach to develop a Big-Data-based platform to classify, collect and store data about workers' unsafe behavior that is derived from a metro construction project. In developing the platform, three processes were undertaken: (1) a behavioral risk knowledge base was established; (2) images reflecting workers' unsafe behavior were collected from intelligent video surveillance and mobile application; and (3) images with semantic information were stored via a Hadoop Distributed File System (HDFS). The platform was implemented during the construction of the metro-system and it is demonstrated that it can effectively analyze semantic information contained in images, automatically extract workers' unsafe behavior and quickly retrieve on HDFS as well. The research presented in this paper can enable construction organizations with the ability to visualize unsafe acts in real-time and further identify patterns of behavior that can jeopardize safety outcomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Does "a picture is worth 1000 words" apply to iconic Chinese words? Relationship of Chinese words and pictures.

    PubMed

    Lo, Shih-Yu; Yeh, Su-Ling

    2018-05-29

    The meaning of a picture can be extracted rapidly, but the form-to-meaning relationship is less obvious for printed words. In contrast to English words that follow grapheme-to-phoneme correspondence rule, the iconic nature of Chinese words might predispose them to activate their semantic representations more directly from their orthographies. By using the paradigm of repetition blindness (RB) that taps into the early level of word processing, we examined whether Chinese words activate their semantic representations as directly as pictures do. RB refers to the failure to detect the second occurrence of an item when it is presented twice in temporal proximity. Previous studies showed RB for semantically related pictures, suggesting that pictures activate their semantic representations directly from their shapes and thus two semantically related pictures are represented as repeated. However, this does not apply to English words since no RB was found for English synonyms. In this study, we replicated the semantic RB effect for pictures, and further showed the absence of semantic RB for Chinese synonyms. Based on our findings, it is suggested that Chinese words are processed like English words, which do not activate their semantic representations as directly as pictures do.

  20. Fully Automatic Speech-Based Analysis of the Semantic Verbal Fluency Task.

    PubMed

    König, Alexandra; Linz, Nicklas; Tröger, Johannes; Wolters, Maria; Alexandersson, Jan; Robert, Phillipe

    2018-06-08

    Semantic verbal fluency (SVF) tests are routinely used in screening for mild cognitive impairment (MCI). In this task, participants name as many items as possible of a semantic category under a time constraint. Clinicians measure task performance manually by summing the number of correct words and errors. More fine-grained variables add valuable information to clinical assessment, but are time-consuming. Therefore, the aim of this study is to investigate whether automatic analysis of the SVF could provide these as accurate as manual and thus, support qualitative screening of neurocognitive impairment. SVF data were collected from 95 older people with MCI (n = 47), Alzheimer's or related dementias (ADRD; n = 24), and healthy controls (HC; n = 24). All data were annotated manually and automatically with clusters and switches. The obtained metrics were validated using a classifier to distinguish HC, MCI, and ADRD. Automatically extracted clusters and switches were highly correlated (r = 0.9) with manually established values, and performed as well on the classification task separating HC from persons with ADRD (area under curve [AUC] = 0.939) and MCI (AUC = 0.758). The results show that it is possible to automate fine-grained analyses of SVF data for the assessment of cognitive decline. © 2018 S. Karger AG, Basel.

  1. NELasso: Group-Sparse Modeling for Characterizing Relations Among Named Entities in News Articles.

    PubMed

    Tariq, Amara; Karim, Asim; Foroosh, Hassan

    2017-10-01

    Named entities such as people, locations, and organizations play a vital role in characterizing online content. They often reflect information of interest and are frequently used in search queries. Although named entities can be detected reliably from textual content, extracting relations among them is more challenging, yet useful in various applications (e.g., news recommending systems). In this paper, we present a novel model and system for learning semantic relations among named entities from collections of news articles. We model each named entity occurrence with sparse structured logistic regression, and consider the words (predictors) to be grouped based on background semantics. This sparse group LASSO approach forces the weights of word groups that do not influence the prediction towards zero. The resulting sparse structure is utilized for defining the type and strength of relations. Our unsupervised system yields a named entities' network where each relation is typed, quantified, and characterized in context. These relations are the key to understanding news material over time and customizing newsfeeds for readers. Extensive evaluation of our system on articles from TIME magazine and BBC News shows that the learned relations correlate with static semantic relatedness measures like WLM, and capture the evolving relationships among named entities over time.

  2. PASTE: patient-centered SMS text tagging in a medication management system.

    PubMed

    Stenner, Shane P; Johnson, Kevin B; Denny, Joshua C

    2012-01-01

    To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages.

  3. Speaker information affects false recognition of unstudied lexical-semantic associates.

    PubMed

    Luthra, Sahil; Fox, Neal P; Blumstein, Sheila E

    2018-05-01

    Recognition of and memory for a spoken word can be facilitated by a prior presentation of that word spoken by the same talker. However, it is less clear whether this speaker congruency advantage generalizes to facilitate recognition of unheard related words. The present investigation employed a false memory paradigm to examine whether information about a speaker's identity in items heard by listeners could influence the recognition of novel items (critical intruders) phonologically or semantically related to the studied items. In Experiment 1, false recognition of semantically associated critical intruders was sensitive to speaker information, though only when subjects attended to talker identity during encoding. Results from Experiment 2 also provide some evidence that talker information affects the false recognition of critical intruders. Taken together, the present findings indicate that indexical information is able to contact the lexical-semantic network to affect the processing of unheard words.

  4. Shared Semantics and the Use of Organizational Memories for E-Mail Communications.

    ERIC Educational Resources Information Center

    Schwartz, David G.

    1998-01-01

    Examines the use of shared semantics information to link concepts in an organizational memory to e-mail communications. Presents a framework for determining shared semantics based on organizational and personal user profiles. Illustrates how shared semantics are used by the HyperMail system to help link organizational memories (OM) content to…

  5. Rule-based support system for multiple UMLS semantic type assignments

    PubMed Central

    Geller, James; He, Zhe; Perl, Yehoshua; Morrey, C. Paul; Xu, Julia

    2012-01-01

    Background When new concepts are inserted into the UMLS, they are assigned one or several semantic types from the UMLS Semantic Network by the UMLS editors. However, not every combination of semantic types is permissible. It was observed that many concepts with rare combinations of semantic types have erroneous semantic type assignments or prohibited combinations of semantic types. The correction of such errors is resource-intensive. Objective We design a computational system to inform UMLS editors as to whether a specific combination of two, three, four, or five semantic types is permissible or prohibited or questionable. Methods We identify a set of inclusion and exclusion instructions in the UMLS Semantic Network documentation and derive corresponding rule-categories as well as rule-categories from the UMLS concept content. We then design an algorithm adviseEditor based on these rule-categories. The algorithm specifies rules for an editor how to proceed when considering a tuple (pair, triple, quadruple, quintuple) of semantic types to be assigned to a concept. Results Eight rule-categories were identified. A Web-based system was developed to implement the adviseEditor algorithm, which returns for an input combination of semantic types whether it is permitted, prohibited or (in a few cases) requires more research. The numbers of semantic type pairs assigned to each rule-category are reported. Interesting examples for each rule-category are illustrated. Cases of semantic type assignments that contradict rules are listed, including recently introduced ones. Conclusion The adviseEditor system implements explicit and implicit knowledge available in the UMLS in a system that informs UMLS editors about the permissibility of a desired combination of semantic types. Using adviseEditor might help accelerate the work of the UMLS editors and prevent erroneous semantic type assignments. PMID:23041716

  6. The Ins and Outs of Meaning: Behavioral and Neuroanatomical Dissociation of Semantically-Driven Word Retrieval and Multimodal Semantic Recognition in Aphasia

    PubMed Central

    Mirman, Daniel; Zhang, Yongsheng; Wang, Ze; Coslett, H. Branch; Schwartz, Myrna F.

    2015-01-01

    Theories about the architecture of language processing differ with regard to whether verbal and nonverbal comprehension share a functional and neural substrate and how meaning extraction in comprehension relates to the ability to use meaning to drive verbal production. We (re-)evaluate data from 17 cognitive-linguistic performance measures of 99 participants with chronic aphasia using factor analysis to establish functional components and support vector regression-based lesion-symptom mapping to determine the neural correlates of deficits on these functional components. The results are highly consistent with our previous findings: production of semantic errors is behaviorally and neuroanatomically distinct from verbal and nonverbal comprehension. Semantic errors were most strongly associated with left ATL damage whereas deficits on tests of verbal and non-verbal semantic recognition were most strongly associated with damage to deep white matter underlying the frontal lobe at the confluence of multiple tracts, including the inferior fronto-occipital fasciculus, the uncinate fasciculus, and the anterior thalamic radiations. These results suggest that traditional views based on grey matter hub(s) for semantic processing are incomplete and that the role of white matter in semantic cognition has been underappreciated. PMID:25681739

  7. Semantic attributes based texture generation

    NASA Astrophysics Data System (ADS)

    Chi, Huifang; Gan, Yanhai; Qi, Lin; Dong, Junyu; Madessa, Amanuel Hirpa

    2018-04-01

    Semantic attributes are commonly used for texture description. They can be used to describe the information of a texture, such as patterns, textons, distributions, brightness, and so on. Generally speaking, semantic attributes are more concrete descriptors than perceptual features. Therefore, it is practical to generate texture images from semantic attributes. In this paper, we propose to generate high-quality texture images from semantic attributes. Over the last two decades, several works have been done on texture synthesis and generation. Most of them focusing on example-based texture synthesis and procedural texture generation. Semantic attributes based texture generation still deserves more devotion. Gan et al. proposed a useful joint model for perception driven texture generation. However, perceptual features are nonobjective spatial statistics used by humans to distinguish different textures in pre-attentive situations. To give more describing information about texture appearance, semantic attributes which are more in line with human description habits are desired. In this paper, we use sigmoid cross entropy loss in an auxiliary model to provide enough information for a generator. Consequently, the discriminator is released from the relatively intractable mission of figuring out the joint distribution of condition vectors and samples. To demonstrate the validity of our method, we compare our method to Gan et al.'s method on generating textures by designing experiments on PTD and DTD. All experimental results show that our model can generate textures from semantic attributes.

  8. Depth of treatment sensitive noise resistant dynamic artificial neural networks model of recall in people with prosopagnosia.

    PubMed

    Morissette, Laurence; Chartier, Sylvain; Vandermeulen, Robyn; Watier, Nicholas

    2012-08-01

    The Fusiform Face Area (FFA) is the brain region considered to be responsible for face recognition. Prosopagnosia is a brain disorder causing the inability to a recognise faces that is said to mainly affect the FFA. We put forward a model that simulates the capacity to retrieve label associated with faces and objects depending on the depth of treatment of the information. Akin to prosopagnosia, various localised "lesions" were inserted into the network in order to evaluate the degradation of performance. The network is first composed of a Feature Extracting Bidirectional Associative Memory (FEBAM-SOM) to represent the topological maps allowing the categorisation of all faces. The second component of the network is a Bidirectional Heteroassociative Memory (BHM) that links those representations to their semantic label. For the latter, specific semantic labels were used as well as more general ones. The inputs were images representing faces and various objects. Just like in the visual perceptual system, the images were pre-processed using a low-pass filter. Results showed that the network is able to associate the extracted map with the correct label information. The network is able to generalise and is robust to noise. Moreover, results showed that the recall performance of names associated with faces decrease with the size of lesion without affecting the performance of the objects. Finally, results obtained with the network are also consistent with human ones in that higher level, more general labels are more robust to lesion compared to low level, specific labels. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Exploring semantic and phonological picture-word priming in adults who stutter using event-related potentials

    PubMed Central

    Maxfield, Nathan D.; Pizon-Moore, Angela A.; Frisch, Stefan A.; Constantine, Joseph L.

    2011-01-01

    Objective Our aim was to investigate how semantic and phonological information is processed in adults who stutter (AWS) preparing to name pictures, following-up a report that event-related potentials (ERPs) in AWS evidenced atypical semantic picture-word priming (Maxfield et al., 2010). Methods Fourteen AWS and 14 typically-fluent adults (TFA) participated. Pictures, named at a delay, were followed by probe words. Design elements not used in Maxfield et al. (2010) let us evaluate both phonological and semantic picture-word priming. Results TFA evidenced typical priming effects in probe-elicited ERPs. AWS evidenced diminished Semantic priming, and reverse Phonological N400 priming. Conclusions Results point to atypical processing of semantic and phonological information in AWS. Discussion considers whether AWS ERP effects reflect unstable activation of target label semantic and phonological representations, strategic inhibition of target label phonological neighbors, and/or phonological label-probe competition. Significance Results raise questions about how mechanisms that regulate activation spreading operate in AWS. PMID:22055837

  10. Semantic Knowledge for Famous Names in Mild Cognitive Impairment

    PubMed Central

    Seidenberg, Michael; Guidotti, Leslie; Nielson, Kristy A.; Woodard, John L.; Durgerian, Sally; Zhang, Qi; Gander, Amelia; Antuono, Piero; Rao, Stephen M.

    2008-01-01

    Person identification represents a unique category of semantic knowledge that is commonly impaired in Alzheimer's Disease (AD), but has received relatively little investigation in patients with Mild Cognitive Impairment (MCI). The current study examined the retrieval of semantic knowledge for famous names from three time epochs (recent, remote, and enduring) in two participant groups; 23 aMCI patients and 23 healthy elderly controls. The aMCI group was less accurate and produced less semantic knowledge than controls for famous names. Names from the enduring period were recognized faster than both recent and remote names in both groups, and remote names were recognized more quickly than recent names. Episodic memory performance was correlated with greater semantic knowledge particularly for recent names. We suggest that the anterograde memory deficits in the aMCI group interferes with learning of recent famous names and as a result produces difficulties with updating and integrating new semantic information with previously stored information. The implications of these findings for characterizing semantic memory deficits in MCI are discussed. PMID:19128524

  11. Decoding semantic information from human electrocorticographic (ECoG) signals.

    PubMed

    Wang, Wei; Degenhart, Alan D; Sudre, Gustavo P; Pomerleau, Dean A; Tyler-Kabara, Elizabeth C

    2011-01-01

    This study examined the feasibility of decoding semantic information from human cortical activity. Four human subjects undergoing presurgical brain mapping and seizure foci localization participated in this study. Electrocorticographic (ECoG) signals were recorded while the subjects performed simple language tasks involving semantic information processing, such as a picture naming task where subjects named pictures of objects belonging to different semantic categories. Robust high-gamma band (60-120 Hz) activation was observed at the left inferior frontal gyrus (LIFG) and the posterior portion of the superior temporal gyrus (pSTG) with a temporal sequence corresponding to speech production and perception. Furthermore, Gaussian Naïve Bayes and Support Vector Machine classifiers, two commonly used machine learning algorithms for pattern recognition, were able to predict the semantic category of an object using cortical activity captured by ECoG electrodes covering the frontal, temporal and parietal cortices. These findings have implications for both basic neuroscience research and development of semantic-based brain-computer interface systems (BCI) that can help individuals with severe motor or communication disorders to express their intention and thoughts.

  12. Recognising discourse causality triggers in the biomedical domain.

    PubMed

    Mihăilă, Claudiu; Ananiadou, Sophia

    2013-12-01

    Current domain-specific information extraction systems represent an important resource for biomedical researchers, who need to process vast amounts of knowledge in a short time. Automatic discourse causality recognition can further reduce their workload by suggesting possible causal connections and aiding in the curation of pathway models. We describe here an approach to the automatic identification of discourse causality triggers in the biomedical domain using machine learning. We create several baselines and experiment with and compare various parameter settings for three algorithms, i.e. Conditional Random Fields (CRF), Support Vector Machines (SVM) and Random Forests (RF). We also evaluate the impact of lexical, syntactic, and semantic features on each of the algorithms, showing that semantics improves the performance in all cases. We test our comprehensive feature set on two corpora containing gold standard annotations of causal relations, and demonstrate the need for more gold standard data. The best performance of 79.35% F-score is achieved by CRFs when using all three feature types.

  13. ADESSA: A Real-Time Decision Support Service for Delivery of Semantically Coded Adverse Drug Event Data

    PubMed Central

    Duke, Jon D.; Friedlin, Jeff

    2010-01-01

    Evaluating medications for potential adverse events is a time-consuming process, typically involving manual lookup of information by physicians. This process can be expedited by CDS systems that support dynamic retrieval and filtering of adverse drug events (ADE’s), but such systems require a source of semantically-coded ADE data. We created a two-component system that addresses this need. First we created a natural language processing application which extracts adverse events from Structured Product Labels and generates a standardized ADE knowledge base. We then built a decision support service that consumes a Continuity of Care Document and returns a list of patient-specific ADE’s. Our database currently contains 534,125 ADE’s from 5602 product labels. An NLP evaluation of 9529 ADE’s showed recall of 93% and precision of 95%. On a trial set of 30 CCD’s, the system provided adverse event data for 88% of drugs and returned these results in an average of 620ms. PMID:21346964

  14. Synergetic computer and holonics - information dynamics of a semantic computer

    NASA Astrophysics Data System (ADS)

    Shimizu, H.; Yamaguchi, Y.

    1987-12-01

    The dynamics of semantic information in biosystem is studied based on holons, generators of mutual relations. Any biosystem has an internal world, a so-called "self", which has an intrinsic purpose rendering the system continuously alive and developed as much as possible against a fluctuating external world. External signals to the system through sensory organs are classified by the self into two basic categories, semantic information with some meaning and value for the purpose and inputs from background and noise sources. Due to this breaking of semantic symmetry, any input signals are transformed into a figure and background, respectively. As a typical example, the visual perception of vertebrates is studied. For such semantic transformation the external signal is first decomposed and converted into a number of elementary signs named "syntons" which are then transmitted into a sensory area of cortex corresponding to an image synthesizer. The synthesizer is a sort of autonomic parallel processor composed of autonomic units, "holons", which are characterized by many internal modes. Syntons are fed into the holons one by one. A set of the elementary meanings, the so-called "semons", provided to the synton are encoded in the internal modes of the holon; that is, each internal mode encodes a semon. A dynamic information theory for the transformation of external signals to semantic information is developed based on our model which we call holovision. Holovision is a dynamic model of visual perception that processes an autonomic ability to self-organize visual images. Autonomous oscillators are utilized as the line processors to encode line elements with specific orientations in their phases as semons. An information space is defined according to the assembly of holons; the spatial plane on which holons are arranged is a syntactic subspace while the internal modes of the holons span a semantic subspace in the orthogonal direction. In this information space, the image of a figure is self-organized - as a sort of spatiotemporal pattern - by autonomic coordinations of the holons that select relevant internal modes, accompanied with compression of irrelevant syntons that correspond to the background. Holons coded by a synton are relevantly connected by means of coherent relations, i.e., dynamic connections with time-coherence, in order to represent the image that varies in time depending on the instantaneous state of the external object. These connections depend on the internal modes that are cooperatively selectively selected by the holons. The image is regarded as a bridge between the external and internal world that has both external and internal consistency. The meaning of the image, i.e., transformed semantic information, is spontaneously transferred from semantic items that have a coherent relation with the image, and the external signal is perceived by the self through the image. We demonstrate that images are indeed self-organized in holovision in the previously described sense. Simulated processes of the creation of semantic information in holovision are shown to display typical features of the forgoing steps of information compression. Based on these results, we propose quantitative indices that represent the value of semantic information in the image processor as well as in the memory.

  15. Semantic integration of audio-visual information of polyphonic characters in a sentence context: an event-related potential study.

    PubMed

    Liu, Hong; Zhang, Gaoyan; Liu, Baolin

    2017-04-01

    In the Chinese language, a polyphone is a kind of special character that has more than one pronunciation, with each pronunciation corresponding to a different meaning. Here, we aimed to reveal the cognitive processing of audio-visual information integration of polyphones in a sentence context using the event-related potential (ERP) method. Sentences ending with polyphones were presented to subjects simultaneously in both an auditory and a visual modality. Four experimental conditions were set in which the visual presentations were the same, but the pronunciations of the polyphones were: the correct pronunciation; another pronunciation of the polyphone; a semantically appropriate pronunciation but not the pronunciation of the polyphone; or a semantically inappropriate pronunciation but also not the pronunciation of the polyphone. The behavioral results demonstrated significant differences in response accuracies when judging the semantic meanings of the audio-visual sentences, which reflected the different demands on cognitive resources. The ERP results showed that in the early stage, abnormal pronunciations were represented by the amplitude of the P200 component. Interestingly, because the phonological information mediated access to the lexical semantics, the amplitude and latency of the N400 component changed linearly across conditions, which may reflect the gradually increased semantic mismatch in the four conditions when integrating the auditory pronunciation with the visual information. Moreover, the amplitude of the late positive shift (LPS) showed a significant correlation with the behavioral response accuracies, demonstrating that the LPS component reveals the demand of cognitive resources for monitoring and resolving semantic conflicts when integrating the audio-visual information.

  16. The influence of speech rate and accent on access and use of semantic information.

    PubMed

    Sajin, Stanislav M; Connine, Cynthia M

    2017-04-01

    Circumstances in which the speech input is presented in sub-optimal conditions generally lead to processing costs affecting spoken word recognition. The current study indicates that some processing demands imposed by listening to difficult speech can be mitigated by feedback from semantic knowledge. A set of lexical decision experiments examined how foreign accented speech and word duration impact access to semantic knowledge in spoken word recognition. Results indicate that when listeners process accented speech, the reliance on semantic information increases. Speech rate was not observed to influence semantic access, except in the setting in which unusually slow accented speech was presented. These findings support interactive activation models of spoken word recognition in which attention is modulated based on speech demands.

  17. Less Is More: Semantic Information Survives Interocular Suppression When Attention Is Diverted.

    PubMed

    Eo, Kangyong; Cha, Oakyoon; Chong, Sang Chul; Kang, Min-Suk

    2016-05-18

    The extent of unconscious semantic processing has been debated. It is well established that semantic information is registered in the absence of awareness induced by inattention. However, it has been debated whether semantic information of invisible stimuli is processed during interocular suppression, a procedure that renders one eye's view invisible by presenting a dissimilar stimulus to the other eye. Inspired by recent evidence demonstrating that reduced attention attenuates interocular suppression, we tested a counterintuitive hypothesis that attention withdrawn from the suppressed target location facilitates semantic processing in the absence of awareness induced by interocular suppression. We obtained an electrophysiological marker of semantic processing (N400 component) while human participants' spatial attention was being manipulated with a cueing paradigm during interocular suppression. We found that N400 modulation was absent when participants' attention was directed to the target location, but present when diverted elsewhere. In addition, the correlation analysis across participants indicated that the N400 amplitude was reduced with more attention being directed to the target location. Together, these results indicate that inattention attenuates interocular suppression and thereby makes semantic processing available unconsciously, reconciling conflicting evidence in the literature. We discuss a tight link among interocular suppression, attention, and conscious awareness. Interocular suppression offers a powerful means of studying the extent of unconscious processing by rendering a salient stimulus presented to one eye invisible. Here, we provide evidence that attention is a determining factor for unconscious semantic processing. An electrophysiological marker for semantic processing (N400 component) was present when attention was diverted away from the suppressed stimulus but absent when attention was directed to that stimulus, indicating that inattention facilitates unconscious semantic processing during the interocular suppression. Although contrary to the common sense assumption that attention facilitates information processing, this result is in accordance with recent studies showing that attention modulates interocular suppression but is not necessary for semantic processing. Our finding reconciles the conflicting evidence and advances theories of consciousness. Copyright © 2016 the authors 0270-6474/16/365489-09$15.00/0.

  18. Structured reports of videofluoroscopic swallowing studies have the potential to improve overall report quality compared to free text reports.

    PubMed

    Schoeppe, Franziska; Sommer, Wieland H; Haack, Mareike; Havel, Miriam; Rheinwald, Marika; Wechtenbruch, Juliane; Fischer, Martin R; Meinel, Felix G; Sabel, Bastian O; Sommer, Nora N

    2018-01-01

    To compare free text (FTR) and structured reports (SR) of videofluoroscopic swallowing studies (VFSS) and evaluate satisfaction of referring otolaryngologists and speech therapists. Both standard FTR and SR of 26 patients with VFSS were acquired. A dedicated template focusing on oropharyngeal phases was created for SR using online software with clickable decision-trees and concomitant generation of semantically structured reports. All reports were evaluated regarding overall quality and content, information extraction and clinical decision support (10-point Likert scale (0 = I completely disagree, 10 = I completely agree)). Two otorhinolaryngologists and two speech therapists evaluated FTR and SR. SR received better ratings than FTR in all items. SR were perceived to contain more details on the swallowing phases (median rating: 10 vs. 5; P < 0.001), penetration and aspiration (10 vs. 5; P < 0.001) and facilitated information extraction compared to FTR (10 vs. 4; P < 0.001). Overall quality was rated significantly higher in SR than FTR (P < 0.001). SR of VFSS provide more detailed information and facilitate information extraction. SR better assist in clinical decision-making, might enhance the quality of the report and, thus, are recommended for the evaluation of VFSS. • Structured reports on videofluoroscopic exams of deglutition lead to improved report quality. • Information extraction is facilitated when using structured reports based on decision trees. • Template-based reports add more value to clinical decision-making than free text reports. • Structured reports receive better ratings by speech therapists and otolaryngologists. • Structured reports on videofluoroscopic exams may improve the comparability between exams.

  19. Edge-Based Image Compression with Homogeneous Diffusion

    NASA Astrophysics Data System (ADS)

    Mainberger, Markus; Weickert, Joachim

    It is well-known that edges contain semantically important image information. In this paper we present a lossy compression method for cartoon-like images that exploits information at image edges. These edges are extracted with the Marr-Hildreth operator followed by hysteresis thresholding. Their locations are stored in a lossless way using JBIG. Moreover, we encode the grey or colour values at both sides of each edge by applying quantisation, subsampling and PAQ coding. In the decoding step, information outside these encoded data is recovered by solving the Laplace equation, i.e. we inpaint with the steady state of a homogeneous diffusion process. Our experiments show that the suggested method outperforms the widely-used JPEG standard and can even beat the advanced JPEG2000 standard for cartoon-like images.

  20. Temporal Representation in Semantic Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Levandoski, J J; Abdulla, G M

    2007-08-07

    A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.

  1. Semantic-Web Technology: Applications at NASA

    NASA Technical Reports Server (NTRS)

    Ashish, Naveen

    2004-01-01

    We provide a description of work at the National Aeronautics and Space Administration (NASA) on building system based on semantic-web concepts and technologies. NASA has been one of the early adopters of semantic-web technologies for practical applications. Indeed there are several ongoing 0 endeavors on building semantics based systems for use in diverse NASA domains ranging from collaborative scientific activity to accident and mishap investigation to enterprise search to scientific information gathering and integration to aviation safety decision support We provide a brief overview of many applications and ongoing work with the goal of informing the external community of these NASA endeavors.

  2. Workspaces in the Semantic Web

    NASA Technical Reports Server (NTRS)

    Wolfe, Shawn R.; Keller, RIchard M.

    2005-01-01

    Due to the recency and relatively limited adoption of Semantic Web technologies. practical issues related to technology scaling have received less attention than foundational issues. Nonetheless, these issues must be addressed if the Semantic Web is to realize its full potential. In particular, we concentrate on the lack of scoping methods that reduce the size of semantic information spaces so they are more efficient to work with and more relevant to an agent's needs. We provide some intuition to motivate the need for such reduced information spaces, called workspaces, give a formal definition, and suggest possible methods of deriving them.

  3. A logical approach to semantic interoperability in healthcare.

    PubMed

    Bird, Linda; Brooks, Colleen; Cheong, Yu Chye; Tun, Nwe Ni

    2011-01-01

    Singapore is in the process of rolling out a number of national e-health initiatives, including the National Electronic Health Record (NEHR). A critical enabler in the journey towards semantic interoperability is a Logical Information Model (LIM) that harmonises the semantics of the information structure with the terminology. The Singapore LIM uses a combination of international standards, including ISO 13606-1 (a reference model for electronic health record communication), ISO 21090 (healthcare datatypes), and SNOMED CT (healthcare terminology). The LIM is accompanied by a logical design approach, used to generate interoperability artifacts, and incorporates mechanisms for achieving unidirectional and bidirectional semantic interoperability.

  4. A Visual Analytics Framework for Identifying Topic Drivers in Media Events.

    PubMed

    Lu, Yafeng; Wang, Hong; Landis, Steven; Maciejewski, Ross

    2017-09-14

    Media data has been the subject of large scale analysis with applications of text mining being used to provide overviews of media themes and information flows. Such information extracted from media articles has also shown its contextual value of being integrated with other data, such as criminal records and stock market pricing. In this work, we explore linking textual media data with curated secondary textual data sources through user-guided semantic lexical matching for identifying relationships and data links. In this manner, critical information can be identified and used to annotate media timelines in order to provide a more detailed overview of events that may be driving media topics and frames. These linked events are further analyzed through an application of causality modeling to model temporal drivers between the data series. Such causal links are then annotated through automatic entity extraction which enables the analyst to explore persons, locations, and organizations that may be pertinent to the media topic of interest. To demonstrate the proposed framework, two media datasets and an armed conflict event dataset are explored.

  5. Eye Movements to Pictures Reveal Transient Semantic Activation during Spoken Word Recognition

    ERIC Educational Resources Information Center

    Yee, Eiling; Sedivy, Julie C.

    2006-01-01

    Two experiments explore the activation of semantic information during spoken word recognition. Experiment 1 shows that as the name of an object unfolds (e.g., lock), eye movements are drawn to pictorial representations of both the named object and semantically related objects (e.g., key). Experiment 2 shows that objects semantically related to an…

  6. Explaining Semantic Short-Term Memory Deficits: Evidence for the Critical Role of Semantic Control

    ERIC Educational Resources Information Center

    Hoffman, Paul; Jefferies, Elizabeth; Lambon Ralph, Matthew A.

    2011-01-01

    Patients with apparently selective short-term memory (STM) deficits for semantic information have played an important role in developing multi-store theories of STM and challenge the idea that verbal STM is supported by maintaining activation in the language system. We propose that semantic STM deficits are not as selective as previously thought…

  7. Building a semantic web-based metadata repository for facilitating detailed clinical modeling in cancer genome studies.

    PubMed

    Sharma, Deepak K; Solbrig, Harold R; Tao, Cui; Weng, Chunhua; Chute, Christopher G; Jiang, Guoqian

    2017-06-05

    Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains. We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes). We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models. Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.

  8. The elephant in the room: Inconsistency in scene viewing and representation.

    PubMed

    Spotorno, Sara; Tatler, Benjamin W

    2017-10-01

    We examined the extent to which semantic informativeness, consistency with expectations and perceptual salience contribute to object prioritization in scene viewing and representation. In scene viewing (Experiments 1-2), semantic guidance overshadowed perceptual guidance in determining fixation order, with the greatest prioritization for objects that were diagnostic of the scene's depicted event. Perceptual properties affected selection of consistent objects (regardless of their informativeness) but not of inconsistent objects. Semantic and perceptual properties also interacted in influencing foveal inspection, as inconsistent objects were fixated longer than low but not high salience diagnostic objects. While not studied in direct competition with each other (each studied in competition with diagnostic objects), we found that inconsistent objects were fixated earlier and for longer than consistent but marginally informative objects. In change detection (Experiment 3), perceptual guidance overshadowed semantic guidance, promoting detection of highly salient changes. A residual advantage for diagnosticity over inconsistency emerged only when selection prioritization could not be based on low-level features. Overall these findings show that semantic inconsistency is not prioritized within a scene when competing with other relevant information that is essential to scene understanding and respects observers' expectations. Moreover, they reveal that the relative dominance of semantic or perceptual properties during selection depends on ongoing task requirements. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  9. A novel software architecture for the provision of context-aware semantic transport information.

    PubMed

    Moreno, Asier; Perallos, Asier; López-de-Ipiña, Diego; Onieva, Enrique; Salaberria, Itziar; Masegosa, Antonio D

    2015-05-26

    The effectiveness of Intelligent Transportation Systems depends largely on the ability to integrate information from diverse sources and the suitability of this information for the specific user. This paper describes a new approach for the management and exchange of this information, related to multimodal transportation. A novel software architecture is presented, with particular emphasis on the design of the data model and the enablement of services for information retrieval, thereby obtaining a semantic model for the representation of transport information. The publication of transport data as semantic information is established through the development of a Multimodal Transport Ontology (MTO) and the design of a distributed architecture allowing dynamic integration of transport data. The advantages afforded by the proposed system due to the use of Linked Open Data and a distributed architecture are stated, comparing it with other existing solutions. The adequacy of the information generated in regard to the specific user's context is also addressed. Finally, a working solution of a semantic trip planner using actual transport data and running on the proposed architecture is presented, as a demonstration and validation of the system.

  10. HealthCyberMap: a semantic visual browser of medical Internet resources based on clinical codes and the human body metaphor.

    PubMed

    Kamel Boulos, Maged N; Roudsari, Abdul V; Carso N, Ewart R

    2002-12-01

    HealthCyberMap (HCM-http://healthcybermap.semanticweb.org) is a web-based service for healthcare professionals and librarians, patients and the public in general that aims at mapping parts of the health information resources in cyberspace in novel ways to improve their retrieval and navigation. HCM adopts a clinical metadata framework built upon a clinical coding ontology for the semantic indexing, classification and browsing of Internet health information resources. A resource metadata base holds information about selected resources. HCM then uses GIS (Geographic Information Systems) spatialization methods to generate interactive navigational cybermaps from the metadata base. These visual cybermaps are based on familiar medical metaphors. HCM cybermaps can be considered as semantically spatialized, ontology-based browsing views of the underlying resource metadata base. Using a clinical coding scheme as a metric for spatialization ('semantic distance') is unique to HCM and is very much suited for the semantic categorization and navigation of Internet health information resources. Clinical codes ensure reliable and unambiguous topical indexing of these resources. HCM also introduces a useful form of cyberspatial analysis for the detection of topical coverage gaps in the resource metadata base using choropleth (shaded) maps of human body systems.

  11. A development framework for semantically interoperable health information systems.

    PubMed

    Lopez, Diego M; Blobel, Bernd G M E

    2009-02-01

    Semantic interoperability is a basic challenge to be met for new generations of distributed, communicating and co-operating health information systems (HIS) enabling shared care and e-Health. Analysis, design, implementation and maintenance of such systems and intrinsic architectures have to follow a unified development methodology. The Generic Component Model (GCM) is used as a framework for modeling any system to evaluate and harmonize state of the art architecture development approaches and standards for health information systems as well as to derive a coherent architecture development framework for sustainable, semantically interoperable HIS and their components. The proposed methodology is based on the Rational Unified Process (RUP), taking advantage of its flexibility to be configured for integrating other architectural approaches such as Service-Oriented Architecture (SOA), Model-Driven Architecture (MDA), ISO 10746, and HL7 Development Framework (HDF). Existing architectural approaches have been analyzed, compared and finally harmonized towards an architecture development framework for advanced health information systems. Starting with the requirements for semantic interoperability derived from paradigm changes for health information systems, and supported in formal software process engineering methods, an appropriate development framework for semantically interoperable HIS has been provided. The usability of the framework has been exemplified in a public health scenario.

  12. Parafoveal processing in silent and oral reading: Reading mode influences the relative weighting of phonological and semantic information in Chinese.

    PubMed

    Pan, Jinger; Laubrock, Jochen; Yan, Ming

    2016-08-01

    We examined how reading mode (i.e., silent vs. oral reading) influences parafoveal semantic and phonological processing during the reading of Chinese sentences, using the gaze-contingent boundary paradigm. In silent reading, we found in 2 experiments that reading times on target words were shortened with semantic previews in early and late processing, whereas phonological preview effects mainly occurred in gaze duration or second-pass reading. In contrast, results showed that phonological preview information is obtained early on in oral reading. Strikingly, in oral reading, we observed a semantic preview cost on the target word in Experiment 1 and a decrease in the effect size of preview benefit from first- to second-pass measures in Experiment 2, which we hypothesize to result from increased preview duration. Taken together, our results indicate that parafoveal semantic information can be obtained irrespective of reading mode, whereas readers more efficiently process parafoveal phonological information in oral reading. We discuss implications for notions of information processing priority and saccade generation during silent and oral reading. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  13. Informatics in radiology: radiology gamuts ontology: differential diagnosis for the Semantic Web.

    PubMed

    Budovec, Joseph J; Lam, Cesar A; Kahn, Charles E

    2014-01-01

    The Semantic Web is an effort to add semantics, or "meaning," to empower automated searching and processing of Web-based information. The overarching goal of the Semantic Web is to enable users to more easily find, share, and combine information. Critical to this vision are knowledge models called ontologies, which define a set of concepts and formalize the relations between them. Ontologies have been developed to manage and exploit the large and rapidly growing volume of information in biomedical domains. In diagnostic radiology, lists of differential diagnoses of imaging observations, called gamuts, provide an important source of knowledge. The Radiology Gamuts Ontology (RGO) is a formal knowledge model of differential diagnoses in radiology that includes 1674 differential diagnoses, 19,017 terms, and 52,976 links between terms. Its knowledge is used to provide an interactive, freely available online reference of radiology gamuts ( www.gamuts.net ). A Web service allows its content to be discovered and consumed by other information systems. The RGO integrates radiologic knowledge with other biomedical ontologies as part of the Semantic Web. © RSNA, 2014.

  14. Creating personalised clinical pathways by semantic interoperability with electronic health records.

    PubMed

    Wang, Hua-Qiong; Li, Jing-Song; Zhang, Yi-Fan; Suzuki, Muneou; Araki, Kenji

    2013-06-01

    There is a growing realisation that clinical pathways (CPs) are vital for improving the treatment quality of healthcare organisations. However, treatment personalisation is one of the main challenges when implementing CPs, and the inadequate dynamic adaptability restricts the practicality of CPs. The purpose of this study is to improve the practicality of CPs using semantic interoperability between knowledge-based CPs and semantic electronic health records (EHRs). Simple protocol and resource description framework query language is used to gather patient information from semantic EHRs. The gathered patient information is entered into the CP ontology represented by web ontology language. Then, after reasoning over rules described by semantic web rule language in the Jena semantic framework, we adjust the standardised CPs to meet different patients' practical needs. A CP for acute appendicitis is used as an example to illustrate how to achieve CP customisation based on the semantic interoperability between knowledge-based CPs and semantic EHRs. A personalised care plan is generated by comprehensively analysing the patient's personal allergy history and past medical history, which are stored in semantic EHRs. Additionally, by monitoring the patient's clinical information, an exception is recorded and handled during CP execution. According to execution results of the actual example, the solutions we present are shown to be technically feasible. This study contributes towards improving the clinical personalised practicality of standardised CPs. In addition, this study establishes the foundation for future work on the research and development of an independent CP system. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. Ontologies in medicinal chemistry: current status and future challenges.

    PubMed

    Gómez-Pérez, Asunción; Martínez-Romero, Marcos; Rodríguez-González, Alejandro; Vázquez, Guillermo; Vázquez-Naya, José M

    2013-01-01

    Recent years have seen a dramatic increase in the amount and availability of data in the diverse areas of medicinal chemistry, making it possible to achieve significant advances in fields such as the design, synthesis and biological evaluation of compounds. However, with this data explosion, the storage, management and analysis of available data to extract relevant information has become even a more complex task that offers challenging research issues to Artificial Intelligence (AI) scientists. Ontologies have emerged in AI as a key tool to formally represent and semantically organize aspects of the real world. Beyond glossaries or thesauri, ontologies facilitate communication between experts and allow the application of computational techniques to extract useful information from available data. In medicinal chemistry, multiple ontologies have been developed during the last years which contain knowledge about chemical compounds and processes of synthesis of pharmaceutical products. This article reviews the principal standards and ontologies in medicinal chemistry, analyzes their main applications and suggests future directions.

  16. Domain-independent information extraction in unstructured text

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Irwin, N.H.

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness whenmore » compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.« less

  17. Investigating the capabilities of semantic enrichment of 3D CityEngine data

    NASA Astrophysics Data System (ADS)

    Solou, Dimitra; Dimopoulou, Efi

    2016-08-01

    In recent years the development of technology and the lifting of several technical limitations, has brought the third dimension to the fore. The complexity of urban environments and the strong need for land administration, intensify the need of using a three-dimensional cadastral system. Despite the progress in the field of geographic information systems and 3D modeling techniques, there is no fully digital 3D cadastre. The existing geographic information systems and the different methods of three-dimensional modeling allow for better management, visualization and dissemination of information. Nevertheless, these opportunities cannot be totally exploited because of deficiencies in standardization and interoperability in these systems. Within this context, CityGML was developed as an international standard of the Open Geospatial Consortium (OGC) for 3D city models' representation and exchange. CityGML defines geometry and topology for city modeling, also focusing on semantic aspects of 3D city information. The scope of CityGML is to reach common terminology, also addressing the imperative need for interoperability and data integration, taking into account the number of available geographic information systems and modeling techniques. The aim of this paper is to develop an application for managing semantic information of a model generated based on procedural modeling. The model was initially implemented in CityEngine ESRI's software, and then imported to ArcGIS environment. Final goal was the original model's semantic enrichment and then its conversion to CityGML format. Semantic information management and interoperability seemed to be feasible by the use of the 3DCities Project ESRI tools, since its database structure ensures adding semantic information to the CityEngine model and therefore automatically convert to CityGML for advanced analysis and visualization in different application areas.

  18. Document cards: a top trumps visualization for documents.

    PubMed

    Strobelt, Hendrik; Oelke, Daniela; Rohrdantz, Christian; Stoffel, Andreas; Keim, Daniel A; Deussen, Oliver

    2009-01-01

    Finding suitable, less space consuming views for a document's main content is crucial to provide convenient access to large document collections on display devices of different size. We present a novel compact visualization which represents the document's key semantic as a mixture of images and important key terms, similar to cards in a top trumps game. The key terms are extracted using an advanced text mining approach based on a fully automatic document structure extraction. The images and their captions are extracted using a graphical heuristic and the captions are used for a semi-semantic image weighting. Furthermore, we use the image color histogram for classification and show at least one representative from each non-empty image class. The approach is demonstrated for the IEEE InfoVis publications of a complete year. The method can easily be applied to other publication collections and sets of documents which contain images.

  19. Computational methods to extract meaning from text and advance theories of human cognition.

    PubMed

    McNamara, Danielle S

    2011-01-01

    Over the past two decades, researchers have made great advances in the area of computational methods for extracting meaning from text. This research has to a large extent been spurred by the development of latent semantic analysis (LSA), a method for extracting and representing the meaning of words using statistical computations applied to large corpora of text. Since the advent of LSA, researchers have developed and tested alternative statistical methods designed to detect and analyze meaning in text corpora. This research exemplifies how statistical models of semantics play an important role in our understanding of cognition and contribute to the field of cognitive science. Importantly, these models afford large-scale representations of human knowledge and allow researchers to explore various questions regarding knowledge, discourse processing, text comprehension, and language. This topic includes the latest progress by the leading researchers in the endeavor to go beyond LSA. Copyright © 2010 Cognitive Science Society, Inc.

  20. Semantic Segmentation of Building Elements Using Point Cloud Hashing

    NASA Astrophysics Data System (ADS)

    Chizhova, M.; Gurianov, A.; Hess, M.; Luhmann, T.; Brunn, A.; Stilla, U.

    2018-05-01

    For the interpretation of point clouds, the semantic definition of extracted segments from point clouds or images is a common problem. Usually, the semantic of geometrical pre-segmented point cloud elements are determined using probabilistic networks and scene databases. The proposed semantic segmentation method is based on the psychological human interpretation of geometric objects, especially on fundamental rules of primary comprehension. Starting from these rules the buildings could be quite well and simply classified by a human operator (e.g. architect) into different building types and structural elements (dome, nave, transept etc.), including particular building parts which are visually detected. The key part of the procedure is a novel method based on hashing where point cloud projections are transformed into binary pixel representations. A segmentation approach released on the example of classical Orthodox churches is suitable for other buildings and objects characterized through a particular typology in its construction (e.g. industrial objects in standardized enviroments with strict component design allowing clear semantic modelling).

  1. Taxonomic and Thematic Semantic Systems

    PubMed Central

    Mirman, Daniel; Landrigan, Jon-Frederick; Britt, Allison E.

    2017-01-01

    Object concepts are critical for nearly all aspects of human cognition, from perception tasks like object recognition, to understanding and producing language, to making meaningful actions. Concepts can have two very different kinds of relations: similarity relations based on shared features (e.g., dog – bear), which are called “taxonomic” relations, and contiguity relations based on co-occurrence in events or scenarios (e.g., dog – leash), which are called “thematic” relations. Here we report a systematic review of experimental psychology and cognitive neuroscience evidence of this distinction in the structure of semantic memory. We propose two principles that may drive the development of distinct taxonomic and thematic semantic systems: (1) differences between which features determine taxonomic vs. thematic relations and (2) differences in the processing required to extract taxonomic vs. thematic relations. This review brings together distinct threads of behavioral, computational, and neuroscience research on semantic memory in support of a functional and neural dissociation, and defines a framework for future studies of semantic memory. PMID:28333494

  2. Resolving anaphoras for the extraction of drug-drug interactions in pharmacological documents

    PubMed Central

    2010-01-01

    Background Drug-drug interactions are frequently reported in the increasing amount of biomedical literature. Information Extraction (IE) techniques have been devised as a useful instrument to manage this knowledge. Nevertheless, IE at the sentence level has a limited effect because of the frequent references to previous entities in the discourse, a phenomenon known as 'anaphora'. DrugNerAR, a drug anaphora resolution system is presented to address the problem of co-referring expressions in pharmacological literature. This development is part of a larger and innovative study about automatic drug-drug interaction extraction. Methods The system uses a set of linguistic rules drawn by Centering Theory over the analysis provided by a biomedical syntactic parser. Semantic information provided by the Unified Medical Language System (UMLS) is also integrated in order to improve the recognition and the resolution of nominal drug anaphors. Besides, a corpus has been developed in order to analyze the phenomena and evaluate the current approach. Each possible case of anaphoric expression was looked into to determine the most effective way of resolution. Results An F-score of 0.76 in anaphora resolution was achieved, outperforming significantly the baseline by almost 73%. This ad-hoc reference line was developed to check the results as there is no previous work on anaphora resolution in pharmalogical documents. The obtained results resemble those found in related-semantic domains. Conclusions The present approach shows very promising results in the challenge of accounting for anaphoric expressions in pharmacological texts. DrugNerAr obtains similar results to other approaches dealing with anaphora resolution in the biomedical domain, but, unlike these approaches, it focuses on documents reflecting drug interactions. The Centering Theory has proved being effective at the selection of antecedents in anaphora resolution. A key component in the success of this framework is the analysis provided by the MMTx program and the DrugNer system that allows to deal with the complexity of the pharmacological language. It is expected that the positive results of the resolver increases performance of our future drug-drug interaction extraction system. PMID:20406499

  3. Semantic relatedness between words in each individual brain: an event-related potential study.

    PubMed

    Hata, Masahiro; Homae, Fumitaka; Hagiwara, Hiroko

    2011-08-26

    The relationship between 2 words is judged by the meanings of words. Here, we examined how the semantic relatedness of words is structured in each individual brain. During measurements of event-related potentials (ERPs), participants performed semantic-relatedness judgments of word pairs. For each participant, we divided word pairs into 2 groups--related and unrelated pairs--and compared their ERPs. All of the participants showed a significant N400 effect. However, when we applied an identical grouping of pairs, this effect was observed only in half the number of the participants. These results show that our single-subject analysis of N400 extracted semantic relatedness of words in the individual brain. Future studies using this analysis will clarify the organization of the mental lexicon. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  4. A DNA-based semantic fusion model for remote sensing data.

    PubMed

    Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H

    2013-01-01

    Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology.

  5. A DNA-Based Semantic Fusion Model for Remote Sensing Data

    PubMed Central

    Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H.

    2013-01-01

    Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology. PMID:24116207

  6. An approach for the semantic interoperability of ISO EN 13606 and OpenEHR archetypes.

    PubMed

    Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás

    2010-10-01

    The communication between health information systems of hospitals and primary care organizations is currently an important challenge to improve the quality of clinical practice and patient safety. However, clinical information is usually distributed among several independent systems that may be syntactically or semantically incompatible. This fact prevents healthcare professionals from accessing clinical information of patients in an understandable and normalized way. In this work, we address the semantic interoperability of two EHR standards: OpenEHR and ISO EN 13606. Both standards follow the dual model approach which distinguishes information and knowledge, this being represented through archetypes. The solution presented here is capable of transforming OpenEHR archetypes into ISO EN 13606 and vice versa by combining Semantic Web and Model-driven Engineering technologies. The resulting software implementation has been tested using publicly available collections of archetypes for both standards.

  7. An Educational Tool for Browsing the Semantic Web

    ERIC Educational Resources Information Center

    Yoo, Sujin; Kim, Younghwan; Park, Seongbin

    2013-01-01

    The Semantic Web is an extension of the current Web where information is represented in a machine processable way. It is not separate from the current Web and one of the confusions that novice users might have is where the Semantic Web is. In fact, users can easily encounter RDF documents that are components of the Semantic Web while they navigate…

  8. A Semantic Web-based System for Managing Clinical Archetypes.

    PubMed

    Fernandez-Breis, Jesualdo Tomas; Menarguez-Tortosa, Marcos; Martinez-Costa, Catalina; Fernandez-Breis, Eneko; Herrero-Sempere, Jose; Moner, David; Sanchez, Jesus; Valencia-Garcia, Rafael; Robles, Montserrat

    2008-01-01

    Archetypes facilitate the sharing of clinical knowledge and therefore are a basic tool for achieving interoperability between healthcare information systems. In this paper, a Semantic Web System for Managing Archetypes is presented. This system allows for the semantic annotation of archetypes, as well for performing semantic searches. The current system is capable of working with both ISO13606 and OpenEHR archetypes.

  9. Coherent concepts are computed in the anterior temporal lobes.

    PubMed

    Lambon Ralph, Matthew A; Sage, Karen; Jones, Roy W; Mayberry, Emily J

    2010-02-09

    In his Philosophical Investigations, Wittgenstein famously noted that the formation of semantic representations requires more than a simple combination of verbal and nonverbal features to generate conceptually based similarities and differences. Classical and contemporary neuroscience has tended to focus upon how different neocortical regions contribute to conceptualization through the summation of modality-specific information. The additional yet critical step of computing coherent concepts has received little attention. Some computational models of semantic memory are able to generate such concepts by the addition of modality-invariant information coded in a multidimensional semantic space. By studying patients with semantic dementia, we demonstrate that this aspect of semantic memory becomes compromised following atrophy of the anterior temporal lobes and, as a result, the patients become increasingly influenced by superficial rather than conceptual similarities.

  10. The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF

    NASA Astrophysics Data System (ADS)

    Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.

    2006-12-01

    In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.

  11. The Analysis of RDF Semantic Data Storage Optimization in Large Data Era

    NASA Astrophysics Data System (ADS)

    He, Dandan; Wang, Lijuan; Wang, Can

    2018-03-01

    With the continuous development of information technology and network technology in China, the Internet has also ushered in the era of large data. In order to obtain the effective acquisition of information in the era of large data, it is necessary to optimize the existing RDF semantic data storage and realize the effective query of various data. This paper discusses the storage optimization of RDF semantic data under large data.

  12. Medical Image Analysis by Cognitive Information Systems - a Review.

    PubMed

    Ogiela, Lidia; Takizawa, Makoto

    2016-10-01

    This publication presents a review of medical image analysis systems. The paradigms of cognitive information systems will be presented by examples of medical image analysis systems. The semantic processes present as it is applied to different types of medical images. Cognitive information systems were defined on the basis of methods for the semantic analysis and interpretation of information - medical images - applied to cognitive meaning of medical images contained in analyzed data sets. Semantic analysis was proposed to analyzed the meaning of data. Meaning is included in information, for example in medical images. Medical image analysis will be presented and discussed as they are applied to various types of medical images, presented selected human organs, with different pathologies. Those images were analyzed using different classes of cognitive information systems. Cognitive information systems dedicated to medical image analysis was also defined for the decision supporting tasks. This process is very important for example in diagnostic and therapy processes, in the selection of semantic aspects/features, from analyzed data sets. Those features allow to create a new way of analysis.

  13. Calculating semantic relatedness for biomedical use in a knowledge-poor environment.

    PubMed

    Rybinski, Maciej; Aldana-Montes, José

    2014-01-01

    Computing semantic relatedness between textual labels representing biological and medical concepts is a crucial task in many automated knowledge extraction and processing applications relevant to the biomedical domain, specifically due to the huge amount of new findings being published each year. Most methods benefit from making use of highly specific resources, thus reducing their usability in many real world scenarios that differ from the original assumptions. In this paper we present a simple resource-efficient method for calculating semantic relatedness in a knowledge-poor environment. The method obtains results comparable to state-of-the-art methods, while being more generic and flexible. The solution being presented here was designed to use only a relatively generic and small document corpus and its statistics, without referring to a previously defined knowledge base, thus it does not assume a 'closed' problem. We propose a method in which computation for two input texts is based on the idea of comparing the vocabulary associated with the best-fit documents related to those texts. As keyterm extraction is a costly process, it is done in a preprocessing step on a 'per-document' basis in order to limit the on-line processing. The actual computations are executed in a compact vector space, limited by the most informative extraction results. The method has been evaluated on five direct benchmarks by calculating correlation coefficients w.r.t. average human answers. It also has been used on Gene - Disease and Disease- Disease data pairs to highlight its potential use as a data analysis tool. Apart from comparisons with reported results, some interesting features of the method have been studied, i.e. the relationship between result quality, efficiency and applicable trimming threshold for size reduction. Experimental evaluation shows that the presented method obtains results that are comparable with current state of the art methods, even surpassing them on a majority of the benchmarks. Additionally, a possible usage scenario for the method is showcased with a real-world data experiment. Our method improves flexibility of the existing methods without a notable loss of quality. It is a legitimate alternative to the costly construction of specialized knowledge-rich resources.

  14. Archetype-based semantic integration and standardization of clinical data.

    PubMed

    Moner, David; Maldonado, Jose A; Bosca, Diego; Fernandez, Jesualdo T; Angulo, Carlos; Crespo, Pere; Vivancos, Pedro J; Robles, Montserrat

    2006-01-01

    One of the basic needs for any healthcare professional is to be able to access to clinical information of patients in an understandable and normalized way. The lifelong clinical information of any person supported by electronic means configures his/her Electronic Health Record (EHR). This information is usually distributed among several independent and heterogeneous systems that may be syntactically or semantically incompatible. The Dual Model architecture has appeared as a new proposal for maintaining a homogeneous representation of the EHR with a clear separation between information and knowledge. Information is represented by a Reference Model which describes common data structures with minimal semantics. Knowledge is specified by archetypes, which are formal representations of clinical concepts built upon a particular Reference Model. This kind of architecture is originally thought for implantation of new clinical information systems, but archetypes can be also used for integrating data of existing and not normalized systems, adding at the same time a semantic meaning to the integrated data. In this paper we explain the possible use of a Dual Model approach for semantic integration and standardization of heterogeneous clinical data sources and present LinkEHR-Ed, a tool for developing archetypes as elements for integration purposes. LinkEHR-Ed has been designed to be easily used by the two main participants of the creation process of archetypes for clinical data integration: the Health domain expert and the Information Technologies domain expert.

  15. Towards an Obesity-Cancer Knowledge Base: Biomedical Entity Identification and Relation Detection

    PubMed Central

    Lossio-Ventura, Juan Antonio; Hogan, William; Modave, François; Hicks, Amanda; Hanna, Josh; Guo, Yi; He, Zhe; Bian, Jiang

    2017-01-01

    Obesity is associated with increased risks of various types of cancer, as well as a wide range of other chronic diseases. On the other hand, access to health information activates patient participation, and improve their health outcomes. However, existing online information on obesity and its relationship to cancer is heterogeneous ranging from pre-clinical models and case studies to mere hypothesis-based scientific arguments. A formal knowledge representation (i.e., a semantic knowledge base) would help better organizing and delivering quality health information related to obesity and cancer that consumers need. Nevertheless, current ontologies describing obesity, cancer and related entities are not designed to guide automatic knowledge base construction from heterogeneous information sources. Thus, in this paper, we present methods for named-entity recognition (NER) to extract biomedical entities from scholarly articles and for detecting if two biomedical entities are related, with the long term goal of building a obesity-cancer knowledge base. We leverage both linguistic and statistical approaches in the NER task, which supersedes the state-of-the-art results. Further, based on statistical features extracted from the sentences, our method for relation detection obtains an accuracy of 99.3% and a f-measure of 0.993. PMID:28503356

  16. Taxonomy, Ontology and Semantics at Johnson Space Center

    NASA Technical Reports Server (NTRS)

    Berndt, Sarah Ann

    2011-01-01

    At NASA Johnson Space Center (JSC), the Chief Knowledge Officer has been developing the JSC Taxonomy to capitalize on the accomplishments of yesterday while maintaining the flexibility needed for the evolving information environment of today. A clear vision and scope for the semantic system is integral to its success. The vision for the JSC Taxonomy is to connect information stovepipes to present a unified view for information and knowledge across the Center, across organizations, and across decades. Semantic search at JSC means seemless integration of disparate information sets into a single interface. Ever increasing use, interest, and organizational participation mark successful integration and provide the framework for future application.

  17. Classification with an edge: Improving semantic image segmentation with boundary detection

    NASA Astrophysics Data System (ADS)

    Marmanis, D.; Schindler, K.; Wegner, J. D.; Galliani, S.; Datcu, M.; Stilla, U.

    2018-01-01

    We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large receptive fields. However, this success comes at a cost, since the associated loss of effective spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class boundaries explicit in the model. First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the SEGNET encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs in an end-to-end training scheme. Our best model achieves >90% overall accuracy on the ISPRS Vaihingen benchmark.

  18. Supporting the self-concept with memory: insight from amnesia

    PubMed Central

    Verfaellie, Mieke

    2015-01-01

    We investigated the extent to which personal semantic memory supports the self-concept in individuals with medial temporal lobe amnesia and healthy adults. Participants completed eight ‘I Am’ self-statements. For each of the four highest ranked self-statements, participants completed an open-ended narrative task, during which they provided supporting information indicating why the I Am statement was considered self-descriptive. Participants then completed an episodic probe task, during which they attempted to retrieve six episodic memories for each of these self-statements. Supporting information was scored as episodic, personal semantic or general semantic. In the narrative task, personal semantic memory predominated as self-supporting information in both groups. The amnesic participants generated fewer personal semantic memories than controls to support their self-statements, a deficit that was more pronounced for trait relative to role self-statements. In the episodic probe task, the controls primarily generated unique event memories, but the amnesic participants did not. These findings demonstrate that personal semantic memory, in particular autobiographical fact knowledge, plays a critical role in supporting the self-concept, regardless of the accessibility of episodic memories, and they highlight potential differences in the way traits and roles are supported by personal memory. PMID:25964501

  19. Type-specific proactive interference in patients with semantic and phonological STM deficits.

    PubMed

    Harris, Lara; Olson, Andrew; Humphreys, Glyn

    2014-01-01

    Prior neuropsychological evidence suggests that semantic and phonological components of short-term memory (STM) are functionally and neurologically distinct. The current paper examines proactive interference (PI) from semantic and phonological information in two STM-impaired patients, DS (semantic STM deficit) and AK (phonological STM deficit). In Experiment 1 probe recognition tasks with open and closed sets of stimuli were used. Phonological PI was assessed using nonword items, and semantic and phonological PI was assessed using words. In Experiment 2 phonological and semantic PI was elicited by an item recognition probe test with stimuli that bore phonological and semantic relations to the probes. The data suggested heightened phonological PI for the semantic STM patient, and exaggerated effects of semantic PI in the phonological STM case. The findings are consistent with an account of extremely rapid decay of activated type-specific representations in cases of severely impaired phonological and semantic STM.

  20. Identifying city PV roof resource based on Gabor filter

    NASA Astrophysics Data System (ADS)

    Ruhang, Xu; Zhilin, Liu; Yong, Huang; Xiaoyu, Zhang

    2017-06-01

    To identify a city’s PV roof resources, the area and ownership distribution of residential buildings in an urban district should be assessed. To achieve this assessment, remote sensing data analysing is a promising approach. Urban building roof area estimation is a major topic for remote sensing image information extraction. There are normally three ways to solve this problem. The first way is pixel-based analysis, which is based on mathematical morphology or statistical methods; the second way is object-based analysis, which is able to combine semantic information and expert knowledge; the third way is signal-processing view method. This paper presented a Gabor filter based method. This result shows that the method is fast and with proper accuracy.

  1. Benchmarking natural-language parsers for biological applications using dependency graphs.

    PubMed

    Clegg, Andrew B; Shepherd, Adrian J

    2007-01-25

    Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques.

  2. Benchmarking natural-language parsers for biological applications using dependency graphs

    PubMed Central

    Clegg, Andrew B; Shepherd, Adrian J

    2007-01-01

    Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351

  3. Out-of-Sample Extrapolation utilizing Semi-Supervised Manifold Learning (OSE-SSL): Content Based Image Retrieval for Histopathology Images

    PubMed Central

    Sparks, Rachel; Madabhushi, Anant

    2016-01-01

    Content-based image retrieval (CBIR) retrieves database images most similar to the query image by (1) extracting quantitative image descriptors and (2) calculating similarity between database and query image descriptors. Recently, manifold learning (ML) has been used to perform CBIR in a low dimensional representation of the high dimensional image descriptor space to avoid the curse of dimensionality. ML schemes are computationally expensive, requiring an eigenvalue decomposition (EVD) for every new query image to learn its low dimensional representation. We present out-of-sample extrapolation utilizing semi-supervised ML (OSE-SSL) to learn the low dimensional representation without recomputing the EVD for each query image. OSE-SSL incorporates semantic information, partial class label, into a ML scheme such that the low dimensional representation co-localizes semantically similar images. In the context of prostate histopathology, gland morphology is an integral component of the Gleason score which enables discrimination between prostate cancer aggressiveness. Images are represented by shape features extracted from the prostate gland. CBIR with OSE-SSL for prostate histology obtained from 58 patient studies, yielded an area under the precision recall curve (AUPRC) of 0.53 ± 0.03 comparatively a CBIR with Principal Component Analysis (PCA) to learn a low dimensional space yielded an AUPRC of 0.44 ± 0.01. PMID:27264985

  4. Ontology Reuse in Geoscience Semantic Applications

    NASA Astrophysics Data System (ADS)

    Mayernik, M. S.; Gross, M. B.; Daniels, M. D.; Rowan, L. R.; Stott, D.; Maull, K. E.; Khan, H.; Corson-Rikert, J.

    2015-12-01

    The tension between local ontology development and wider ontology connections is fundamental to the Semantic web. It is often unclear, however, what the key decision points should be for new semantic web applications in deciding when to reuse existing ontologies and when to develop original ontologies. In addition, with the growth of semantic web ontologies and applications, new semantic web applications can struggle to efficiently and effectively identify and select ontologies to reuse. This presentation will describe the ontology comparison, selection, and consolidation effort within the EarthCollab project. UCAR, Cornell University, and UNAVCO are collaborating on the EarthCollab project to use semantic web technologies to enable the discovery of the research output from a diverse array of projects. The EarthCollab project is using the VIVO Semantic web software suite to increase discoverability of research information and data related to the following two geoscience-based communities: (1) the Bering Sea Project, an interdisciplinary field program whose data archive is hosted by NCAR's Earth Observing Laboratory (EOL), and (2) diverse research projects informed by geodesy through the UNAVCO geodetic facility and consortium. This presentation will outline of EarthCollab use cases, and provide an overview of key ontologies being used, including the VIVO-Integrated Semantic Framework (VIVO-ISF), Global Change Information System (GCIS), and Data Catalog (DCAT) ontologies. We will discuss issues related to bringing these ontologies together to provide a robust ontological structure to support the EarthCollab use cases. It is rare that a single pre-existing ontology meets all of a new application's needs. New projects need to stitch ontologies together in ways that fit into the broader semantic web ecosystem.

  5. A user-centred evaluation framework for the Sealife semantic web browsers

    PubMed Central

    Oliver, Helen; Diallo, Gayo; de Quincey, Ed; Alexopoulou, Dimitra; Habermann, Bianca; Kostkova, Patty; Schroeder, Michael; Jupp, Simon; Khelif, Khaled; Stevens, Robert; Jawaheer, Gawesh; Madle, Gemma

    2009-01-01

    Background Semantically-enriched browsing has enhanced the browsing experience by providing contextualised dynamically generated Web content, and quicker access to searched-for information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. Methods This paper describes a user-centred evaluation framework that was developed to evaluate the Sealife SWBs that elicited feedback on users' perceptions on ease of use and information findability. Three sources of data: i) web server logs; ii) user questionnaires; and iii) semi-structured interviews were analysed and comparisons made between each browser and a control system. Results It was found that the evaluation framework used successfully elicited users' perceptions of the three distinct SWBs. The results indicate that the browser with the most mature and polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Conclusion Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues. PMID:19796398

  6. Memory as discrimination: what distraction reveals.

    PubMed

    Beaman, C Philip; Hanczakowski, Maciej; Hodgetts, Helen M; Marsh, John E; Jones, Dylan M

    2013-11-01

    Recalling information involves the process of discriminating between relevant and irrelevant information stored in memory. Not infrequently, the relevant information needs to be selected from among a series of related possibilities. This is likely to be particularly problematic when the irrelevant possibilities not only are temporally or contextually appropriate, but also overlap semantically with the target or targets. Here, we investigate the extent to which purely perceptual features that discriminate between irrelevant and target material can be used to overcome the negative impact of contextual and semantic relatedness. Adopting a distraction paradigm, it is demonstrated that when distractors are interleaved with targets presented either visually (Experiment 1) or auditorily (Experiment 2), a within-modality semantic distraction effect occurs; semantically related distractors impact upon recall more than do unrelated distractors. In the semantically related condition, the number of intrusions in recall is reduced, while the number of correctly recalled targets is simultaneously increased by the presence of perceptual cues to relevance (color features in Experiment 1 or speaker's gender in Experiment 2). However, as is demonstrated in Experiment 3, even presenting semantically related distractors in a language and a sensory modality (spoken Welsh) distinct from that of the targets (visual English) is insufficient to eliminate false recalls completely or to restore correct recall to levels seen with unrelated distractors . Together, the study shows how semantic and nonsemantic discriminability shape patterns of both erroneous and correct recall.

  7. Semantics of the visual environment encoded in parahippocampal cortex

    PubMed Central

    Bonner, Michael F.; Price, Amy Rose; Peelle, Jonathan E.; Grossman, Murray

    2016-01-01

    Semantic representations capture the statistics of experience and store this information in memory. A fundamental component of this memory system is knowledge of the visual environment, including knowledge of objects and their associations. Visual semantic information underlies a range of behaviors, from perceptual categorization to cognitive processes such as language and reasoning. Here we examine the neuroanatomic system that encodes visual semantics. Across three experiments, we found converging evidence indicating that knowledge of verbally mediated visual concepts relies on information encoded in a region of the ventral-medial temporal lobe centered on parahippocampal cortex. In an fMRI study, this region was strongly engaged by the processing of concepts relying on visual knowledge but not by concepts relying on other sensory modalities. In a study of patients with the semantic variant of primary progressive aphasia (semantic dementia), atrophy that encompassed this region was associated with a specific impairment in verbally mediated visual semantic knowledge. Finally, in a structural study of healthy adults from the fMRI experiment, gray matter density in this region related to individual variability in the processing of visual concepts. The anatomic location of these findings aligns with recent work linking the ventral-medial temporal lobe with high-level visual representation, contextual associations, and reasoning through imagination. Together this work suggests a critical role for parahippocampal cortex in linking the visual environment with knowledge systems in the human brain. PMID:26679216

  8. Semantics of the Visual Environment Encoded in Parahippocampal Cortex.

    PubMed

    Bonner, Michael F; Price, Amy Rose; Peelle, Jonathan E; Grossman, Murray

    2016-03-01

    Semantic representations capture the statistics of experience and store this information in memory. A fundamental component of this memory system is knowledge of the visual environment, including knowledge of objects and their associations. Visual semantic information underlies a range of behaviors, from perceptual categorization to cognitive processes such as language and reasoning. Here we examine the neuroanatomic system that encodes visual semantics. Across three experiments, we found converging evidence indicating that knowledge of verbally mediated visual concepts relies on information encoded in a region of the ventral-medial temporal lobe centered on parahippocampal cortex. In an fMRI study, this region was strongly engaged by the processing of concepts relying on visual knowledge but not by concepts relying on other sensory modalities. In a study of patients with the semantic variant of primary progressive aphasia (semantic dementia), atrophy that encompassed this region was associated with a specific impairment in verbally mediated visual semantic knowledge. Finally, in a structural study of healthy adults from the fMRI experiment, gray matter density in this region related to individual variability in the processing of visual concepts. The anatomic location of these findings aligns with recent work linking the ventral-medial temporal lobe with high-level visual representation, contextual associations, and reasoning through imagination. Together, this work suggests a critical role for parahippocampal cortex in linking the visual environment with knowledge systems in the human brain.

  9. A user-centred evaluation framework for the Sealife semantic web browsers.

    PubMed

    Oliver, Helen; Diallo, Gayo; de Quincey, Ed; Alexopoulou, Dimitra; Habermann, Bianca; Kostkova, Patty; Schroeder, Michael; Jupp, Simon; Khelif, Khaled; Stevens, Robert; Jawaheer, Gawesh; Madle, Gemma

    2009-10-01

    Semantically-enriched browsing has enhanced the browsing experience by providing contextualized dynamically generated Web content, and quicker access to searched-for information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. This paper describes a user-centred evaluation framework that was developed to evaluate the Sealife SWBs that elicited feedback on users' perceptions on ease of use and information findability. Three sources of data: i) web server logs; ii) user questionnaires; and iii) semi-structured interviews were analysed and comparisons made between each browser and a control system. It was found that the evaluation framework used successfully elicited users' perceptions of the three distinct SWBs. The results indicate that the browser with the most mature and polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues.

  10. The failing measurement of attitudes: How semantic determinants of individual survey responses come to replace measures of attitude strength.

    PubMed

    Arnulf, Jan Ketil; Larsen, Kai Rune; Martinsen, Øyvind Lund; Egeland, Thore

    2018-01-12

    The traditional understanding of data from Likert scales is that the quantifications involved result from measures of attitude strength. Applying a recently proposed semantic theory of survey response, we claim that survey responses tap two different sources: a mixture of attitudes plus the semantic structure of the survey. Exploring the degree to which individual responses are influenced by semantics, we hypothesized that in many cases, information about attitude strength is actually filtered out as noise in the commonly used correlation matrix. We developed a procedure to separate the semantic influence from attitude strength in individual response patterns, and compared these results to, respectively, the observed sample correlation matrices and the semantic similarity structures arising from text analysis algorithms. This was done with four datasets, comprising a total of 7,787 subjects and 27,461,502 observed item pair responses. As we argued, attitude strength seemed to account for much information about the individual respondents. However, this information did not seem to carry over into the observed sample correlation matrices, which instead converged around the semantic structures offered by the survey items. This is potentially disturbing for the traditional understanding of what survey data represent. We argue that this approach contributes to a better understanding of the cognitive processes involved in survey responses. In turn, this could help us make better use of the data that such methods provide.

  11. Semantic concept-enriched dependence model for medical information retrieval.

    PubMed

    Choi, Sungbin; Choi, Jinwook; Yoo, Sooyoung; Kim, Heechun; Lee, Youngho

    2014-02-01

    In medical information retrieval research, semantic resources have been mostly used by expanding the original query terms or estimating the concept importance weight. However, implicit term-dependency information contained in semantic concept terms has been overlooked or at least underused in most previous studies. In this study, we incorporate a semantic concept-based term-dependence feature into a formal retrieval model to improve its ranking performance. Standardized medical concept terms used by medical professionals were assumed to have implicit dependency within the same concept. We hypothesized that, by elaborately revising the ranking algorithms to favor documents that preserve those implicit dependencies, the ranking performance could be improved. The implicit dependence features are harvested from the original query using MetaMap. These semantic concept-based dependence features were incorporated into a semantic concept-enriched dependence model (SCDM). We designed four different variants of the model, with each variant having distinct characteristics in the feature formulation method. We performed leave-one-out cross validations on both a clinical document corpus (TREC Medical records track) and a medical literature corpus (OHSUMED), which are representative test collections in medical information retrieval research. Our semantic concept-enriched dependence model consistently outperformed other state-of-the-art retrieval methods. Analysis shows that the performance gain has occurred independently of the concept's explicit importance in the query. By capturing implicit knowledge with regard to the query term relationships and incorporating them into a ranking model, we could build a more robust and effective retrieval model, independent of the concept importance. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Semantic interoperability--HL7 Version 3 compared to advanced architecture standards.

    PubMed

    Blobel, B G M E; Engel, K; Pharow, P

    2006-01-01

    To meet the challenge for high quality and efficient care, highly specialized and distributed healthcare establishments have to communicate and co-operate in a semantically interoperable way. Information and communication technology must be open, flexible, scalable, knowledge-based and service-oriented as well as secure and safe. For enabling semantic interoperability, a unified process for defining and implementing the architecture, i.e. structure and functions of the cooperating systems' components, as well as the approach for knowledge representation, i.e. the used information and its interpretation, algorithms, etc. have to be defined in a harmonized way. Deploying the Generic Component Model, systems and their components, underlying concepts and applied constraints must be formally modeled, strictly separating platform-independent from platform-specific models. As HL7 Version 3 claims to represent the most successful standard for semantic interoperability, HL7 has been analyzed regarding the requirements for model-driven, service-oriented design of semantic interoperable information systems, thereby moving from a communication to an architecture paradigm. The approach is compared with advanced architectural approaches for information systems such as OMG's CORBA 3 or EHR systems such as GEHR/openEHR and CEN EN 13606 Electronic Health Record Communication. HL7 Version 3 is maturing towards an architectural approach for semantic interoperability. Despite current differences, there is a close collaboration between the teams involved guaranteeing a convergence between competing approaches.

  13. [Analysis of health terminologies for use as ontologies in healthcare information systems].

    PubMed

    Romá-Ferri, Maria Teresa; Palomar, Manuel

    2008-01-01

    Ontologies are a resource that allow the concept of meaning to be represented informatically, thus avoiding the limitations imposed by standardized terms. The objective of this study was to establish the extent to which terminologies could be used for the design of ontologies, which could be serve as an aid to resolve problems such as semantic interoperability and knowledge reusability in healthcare information systems. To determine the extent to which terminologies could be used as ontologies, six of the most important terminologies in clinical, epidemiologic, documentation and administrative-economic contexts were analyzed. The following characteristics were verified: conceptual coverage, hierarchical structure, conceptual granularity of the categories, conceptual relations, and the language used for conceptual representation. MeSH, DeCS and UMLS ontologies were considered lightweight. The main differences among these ontologies concern conceptual specification, the types of relation and the restrictions among the associated concepts. SNOMED and GALEN ontologies have declaratory formalism, based on logical descriptions. These ontologies include explicit qualities and show greater restrictions among associated concepts and rule combinations and were consequently considered as heavyweight. Analysis of the declared representation of the terminologies shows the extent to which they could be reused as ontologies. Their degree of usability depends on whether the aim is for healthcare information systems to solve problems of semantic interoperability (lightweight ontologies) or to reuse the systems' knowledge as an aid to decision making (heavyweight ontologies) and for non-structured information retrieval, extraction, and classification.

  14. Structure before Meaning: Sentence Processing, Plausibility, and Subcategorization

    PubMed Central

    Kizach, Johannes; Nyvad, Anne Mette; Christensen, Ken Ramshøj

    2013-01-01

    Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous. PMID:24116101

  15. Structure before meaning: sentence processing, plausibility, and subcategorization.

    PubMed

    Kizach, Johannes; Nyvad, Anne Mette; Christensen, Ken Ramshøj

    2013-01-01

    Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous.

  16. A hierarchical SVG image abstraction layer for medical imaging

    NASA Astrophysics Data System (ADS)

    Kim, Edward; Huang, Xiaolei; Tan, Gang; Long, L. Rodney; Antani, Sameer

    2010-03-01

    As medical imaging rapidly expands, there is an increasing need to structure and organize image data for efficient analysis, storage and retrieval. In response, a large fraction of research in the areas of content-based image retrieval (CBIR) and picture archiving and communication systems (PACS) has focused on structuring information to bridge the "semantic gap", a disparity between machine and human image understanding. An additional consideration in medical images is the organization and integration of clinical diagnostic information. As a step towards bridging the semantic gap, we design and implement a hierarchical image abstraction layer using an XML based language, Scalable Vector Graphics (SVG). Our method encodes features from the raw image and clinical information into an extensible "layer" that can be stored in a SVG document and efficiently searched. Any feature extracted from the raw image including, color, texture, orientation, size, neighbor information, etc., can be combined in our abstraction with high level descriptions or classifications. And our representation can natively characterize an image in a hierarchical tree structure to support multiple levels of segmentation. Furthermore, being a world wide web consortium (W3C) standard, SVG is able to be displayed by most web browsers, interacted with by ECMAScript (standardized scripting language, e.g. JavaScript, JScript), and indexed and retrieved by XML databases and XQuery. Using these open source technologies enables straightforward integration into existing systems. From our results, we show that the flexibility and extensibility of our abstraction facilitates effective storage and retrieval of medical images.

  17. A research framework for pharmacovigilance in health social media: Identification and evaluation of patient adverse drug event reports.

    PubMed

    Liu, Xiao; Chen, Hsinchun

    2015-12-01

    Social media offer insights of patients' medical problems such as drug side effects and treatment failures. Patient reports of adverse drug events from social media have great potential to improve current practice of pharmacovigilance. However, extracting patient adverse drug event reports from social media continues to be an important challenge for health informatics research. In this study, we develop a research framework with advanced natural language processing techniques for integrated and high-performance patient reported adverse drug event extraction. The framework consists of medical entity extraction for recognizing patient discussions of drug and events, adverse drug event extraction with shortest dependency path kernel based statistical learning method and semantic filtering with information from medical knowledge bases, and report source classification to tease out noise. To evaluate the proposed framework, a series of experiments were conducted on a test bed encompassing about postings from major diabetes and heart disease forums in the United States. The results reveal that each component of the framework significantly contributes to its overall effectiveness. Our framework significantly outperforms prior work. Published by Elsevier Inc.

  18. Supervised Semantic Classification for Nuclear Proliferation Monitoring

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Cheriyadat, Anil M; Gleason, Shaun Scott

    2010-01-01

    Existing feature extraction and classification approaches are not suitable for monitoring proliferation activity using high-resolution multi-temporal remote sensing imagery. In this paper we present a supervised semantic labeling framework based on the Latent Dirichlet Allocation method. This framework is used to analyze over 120 images collected under different spatial and temporal settings over the globe representing three major semantic categories: airports, nuclear, and coal power plants. Initial experimental results show a reasonable discrimination of these three categories even though coal and nuclear images share highly common and overlapping objects. This research also identified several research challenges associated with nuclear proliferationmore » monitoring using high resolution remote sensing images.« less

  19. A Left Cerebral Hemisphere's Superiority in Processing Spatial-Categorical Information in a Non-Verbal Semantic Format

    ERIC Educational Resources Information Center

    Suegami, Takashi; Laeng, Bruno

    2013-01-01

    It has been shown that the left and right cerebral hemispheres (LH and RH) respectively process qualitative or "categorical" spatial relations and metric or "coordinate" spatial relations. However, categorical spatial information could be thought as divided into two types: semantically-coded and visuospatially-coded categorical information. We…

  20. Facial and semantic emotional interference: A pilot study on the behavioral and cortical responses to the dual valence association task

    PubMed Central

    2011-01-01

    Background Integration of compatible or incompatible emotional valence and semantic information is an essential aspect of complex social interactions. A modified version of the Implicit Association Test (IAT) called Dual Valence Association Task (DVAT) was designed in order to measure conflict resolution processing from compatibility/incompatibly of semantic and facial valence. The DVAT involves two emotional valence evaluative tasks which elicits two forms of emotional compatible/incompatible associations (facial and semantic). Methods Behavioural measures and Event Related Potentials were recorded while participants performed the DVAT. Results Behavioural data showed a robust effect that distinguished compatible/incompatible tasks. The effects of valence and contextual association (between facial and semantic stimuli) showed early discrimination in N170 of faces. The LPP component was modulated by the compatibility of the DVAT. Conclusions Results suggest that DVAT is a robust paradigm for studying the emotional interference effect in the processing of simultaneous information from semantic and facial stimuli. PMID:21489277

  1. A Novel Software Architecture for the Provision of Context-Aware Semantic Transport Information

    PubMed Central

    Moreno, Asier; Perallos, Asier; López-de-Ipiña, Diego; Onieva, Enrique; Salaberria, Itziar; Masegosa, Antonio D.

    2015-01-01

    The effectiveness of Intelligent Transportation Systems depends largely on the ability to integrate information from diverse sources and the suitability of this information for the specific user. This paper describes a new approach for the management and exchange of this information, related to multimodal transportation. A novel software architecture is presented, with particular emphasis on the design of the data model and the enablement of services for information retrieval, thereby obtaining a semantic model for the representation of transport information. The publication of transport data as semantic information is established through the development of a Multimodal Transport Ontology (MTO) and the design of a distributed architecture allowing dynamic integration of transport data. The advantages afforded by the proposed system due to the use of Linked Open Data and a distributed architecture are stated, comparing it with other existing solutions. The adequacy of the information generated in regard to the specific user’s context is also addressed. Finally, a working solution of a semantic trip planner using actual transport data and running on the proposed architecture is presented, as a demonstration and validation of the system. PMID:26016915

  2. Semantic-episodic interactions in the neuropsychology of disbelief.

    PubMed

    Ladowsky-Brooks, Ricki; Alcock, James E

    2007-03-01

    The purpose of this paper is to outline ways in which characteristics of memory functioning determine truth judgements regarding verbally transmitted information. Findings on belief formation from several areas of psychology were reviewed in order to identify general principles that appear to underlie the designation of information in memory as "true" or "false". Studies on belief formation have demonstrated that individuals have a tendency to encode information as "true" and that an additional encoding step is required to tag information as "false". This additional step can involve acquisition and later recall of semantic-episodic associations between message content and contextual cues that signal that information is "false". Semantic-episodic interactions also appear to prevent new information from being accepted as "true" through encoding bias or the assignment of a "false" tag to data that is incompatible with prior knowledge. It is proposed that truth judgements are made through a combined weighting of the reliability of the information source and the compatibility of this information with already stored data. This requires interactions in memory. Failure to integrate different types of memories, such as semantic and episodic memories, can arise from mild hippocampal dysfunction and might result in delusions.

  3. Behavioural and magnetoencephalographic evidence for the interaction between semantic and episodic memory in healthy elderly subjects.

    PubMed

    La Corte, Valentina; Dalla Barba, Gianfranco; Lemaréchal, Jean-Didier; Garnero, Line; George, Nathalie

    2012-10-01

    The relationship between episodic and semantic memory systems has long been debated. Some authors argue that episodic memory is contingent on semantic memory (Tulving 1984), while others postulate that both systems are independent since they can be selectively damaged (Squire 1987). The interaction between these memory systems is particularly important in the elderly, since the dissociation of episodic and semantic memory defects characterize different aging-related pathologies. Here, we investigated the interaction between semantic knowledge and episodic memory processes associated with faces in elderly subjects using an experimental paradigm where the semantic encoding of famous and unknown faces was compared to their episodic recognition. Results showed that the level of semantic awareness of items affected the recognition of those items in the episodic memory task. Event-related magnetic fields confirmed this interaction between episodic and semantic memory: ERFs related to the old/new effect during the episodic task were markedly different for famous and unknown faces. The old/new effect for famous faces involved sustained activities maximal over right temporal sensors, showing a spatio-temporal pattern partly similar to that found for famous versus unknown faces during the semantic task. By contrast, an old/new effect for unknown faces was observed on left parieto-occipital sensors. These findings suggest that the episodic memory for famous faces activated the retrieval of stored semantic information, whereas it was based on items' perceptual features for unknown faces. Overall, our results show that semantic information interfered markedly with episodic memory processes and suggested that the neural substrates of these two memory systems overlap.

  4. Transcranial Direct Current Stimulation Effects on Semantic Processing in Healthy Individuals.

    PubMed

    Joyal, Marilyne; Fecteau, Shirley

    2016-01-01

    Semantic processing allows us to use conceptual knowledge about the world. It has been associated with a large distributed neural network that includes the frontal, temporal and parietal cortices. Recent studies using transcranial direct current stimulation (tDCS) also contributed at investigating semantic processing. The goal of this article was to review studies investigating semantic processing in healthy individuals with tDCS and discuss findings from these studies in line with neuroimaging results. Based on functional magnetic resonance imaging studies assessing semantic processing, we predicted that tDCS applied over the inferior frontal gyrus, middle temporal gyrus, and posterior parietal cortex will impact semantic processing. We conducted a search on Pubmed and selected 27 articles in which tDCS was used to modulate semantic processing in healthy subjects. We analysed each article according to these criteria: demographic information, experimental outcomes assessing semantic processing, study design, and effects of tDCS on semantic processes. From the 27 reviewed studies, 8 found main effects of stimulation. In addition to these 8 studies, 17 studies reported an interaction between stimulus types and stimulation conditions (e.g. incoherent functional, but not instrumental, actions were processed faster when anodal tDCS was applied over the posterior parietal cortex as compared to sham tDCS). Results suggest that regions in the frontal, temporal, and parietal cortices are involved in semantic processing. tDCS can modulate some aspects of semantic processing and provide information on the functional roles of brain regions involved in this cognitive process. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. A semantic web framework to integrate cancer omics data with biological knowledge.

    PubMed

    Holford, Matthew E; McCusker, James P; Cheung, Kei-Hoi; Krauthammer, Michael

    2012-01-25

    The RDF triple provides a simple linguistic means of describing limitless types of information. Triples can be flexibly combined into a unified data source we call a semantic model. Semantic models open new possibilities for the integration of variegated biological data. We use Semantic Web technology to explicate high throughput clinical data in the context of fundamental biological knowledge. We have extended Corvus, a data warehouse which provides a uniform interface to various forms of Omics data, by providing a SPARQL endpoint. With the querying and reasoning tools made possible by the Semantic Web, we were able to explore quantitative semantic models retrieved from Corvus in the light of systematic biological knowledge. For this paper, we merged semantic models containing genomic, transcriptomic and epigenomic data from melanoma samples with two semantic models of functional data - one containing Gene Ontology (GO) data, the other, regulatory networks constructed from transcription factor binding information. These two semantic models were created in an ad hoc manner but support a common interface for integration with the quantitative semantic models. Such combined semantic models allow us to pose significant translational medicine questions. Here, we study the interplay between a cell's molecular state and its response to anti-cancer therapy by exploring the resistance of cancer cells to Decitabine, a demethylating agent. We were able to generate a testable hypothesis to explain how Decitabine fights cancer - namely, that it targets apoptosis-related gene promoters predominantly in Decitabine-sensitive cell lines, thus conveying its cytotoxic effect by activating the apoptosis pathway. Our research provides a framework whereby similar hypotheses can be developed easily.

  6. PASTE: patient-centered SMS text tagging in a medication management system

    PubMed Central

    Johnson, Kevin B; Denny, Joshua C

    2011-01-01

    Objective To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Design Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. Measurements A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Results Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Conclusion Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages. PMID:21984605

  7. Categorizing with gender: does implicit grammatical gender affect semantic processing in 24-month-old toddlers?

    PubMed

    Bobb, Susan C; Mani, Nivedita

    2013-06-01

    The current study investigated the interaction of implicit grammatical gender and semantic category knowledge during object identification. German-learning toddlers (24-month-olds) were presented with picture pairs and heard a noun (without a preceding article) labeling one of the pictures. Labels for target and distracter images either matched or mismatched in grammatical gender and either matched or mismatched in semantic category. When target and distracter overlapped in both semantic and gender information, target recognition was impaired compared with when target and distracter overlapped on only one dimension. Results suggest that by 24 months of age, German-learning toddlers are already forming not only semantic but also grammatical gender categories and that these sources of information are activated, and interact, during object identification. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Automated software system for checking the structure and format of ACM SIG documents

    NASA Astrophysics Data System (ADS)

    Mirza, Arsalan Rahman; Sah, Melike

    2017-04-01

    Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.

  9. Automatic definition of the oncologic EHR data elements from NCIT in OWL.

    PubMed

    Cuggia, Marc; Bourdé, Annabel; Turlin, Bruno; Vincendeau, Sebastien; Bertaud, Valerie; Bohec, Catherine; Duvauferrier, Régis

    2011-01-01

    Semantic interoperability based on ontologies allows systems to combine their information and process them automatically. The ability to extract meaningful fragments from ontology is a key for the ontology re-use and the construction of a subset will help to structure clinical data entries. The aim of this work is to provide a method for extracting a set of concepts for a specific domain, in order to help to define data elements of an oncologic EHR. a generic extraction algorithm was developed to extract, from the NCIT and for a specific disease (i.e. prostate neoplasm), all the concepts of interest into a sub-ontology. We compared all the concepts extracted to the concepts encoded manually contained into the multi-disciplinary meeting report form (MDMRF). We extracted two sub-ontologies: sub-ontology 1 by using a single key concept and sub-ontology 2 by using 5 additional keywords. The coverage of sub-ontology 2 to the MDMRF concepts was 51%. The low rate of coverage is due to the lack of definition or mis-classification of the NCIT concepts. By providing a subset of concepts focused on a particular domain, this extraction method helps at optimizing the binding process of data elements and at maintaining and enriching a domain ontology.

  10. Automatic textual annotation of video news based on semantic visual object extraction

    NASA Astrophysics Data System (ADS)

    Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem

    2003-12-01

    In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.

  11. Development and psychometric testing of a semantic differential scale of sexual attitude for the older person.

    PubMed

    Park, Hyojung; Shin, Sunhwa

    2015-12-01

    The purpose of this study was to develop and test a semantic differential scale of sexual attitudes for older people in Korea. The scale was based on items derived from a literature review and focus group interviews. A methodological study was used to test the reliability and validity of the instrument. A total of 368 older men and women were recruited to complete the semantic differential scale. Fifteen pairs of adjective ratings were extracted through factor analysis. Total variance explained was 63.40%. To test for construct validity, group comparisons were implemented. The total score of sexual attitudes showed significant differences depending on gender and availability of sexual activity. Cronbach's alpha coefficient for internal consistency was 0.96. The findings of this study demonstrate that the semantic differential scale of sexual attitude is a reliable and valid instrument. © 2015 Wiley Publishing Asia Pty Ltd.

  12. The BioLexicon: a large-scale terminological resource for biomedical text mining

    PubMed Central

    2011-01-01

    Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical Markup Framework, an ISO standard. Conclusions The BioLexicon contains over 2.2 M lexical entries and over 1.8 M terminological variants, as well as over 3.3 M semantic relations, including over 2 M synonymy relations. Its exploitation can benefit both application developers and users. We demonstrate some such benefits by describing integration of the resource into a number of different tools, and evaluating improvements in performance that this can bring. PMID:21992002

  13. The BioLexicon: a large-scale terminological resource for biomedical text mining.

    PubMed

    Thompson, Paul; McNaught, John; Montemagni, Simonetta; Calzolari, Nicoletta; del Gratta, Riccardo; Lee, Vivian; Marchi, Simone; Monachini, Monica; Pezik, Piotr; Quochi, Valeria; Rupp, C J; Sasaki, Yutaka; Venturi, Giulia; Rebholz-Schuhmann, Dietrich; Ananiadou, Sophia

    2011-10-12

    Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical Markup Framework, an ISO standard. The BioLexicon contains over 2.2 M lexical entries and over 1.8 M terminological variants, as well as over 3.3 M semantic relations, including over 2 M synonymy relations. Its exploitation can benefit both application developers and users. We demonstrate some such benefits by describing integration of the resource into a number of different tools, and evaluating improvements in performance that this can bring.

  14. Semantic and topological classification of images in magnetically guided capsule endoscopy

    NASA Astrophysics Data System (ADS)

    Mewes, P. W.; Rennert, P.; Juloski, A. L.; Lalande, A.; Angelopoulou, E.; Kuth, R.; Hornegger, J.

    2012-03-01

    Magnetically-guided capsule endoscopy (MGCE) is a nascent technology with the goal to allow the steering of a capsule endoscope inside a water filled stomach through an external magnetic field. We developed a classification cascade for MGCE images with groups images in semantic and topological categories. Results can be used in a post-procedure review or as a starting point for algorithms classifying pathologies. The first semantic classification step discards over-/under-exposed images as well as images with a large amount of debris. The second topological classification step groups images with respect to their position in the upper gastrointestinal tract (mouth, esophagus, stomach, duodenum). In the third stage two parallel classifications steps distinguish topologically different regions inside the stomach (cardia, fundus, pylorus, antrum, peristaltic view). For image classification, global image features and local texture features were applied and their performance was evaluated. We show that the third classification step can be improved by a bubble and debris segmentation because it limits feature extraction to discriminative areas only. We also investigated the impact of segmenting intestinal folds on the identification of different semantic camera positions. The results of classifications with a support-vector-machine show the significance of color histogram features for the classification of corrupted images (97%). Features extracted from intestinal fold segmentation lead only to a minor improvement (3%) in discriminating different camera positions.

  15. End-User Evaluations of Semantic Web Technologies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McCool, Rob; Cowell, Andrew J.; Thurman, David A.

    Stanford University's Knowledge Systems Laboratory (KSL) is working in partnership with Battelle Memorial Institute and IBM Watson Research Center to develop a suite of technologies for information extraction, knowledge representation & reasoning, and human-information interaction, in unison entitled 'Knowledge Associates for Novel Intelligence' (KANI). We have developed an integrated analytic environment composed of a collection of analyst associates, software components that aid the user at different stages of the information analysis process. An important part of our participatory design process has been to ensure our technologies and designs are tightly integrate with the needs and requirements of our end users,more » To this end, we perform a sequence of evaluations towards the end of the development process that ensure the technologies are both functional and usable. This paper reports on that process.« less

  16. Extracting 3d Semantic Information from Video Surveillance System Using Deep Learning

    NASA Astrophysics Data System (ADS)

    Zhang, J. S.; Cao, J.; Mao, B.; Shen, D. Q.

    2018-04-01

    At present, intelligent video analysis technology has been widely used in various fields. Object tracking is one of the important part of intelligent video surveillance, but the traditional target tracking technology based on the pixel coordinate system in images still exists some unavoidable problems. Target tracking based on pixel can't reflect the real position information of targets, and it is difficult to track objects across scenes. Based on the analysis of Zhengyou Zhang's camera calibration method, this paper presents a method of target tracking based on the target's space coordinate system after converting the 2-D coordinate of the target into 3-D coordinate. It can be seen from the experimental results: Our method can restore the real position change information of targets well, and can also accurately get the trajectory of the target in space.

  17. Goal-directed mechanisms that constrain retrieval predict subsequent memory for new "foil" information.

    PubMed

    Vogelsang, David A; Bonnici, Heidi M; Bergström, Zara M; Ranganath, Charan; Simons, Jon S

    2016-08-01

    To remember a previous event, it is often helpful to use goal-directed control processes to constrain what comes to mind during retrieval. Behavioral studies have demonstrated that incidental learning of new "foil" words in a recognition test is superior if the participant is trying to remember studied items that were semantically encoded compared to items that were non-semantically encoded. Here, we applied subsequent memory analysis to fMRI data to understand the neural mechanisms underlying the "foil effect". Participants encoded information during deep semantic and shallow non-semantic tasks and were tested in a subsequent blocked memory task to examine how orienting retrieval towards different types of information influences the incidental encoding of new words presented as foils during the memory test phase. To assess memory for foils, participants performed a further surprise old/new recognition test involving foil words that were encountered during the previous memory test blocks as well as completely new words. Subsequent memory effects, distinguishing successful versus unsuccessful incidental encoding of foils, were observed in regions that included the left inferior frontal gyrus and posterior parietal cortex. The left inferior frontal gyrus exhibited disproportionately larger subsequent memory effects for semantic than non-semantic foils, and significant overlap in activity during semantic, but not non-semantic, initial encoding and foil encoding. The results suggest that orienting retrieval towards different types of foils involves re-implementing the neurocognitive processes that were involved during initial encoding. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  18. Measuring content overlap during handoff communication using distributional semantics: An exploratory study.

    PubMed

    Abraham, Joanna; Kannampallil, Thomas G; Srinivasan, Vignesh; Galanter, William L; Tagney, Gail; Cohen, Trevor

    2017-01-01

    We develop and evaluate a methodological approach to measure the degree and nature of overlap in handoff communication content within and across clinical professions. This extensible, exploratory approach relies on combining techniques from conversational analysis and distributional semantics. We audio-recorded handoff communication of residents and nurses on the General Medicine floor of a large academic hospital (n=120 resident and n=120 nurse handoffs). We measured semantic similarity, a proxy for content overlap, between resident-resident and nurse-nurse communication using multiple steps: a qualitative conversational content analysis; an automated semantic similarity analysis using Reflective Random Indexing (RRI); and comparing semantic similarity generated by RRI analysis with human ratings of semantic similarity. There was significant association between the semantic similarity as computed by the RRI method and human rating (ρ=0.88). Based on the semantic similarity scores, content overlap was relatively higher for content related to patient active problems, assessment of active problems, patient-identifying information, past medical history, and medications/treatments. In contrast, content overlap was limited on content related to allergies, family-related information, code status, and anticipatory guidance. Our approach using RRI analysis provides new opportunities for characterizing the nature and degree of overlap in handoff communication. Although exploratory, this method provides a basis for identifying content that can be used for determining shared understanding across clinical professions. Additionally, this approach can inform the development of flexibly standardized handoff tools that reflect clinical content that are most appropriate for fostering shared understanding during transitions of care. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Semantic integration of gene expression analysis tools and data sources using software connectors

    PubMed Central

    2013-01-01

    Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data. PMID:24341380

  20. Semantic integration of gene expression analysis tools and data sources using software connectors.

    PubMed

    Miyazaki, Flávia A; Guardia, Gabriela D A; Vêncio, Ricardo Z N; de Farias, Cléver R G

    2013-10-25

    The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heterogeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.

Top