Sample records for efficient semantic parsing

  1. Applying Semantic-based Probabilistic Context-Free Grammar to Medical Language Processing – A Preliminary Study on Parsing Medication Sentences

    PubMed Central

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C.; Doan, Son

    2011-01-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4,564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440

  2. Integrated Japanese Dependency Analysis Using a Dialog Context

    NASA Astrophysics Data System (ADS)

    Ikegaya, Yuki; Noguchi, Yasuhiro; Kogure, Satoru; Itoh, Toshihiko; Konishi, Tatsuhiro; Kondo, Makoto; Asoh, Hideki; Takagi, Akira; Itoh, Yukihiro

    This paper describes how to perform syntactic parsing and semantic analysis in a dialog system. The paper especially deals with how to disambiguate potentially ambiguous sentences using the contextual information. Although syntactic parsing and semantic analysis are often studied independently of each other, correct parsing of a sentence often requires the semantic information on the input and/or the contextual information prior to the input. Accordingly, we merge syntactic parsing with semantic analysis, which enables syntactic parsing taking advantage of the semantic content of an input and its context. One of the biggest problems of semantic analysis is how to interpret dependency structures. We employ a framework for semantic representations that circumvents the problem. Within the framework, the meaning of any predicate is converted into a semantic representation which only permits a single type of predicate: an identifying predicate "aru". The semantic representations are expressed as sets of "attribute-value" pairs, and those semantic representations are stored in the context information. Our system disambiguates syntactic/semantic ambiguities of inputs referring to the attribute-value pairs in the context information. We have experimentally confirmed the effectiveness of our approach; specifically, the experiment confirmed high accuracy of parsing and correctness of generated semantic representations.

  3. A Semantic Constraint on Syntactic Parsing.

    ERIC Educational Resources Information Center

    Crain, Stephen; Coker, Pamela L.

    This research examines how semantic information influences syntactic parsing decisions during sentence processing. In the first experiment, subjects were presented lexical strings having syntactically identical surface structures but with two possible underlying structures: "The children taught by the Berlitz method," and "The…

  4. Parsing GML data based on integrative GML syntactic and semantic schemas database

    NASA Astrophysics Data System (ADS)

    Miao, Lizhi; Zhang, Shuliang; Lu, Guonian; Gao, Xiaoli; Jiao, Donglai; Gan, Jiayan

    2007-06-01

    This paper proposes a new method to parse various application schemas of Geography Markup Language (GML) for understanding syntax and semantic of their element and type in order to implement uniform interpretation of the same GML instance data among diverse users. The proposed method generates an Integrative GML Syntactic and Semantic Schemas Database (IGSSSDB) from GML3.1 core schemas and corresponding application schema. This paper parses GML data based on IGSSSDB, which is composed of syntactic and semantic information, nesting information and mapping rules of GML core schemas and application schemas. Three kinds of relational tables are designed for storing information from schemas when constructing IGSSSDB. Those are info tables for schemas included and namespace imported in application schemas, tables for information related to schemas and catalog tables of core schemas. In relational tables, we propose to use homologous regular expression to describe model of elements and complex types in schemas, which can ensure model complete and readable. Based on IGSSSDB, we design and develop many APIs to implement GML data parsing, and can process syntactic and semantic information of GML data from diverse fields and users. At the latter part of this paper, test study is implemented to show that the proposed method is feasible and appropriate for parsing GML data. Also, it founds a good basis for future GML data studies such as storage, index and query etc.

  5. Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques

    DTIC Science & Technology

    2010-05-01

    Workshop on Supervisory Con- trol of Learning and Adaptive Systems. San Jose, CA. Roland Kuhn and Renato De Mori (1995). The application of semantic...Processing (EMNLP-09), pp. 1–10. Suntec,Singapore. Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko and Alexander Yates (2004). Modern natural

  6. owlcpp: a C++ library for working with OWL ontologies.

    PubMed

    Levin, Mikhail K; Cowell, Lindsay G

    2015-01-01

    The increasing use of ontologies highlights the need for a library for working with ontologies that is efficient, accessible from various programming languages, and compatible with common computational platforms. We developed owlcpp, a library for storing and searching RDF triples, parsing RDF/XML documents, converting triples into OWL axioms, and reasoning. The library is written in ISO-compliant C++ to facilitate efficiency, portability, and accessibility from other programming languages. Internally, owlcpp uses the Raptor RDF Syntax library for parsing RDF/XML and the FaCT++ library for reasoning. The current version of owlcpp is supported under Linux, OSX, and Windows platforms and provides an API for Python. The results of our evaluation show that, compared to other commonly used libraries, owlcpp is significantly more efficient in terms of memory usage and searching RDF triple stores. owlcpp performs strict parsing and detects errors ignored by other libraries, thus reducing the possibility of incorrect semantic interpretation of ontologies. owlcpp is available at http://owl-cpp.sf.net/ under the Boost Software License, Version 1.0.

  7. Learning for Semantic Parsing with Kernels under Various Forms of Supervision

    DTIC Science & Technology

    2007-08-01

    natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194

  8. A Semantic Parsing Method for Mapping Clinical Questions to Logical Forms

    PubMed Central

    Roberts, Kirk; Patra, Braja Gopal

    2017-01-01

    This paper presents a method for converting natural language questions about structured data in the electronic health record (EHR) into logical forms. The logical forms can then subsequently be converted to EHR-dependent structured queries. The natural language processing task, known as semantic parsing, has the potential to convert questions to logical forms with extremely high precision, resulting in a system that is usable and trusted by clinicians for real-time use in clinical settings. We propose a hybrid semantic parsing method, combining rule-based methods with a machine learning-based classifier. The overall semantic parsing precision on a set of 212 questions is 95.6%. The parser’s rules furthermore allow it to “know what it does not know”, enabling the system to indicate when unknown terms prevent it from understanding the question’s full logical structure. When combined with a module for converting a logical form into an EHR-dependent query, this high-precision approach allows for a question answering system to provide a user with a single, verifiably correct answer. PMID:29854217

  9. Learning for Semantic Parsing and Natural Language Generation Using Statistical Machine Translation Techniques

    DTIC Science & Technology

    2007-08-01

    In this domain, queries typically show a deeply nested structure, which makes the semantic parsing task rather challenging , e.g.: What states border...only 80% of the GEOQUERY queries are semantically tractable, which shows that GEOQUERY is indeed a more challenging domain than ATIS. Note that none...a particularly challenging task, because of the inherent ambiguity of natural languages on both sides. It has inspired a large body of research. In

  10. A Semantic Analysis Method for Scientific and Engineering Code

    NASA Technical Reports Server (NTRS)

    Stewart, Mark E. M.

    1998-01-01

    This paper develops a procedure to statically analyze aspects of the meaning or semantics of scientific and engineering code. The analysis involves adding semantic declarations to a user's code and parsing this semantic knowledge with the original code using multiple expert parsers. These semantic parsers are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. In practice, a user would submit code with semantic declarations of primitive variables to the analysis procedure, and its semantic parsers would automatically recognize and document some static, semantic concepts and locate some program semantic errors. A prototype implementation of this analysis procedure is demonstrated. Further, the relationship between the fundamental algebraic manipulations of equations and the parsing of expressions is explained. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.

  11. FRED: a program development tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shilling, J.

    1985-09-01

    The structured, screen-based editor FRED is introduced. FRED provides incremental parsing and semantic analysis. The parsing is based on an LL(1) top-down algorithm which has been modified to provide follow-the-cursor parsing and soft templates. The languages accepted by the editor are LL(1) languages with the addition of the Unknown and preferred production non-terminal classes. The semantic analysis is based on the incremental update of attribute grammar equations. We briefly describe the interface between FRED and an automated reference librarian system that is under development.

  12. Hierarchical parsing and semantic navigation of full body CT data

    NASA Astrophysics Data System (ADS)

    Seifert, Sascha; Barbu, Adrian; Zhou, S. Kevin; Liu, David; Feulner, Johannes; Huber, Martin; Suehling, Michael; Cavallaro, Alexander; Comaniciu, Dorin

    2009-02-01

    Whole body CT scanning is a common diagnosis technique for discovering early signs of metastasis or for differential diagnosis. Automatic parsing and segmentation of multiple organs and semantic navigation inside the body can help the clinician in efficiently obtaining accurate diagnosis. However, dealing with the large amount of data of a full body scan is challenging and techniques are needed for the fast detection and segmentation of organs, e.g., heart, liver, kidneys, bladder, prostate, and spleen, and body landmarks, e.g., bronchial bifurcation, coccyx tip, sternum, lung tips. Solving the problem becomes even more challenging if partial body scans are used, where not all organs are present. We propose a new approach to this problem, in which a network of 1D and 3D landmarks is trained to quickly parse the 3D CT data and estimate which organs and landmarks are present as well as their most probable locations and boundaries. Using this approach, the segmentation of seven organs and detection of 19 body landmarks can be obtained in about 20 seconds with state-of-the-art accuracy and has been validated on 80 CT full or partial body scans.

  13. Flexible Parsing.

    DTIC Science & Technology

    1986-06-30

    Machine Studies .. 14. Minton, S. N., Hayes, P. J., and Fain, J. E. Controlling Search in Flexible Parsing. Proc. Ninth Int. Jt. Conf. on Artificial...interaction through the COUSIN command interface", International Journal of Man- Machine Studies , Vol. 19, No. 3, September 1983, pp. 285-305. 8...in a gracefully interacting user interface," "Dynamic strategy selection in flexible parsing," and "Parsing spoken language: a semantic case frame

  14. Parsing English. Course Notes for a Tutorial on Computational Semantics, March 17-22, 1975.

    ERIC Educational Resources Information Center

    Wilks, Yorick

    The course in parsing English is essentially a survey and comparison of several of the principal systems used for understanding natural language. The basic procedure of parsing is described. The discussion of the principal systems is based on the idea that "meaning is procedures," that is, that the procedures of application give a parsed…

  15. Semantics Boosts Syntax in Artificial Grammar Learning Tasks with Recursion

    ERIC Educational Resources Information Center

    Fedor, Anna; Varga, Mate; Szathmary, Eors

    2012-01-01

    Center-embedded recursion (CER) in natural language is exemplified by sentences such as "The malt that the rat ate lay in the house." Parsing center-embedded structures is in the focus of attention because this could be one of the cognitive capacities that make humans distinct from all other animals. The ability to parse CER is usually…

  16. A fusion network for semantic segmentation using RGB-D data

    NASA Astrophysics Data System (ADS)

    Yuan, Jiahui; Zhang, Kun; Xia, Yifan; Qi, Lin; Dong, Junyu

    2018-04-01

    Semantic scene parsing is considerable in many intelligent field, including perceptual robotics. For the past few years, pixel-wise prediction tasks like semantic segmentation with RGB images has been extensively studied and has reached very remarkable parsing levels, thanks to convolutional neural networks (CNNs) and large scene datasets. With the development of stereo cameras and RGBD sensors, it is expected that additional depth information will help improving accuracy. In this paper, we propose a semantic segmentation framework incorporating RGB and complementary depth information. Motivated by the success of fully convolutional networks (FCN) in semantic segmentation field, we design a fully convolutional networks consists of two branches which extract features from both RGB and depth data simultaneously and fuse them as the network goes deeper. Instead of aggregating multiple model, our goal is to utilize RGB data and depth data more effectively in a single model. We evaluate our approach on the NYU-Depth V2 dataset, which consists of 1449 cluttered indoor scenes, and achieve competitive results with the state-of-the-art methods.

  17. A resource-saving collective approach to biomedical semantic role labeling

    PubMed Central

    2014-01-01

    Background Biomedical semantic role labeling (BioSRL) is a natural language processing technique that identifies the semantic roles of the words or phrases in sentences describing biological processes and expresses them as predicate-argument structures (PAS’s). Currently, a major problem of BioSRL is that most systems label every node in a full parse tree independently; however, some nodes always exhibit dependency. In general SRL, collective approaches based on the Markov logic network (MLN) model have been successful in dealing with this problem. However, in BioSRL such an approach has not been attempted because it would require more training data to recognize the more specialized and diverse terms found in biomedical literature, increasing training time and computational complexity. Results We first constructed a collective BioSRL system based on MLN. This system, called collective BIOSMILE (CBIOSMILE), is trained on the BioProp corpus. To reduce the resources used in BioSRL training, we employ a tree-pruning filter to remove unlikely nodes from the parse tree and four argument candidate identifiers to retain candidate nodes in the tree. Nodes not recognized by any candidate identifier are discarded. The pruned annotated parse trees are used to train a resource-saving MLN-based system, which is referred to as resource-saving collective BIOSMILE (RCBIOSMILE). Our experimental results show that our proposed CBIOSMILE system outperforms BIOSMILE, which is the top BioSRL system. Furthermore, our proposed RCBIOSMILE maintains the same level of accuracy as CBIOSMILE using 92% less memory and 57% less training time. Conclusions This greatly improved efficiency makes RCBIOSMILE potentially suitable for training on much larger BioSRL corpora over more biomedical domains. Compared to real-world biomedical corpora, BioProp is relatively small, containing only 445 MEDLINE abstracts and 30 event triggers. It is not large enough for practical applications, such as pathway construction. We consider it of primary importance to pursue SRL training on large corpora in the future. PMID:24884358

  18. E-Learning for Depth in the Semantic Web

    ERIC Educational Resources Information Center

    Shafrir, Uri; Etkind, Masha

    2006-01-01

    In this paper, we describe concept parsing algorithms, a novel semantic analysis methodology at the core of a new pedagogy that focuses learners attention on deep comprehension of the conceptual content of learned material. Two new e-learning tools are described in some detail: interactive concept discovery learning and meaning equivalence…

  19. A Development System for Augmented Transition Network Grammars and a Large Grammar for Technical Prose. Technical Report No. 25.

    ERIC Educational Resources Information Center

    Mayer, John; Kieras, David E.

    Using a system based on standard augmented transition network (ATN) parsing approach, this report describes a technique for the rapid development of natural language parsing, called High-Level Grammar Specification Language (HGSL). The first part of the report describes the syntax and semantics of HGSL and the network implementation of each of its…

  20. ANTLR Tree Grammar Generator and Extensions

    NASA Technical Reports Server (NTRS)

    Craymer, Loring

    2005-01-01

    A computer program implements two extensions of ANTLR (Another Tool for Language Recognition), which is a set of software tools for translating source codes between different computing languages. ANTLR supports predicated- LL(k) lexer and parser grammars, a notation for annotating parser grammars to direct tree construction, and predicated tree grammars. [ LL(k) signifies left-right, leftmost derivation with k tokens of look-ahead, referring to certain characteristics of a grammar.] One of the extensions is a syntax for tree transformations. The other extension is the generation of tree grammars from annotated parser or input tree grammars. These extensions can simplify the process of generating source-to-source language translators and they make possible an approach, called "polyphase parsing," to translation between computing languages. The typical approach to translator development is to identify high-level semantic constructs such as "expressions," "declarations," and "definitions" as fundamental building blocks in the grammar specification used for language recognition. The polyphase approach is to lump ambiguous syntactic constructs during parsing and then disambiguate the alternatives in subsequent tree transformation passes. Polyphase parsing is believed to be useful for generating efficient recognizers for C++ and other languages that, like C++, have significant ambiguities.

  1. Progress in The Semantic Analysis of Scientific Code

    NASA Technical Reports Server (NTRS)

    Stewart, Mark

    2000-01-01

    This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.

  2. Robust Deep Semantics for Language Understanding

    DTIC Science & Technology

    focus on five areas: deep learning, textual inferential relations, relation and event extraction by distant supervision , semantic parsing and...ontology expansion, and coreference resolution. As time went by, the program focus converged towards emphasizing technologies for knowledge base...natural logic methods for text understanding, improved mention coreference algorithms, and the further development of multilingual tools in CoreNLP.

  3. User Evaluation of Automatically Generated Semantic Hypertext Links in a Heavily Used Procedural Manual.

    ERIC Educational Resources Information Center

    Tebbutt, John

    1999-01-01

    Discusses efforts at National Institute of Standards and Technology (NIST) to construct an information discovery tool through the fusion of hypertext and information retrieval that works by parsing a contiguous document base into smaller documents and inserting semantic links between them. Also presents a case study that evaluated user reactions.…

  4. An Experiment in Scientific Code Semantic Analysis

    NASA Technical Reports Server (NTRS)

    Stewart, Mark E. M.

    1998-01-01

    This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, distributed expert parsers. These semantic parser are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. The parsers will automatically recognize and document some static, semantic concepts and locate some program semantic errors. Results are shown for a subroutine test case and a collection of combustion code routines. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.

  5. Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 1. Technical Report

    DTIC Science & Technology

    1989-09-30

    26 9.1 Overview of SPQR ............................................................................. 26 9.2...domain. The ISR is the input to the selection component SPQR , whose function is to block semantically anomalous parses before they are sent to the...frequently occurring pairs of words, which is useful for identifying fixed multi-word expressions. 9. SELECTION The SPQR module (Selectional Pattern

  6. Research in Knowledge Representation for Natural Language Understanding

    DTIC Science & Technology

    1980-11-01

    artificial intelligence, natural language understanding , parsing, syntax, semantics, speaker meaning, knowledge representation, semantic networks...TinB PAGE map M W006 1Report No. 4513 L RESEARCH IN KNOWLEDGE REPRESENTATION FOR NATURAL LANGUAGE UNDERSTANDING Annual Report 1 September 1979 to 31... understanding , knowledge representation, and knowledge based inference. The work that we have been doing falls into three classes, successively motivated by

  7. Integrating Syntax, Semantics, and Discourse DARPA (Defense Advanced Research Projects Agency) Natural Language Understanding Program

    DTIC Science & Technology

    1988-08-01

    heavily on the original SPQR component, and uses the same context free grammar to analyze the ISR. The main difference is that, where before SPQR ...ISR is semantically coherent. This has been tested thoroughly on the CASREPS domain, and selects the same parses that SPQR Eid, in less time. There...were a few SPQR patterns that reflected semantic information that could only be provided by time analysis, such as the fact that [pressure during

  8. Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 3. Papers

    DTIC Science & Technology

    1989-09-30

    parses, in a second experiment. This procedure used PUNDIT’s Selection Pattern Query and Response ( SPQR ) component JLang19881. We first used SPQR in...messages pattern. SPQR continues the analysis of the ISR. from each domain, and the resulting output is and the parsing of the sentence is allowed to...UNISYS P. 0. Box 517, Paoli, PA 19301 ABSTRACT knowledge. This paper presents SPQR (Selectional Pat- One obvious benefit of acquiring domain- tern Queries

  9. HDM/PASCAL Verification System User's Manual

    NASA Technical Reports Server (NTRS)

    Hare, D.

    1983-01-01

    The HDM/Pascal verification system is a tool for proving the correctness of programs written in PASCAL and specified in the Hierarchical Development Methodology (HDM). This document assumes an understanding of PASCAL, HDM, program verification, and the STP system. The steps toward verification which this tool provides are parsing programs and specifications, checking the static semantics, and generating verification conditions. Some support functions are provided such as maintaining a data base, status management, and editing. The system runs under the TOPS-20 and TENEX operating systems and is written in INTERLISP. However, no knowledge is assumed of these operating systems or of INTERLISP. The system requires three executable files, HDMVCG, PARSE, and STP. Optionally, the editor EMACS should be on the system in order for the editor to work. The file HDMVCG is invoked to run the system. The files PARSE and STP are used as lower forks to perform the functions of parsing and proving.

  10. Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies

    PubMed Central

    Fan, Jung-Wei; Friedman, Carol

    2011-01-01

    Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. PMID:21549857

  11. Deriving pathway maps from automated text analysis using a grammar-based approach.

    PubMed

    Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn

    2006-04-01

    We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.

  12. A Tandem Semantic Interpreter for Incremental Parse Selection

    DTIC Science & Technology

    1990-09-28

    syntactic role to semantic role. An exam - ple from Fillmore [10] is the sentence I copied that letter, which can be uttered when point- ing either to...person. We want the word fiddle to have the sort predicate violin as its lexical interpretation, how- ever, notthing. Thus, for Ross went for his fiddle...to receive an interpretation, a sort hierarchy is needed to establish that all violins are things. A well-structured sort hierarchy allows newly added

  13. An Experiment in Scientific Program Understanding

    NASA Technical Reports Server (NTRS)

    Stewart, Mark E. M.; Owen, Karl (Technical Monitor)

    2000-01-01

    This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. Results are shown for three intensively studied codes and seven blind test cases; all test cases are state of the art scientific codes. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.

  14. Syntactic and semantic processing of Chinese middle sentences: evidence from event-related potentials.

    PubMed

    Zeng, Tao; Mao, Wen; Lu, Qing

    2016-05-25

    Scalp-recorded event-related potentials are known to be sensitive to particular aspects of sentence processing. The N400 component is widely recognized as an effect closely related to lexical-semantic processing. The absence of an N400 effect in participants performing tasks in Indo-European languages has been considered evidence that failed syntactic category processing appears to block lexical-semantic integration and that syntactic structure building is a prerequisite of semantic analysis. An event-related potential experiment was designed to investigate whether such syntactic primacy can be considered to apply equally to Chinese sentence processing. Besides correct middles, sentences with either single semantic or single syntactic violation as well as double syntactic and semantic anomaly were used in the present research. Results showed that both purely semantic and combined violation induced a broad negativity in the time window 300-500 ms, indicating the independence of lexical-semantic integration. These findings provided solid evidence that lexical-semantic parsing plays a crucial role in Chinese sentence comprehension.

  15. A hierarchical methodology for urban facade parsing from TLS point clouds

    NASA Astrophysics Data System (ADS)

    Li, Zhuqiang; Zhang, Liqiang; Mathiopoulos, P. Takis; Liu, Fangyu; Zhang, Liang; Li, Shuaipeng; Liu, Hao

    2017-01-01

    The effective and automated parsing of building facades from terrestrial laser scanning (TLS) point clouds of urban environments is an important research topic in the GIS and remote sensing fields. It is also challenging because of the complexity and great variety of the available 3D building facade layouts as well as the noise and data missing of the input TLS point clouds. In this paper, we introduce a novel methodology for the accurate and computationally efficient parsing of urban building facades from TLS point clouds. The main novelty of the proposed methodology is that it is a systematic and hierarchical approach that considers, in an adaptive way, the semantic and underlying structures of the urban facades for segmentation and subsequent accurate modeling. Firstly, the available input point cloud is decomposed into depth planes based on a data-driven method; such layer decomposition enables similarity detection in each depth plane layer. Secondly, the labeling of the facade elements is performed using the SVM classifier in combination with our proposed BieS-ScSPM algorithm. The labeling outcome is then augmented with weak architectural knowledge. Thirdly, least-squares fitted normalized gray accumulative curves are applied to detect regular structures, and a binarization dilation extraction algorithm is used to partition facade elements. A dynamic line-by-line division is further applied to extract the boundaries of the elements. The 3D geometrical façade models are then reconstructed by optimizing facade elements across depth plane layers. We have evaluated the performance of the proposed method using several TLS facade datasets. Qualitative and quantitative performance comparisons with several other state-of-the-art methods dealing with the same facade parsing problem have demonstrated its superiority in performance and its effectiveness in improving segmentation accuracy.

  16. Semantic Support and Parallel Parsing in Chinese

    ERIC Educational Resources Information Center

    Hsieh, Yufen; Boland, Julie E.

    2015-01-01

    Two eye-tracking experiments were conducted using written Chinese sentences that contained a multi-word ambiguous region. The goal was to determine whether readers maintained multiple interpretations throughout the ambiguous region or selected a single interpretation at the point of ambiguity. Within the ambiguous region, we manipulated the…

  17. Null Element Restoration

    ERIC Educational Resources Information Center

    Gabbard, Ryan

    2010-01-01

    Understanding the syntactic structure of a sentence is a necessary preliminary to understanding its semantics and therefore for many practical applications. The field of natural language processing has achieved a high degree of accuracy in parsing, at least in English. However, the syntactic structures produced by the most commonly used parsers…

  18. Revise and resubmit: How real-time parsing limitations influence grammar acquisition

    PubMed Central

    Pozzan, Lucia; Trueswell, John C.

    2015-01-01

    We present the results from a three-day artificial language learning study on adults. The study examined whether sentence-parsing limitations, in particular, difficulties revising initial syntactic/semantic commitments during comprehension, shape learners’ ability to acquire a language. Findings show that both comprehension and production of morphology pertaining to sentence argument structure are delayed when this morphology consistently appears at the end, rather than at the beginning, of sentences in otherwise identical grammatical systems. This suggests that real-time processing constraints impact acquisition; morphological cues that tend to guide linguistic analyses are easier to learn than cues that revise these analyses. Parallel performance in production and comprehension indicates that parsing constraints affect grammatical acquisition, not just real-time commitments. Properties of the linguistic system (e.g., ordering of cues within a sentence) interact with the properties of the cognitive system (cognitive control and conflict-resolution abilities) and together affect language acquisition. PMID:26026607

  19. Interference Effects from Grammatically Unavailable Constituents during Sentence Processing

    ERIC Educational Resources Information Center

    Van Dyke, Julie A.

    2007-01-01

    Evidence from 3 experiments reveals interference effects from structural relationships that are inconsistent with any grammatical parse of the perceived input. Processing disruption was observed when items occurring between a head and a dependent overlapped with either (or both) syntactic or semantic features of the dependent. Effects of syntactic…

  20. Modelling Parsing Constraints with High-Dimensional Context Space.

    ERIC Educational Resources Information Center

    Burgess, Curt; Lund, Kevin

    1997-01-01

    Presents a model of high-dimensional context space, the Hyperspace Analogue to Language (HAL), with a series of simulations modelling human empirical results. Proposes that HAL's context space can be used to provide a basic categorization of semantic and grammatical concepts; model certain aspects of morphological ambiguity in verbs; and provide…

  1. Semantically-based priors and nuanced knowledge core for Big Data, Social AI, and language understanding.

    PubMed

    Olsher, Daniel

    2014-10-01

    Noise-resistant and nuanced, COGBASE makes 10 million pieces of commonsense data and a host of novel reasoning algorithms available via a family of semantically-driven prior probability distributions. Machine learning, Big Data, natural language understanding/processing, and social AI can draw on COGBASE to determine lexical semantics, infer goals and interests, simulate emotion and affect, calculate document gists and topic models, and link commonsense knowledge to domain models and social, spatial, cultural, and psychological data. COGBASE is especially ideal for social Big Data, which tends to involve highly implicit contexts, cognitive artifacts, difficult-to-parse texts, and deep domain knowledge dependencies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Guidance of visual attention by semantic information in real-world scenes

    PubMed Central

    Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc

    2014-01-01

    Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724

  3. Neural Basis of Semantic and Syntactic Interference in Sentence Comprehension

    PubMed Central

    Glaser, Yi G.; Martin, Randi C.; Van Dyke, Julie A.; Hamilton, A. Cris; Tan, Yingying

    2013-01-01

    According to the cue-based parsing approach (Lewis, Vasishth, & Van Dyke, 2006), sentence comprehension difficulty derives from interference from material that partially matches syntactic and semantic retrieval cues. In a 2 (low vs. high semantic interference) × 2 (low vs. high syntactic interference) fMRI study, greater activation was observed in left BA 44/45 for high versus low syntactic interference conditions following sentences and in BA 45/47 for high versus low semantic interference following comprehension questions. A conjunction analysis showed BA45 associated with both types of interference, while BA47 was associated with only semantic interference. Greater activation was also observed in the left STG in the high interference conditions. Importantly, the results for the LIFG could not be attributed to greater working memory capacity demands for high interference conditions. The results favor a fractionation of LIFG wherein BA45 is associated with post-retrieval selection and BA47 with controlled retrieval of semantic information. PMID:23933471

  4. What is it that lingers? Garden-path (mis)interpretations in younger and older adults.

    PubMed

    Malyutina, Svetlana; den Ouden, Dirk-Bart

    2016-01-01

    Previous research has shown that comprehenders do not always conduct a full (re)analysis of temporarily ambiguous "garden-path" sentences. The present study used a sentence-picture matching task to investigate what kind of representations are formed when full reanalysis is not performed: Do comprehenders "blend" two incompatible representations as a result of shallow syntactic processing or do they erroneously maintain the initial incorrect parsing without incorporating new information, and does this vary with age? Twenty-five younger and 15 older adults performed a multiple-choice sentence-picture matching task with stimuli including early-closure garden-path sentences. The results suggest that the type of erroneous representation is affected by linguistic variables, such as sentence structure, verb type, and semantic plausibility, as well as by age. Older adults' response patterns indicate an increased reliance on inferencing based on lexical and semantic cues, with a lower bar for accepting an initial parse and with a weaker drive to reanalyse a syntactic representation. Among younger adults, there was a tendency to blend two representations into a single interpretation, even if this was not licensed by the syntax.

  5. Representing sentence information

    NASA Astrophysics Data System (ADS)

    Perkins, Walton A., III

    1991-03-01

    This paper describes a computer-oriented representation for sentence information. Whereas many Artificial Intelligence (AI) natural language systems start with a syntactic parse of a sentence into the linguist's components: noun, verb, adjective, preposition, etc., we argue that it is better to parse the input sentence into 'meaning' components: attribute, attribute value, object class, object instance, and relation. AI systems need a representation that will allow rapid storage and retrieval of information and convenient reasoning with that information. The attribute-of-object representation has proven useful for handling information in relational databases (which are well known for their efficiency in storage and retrieval) and for reasoning in knowledge- based systems. On the other hand, the linguist's syntactic representation of the works in sentences has not been shown to be useful for information handling and reasoning. We think it is an unnecessary and misleading intermediate form. Our sentence representation is semantic based in terms of attribute, attribute value, object class, object instance, and relation. Every sentence is segmented into one or more components with the form: 'attribute' of 'object' 'relation' 'attribute value'. Using only one format for all information gives the system simplicity and good performance as a RISC architecture does for hardware. The attribute-of-object representation is not new; it is used extensively in relational databases and knowledge-based systems. However, we will show that it can be used as a meaning representation for natural language sentences with minor extensions. In this paper we describe how a computer system can parse English sentences into this representation and generate English sentences from this representation. Much of this has been tested with computer implementation.

  6. Parsing Citations in Biomedical Articles Using Conditional Random Fields

    PubMed Central

    Zhang, Qing; Cao, Yong-Gang; Yu, Hong

    2011-01-01

    Citations are used ubiquitously in biomedical full-text articles and play an important role for representing both the rhetorical structure and the semantic content of the articles. As a result, text mining systems will significantly benefit from a tool that automatically extracts the content of a citation. In this study, we applied the supervised machine-learning algorithms Conditional Random Fields (CRFs) to automatically parse a citation into its fields (e.g., Author, Title, Journal, and Year). With a subset of html format open-access PubMed Central articles, we report an overall 97.95% F1-score. The citation parser can be accessed at: http://www.cs.uwm.edu/~qing/projects/cithit/index.html. PMID:21419403

  7. Multiscale characterization and analysis of shapes

    DOEpatents

    Prasad, Lakshman; Rao, Ramana

    2002-01-01

    An adaptive multiscale method approximates shapes with continuous or uniformly and densely sampled contours, with the purpose of sparsely and nonuniformly discretizing the boundaries of shapes at any prescribed resolution, while at the same time retaining the salient shape features at that resolution. In another aspect, a fundamental geometric filtering scheme using the Constrained Delaunay Triangulation (CDT) of polygonized shapes creates an efficient parsing of shapes into components that have semantic significance dependent only on the shapes' structure and not on their representations per se. A shape skeletonization process generalizes to sparsely discretized shapes, with the additional benefit of prunability to filter out irrelevant and morphologically insignificant features. The skeletal representation of characters of varying thickness and the elimination of insignificant and noisy spurs and branches from the skeleton greatly increases the robustness, reliability and recognition rates of character recognition algorithms.

  8. Intelligent Semantic Query of Notices to Airmen (NOTAMs)

    DTIC Science & Technology

    2006-07-01

    definition of the airspace is constantly changing, new vocabulary is added and old words retired on a monthly basis, and the information specifying this is...NOTAMs are notices containing information on the conditions, or changes to, aeronautical facilities, services, procedures, or hazards, which are...develop a new parsing system, employing and extending ideas developed by the information-extraction community, rather than on classical computational

  9. Natural Language Semantics using Probabilistic Logic

    DTIC Science & Technology

    2014-10-01

    standard scores. Here are some examples: • “A man is playing a guitar.” “A woman is playing the guitar.”, score: 2.75 • “A woman is cutting broccoli ...A woman is slicing broccoli .”, score: 5.00 • “A car is parking.” “A cat is playing.”, score: 0.00 9 Sent1 Parsing (Boxer) KB result Sent2 LF1 LF2

  10. Harmony Search Algorithm for Word Sense Disambiguation.

    PubMed

    Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia

    2015-01-01

    Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.

  11. Harmony Search Algorithm for Word Sense Disambiguation

    PubMed Central

    Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia

    2015-01-01

    Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used. PMID:26422368

  12. Auditory conflict and congruence in frontotemporal dementia.

    PubMed

    Clark, Camilla N; Nicholas, Jennifer M; Agustus, Jennifer L; Hardy, Christopher J D; Russell, Lucy L; Brotherhood, Emilie V; Dick, Katrina M; Marshall, Charles R; Mummery, Catherine J; Rohrer, Jonathan D; Warren, Jason D

    2017-09-01

    Impaired analysis of signal conflict and congruence may contribute to diverse socio-emotional symptoms in frontotemporal dementias, however the underlying mechanisms have not been defined. Here we addressed this issue in patients with behavioural variant frontotemporal dementia (bvFTD; n = 19) and semantic dementia (SD; n = 10) relative to healthy older individuals (n = 20). We created auditory scenes in which semantic and emotional congruity of constituent sounds were independently probed; associated tasks controlled for auditory perceptual similarity, scene parsing and semantic competence. Neuroanatomical correlates of auditory congruity processing were assessed using voxel-based morphometry. Relative to healthy controls, both the bvFTD and SD groups had impaired semantic and emotional congruity processing (after taking auditory control task performance into account) and reduced affective integration of sounds into scenes. Grey matter correlates of auditory semantic congruity processing were identified in distributed regions encompassing prefrontal, parieto-temporal and insular areas and correlates of auditory emotional congruity in partly overlapping temporal, insular and striatal regions. Our findings suggest that decoding of auditory signal relatedness may probe a generic cognitive mechanism and neural architecture underpinning frontotemporal dementia syndromes. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  13. An expert system for natural language processing

    NASA Technical Reports Server (NTRS)

    Hennessy, John F.

    1988-01-01

    A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.

  14. Functional segregation of the inferior frontal gyrus for syntactic processes: a functional magnetic-resonance imaging study.

    PubMed

    Uchiyama, Yuji; Toyoda, Hiroshi; Honda, Manabu; Yoshida, Haruyo; Kochiyama, Takanori; Ebe, Kazutoshi; Sadato, Norihiro

    2008-07-01

    We used functional magnetic resonance imaging in 18 normal volunteers to determine whether there is separate representation of syntactic, semantic, and verbal working memory processing in the left inferior frontal gyrus (GFi). We compared a sentence comprehension task with a short-term memory maintenance task to identify syntactic and semantic processing regions. To investigate the effects of syntactic and verbal working memory load while minimizing the differences in semantic processes, we used comprehension tasks with garden-path (GP) sentences, which require re-parsing, and non-garden-path (NGP) sentences. Compared with the short-term memory task, sentence comprehension activated the left GFi, including Brodmann areas (BAs) 44, 45, and 47, and the left superior temporal gyrus. In GP versus NGP sentences, there was greater activity in the left BAs 44, 45, and 46 extending to the left anterior insula, the pre-supplementary motor area, and the right cerebellum. In the left GFi, verbal working memory activity was located more dorsally (BA 44/45), semantic processing was located more ventrally (BA 47), and syntactic processing was located in between (BA 45). These findings indicate a close relationship between semantic and syntactic processes, and suggest that BA 45 might link verbal working memory and semantic processing via syntactic unification processes.

  15. Convergence of Health Level Seven Version 2 Messages to Semantic Web Technologies for Software-Intensive Systems in Telemedicine Trauma Care.

    PubMed

    Menezes, Pedro Monteiro; Cook, Timothy Wayne; Cavalini, Luciana Tricai

    2016-01-01

    To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies.

  16. Managing biomedical image metadata for search and retrieval of similar images.

    PubMed

    Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris

    2011-08-01

    Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.

  17. Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

    PubMed

    He, Bin; Dong, Bin; Guan, Yi; Yang, Jinfeng; Jiang, Zhipeng; Yu, Qiubin; Cheng, Jianyi; Qu, Chunyan

    2017-05-01

    To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. The syntactic corpus consists of 138 Chinese clinical documents with 47,426 tokens and 2612 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7693 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain. Copyright © 2017. Published by Elsevier Inc.

  18. NASA Taxonomies for Searching Problem Reports and FMEAs

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.; Throop, David R.

    2006-01-01

    Many types of hazard and risk analyses are used during the life cycle of complex systems, including Failure Modes and Effects Analysis (FMEA), Hazard Analysis, Fault Tree and Event Tree Analysis, Probabilistic Risk Assessment, Reliability Analysis and analysis of Problem Reporting and Corrective Action (PRACA) databases. The success of these methods depends on the availability of input data and the analysts knowledge. Standard nomenclature can increase the reusability of hazard, risk and problem data. When nomenclature in the source texts is not standard, taxonomies with mapping words (sets of rough synonyms) can be combined with semantic search to identify items and tag them with metadata based on a rich standard nomenclature. Semantic search uses word meanings in the context of parsed phrases to find matches. The NASA taxonomies provide the word meanings. Spacecraft taxonomies and ontologies (generalization hierarchies with attributes and relationships, based on terms meanings) are being developed for types of subsystems, functions, entities, hazards and failures. The ontologies are broad and general, covering hardware, software and human systems. Semantic search of Space Station texts was used to validate and extend the taxonomies. The taxonomies have also been used to extract system connectivity (interaction) models and functions from requirements text. Now the Reconciler semantic search tool and the taxonomies are being applied to improve search in the Space Shuttle PRACA database, to discover recurring patterns of failure. Usual methods of string search and keyword search fall short because the entries are terse and have numerous shortcuts (irregular abbreviations, nonstandard acronyms, cryptic codes) and modifier words cannot be used in sentence context to refine the search. The limited and fixed FMEA categories associated with the entries do not make the fine distinctions needed in the search. The approach assigns PRACA report titles to problem classes in the taxonomy. Each ontology class includes mapping words - near-synonyms naming different manifestations of that problem class. The mapping words for Problems, Entities and Functions are converted to a canonical form plus any of a small set of modifier words (e.g. non-uniformity NOT + UNIFORM.) The report titles are parsed as sentences if possible, or treated as a flat sequence of word tokens if parsing fails. When canonical forms in the title match mapping words, the PRACA entry is associated with the corresponding Problem, Entity or Function in the ontology. The user can search for types of failures associated with types of equipment, clustering by type of problem (e.g., all bearings found with problems of being uneven: rough, irregular, gritty ). The results could also be used for tagging PRACA report entries with rich metadata. This approach could also be applied to searching and tagging failure modes, failure effects and mitigations in FMEAs. In the pilot work, parsing 52K+ truncated titles (the test cases that were available), has resulted in identification of both a type of equipment and type of problem in about 75% of the cases. The results are displayed in a manner analogous to Google search results. The effort has also led to the enrichment of the taxonomy, adding some new categories and many new mapping words. Further work would make enhancements that have been identified for improving the clustering and further reducing the false alarm rate. (In searching for recurring problems, good clustering is more important than reducing false alarms). Searching complete PRACA reports should lead to immediate improvement.

  19. The power and limits of a rule-based morpho-semantic parser.

    PubMed Central

    Baud, R. H.; Rassinoux, A. M.; Ruch, P.; Lovis, C.; Scherrer, J. R.

    1999-01-01

    The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors. PMID:10566313

  20. The power and limits of a rule-based morpho-semantic parser.

    PubMed

    Baud, R H; Rassinoux, A M; Ruch, P; Lovis, C; Scherrer, J R

    1999-01-01

    The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors.

  1. A study of actions in operative notes.

    PubMed

    Wang, Yan; Pakhomov, Serguei; Burkart, Nora E; Ryan, James O; Melton, Genevieve B

    2012-01-01

    Operative notes contain rich information about techniques, instruments, and materials used in procedures. To assist development of effective information extraction (IE) techniques for operative notes, we investigated the sublanguage used to describe actions within the operative report 'procedure description' section. Deep parsing results of 362,310 operative notes with an expanded Stanford parser using the SPECIALIST Lexicon resulted in 200 verbs (92% coverage) including 147 action verbs. Nominal action predicates for each action verb were gathered from WordNet, SPECIALIST Lexicon, New Oxford American Dictionary and Stedman's Medical Dictionary. Coverage gaps were seen in existing lexical, domain, and semantic resources (Unified Medical Language System (UMLS) Metathesaurus, SPECIALIST Lexicon, WordNet and FrameNet). Our findings demonstrate the need to construct surgical domain-specific semantic resources for IE from operative notes.

  2. GBParsy: a GenBank flatfile parser library with high speed.

    PubMed

    Lee, Tae-Ho; Kim, Yeon-Ki; Nahm, Baek Hie

    2008-07-25

    GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are flexible but slow. In addition, we optimized an algorithm related to memory usage so that it also increased parsing performance and efficiency of memory usage. GBParsy is at least 5-100x faster than current parsers in benchmark tests. GBParsy is estimated to extract annotated information from almost 100 Mb of a GenBank flatfile for chromosomal sequence information within a second. Thus, it should be used for a variety of applications such as on-time visualization of a genome at a web site.

  3. Convergence of Health Level Seven Version 2 Messages to Semantic Web Technologies for Software-Intensive Systems in Telemedicine Trauma Care

    PubMed Central

    Cook, Timothy Wayne; Cavalini, Luciana Tricai

    2016-01-01

    Objectives To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. Methods This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Results Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. Conclusions This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies. PMID:26893947

  4. BBN: Description of the PLUM System as Used for MUC-4

    DTIC Science & Technology

    1992-01-01

    in the MUC-4 corpus’ . Here are the 8 parse fragments generated by FPP for the first sentence of TST2- MUC4 -0048 : ("SALVADORAN PRESIDENT-ELECT ALFREDO...extensive patterns for fragment combination . Figure 2 shows a graphical version of the semantics generated for the first fragment of S1 in TST2- MUC4 ...trigger. Following is the discourse event structure for the first event in TST2- MUC4 -0048 : Event MURDER Trigger fragments: "SALVADORAN PRESIDENT

  5. Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 2. Documentation

    DTIC Science & Technology

    1989-09-30

    Please choose a list of switches, or type ’"ok.’’ -- [3,5,7]. Changed the switch: parse-. tree --------------------------------- > ON Changed the switch...argument of the verb, especially in the passive (The car was found parked on Elm Street). Other verbs are clearer: They reported the car stolen doesn’t...object slot in the passive object passobj, as in the tree above. Strings, LXiRs and Disjunctive Rules In general, there are three basic types of rules in

  6. Parser Combinators: a Practical Application for Generating Parsers for NMR Data

    PubMed Central

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  7. "gnparser": a powerful parser for scientific names based on Parsing Expression Grammar.

    PubMed

    Mozzherin, Dmitry Y; Myltsev, Alexander A; Patterson, David J

    2017-05-26

    Scientific names in biology act as universal links. They allow us to cross-reference information about organisms globally. However variations in spelling of scientific names greatly diminish their ability to interconnect data. Such variations may include abbreviations, annotations, misspellings, etc. Authorship is a part of a scientific name and may also differ significantly. To match all possible variations of a name we need to divide them into their elements and classify each element according to its role. We refer to this as 'parsing' the name. Parsing categorizes name's elements into those that are stable and those that are prone to change. Names are matched first by combining them according to their stable elements. Matches are then refined by examining their varying elements. This two stage process dramatically improves the number and quality of matches. It is especially useful for the automatic data exchange within the context of "Big Data" in biology. We introduce Global Names Parser (gnparser). It is a Java tool written in Scala language (a language for Java Virtual Machine) to parse scientific names. It is based on a Parsing Expression Grammar. The parser can be applied to scientific names of any complexity. It assigns a semantic meaning (such as genus name, species epithet, rank, year of publication, authorship, annotations, etc.) to all elements of a name. It is able to work with nested structures as in the names of hybrids. gnparser performs with ≈99% accuracy and processes 30 million name-strings/hour per CPU thread. The gnparser library is compatible with Scala, Java, R, Jython, and JRuby. The parser can be used as a command line application, as a socket server, a web-app or as a RESTful HTTP-service. It is released under an Open source MIT license. Global Names Parser (gnparser) is a fast, high precision tool for biodiversity informaticians and biologists working with large numbers of scientific names. It can replace expensive and error-prone manual parsing and standardization of scientific names in many situations, and can quickly enhance the interoperability of distributed biological information.

  8. Reanalysis and semantic persistence in native and non-native garden-path recovery.

    PubMed

    Jacob, Gunnar; Felser, Claudia

    2016-01-01

    We report the results from an eye-movement monitoring study investigating how native and non-native speakers of English process temporarily ambiguous sentences such as While the gentleman was eating the burgers were still being reheated in the microwave, in which an initially plausible direct-object analysis is first ruled out by a syntactic disambiguation (were) and also later on by semantic information (being reheated). Both participant groups showed garden-path effects at the syntactic disambiguation, with native speakers showing significantly stronger effects of ambiguity than non-native speakers in later eye-movement measures but equally strong effects in first-pass reading times. Ambiguity effects at the semantic disambiguation and in participants' end-of-trial responses revealed that for both participant groups, the incorrect direct-object analysis was frequently maintained beyond the syntactic disambiguation. The non-native group showed weaker reanalysis effects at the syntactic disambiguation and was more likely to misinterpret the experimental sentences than the native group. Our results suggest that native language (L1) and non-native language (L2) parsing are similar with regard to sensitivity to syntactic and semantic error signals, but different with regard to processes of reanalysis.

  9. Lexicality Effects in Word and Nonword Recall of Semantic Dementia and Progressive Nonfluent Aphasia

    PubMed Central

    Reilly, Jamie; Troche, Joshua; Chatel, Alison; Park, Hyejin; Kalinyak-Fliszar, Michelene; Antonucci, Sharon M.; Martin, Nadine

    2012-01-01

    Background Verbal working memory is an essential component of many language functions, including sentence comprehension and word learning. As such, working memory has emerged as a domain of intense research interest both in aphasiology and in the broader field of cognitive neuroscience. The integrity of verbal working memory encoding relies on a fluid interaction between semantic and phonological processes. That is, we encode verbal detail using many cues related to both the sound and meaning of words. Lesion models can provide an effective means of parsing the contributions of phonological or semantic impairment to recall performance. Methods and Procedures We employed the lesion model approach here by contrasting the nature of lexicality errors incurred during recall of word and nonword sequences by 3individuals with progressive nonfluent aphasia (a phonological dominant impairment) compared to that of 2 individuals with semantic dementia (a semantic dominant impairment). We focused on psycholinguistic attributes of correctly recalled stimuli relative to those that elicited a lexicality error (i.e., nonword → word OR word → nonword). Outcomes and results Patients with semantic dementia showed greater sensitivity to phonological attributes (e.g., phoneme length, wordlikeness) of the target items relative to semantic attributes (e.g., familiarity). Patients with PNFA showed the opposite pattern, marked by sensitivity to word frequency, age of acquisition, familiarity, and imageability. Conclusions We interpret these results in favor of a processing strategy such that in the context of a focal phonological impairment patients revert to an over-reliance on preserved semantic processing abilities. In contrast, a focal semantic impairment forces both reliance upon and hypersensitivity to phonological attributes of target words. We relate this interpretation to previous hypotheses about the nature of verbal short-term memory in progressive aphasia. PMID:23486736

  10. Effective user guidance in online interactive semantic segmentation

    NASA Astrophysics Data System (ADS)

    Petersen, Jens; Bendszus, Martin; Debus, Jürgen; Heiland, Sabine; Maier-Hein, Klaus H.

    2017-03-01

    With the recent success of machine learning based solutions for automatic image parsing, the availability of reference image annotations for algorithm training is one of the major bottlenecks in medical image segmentation. We are interested in interactive semantic segmentation methods that can be used in an online fashion to generate expert segmentations. These can be used to train automated segmentation techniques or, from an application perspective, for quick and accurate tumor progression monitoring. Using simulated user interactions in a MRI glioblastoma segmentation task, we show that if the user possesses knowledge of the correct segmentation it is significantly (p <= 0.009) better to present data and current segmentation to the user in such a manner that they can easily identify falsely classified regions compared to guiding the user to regions where the classifier exhibits high uncertainty, resulting in differences of mean Dice scores between +0.070 (Whole tumor) and +0.136 (Tumor Core) after 20 iterations. The annotation process should cover all classes equally, which results in a significant (p <= 0.002) improvement compared to completely random annotations anywhere in falsely classified regions for small tumor regions such as the necrotic tumor core (mean Dice +0.151 after 20 it.) and non-enhancing abnormalities (mean Dice +0.069 after 20 it.). These findings provide important insights for the development of efficient interactive segmentation systems and user interfaces.

  11. Extraction of CYP chemical interactions from biomedical literature using natural language processing methods.

    PubMed

    Jiao, Dazhi; Wild, David J

    2009-02-01

    This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.

  12. Form and function: Optional complementizers reduce causal inferences

    PubMed Central

    Rohde, Hannah; Tyler, Joseph; Carlson, Katy

    2017-01-01

    Many factors are known to influence the inference of the discourse coherence relationship between two sentences. Here, we examine the relationship between two conjoined embedded clauses in sentences like The professor noted that the student teacher did not look confident and (that) the students were poorly behaved. In two studies, we find that the presence of that before the second embedded clause in such sentences reduces the possibility of a forward causal relationship between the clauses, i.e., the inference that the student teacher’s confidence was what affected student behavior. Three further studies tested the possibility of a backward causal relationship between clauses in the same structure, and found that the complementizer’s presence aids that relationship, especially in a forced-choice paradigm. The empirical finding that a complementizer, a linguistic element associated primarily with structure rather than event-level semantics, can affect discourse coherence is novel and illustrates an interdependence between syntactic parsing and discourse parsing. PMID:28804781

  13. Shaving Bridges and Tuning Kitaraa: The Effect of Language Switching on Semantic Processing

    PubMed Central

    Hut, Suzanne C. A.; Leminen, Alina

    2017-01-01

    Language switching has been repeatedly found to be costly. Yet, there are reasons to believe that switches in language might benefit language comprehension in some groups of people, such as less proficient language learners. This study therefore investigated the interplay between language switching and semantic processing in groups with varying language proficiency. EEG was recorded while L2 learners of English with intermediate and high proficiency levels read semantically congruent or incongruent sentences in L2. Translations of congruent and incongruent target words were additionally presented in L1 to create intrasentential language switches. A control group of English native speakers was tested in order to compare responses to non-switched stimuli with those of L2 learners. An omnibus ANOVA including all groups revealed larger N400 responses for non-switched incongruent stimuli compared to congruent stimuli. Additionally, despite switches to L1 at target word position, semantic N400 responses were still elicited in both L2 learner groups. Further switching effects were reflected by an N400-like effect and a late positivity complex, pointing to possible parsing efforts after language switches. Our results therefore show that although language switches are associated with increased mental effort, switches may not necessarily be costly on the semantic level. This finding contributes to the ongoing discussion on language inhibition processes, and shows that, in these intermediate and high proficient L2 learners, semantic processes look similar to those of native speakers of English. PMID:28900402

  14. Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 2. Appendices.

    DTIC Science & Technology

    1987-05-14

    such as and. but, or the conjunction comma, as in apples, oranges and bananas . Once such a word was recognized, normal parsing was suspended; a portion...Antecedents interpreted with respect to the "reaching the Stadium" event, as happening sometime after thaL A new node a. I pioced up a banana . Up...dose, I noticed the banana would be created in e/s structure ordered sometime after was too green to a the "reaching the Stadium" event. On the other

  15. A rapid place name locating algorithm based on ontology qualitative retrieval, ranking and recommendation

    NASA Astrophysics Data System (ADS)

    Fan, Hong; Zhu, Anfeng; Zhang, Weixia

    2015-12-01

    In order to meet the rapid positioning of 12315 complaints, aiming at the natural language expression of telephone complaints, a semantic retrieval framework is proposed which is based on natural language parsing and geographical names ontology reasoning. Among them, a search result ranking and recommended algorithms is proposed which is regarding both geo-name conceptual similarity and spatial geometry relation similarity. The experiments show that this method can assist the operator to quickly find location of 12,315 complaints, increased industry and commerce customer satisfaction.

  16. Detecting modification of biomedical events using a deep parsing approach.

    PubMed

    Mackinlay, Andrew; Martinez, David; Baldwin, Timothy

    2012-04-30

    This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.

  17. Incremental Refinement of FAÇADE Models with Attribute Grammar from 3d Point Clouds

    NASA Astrophysics Data System (ADS)

    Dehbi, Y.; Staat, C.; Mandtler, L.; Pl¨umer, L.

    2016-06-01

    Data acquisition using unmanned aerial vehicles (UAVs) has gotten more and more attention over the last years. Especially in the field of building reconstruction the incremental interpretation of such data is a demanding task. In this context formal grammars play an important role for the top-down identification and reconstruction of building objects. Up to now, the available approaches expect offline data in order to parse an a-priori known grammar. For mapping on demand an on the fly reconstruction based on UAV data is required. An incremental interpretation of the data stream is inevitable. This paper presents an incremental parser of grammar rules for an automatic 3D building reconstruction. The parser enables a model refinement based on new observations with respect to a weighted attribute context-free grammar (WACFG). The falsification or rejection of hypotheses is supported as well. The parser can deal with and adapt available parse trees acquired from previous interpretations or predictions. Parse trees derived so far are updated in an iterative way using transformation rules. A diagnostic step searches for mismatches between current and new nodes. Prior knowledge on façades is incorporated. It is given by probability densities as well as architectural patterns. Since we cannot always assume normal distributions, the derivation of location and shape parameters of building objects is based on a kernel density estimation (KDE). While the level of detail is continuously improved, the geometrical, semantic and topological consistency is ensured.

  18. The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information

    PubMed Central

    Battaglia, Francesco P.; Pennartz, Cyriel M. A.

    2011-01-01

    After acquisition, memories underlie a process of consolidation, making them more resistant to interference and brain injury. Memory consolidation involves systems-level interactions, most importantly between the hippocampus and associated structures, which takes part in the initial encoding of memory, and the neocortex, which supports long-term storage. This dichotomy parallels the contrast between episodic memory (tied to the hippocampal formation), collecting an autobiographical stream of experiences, and semantic memory, a repertoire of facts and statistical regularities about the world, involving the neocortex at large. Experimental evidence points to a gradual transformation of memories, following encoding, from an episodic to a semantic character. This may require an exchange of information between different memory modules during inactive periods. We propose a theory for such interactions and for the formation of semantic memory, in which episodic memory is encoded as relational data. Semantic memory is modeled as a modified stochastic grammar, which learns to parse episodic configurations expressed as an association matrix. The grammar produces tree-like representations of episodes, describing the relationships between its main constituents at multiple levels of categorization, based on its current knowledge of world regularities. These regularities are learned by the grammar from episodic memory information, through an expectation-maximization procedure, analogous to the inside–outside algorithm for stochastic context-free grammars. We propose that a Monte-Carlo sampling version of this algorithm can be mapped on the dynamics of “sleep replay” of previously acquired information in the hippocampus and neocortex. We propose that the model can reproduce several properties of semantic memory such as decontextualization, top-down processing, and creation of schemata. PMID:21887143

  19. Semantic and Syntactic Interference in Sentence Comprehension: A Comparison of Working Memory Models

    PubMed Central

    Tan, Yingying; Martin, Randi C.; Van Dyke, Julie A.

    2017-01-01

    This study investigated the nature of the underlying working memory system supporting sentence processing through examining individual differences in sensitivity to retrieval interference effects during sentence comprehension. Interference effects occur when readers incorrectly retrieve sentence constituents which are similar to those required during integrative processes. We examined interference arising from a partial match between distracting constituents and syntactic and semantic cues, and related these interference effects to performance on working memory, short-term memory (STM), vocabulary, and executive function tasks. For online sentence comprehension, as measured by self-paced reading, the magnitude of individuals' syntactic interference effects was predicted by general WM capacity and the relation remained significant when partialling out vocabulary, indicating that the effects were not due to verbal knowledge. For offline sentence comprehension, as measured by responses to comprehension questions, both general WM capacity and vocabulary knowledge interacted with semantic interference for comprehension accuracy, suggesting that both general WM capacity and the quality of semantic representations played a role in determining how well interference was resolved offline. For comprehension question reaction times, a measure of semantic STM capacity interacted with semantic but not syntactic interference. However, a measure of phonological capacity (digit span) and a general measure of resistance to response interference (Stroop effect) did not predict individuals' interference resolution abilities in either online or offline sentence comprehension. The results are discussed in relation to the multiple capacities account of working memory (e.g., Martin and Romani, 1994; Martin and He, 2004), and the cue-based retrieval parsing approach (e.g., Lewis et al., 2006; Van Dyke et al., 2014). While neither approach was fully supported, a possible means of reconciling the two approaches and directions for future research are proposed. PMID:28261133

  20. Detecting modification of biomedical events using a deep parsing approach

    PubMed Central

    2012-01-01

    Background This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. Method To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Results Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Conclusions Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification. PMID:22595089

  1. The "Globularization Hypothesis" of the Language-ready Brain as a Developmental Frame for Prosodic Bootstrapping Theories of Language Acquisition.

    PubMed

    Irurtzun, Aritz

    2015-01-01

    In recent research (Boeckx and Benítez-Burraco, 2014a,b) have advanced the hypothesis that our species-specific language-ready brain should be understood as the outcome of developmental changes that occurred in our species after the split from Neanderthals-Denisovans, which resulted in a more globular braincase configuration in comparison to our closest relatives, who had elongated endocasts. According to these authors, the development of a globular brain is an essential ingredient for the language faculty and in particular, it is the centrality occupied by the thalamus in a globular brain that allows its modulatory or regulatory role, essential for syntactico-semantic computations. Their hypothesis is that the syntactico-semantic capacities arise in humans as a consequence of a process of globularization, which significantly takes place postnatally (cf. Neubauer et al., 2010). In this paper, I show that Boeckx and Benítez-Burraco's hypothesis makes an interesting developmental prediction regarding the path of language acquisition: it teases apart the onset of phonological acquisition and the onset of syntactic acquisition (the latter starting significantly later, after globularization). I argue that this hypothesis provides a developmental rationale for the prosodic bootstrapping hypothesis of language acquisition (cf. i.a. Gleitman and Wanner, 1982; Mehler et al., 1988, et seq.; Gervain and Werker, 2013), which claim that prosodic cues are employed for syntactic parsing. The literature converges in the observation that a large amount of such prosodic cues (in particular, rhythmic cues) are already acquired before the completion of the globularization phase, which paves the way for the premises of the prosodic bootstrapping hypothesis, allowing babies to have a rich knowledge of the prosody of their target language before they can start parsing the primary linguistic data syntactically.

  2. External details revisited - A new taxonomy for coding 'non-episodic' content during autobiographical memory retrieval.

    PubMed

    Strikwerda-Brown, Cherie; Mothakunnel, Annu; Hodges, John R; Piguet, Olivier; Irish, Muireann

    2018-04-24

    Autobiographical memory (ABM) is typically held to comprise episodic and semantic elements, with the vast majority of studies to date focusing on profiles of episodic details in health and disease. In this context, 'non-episodic' elements are often considered to reflect semantic processing or are discounted from analyses entirely. Mounting evidence suggests that rather than reflecting one unitary entity, semantic autobiographical information may contain discrete subcomponents, which vary in their relative degree of semantic or episodic content. This study aimed to (1) review the existing literature to formally characterize the variability in analysis of 'non-episodic' content (i.e., external details) on the Autobiographical Interview and (2) use these findings to create a theoretically grounded framework for coding external details. Our review exposed discrepancies in the reporting and interpretation of external details across studies, reinforcing the need for a new, consistent approach. We validated our new external details scoring protocol (the 'NExt' taxonomy) in patients with Alzheimer's disease (n = 18) and semantic dementia (n = 13), and 20 healthy older Control participants and compared profiles of the NExt subcategories across groups and time periods. Our results revealed increased sensitivity of the NExt taxonomy in discriminating between ABM profiles of patient groups, when compared to traditionally used internal and external detail metrics. Further, remote and recent autobiographical memories displayed distinct compositions of the NExt detail types. This study is the first to provide a fine-grained and comprehensive taxonomy to parse external details into intuitive subcategories and to validate this protocol in neurodegenerative disorders. © 2018 The British Psychological Society.

  3. Meeting medical terminology needs--the Ontology-Enhanced Medical Concept Mapper.

    PubMed

    Leroy, G; Chen, H

    2001-12-01

    This paper describes the development and testing of the Medical Concept Mapper, a tool designed to facilitate access to online medical information sources by providing users with appropriate medical search terms for their personal queries. Our system is valuable for patients whose knowledge of medical vocabularies is inadequate to find the desired information, and for medical experts who search for information outside their field of expertise. The Medical Concept Mapper maps synonyms and semantically related concepts to a user's query. The system is unique because it integrates our natural language processing tool, i.e., the Arizona (AZ) Noun Phraser, with human-created ontologies, the Unified Medical Language System (UMLS) and WordNet, and our computer generated Concept Space, into one system. Our unique contribution results from combining the UMLS Semantic Net with Concept Space in our deep semantic parsing (DSP) algorithm. This algorithm establishes a medical query context based on the UMLS Semantic Net, which allows Concept Space terms to be filtered so as to isolate related terms relevant to the query. We performed two user studies in which Medical Concept Mapper terms were compared against human experts' terms. We conclude that the AZ Noun Phraser is well suited to extract medical phrases from user queries, that WordNet is not well suited to provide strictly medical synonyms, that the UMLS Metathesaurus is well suited to provide medical synonyms, and that Concept Space is well suited to provide related medical terms, especially when these terms are limited by our DSP algorithm.

  4. Experimental Evaluation of Processing Time for the Synchronization of XML-Based Business Objects

    NASA Astrophysics Data System (ADS)

    Ameling, Michael; Wolf, Bernhard; Springer, Thomas; Schill, Alexander

    Business objects (BOs) are data containers for complex data structures used in business applications such as Supply Chain Management and Customer Relationship Management. Due to the replication of application logic, multiple copies of BOs are created which have to be synchronized and updated. This is a complex and time consuming task because BOs rigorously vary in their structure according to the distribution, number and size of elements. Since BOs are internally represented as XML documents, the parsing of XML is one major cost factor which has to be considered for minimizing the processing time during synchronization. The prediction of the parsing time for BOs is an significant property for the selection of an efficient synchronization mechanism. In this paper, we present a method to evaluate the influence of the structure of BOs on their parsing time. The results of our experimental evaluation incorporating four different XML parsers examine the dependencies between the distribution of elements and the parsing time. Finally, a general cost model will be validated and simplified according to the results of the experimental setup.

  5. Human behavior recognition using a context-free grammar

    NASA Astrophysics Data System (ADS)

    Rosani, Andrea; Conci, Nicola; De Natale, Francesco G. B.

    2014-05-01

    Automatic recognition of human activities and behaviors is still a challenging problem for many reasons, including limited accuracy of the data acquired by sensing devices, high variability of human behaviors, and gap between visual appearance and scene semantics. Symbolic approaches can significantly simplify the analysis and turn raw data into chains of meaningful patterns. This allows getting rid of most of the clutter produced by low-level processing operations, embedding significant contextual information into the data, as well as using simple syntactic approaches to perform the matching between incoming sequences and models. We propose a symbolic approach to learn and detect complex activities through the sequences of atomic actions. Compared to previous methods based on context-free grammars, we introduce several important novelties, such as the capability to learn actions based on both positive and negative samples, the possibility of efficiently retraining the system in the presence of misclassified or unrecognized events, and the use of a parsing procedure that allows correct detection of the activities also when they are concatenated and/or nested one with each other. An experimental validation on three datasets with different characteristics demonstrates the robustness of the approach in classifying complex human behaviors.

  6. Deeply learnt hashing forests for content based image retrieval in prostate MR images

    NASA Astrophysics Data System (ADS)

    Shah, Amit; Conjeti, Sailesh; Navab, Nassir; Katouzian, Amin

    2016-03-01

    Deluge in the size and heterogeneity of medical image databases necessitates the need for content based retrieval systems for their efficient organization. In this paper, we propose such a system to retrieve prostate MR images which share similarities in appearance and content with a query image. We introduce deeply learnt hashing forests (DL-HF) for this image retrieval task. DL-HF effectively leverages the semantic descriptiveness of deep learnt Convolutional Neural Networks. This is used in conjunction with hashing forests which are unsupervised random forests. DL-HF hierarchically parses the deep-learnt feature space to encode subspaces with compact binary code words. We propose a similarity preserving feature descriptor called Parts Histogram which is derived from DL-HF. Correlation defined on this descriptor is used as a similarity metric for retrieval from the database. Validations on publicly available multi-center prostate MR image database established the validity of the proposed approach. The proposed method is fully-automated without any user-interaction and is not dependent on any external image standardization like image normalization and registration. This image retrieval method is generalizable and is well-suited for retrieval in heterogeneous databases other imaging modalities and anatomies.

  7. Content-based TV sports video retrieval using multimodal analysis

    NASA Astrophysics Data System (ADS)

    Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru

    2003-09-01

    In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.

  8. XPI: The Xanadu Parameter Interface

    NASA Technical Reports Server (NTRS)

    White, N.; Barrett, P.; Oneel, B.; Jacobs, P.

    1992-01-01

    XPI is a table driven parameter interface which greatly simplifies both command driven programs such as BROWSE and XIMAGE as well as stand alone single-task programs. It moves all of the syntax and semantic parsing of commands and parameters out of the users code into common code and externally defined tables. This allows the programmer to concentrate on writing the code unique to the application rather than reinventing the user interface and for external graphical interfaces to interface with no changes to the command driven program. XPI also includes a compatibility library which allows programs written using the IRAF host interface (Mandel and Roll) to use XPI in place of the IRAF host interface.

  9. TopFed: TCGA tailored federated query processing and linking to LOD.

    PubMed

    Saleem, Muhammad; Padmanabhuni, Shanmukha S; Ngomo, Axel-Cyrille Ngonga; Iqbal, Aftab; Almeida, Jonas S; Decker, Stefan; Deus, Helena F

    2014-01-01

    The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims of this project is to create a comprehensive and open repository of cancer related molecular analysis, to be exploited by bioinformaticians towards advancing cancer knowledge. However, devising bioinformatics applications to analyse such large dataset is still challenging, as it often requires downloading large archives and parsing the relevant text files. Therefore, it is making it difficult to enable virtual data integration in order to collect the critical co-variates necessary for analysis. We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and parsing.

  10. Intelligence related upper alpha desynchronization in a semantic memory task.

    PubMed

    Doppelmayr, M; Klimesch, W; Hödlmoser, K; Sauseng, P; Gruber, W

    2005-07-30

    Recent evidence shows that event-related (upper) alpha desynchronization (ERD) is related to cognitive performance. Several studies observed a positive, some a negative relationship. The latter finding, interpreted in terms of the neural efficiency hypothesis, suggests that good performance is associated with a more 'efficient', smaller extent of cortical activation. Other studies found that ERD increases with semantic processing demands and that this increase is larger for good performers. Studies supporting the neural efficiency hypothesis used tasks that do not specifically require semantic processing. Thus, we assume that the lack of semantic processing demands may at least in part be responsible for the reduced ERD. In the present study we measured ERD during a difficult verbal-semantic task. The findings demonstrate that during semantic processing, more intelligent (as compared to less intelligent) subjects exhibited a significantly larger upper alpha ERD over the left hemisphere. We conclude that more intelligent subjects exhibit a more extensive activation in a semantic processing system and suggest that divergent findings regarding the neural efficiency hypotheses are due to task specific differences in semantic processing demands.

  11. MMTF-An efficient file format for the transmission, visualization, and analysis of macromolecular structures.

    PubMed

    Bradley, Anthony R; Rose, Alexander S; Pavelka, Antonín; Valasatava, Yana; Duarte, Jose M; Prlić, Andreas; Rose, Peter W

    2017-06-01

    Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.

  12. MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures

    PubMed Central

    Pavelka, Antonín; Valasatava, Yana; Prlić, Andreas

    2017-01-01

    Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis. PMID:28574982

  13. FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

    PubMed

    Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

    2014-06-14

    Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.

  14. Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

    NASA Astrophysics Data System (ADS)

    Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

    2016-12-01

    Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi

  15. Discovery of novel biomarkers and phenotypes by semantic technologies

    PubMed Central

    2013-01-01

    Background Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. Results This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. Conclusions The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions. PMID:23402646

  16. A reusability and efficiency oriented software design method for mobile land inspection

    NASA Astrophysics Data System (ADS)

    Cai, Wenwen; He, Jun; Wang, Qing

    2008-10-01

    Aiming at the requirement from the real-time land inspection domain, a land inspection handset system was presented in this paper. In order to increase the reusability of the system, a design pattern based framework was presented. Encapsulation for command like actions by applying COMMAND pattern was proposed for the problem of complex UI interactions. Integrating several GPS-log parsing engines into a general parsing framework was archived by introducing STRATEGY pattern. A network transmission module based network middleware was constructed. For mitigating the high coupling of complex network communication programs, FACTORY pattern was applied to facilitate the decoupling. Moreover, in order to efficiently manipulate huge GIS datasets, a VISITOR pattern and Quad-tree based multi-scale representation method was presented. It had been proved practically that these design patterns reduced the coupling between the subsystems, and improved the expansibility.

  17. ChemicalTagger: A tool for semantic text-mining in chemistry.

    PubMed

    Hawizy, Lezan; Jessop, David M; Adams, Nico; Murray-Rust, Peter

    2011-05-16

    The primary method for scientific communication is in the form of published scientific articles and theses which use natural language combined with domain-specific terminology. As such, they contain free owing unstructured text. Given the usefulness of data extraction from unstructured literature, we aim to show how this can be achieved for the discipline of chemistry. The highly formulaic style of writing most chemists adopt make their contributions well suited to high-throughput Natural Language Processing (NLP) approaches. We have developed the ChemicalTagger parser as a medium-depth, phrase-based semantic NLP tool for the language of chemical experiments. Tagging is based on a modular architecture and uses a combination of OSCAR, domain-specific regex and English taggers to identify parts-of-speech. The ANTLR grammar is used to structure this into tree-based phrases. Using a metric that allows for overlapping annotations, we achieved machine-annotator agreements of 88.9% for phrase recognition and 91.9% for phrase-type identification (Action names). It is possible parse to chemical experimental text using rule-based techniques in conjunction with a formal grammar parser. ChemicalTagger has been deployed for over 10,000 patents and has identified solvents from their linguistic context with >99.5% precision.

  18. Accuracy and Tuning of Flow Parsing for Visual Perception of Object Motion During Self-Motion

    PubMed Central

    Niehorster, Diederick C.

    2017-01-01

    How do we perceive object motion during self-motion using visual information alone? Previous studies have reported that the visual system can use optic flow to identify and globally subtract the retinal motion component resulting from self-motion to recover scene-relative object motion, a process called flow parsing. In this article, we developed a retinal motion nulling method to directly measure and quantify the magnitude of flow parsing (i.e., flow parsing gain) in various scenarios to examine the accuracy and tuning of flow parsing for the visual perception of object motion during self-motion. We found that flow parsing gains were below unity for all displays in all experiments; and that increasing self-motion and object motion speed did not alter flow parsing gain. We conclude that visual information alone is not sufficient for the accurate perception of scene-relative motion during self-motion. Although flow parsing performs global subtraction, its accuracy also depends on local motion information in the retinal vicinity of the moving object. Furthermore, the flow parsing gain was constant across common self-motion or object motion speeds. These results can be used to inform and validate computational models of flow parsing. PMID:28567272

  19. The Fusion Model of Intelligent Transportation Systems Based on the Urban Traffic Ontology

    NASA Astrophysics Data System (ADS)

    Yang, Wang-Dong; Wang, Tao

    On these issues unified representation of urban transport information using urban transport ontology, it defines the statute and the algebraic operations of semantic fusion in ontology level in order to achieve the fusion of urban traffic information in the semantic completeness and consistency. Thus this paper takes advantage of the semantic completeness of the ontology to build urban traffic ontology model with which we resolve the problems as ontology mergence and equivalence verification in semantic fusion of traffic information integration. Information integration in urban transport can increase the function of semantic fusion, and reduce the amount of data integration of urban traffic information as well enhance the efficiency and integrity of traffic information query for the help, through the practical application of intelligent traffic information integration platform of Changde city, the paper has practically proved that the semantic fusion based on ontology increases the effect and efficiency of the urban traffic information integration, reduces the storage quantity, and improve query efficiency and information completeness.

  20. Structural and functional correlates for language efficiency in auditory word processing.

    PubMed

    Jung, JeYoung; Kim, Sunmi; Cho, Hyesuk; Nam, Kichun

    2017-01-01

    This study aims to provide convergent understanding of the neural basis of auditory word processing efficiency using a multimodal imaging. We investigated the structural and functional correlates of word processing efficiency in healthy individuals. We acquired two structural imaging (T1-weighted imaging and diffusion tensor imaging) and functional magnetic resonance imaging (fMRI) during auditory word processing (phonological and semantic tasks). Our results showed that better phonological performance was predicted by the greater thalamus activity. In contrary, better semantic performance was associated with the less activation in the left posterior middle temporal gyrus (pMTG), supporting the neural efficiency hypothesis that better task performance requires less brain activation. Furthermore, our network analysis revealed the semantic network including the left anterior temporal lobe (ATL), dorsolateral prefrontal cortex (DLPFC) and pMTG was correlated with the semantic efficiency. Especially, this network acted as a neural efficient manner during auditory word processing. Structurally, DLPFC and cingulum contributed to the word processing efficiency. Also, the parietal cortex showed a significate association with the word processing efficiency. Our results demonstrated that two features of word processing efficiency, phonology and semantics, can be supported in different brain regions and, importantly, the way serving it in each region was different according to the feature of word processing. Our findings suggest that word processing efficiency can be achieved by in collaboration of multiple brain regions involved in language and general cognitive function structurally and functionally.

  1. Structural and functional correlates for language efficiency in auditory word processing

    PubMed Central

    Kim, Sunmi; Cho, Hyesuk; Nam, Kichun

    2017-01-01

    This study aims to provide convergent understanding of the neural basis of auditory word processing efficiency using a multimodal imaging. We investigated the structural and functional correlates of word processing efficiency in healthy individuals. We acquired two structural imaging (T1-weighted imaging and diffusion tensor imaging) and functional magnetic resonance imaging (fMRI) during auditory word processing (phonological and semantic tasks). Our results showed that better phonological performance was predicted by the greater thalamus activity. In contrary, better semantic performance was associated with the less activation in the left posterior middle temporal gyrus (pMTG), supporting the neural efficiency hypothesis that better task performance requires less brain activation. Furthermore, our network analysis revealed the semantic network including the left anterior temporal lobe (ATL), dorsolateral prefrontal cortex (DLPFC) and pMTG was correlated with the semantic efficiency. Especially, this network acted as a neural efficient manner during auditory word processing. Structurally, DLPFC and cingulum contributed to the word processing efficiency. Also, the parietal cortex showed a significate association with the word processing efficiency. Our results demonstrated that two features of word processing efficiency, phonology and semantics, can be supported in different brain regions and, importantly, the way serving it in each region was different according to the feature of word processing. Our findings suggest that word processing efficiency can be achieved by in collaboration of multiple brain regions involved in language and general cognitive function structurally and functionally. PMID:28892503

  2. Hierarchical layered and semantic-based image segmentation using ergodicity map

    NASA Astrophysics Data System (ADS)

    Yadegar, Jacob; Liu, Xiaoqing

    2010-04-01

    Image segmentation plays a foundational role in image understanding and computer vision. Although great strides have been made and progress achieved on automatic/semi-automatic image segmentation algorithms, designing a generic, robust, and efficient image segmentation algorithm is still challenging. Human vision is still far superior compared to computer vision, especially in interpreting semantic meanings/objects in images. We present a hierarchical/layered semantic image segmentation algorithm that can automatically and efficiently segment images into hierarchical layered/multi-scaled semantic regions/objects with contextual topological relationships. The proposed algorithm bridges the gap between high-level semantics and low-level visual features/cues (such as color, intensity, edge, etc.) through utilizing a layered/hierarchical ergodicity map, where ergodicity is computed based on a space filling fractal concept and used as a region dissimilarity measurement. The algorithm applies a highly scalable, efficient, and adaptive Peano- Cesaro triangulation/tiling technique to decompose the given image into a set of similar/homogenous regions based on low-level visual cues in a top-down manner. The layered/hierarchical ergodicity map is built through a bottom-up region dissimilarity analysis. The recursive fractal sweep associated with the Peano-Cesaro triangulation provides efficient local multi-resolution refinement to any level of detail. The generated binary decomposition tree also provides efficient neighbor retrieval mechanisms for contextual topological object/region relationship generation. Experiments have been conducted within the maritime image environment where the segmented layered semantic objects include the basic level objects (i.e. sky/land/water) and deeper level objects in the sky/land/water surfaces. Experimental results demonstrate the proposed algorithm has the capability to robustly and efficiently segment images into layered semantic objects/regions with contextual topological relationships.

  3. Marginal Space Deep Learning: Efficient Architecture for Volumetric Image Parsing.

    PubMed

    Ghesu, Florin C; Krubasik, Edward; Georgescu, Bogdan; Singh, Vivek; Yefeng Zheng; Hornegger, Joachim; Comaniciu, Dorin

    2016-05-01

    Robust and fast solutions for anatomical object detection and segmentation support the entire clinical workflow from diagnosis, patient stratification, therapy planning, intervention and follow-up. Current state-of-the-art techniques for parsing volumetric medical image data are typically based on machine learning methods that exploit large annotated image databases. Two main challenges need to be addressed, these are the efficiency in scanning high-dimensional parametric spaces and the need for representative image features which require significant efforts of manual engineering. We propose a pipeline for object detection and segmentation in the context of volumetric image parsing, solving a two-step learning problem: anatomical pose estimation and boundary delineation. For this task we introduce Marginal Space Deep Learning (MSDL), a novel framework exploiting both the strengths of efficient object parametrization in hierarchical marginal spaces and the automated feature design of Deep Learning (DL) network architectures. In the 3D context, the application of deep learning systems is limited by the very high complexity of the parametrization. More specifically 9 parameters are necessary to describe a restricted affine transformation in 3D, resulting in a prohibitive amount of billions of scanning hypotheses. The mechanism of marginal space learning provides excellent run-time performance by learning classifiers in clustered, high-probability regions in spaces of gradually increasing dimensionality. To further increase computational efficiency and robustness, in our system we learn sparse adaptive data sampling patterns that automatically capture the structure of the input. Given the object localization, we propose a DL-based active shape model to estimate the non-rigid object boundary. Experimental results are presented on the aortic valve in ultrasound using an extensive dataset of 2891 volumes from 869 patients, showing significant improvements of up to 45.2% over the state-of-the-art. To our knowledge, this is the first successful demonstration of the DL potential to detection and segmentation in full 3D data with parametrized representations.

  4. Automated extraction of Biomarker information from pathology reports.

    PubMed

    Lee, Jeongeun; Song, Hyun-Je; Yoon, Eunsil; Park, Seong-Bae; Park, Sung-Hye; Seo, Jeong-Wook; Park, Peom; Choi, Jinwook

    2018-05-21

    Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.

  5. voevent-parse: Parse, manipulate, and generate VOEvent XML packets

    NASA Astrophysics Data System (ADS)

    Staley, Tim D.

    2014-11-01

    voevent-parse, written in Python, parses, manipulates, and generates VOEvent XML packets; it is built atop lxml.objectify. Details of transients detected by many projects, including Fermi, Swift, and the Catalina Sky Survey, are currently made available as VOEvents, which is also the standard alert format by future facilities such as LSST and SKA. However, working with XML and adhering to the sometimes lengthy VOEvent schema can be a tricky process. voevent-parse provides convenience routines for common tasks, while allowing the user to utilise the full power of the lxml library when required. An earlier version of voevent-parse was part of the pysovo (ascl:1411.002) library.

  6. Technique for information retrieval using enhanced latent semantic analysis generating rank approximation matrix by factorizing the weighted morpheme-by-document matrix

    DOEpatents

    Chew, Peter A; Bader, Brett W

    2012-10-16

    A technique for information retrieval includes parsing a corpus to identify a number of wordform instances within each document of the corpus. A weighted morpheme-by-document matrix is generated based at least in part on the number of wordform instances within each document of the corpus and based at least in part on a weighting function. The weighted morpheme-by-document matrix separately enumerates instances of stems and affixes. Additionally or alternatively, a term-by-term alignment matrix may be generated based at least in part on the number of wordform instances within each document of the corpus. At least one lower rank approximation matrix is generated by factorizing the weighted morpheme-by-document matrix and/or the term-by-term alignment matrix.

  7. Collaborative E-Learning Using Semantic Course Blog

    ERIC Educational Resources Information Center

    Lu, Lai-Chen; Yeh, Ching-Long

    2008-01-01

    Collaborative e-learning delivers many enhancements to e-learning technology; it enables students to collaborate with each other and improves their learning efficiency. Semantic blog combines semantic Web and blog technology that users can import, export, view, navigate, and query the blog. We developed a semantic course blog for collaborative…

  8. ChemicalTagger: A tool for semantic text-mining in chemistry

    PubMed Central

    2011-01-01

    Background The primary method for scientific communication is in the form of published scientific articles and theses which use natural language combined with domain-specific terminology. As such, they contain free owing unstructured text. Given the usefulness of data extraction from unstructured literature, we aim to show how this can be achieved for the discipline of chemistry. The highly formulaic style of writing most chemists adopt make their contributions well suited to high-throughput Natural Language Processing (NLP) approaches. Results We have developed the ChemicalTagger parser as a medium-depth, phrase-based semantic NLP tool for the language of chemical experiments. Tagging is based on a modular architecture and uses a combination of OSCAR, domain-specific regex and English taggers to identify parts-of-speech. The ANTLR grammar is used to structure this into tree-based phrases. Using a metric that allows for overlapping annotations, we achieved machine-annotator agreements of 88.9% for phrase recognition and 91.9% for phrase-type identification (Action names). Conclusions It is possible parse to chemical experimental text using rule-based techniques in conjunction with a formal grammar parser. ChemicalTagger has been deployed for over 10,000 patents and has identified solvents from their linguistic context with >99.5% precision. PMID:21575201

  9. Benchmark eye movement effects during natural reading in autism spectrum disorder.

    PubMed

    Howard, Philippa L; Liversedge, Simon P; Benson, Valerie

    2017-01-01

    In 2 experiments, eye tracking methodology was used to assess on-line lexical, syntactic and semantic processing in autism spectrum disorder (ASD). In Experiment 1, lexical identification was examined by manipulating the frequency of target words. Both typically developed (TD) and ASD readers showed normal frequency effects, suggesting that the processes TD and ASD readers engage in to identify words are comparable. In Experiment 2, syntactic parsing and semantic interpretation requiring the on-line use of world knowledge were examined, by having participants read garden path sentences containing an ambiguous prepositional phrase. Both groups showed normal garden path effects when reading low-attached sentences and the time course of reading disruption was comparable between groups. This suggests that not only do ASD readers hold similar syntactic preferences to TD readers, but also that they use world knowledge on-line during reading. Together, these experiments demonstrate that the initial construction of sentence interpretation appears to be intact in ASD. However, the finding that ASD readers skip target words less often in Experiment 2, and take longer to read sentences during second pass for both experiments, suggests that they adopt a more cautious reading strategy and take longer to evaluate their sentence interpretation prior to making a manual response. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. Automatic information extraction from unstructured mammography reports using distributed semantics.

    PubMed

    Gupta, Anupama; Banerjee, Imon; Rubin, Daniel L

    2018-02-01

    To date, the methods developed for automated extraction of information from radiology reports are mainly rule-based or dictionary-based, and, therefore, require substantial manual effort to build these systems. Recent efforts to develop automated systems for entity detection have been undertaken, but little work has been done to automatically extract relations and their associated named entities in narrative radiology reports that have comparable accuracy to rule-based methods. Our goal is to extract relations in a unsupervised way from radiology reports without specifying prior domain knowledge. We propose a hybrid approach for information extraction that combines dependency-based parse tree with distributed semantics for generating structured information frames about particular findings/abnormalities from the free-text mammography reports. The proposed IE system obtains a F 1 -score of 0.94 in terms of completeness of the content in the information frames, which outperforms a state-of-the-art rule-based system in this domain by a significant margin. The proposed system can be leveraged in a variety of applications, such as decision support and information retrieval, and may also easily scale to other radiology domains, since there is no need to tune the system with hand-crafted information extraction rules. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Emotion regulation during threat: Parsing the time course and consequences of safety signal processing

    PubMed Central

    HEFNER, KATHRYN R.; VERONA, EDELYN; CURTIN, JOHN. J.

    2017-01-01

    Improved understanding of fear inhibition processes can inform the etiology and treatment of anxiety disorders. Safety signals can reduce fear to threat, but precise mechanisms remain unclear. Safety signals may acquire attentional salience and affective properties (e.g., relief) independent of the threat; alternatively, safety signals may only hold affective value in the presence of simultaneous threat. To clarify such mechanisms, an experimental paradigm assessed independent processing of threat and safety cues. Participants viewed a series of red and green words from two semantic categories. Shocks were administered following red words (cue+). No shocks followed green words (cue−). Words from one category were defined as safety signals (SS); no shocks were administered on cue+ trials. Words from the other (control) category did not provide information regarding shock administration. Threat (cue+ vs. cue−) and safety (SS+ vs. SS−) were fully crossed. Startle response and ERPs were recorded. Startle response was increased during cue+ versus cue−. Safety signals reduced startle response during cue+, but had no effect on startle response during cue−. ERP analyses (PD130 and P3) suggested that participants parsed threat and safety signal information in parallel. Motivated attention was not associated with safety signals in the absence of threat. Overall, these results confirm that fear can be reduced by safety signals. Furthermore, safety signals do not appear to hold inherent hedonic salience independent of their effect during threat. Instead, safety signals appear to enable participants to engage in effective top-down emotion regulatory processes. PMID:27088643

  12. Progress in recognizing typeset mathematics

    NASA Astrophysics Data System (ADS)

    Fateman, Richard J.; Tokuyasu, Taku A.

    1996-03-01

    Printed mathematics has a number of features which distinguish it from conventional text. These include structure in two dimensions (fractions, exponents, limits), frequent font changes, symbols with variable shape (quotient bars), and substantially differing notational conventions from source to source. When compounded with more generic problems such as noise and merged or broken characters, printed mathematics offers a challenging arena for recognition. Our project was initially driven by the goal of scanning and parsing some 5,000 pages of elaborate mathematics (tables of definite integrals). While our prototype system demonstrates success on translating noise-free typeset equations into Lisp expressions appropriate for further processing, a more semantic top-down approach appears necessary for higher levels of performance. Such an approach may benefit the incorporation of these programs into a more general document processing viewpoint. We intend to release to the public our somewhat refined prototypes as utility programs in the hope that they will be of general use in the construction of custom OCR packages. These utilities are quite fast even as originally prototyped in Lisp, where they may be of particular interest to those working on 'intelligent' optical processing. Some routines have been re-written in C++ as well. Additional programs providing formula recognition and parsing also form a part of this system. It is important however to realize that distinct conflicting grammars are needed to cover variations in contemporary and historical typesetting, and thus a single simple solution is not possible.

  13. A natural language interface plug-in for cooperative query answering in biological databases.

    PubMed

    Jamil, Hasan M

    2012-06-11

    One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a semantic description of the intended application. We demonstrate the feasibility of our approach with a practical example.

  14. Parsing and Quantification of Raw Orbitrap Mass Spectrometer Data Using RawQuant.

    PubMed

    Kovalchik, Kevin A; Moggridge, Sophie; Chen, David D Y; Morin, Gregg B; Hughes, Christopher S

    2018-06-01

    Effective analysis of protein samples by mass spectrometry (MS) requires careful selection and optimization of a range of experimental parameters. As the output from the primary detection device, the "raw" MS data file can be used to gauge the success of a given sample analysis. However, the closed-source nature of the standard raw MS file can complicate effective parsing of the data contained within. To ease and increase the range of analyses possible, the RawQuant tool was developed to enable parsing of raw MS files derived from Thermo Orbitrap instruments to yield meta and scan data in an openly readable text format. RawQuant can be commanded to export user-friendly files containing MS 1 , MS 2 , and MS 3 metadata as well as matrices of quantification values based on isobaric tagging approaches. In this study, the utility of RawQuant is demonstrated in several scenarios: (1) reanalysis of shotgun proteomics data for the identification of the human proteome, (2) reanalysis of experiments utilizing isobaric tagging for whole-proteome quantification, and (3) analysis of a novel bacterial proteome and synthetic peptide mixture for assessing quantification accuracy when using isobaric tags. Together, these analyses successfully demonstrate RawQuant for the efficient parsing and quantification of data from raw Thermo Orbitrap MS files acquired in a range of common proteomics experiments. In addition, the individual analyses using RawQuant highlights parametric considerations in the different experimental sets and suggests targetable areas to improve depth of coverage in identification-focused studies and quantification accuracy when using isobaric tags.

  15. Statistical learning using real-world scenes: extracting categorical regularities without conscious intent.

    PubMed

    Brady, Timothy F; Oliva, Aude

    2008-07-01

    Recent work has shown that observers can parse streams of syllables, tones, or visual shapes and learn statistical regularities in them without conscious intent (e.g., learn that A is always followed by B). Here, we demonstrate that these statistical-learning mechanisms can operate at an abstract, conceptual level. In Experiments 1 and 2, observers incidentally learned which semantic categories of natural scenes covaried (e.g., kitchen scenes were always followed by forest scenes). In Experiments 3 and 4, category learning with images of scenes transferred to words that represented the categories. In each experiment, the category of the scenes was irrelevant to the task. Together, these results suggest that statistical-learning mechanisms can operate at a categorical level, enabling generalization of learned regularities using existing conceptual knowledge. Such mechanisms may guide learning in domains as disparate as the acquisition of causal knowledge and the development of cognitive maps from environmental exploration.

  16. Semantic e-Learning: Next Generation of e-Learning?

    NASA Astrophysics Data System (ADS)

    Konstantinos, Markellos; Penelope, Markellou; Giannis, Koutsonikos; Aglaia, Liopa-Tsakalidi

    Semantic e-learning aspires to be the next generation of e-learning, since the understanding of learning materials and knowledge semantics allows their advanced representation, manipulation, sharing, exchange and reuse and ultimately promote efficient online experiences for users. In this context, the paper firstly explores some fundamental Semantic Web technologies and then discusses current and potential applications of these technologies in e-learning domain, namely, Semantic portals, Semantic search, personalization, recommendation systems, social software and Web 2.0 tools. Finally, it highlights future research directions and open issues of the field.

  17. Developing a modular architecture for creation of rule-based clinical diagnostic criteria.

    PubMed

    Hong, Na; Pathak, Jyotishman; Chute, Christopher G; Jiang, Guoqian

    2016-01-01

    With recent advances in computerized patient records system, there is an urgent need for producing computable and standards-based clinical diagnostic criteria. Notably, constructing rule-based clinical diagnosis criteria has become one of the goals in the International Classification of Diseases (ICD)-11 revision. However, few studies have been done in building a unified architecture to support the need for diagnostic criteria computerization. In this study, we present a modular architecture for enabling the creation of rule-based clinical diagnostic criteria leveraging Semantic Web technologies. The architecture consists of two modules: an authoring module that utilizes a standards-based information model and a translation module that leverages Semantic Web Rule Language (SWRL). In a prototype implementation, we created a diagnostic criteria upper ontology (DCUO) that integrates ICD-11 content model with the Quality Data Model (QDM). Using the DCUO, we developed a transformation tool that converts QDM-based diagnostic criteria into Semantic Web Rule Language (SWRL) representation. We evaluated the domain coverage of the upper ontology model using randomly selected diagnostic criteria from broad domains (n = 20). We also tested the transformation algorithms using 6 QDM templates for ontology population and 15 QDM-based criteria data for rule generation. As the results, the first draft of DCUO contains 14 root classes, 21 subclasses, 6 object properties and 1 data property. Investigation Findings, and Signs and Symptoms are the two most commonly used element types. All 6 HQMF templates are successfully parsed and populated into their corresponding domain specific ontologies and 14 rules (93.3 %) passed the rule validation. Our efforts in developing and prototyping a modular architecture provide useful insight into how to build a scalable solution to support diagnostic criteria representation and computerization.

  18. Establishing semantic interoperability of biomedical metadata registries using extended semantic relationships.

    PubMed

    Park, Yu Rang; Yoon, Young Jo; Kim, Hye Hyeon; Kim, Ju Han

    2013-01-01

    Achieving semantic interoperability is critical for biomedical data sharing between individuals, organizations and systems. The ISO/IEC 11179 MetaData Registry (MDR) standard has been recognized as one of the solutions for this purpose. The standard model, however, is limited. Representing concepts consist of two or more values, for instance, are not allowed including blood pressure with systolic and diastolic values. We addressed the structural limitations of ISO/IEC 11179 by an integrated metadata object model in our previous research. In the present study, we introduce semantic extensions for the model by defining three new types of semantic relationships; dependency, composite and variable relationships. To evaluate our extensions in a real world setting, we measured the efficiency of metadata reduction by means of mapping to existing others. We extracted metadata from the College of American Pathologist Cancer Protocols and then evaluated our extensions. With no semantic loss, one third of the extracted metadata could be successfully eliminated, suggesting better strategy for implementing clinical MDRs with improved efficiency and utility.

  19. Automatic extraction of property norm-like data from large text corpora.

    PubMed

    Kelly, Colin; Devereux, Barry; Korhonen, Anna

    2014-01-01

    Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.

  20. Efficient Inference for Trees and Alignments: Modeling Monolingual and Bilingual Syntax with Hard and Soft Constraints and Latent Variables

    ERIC Educational Resources Information Center

    Smith, David Arthur

    2010-01-01

    Much recent work in natural language processing treats linguistic analysis as an inference problem over graphs. This development opens up useful connections between machine learning, graph theory, and linguistics. The first part of this dissertation formulates syntactic dependency parsing as a dynamic Markov random field with the novel…

  1. Semantic and Phonological Encoding in Adults Who Stutter: Silent Responses to Pictorial Stimuli

    ERIC Educational Resources Information Center

    Vincent, Irena

    2017-01-01

    Purpose: Research on language planning in adult stuttering is relatively sparse and offers diverging arguments about a potential causative relationship between semantic and phonological encoding and fluency breakdowns. This study further investigated semantic and phonological encoding efficiency in adults who stutter (AWS) by means of silent…

  2. Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture

    DOEpatents

    Lassahn, Gordon D.; Lancaster, Gregory D.; Apel, William A.; Thompson, Vicki S.

    2013-01-08

    Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture are described. According to one embodiment, an image portion identification method includes accessing data regarding an image depicting a plurality of biological substrates corresponding to at least one biological sample and indicating presence of at least one biological indicator within the biological sample and, using processing circuitry, automatically identifying a portion of the image depicting one of the biological substrates but not others of the biological substrates.

  3. Towards a semantic lexicon for biological language processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verspoor, K.

    It is well understood that natural language processing (NLP) applications require sophisticated lexical resources to support their processing goals. In the biomedical domain, we are privileged to have access to extensive terminological resources in the form of controlled vocabularies and ontologies, which have been integrated into the framework of the National Library of Medicine's Unified Medical Language System's (UMLS) Metathesaurus. However, the existence of such terminological resources does not guarantee their utility for NLP. In particular, we have two core requirements for lexical resources for NLP in addition to the basic enumeration of important domain terms: representation of morphosyntactic informationmore » about those terms, specifically part of speech information and inflectional patterns to support parsing and lemma assignment, and representation of semantic information indicating general categorical information about terms, and significant relations between terms to support text understanding and inference (Hahn et at, 1999). Biomedical vocabularies by and large commonly leave out morphosyntactic information, and where they address semantic considerations, they often do so in an unprincipled manner, for instance by indicating a relation between two concepts without indicating the type of that relation. But all is not lost. The UMLS knowledge sources include two additional resources which are relevant - the SPECIALIST lexicon, a lexicon addressing our morphosyntactic requirements, and the Semantic Network, a representation of core conceptual categories in the biomedical domain. The coverage of these two knowledge sources with respect to the full coverage of the Metathesaurus is, however, not entirely clear. Furthermore, when our goals are specifically to process biological text - and often more specifically, text in the molecular biology domain - it is difficult to say whether the coverage of these resources is meaningful. The utility of the UMLS knowledge sources for medical language processing (MLP) has been explored (Johnson, 1999; Friedman et al 2001); the time has now come to repeat these experiments with respect to biological language processing (BLP). To that end, this paper presents an analysis of ihe UMLS resources, specifically with an eye towards constructing lexical resources suitable for BLP. We follow the paradigm presented in Johnson (1999) for medical language, exploring overlap between the UMLS Metathesaurus and SPECIALIST lexicon to construct a morphosyntactic and semantically-specified lexicon, and then further explore the overlap with a relevant domain corpus for molecular biology.« less

  4. Challenges Facing the Semantic Web and Social Software as Communication Technology Agents in E-Learning Environments

    ERIC Educational Resources Information Center

    Olaniran, Bolanle A.

    2010-01-01

    The semantic web describes the process whereby information content is made available for machine consumption. With increased reliance on information communication technologies, the semantic web promises effective and efficient information acquisition and dissemination of products and services in the global economy, in particular, e-learning.…

  5. EIIS: An Educational Information Intelligent Search Engine Supported by Semantic Services

    ERIC Educational Resources Information Center

    Huang, Chang-Qin; Duan, Ru-Lin; Tang, Yong; Zhu, Zhi-Ting; Yan, Yong-Jian; Guo, Yu-Qing

    2011-01-01

    The semantic web brings a new opportunity for efficient information organization and search. To meet the special requirements of the educational field, this paper proposes an intelligent search engine enabled by educational semantic support service, where three kinds of searches are integrated into Educational Information Intelligent Search (EIIS)…

  6. Semantic Agent-Based Service Middleware and Simulation for Smart Cities

    PubMed Central

    Liu, Ming; Xu, Yang; Hu, Haixiao; Mohammed, Abdul-Wahid

    2016-01-01

    With the development of Machine-to-Machine (M2M) technology, a variety of embedded and mobile devices is integrated to interact via the platform of the Internet of Things, especially in the domain of smart cities. One of the primary challenges is that selecting the appropriate services or service combination for upper layer applications is hard, which is due to the absence of a unified semantical service description pattern, as well as the service selection mechanism. In this paper, we define a semantic service representation model from four key properties: Capability (C), Deployment (D), Resource (R) and IOData (IO). Based on this model, an agent-based middleware is built to support semantic service enablement. In this middleware, we present an efficient semantic service discovery and matching approach for a service combination process, which calculates the semantic similarity between services, and a heuristic algorithm to search the service candidates for a specific service request. Based on this design, we propose a simulation of virtual urban fire fighting, and the experimental results manifest the feasibility and efficiency of our design. PMID:28009818

  7. Semantic Agent-Based Service Middleware and Simulation for Smart Cities.

    PubMed

    Liu, Ming; Xu, Yang; Hu, Haixiao; Mohammed, Abdul-Wahid

    2016-12-21

    With the development of Machine-to-Machine (M2M) technology, a variety of embedded and mobile devices is integrated to interact via the platform of the Internet of Things, especially in the domain of smart cities. One of the primary challenges is that selecting the appropriate services or service combination for upper layer applications is hard, which is due to the absence of a unified semantical service description pattern, as well as the service selection mechanism. In this paper, we define a semantic service representation model from four key properties: Capability (C), Deployment (D), Resource (R) and IOData (IO). Based on this model, an agent-based middleware is built to support semantic service enablement. In this middleware, we present an efficient semantic service discovery and matching approach for a service combination process, which calculates the semantic similarity between services, and a heuristic algorithm to search the service candidates for a specific service request. Based on this design, we propose a simulation of virtual urban fire fighting, and the experimental results manifest the feasibility and efficiency of our design.

  8. Lexical Familiarity and Processing Efficiency: Individual Differences in Naming, Lexical Decision, and Semantic Categorization

    PubMed Central

    Lewellen, Mary Jo; Goldinger, Stephen D.; Pisoni, David B.; Greene, Beth G.

    2012-01-01

    College students were separated into 2 groups (high and low) on the basis of 3 measures: subjective familiarity ratings of words, self-reported language experiences, and a test of vocabulary knowledge. Three experiments were conducted to determine if the groups also differed in visual word naming, lexical decision, and semantic categorization. High Ss were consistently faster than low Ss in naming visually presented words. They were also faster and more accurate in making difficult lexical decisions and in rejecting homophone foils in semantic categorization. Taken together, the results demonstrate that Ss who differ in lexical familiarity also differ in processing efficiency. The relationship between processing efficiency and working memory accounts of individual differences in language processing is also discussed. PMID:8371087

  9. Perceptual learning improves visual performance in juvenile amblyopia.

    PubMed

    Li, Roger W; Young, Karen G; Hoenig, Pia; Levi, Dennis M

    2005-09-01

    To determine whether practicing a position-discrimination task improves visual performance in children with amblyopia and to determine the mechanism(s) of improvement. Five children (age range, 7-10 years) with amblyopia practiced a positional acuity task in which they had to judge which of three pairs of lines was misaligned. Positional noise was produced by distributing the individual patches of each line segment according to a Gaussian probability function. Observers were trained at three noise levels (including 0), with each observer performing between 3000 and 4000 responses in 7 to 10 sessions. Trial-by-trial feedback was provided. Four of the five observers showed significant improvement in positional acuity. In those four observers, on average, positional acuity with no noise improved by approximately 32% and with high noise by approximately 26%. A position-averaging model was used to parse the improvement into an increase in efficiency or a decrease in equivalent input noise. Two observers showed increased efficiency (51% and 117% improvements) with no significant change in equivalent input noise across sessions. The other two observers showed both a decrease in equivalent input noise (18% and 29%) and an increase in efficiency (17% and 71%). All five observers showed substantial improvement in Snellen acuity (approximately 26%) after practice. Perceptual learning can improve visual performance in amblyopic children. The improvement can be parsed into two important factors: decreased equivalent input noise and increased efficiency. Perceptual learning techniques may add an effective new method to the armamentarium of amblyopia treatments.

  10. MEG Evidence for Incremental Sentence Composition in the Anterior Temporal Lobe

    ERIC Educational Resources Information Center

    Brennan, Jonathan R.; Pylkkänen, Liina

    2017-01-01

    Research investigating the brain basis of language comprehension has associated the left anterior temporal lobe (ATL) with sentence-level combinatorics. Using magnetoencephalography (MEG), we test the parsing strategy implemented in this brain region. The number of incremental parse steps from a predictive left-corner parsing strategy that is…

  11. Natural speech reveals the semantic maps that tile human cerebral cortex

    PubMed Central

    Huth, Alexander G.; de Heer, Wendy A.; Griffiths, Thomas L.; Theunissen, Frédéric E.; Gallant, Jack L.

    2016-01-01

    The meaning of language is represented in regions of the cerebral cortex collectively known as the “semantic system”. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. Here we systematically map semantic selectivity across the cortex using voxel-wise modeling of fMRI data collected while subjects listened to hours of narrative stories. We show that the semantic system is organized into intricate patterns that appear consistent across individuals. We then use a novel generative model to create a detailed semantic atlas. Our results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas shows which domains are represented in each area. This study demonstrates that data-driven methods—commonplace in studies of human neuroanatomy and functional connectivity—provide a powerful and efficient means for mapping functional representations in the brain. PMID:27121839

  12. A semantic medical multimedia retrieval approach using ontology information hiding.

    PubMed

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.

  13. Research on polarization imaging information parsing method

    NASA Astrophysics Data System (ADS)

    Yuan, Hongwu; Zhou, Pucheng; Wang, Xiaolong

    2016-11-01

    Polarization information parsing plays an important role in polarization imaging detection. This paper focus on the polarization information parsing method: Firstly, the general process of polarization information parsing is given, mainly including polarization image preprocessing, multiple polarization parameters calculation, polarization image fusion and polarization image tracking, etc.; And then the research achievements of the polarization information parsing method are presented, in terms of polarization image preprocessing, the polarization image registration method based on the maximum mutual information is designed. The experiment shows that this method can improve the precision of registration and be satisfied the need of polarization information parsing; In terms of multiple polarization parameters calculation, based on the omnidirectional polarization inversion model is built, a variety of polarization parameter images are obtained and the precision of inversion is to be improve obviously; In terms of polarization image fusion , using fuzzy integral and sparse representation, the multiple polarization parameters adaptive optimal fusion method is given, and the targets detection in complex scene is completed by using the clustering image segmentation algorithm based on fractal characters; In polarization image tracking, the average displacement polarization image characteristics of auxiliary particle filtering fusion tracking algorithm is put forward to achieve the smooth tracking of moving targets. Finally, the polarization information parsing method is applied to the polarization imaging detection of typical targets such as the camouflage target, the fog and latent fingerprints.

  14. Comparing different kinds of words and word-word relations to test an habituation model of priming.

    PubMed

    Rieth, Cory A; Huber, David E

    2017-06-01

    Huber and O'Reilly (2003) proposed that neural habituation exists to solve a temporal parsing problem, minimizing blending between one word and the next when words are visually presented in rapid succession. They developed a neural dynamics habituation model, explaining the finding that short duration primes produce positive priming whereas long duration primes produce negative repetition priming. The model contains three layers of processing, including a visual input layer, an orthographic layer, and a lexical-semantic layer. The predicted effect of prime duration depends both on this assumed representational hierarchy and the assumption that synaptic depression underlies habituation. The current study tested these assumptions by comparing different kinds of words (e.g., words versus non-words) and different kinds of word-word relations (e.g., associative versus repetition). For each experiment, the predictions of the original model were compared to an alternative model with different representational assumptions. Experiment 1 confirmed the prediction that non-words and inverted words require longer prime durations to eliminate positive repetition priming (i.e., a slower transition from positive to negative priming). Experiment 2 confirmed the prediction that associative priming increases and then decreases with increasing prime duration, but remains positive even with long duration primes. Experiment 3 replicated the effects of repetition and associative priming using a within-subjects design and combined these effects by examining target words that were expected to repeat (e.g., viewing the target word 'BACK' after the prime phrase 'back to'). These results support the originally assumed representational hierarchy and more generally the role of habituation in temporal parsing and priming. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. The taxonomic name resolution service: an online tool for automated standardization of plant names

    PubMed Central

    2013-01-01

    Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/. PMID:23324024

  16. Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives

    PubMed Central

    Roth, Dan

    2013-01-01

    Objective This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities. Materials and methods A knowledge-intensive approach for coreference resolution is employed. The domain knowledge used includes several domain-specific lists, a knowledge intensive mention parsing, and task informed discourse model. Mention parsing allows us to abstract over the surface form of the mention and represent each mention using a higher-level representation, which we call the mention's semantic representation (SR). SR reduces the mention to a standard form and hence provides better support for comparing and matching. Existing coreference resolution systems tend to ignore discourse aspects and rely heavily on lexical and structural cues in the text. The authors break from this tradition and present a discourse model for “person” type mentions in clinical narratives, which greatly simplifies the coreference resolution. Results This system was evaluated on four different datasets which were made available in the 2011 i2b2/VA coreference challenge. The unweighted average of F1 scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domain knowledge is effective for different mention types for all the datasets. Discussion Error analysis shows that most of the recall errors made by the system can be handled by further addition of domain knowledge. The precision errors, on the other hand, are more subtle and indicate the need to understand the relations in which mentions participate for building a robust coreference system. Conclusion This paper presents an approach that makes an extensive use of domain knowledge to significantly improve coreference resolution. The authors state that their system and the knowledge sources developed will be made publicly available. PMID:22781192

  17. Automated Program Recognition by Graph Parsing

    DTIC Science & Technology

    1992-07-01

    structures (cliches) in a program can help an experienced programmer understand the program. Based on the known relationships between the clichis, a...Graph Parsing Linda Mary Wills Abstract The recognition of standard computational structures (cliches) in a program can help an experienced programmer...3.4.1 Structure -Sharing ....... ............................ 76 3.4.2 Aggregation ....................................... 80 2 3.5 Chart Parsing Flow

  18. Aligning Event Logs to Task-Time Matrix Clinical Pathways in BPMN for Variance Analysis.

    PubMed

    Yan, Hui; Van Gorp, Pieter; Kaymak, Uzay; Lu, Xudong; Ji, Lei; Chiau, Choo Chiap; Korsten, Hendrikus H M; Duan, Huilong

    2018-03-01

    Clinical pathways (CPs) are popular healthcare management tools to standardize care and ensure quality. Analyzing CP compliance levels and variances is known to be useful for training and CP redesign purposes. Flexible semantics of the business process model and notation (BPMN) language has been shown to be useful for the modeling and analysis of complex protocols. However, in practical cases one may want to exploit that CPs often have the form of task-time matrices. This paper presents a new method parsing complex BPMN models and aligning traces to the models heuristically. A case study on variance analysis is undertaken, where a CP from the practice and two large sets of patients data from an electronic medical record (EMR) database are used. The results demonstrate that automated variance analysis between BPMN task-time models and real-life EMR data are feasible, whereas that was not the case for the existing analysis techniques. We also provide meaningful insights for further improvement.

  19. Structure before Meaning: Sentence Processing, Plausibility, and Subcategorization

    PubMed Central

    Kizach, Johannes; Nyvad, Anne Mette; Christensen, Ken Ramshøj

    2013-01-01

    Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous. PMID:24116101

  20. Structure before meaning: sentence processing, plausibility, and subcategorization.

    PubMed

    Kizach, Johannes; Nyvad, Anne Mette; Christensen, Ken Ramshøj

    2013-01-01

    Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous.

  1. A Semantic Medical Multimedia Retrieval Approach Using Ontology Information Hiding

    PubMed Central

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches. PMID:24082915

  2. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  3. High-frequency neural activity predicts word parsing in ambiguous speech streams.

    PubMed

    Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

    2016-12-01

    During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.

  4. High-frequency neural activity predicts word parsing in ambiguous speech streams

    PubMed Central

    Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

    2016-01-01

    During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528

  5. A suffix arrays based approach to semantic search in P2P systems

    NASA Astrophysics Data System (ADS)

    Shi, Qingwei; Zhao, Zheng; Bao, Hu

    2007-09-01

    Building a semantic search system on top of peer-to-peer (P2P) networks is becoming an attractive and promising alternative scheme for the reason of scalability, Data freshness and search cost. In this paper, we present a Suffix Arrays based algorithm for Semantic Search (SASS) in P2P systems, which generates a distributed Semantic Overlay Network (SONs) construction for full-text search in P2P networks. For each node through the P2P network, SASS distributes document indices based on a set of suffix arrays, by which clusters are created depending on words or phrases shared between documents, therefore, the search cost for a given query is decreased by only scanning semantically related documents. In contrast to recently announced SONs scheme designed by using metadata or predefined-class, SASS is an unsupervised approach for decentralized generation of SONs. SASS is also an incremental, linear time algorithm, which efficiently handle the problem of nodes update in P2P networks. Our simulation results demonstrate that SASS yields high search efficiency in dynamic environments.

  6. SU-E-T-473: A Patient-Specific QC Paradigm Based On Trajectory Log Files and DICOM Plan Files

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeMarco, J; McCloskey, S; Low, D

    Purpose: To evaluate a remote QC tool for monitoring treatment machine parameters and treatment workflow. Methods: The Varian TrueBeamTM linear accelerator is a digital machine that records machine axis parameters and MLC leaf positions as a function of delivered monitor unit or control point. This information is saved to a binary trajectory log file for every treatment or imaging field in the patient treatment session. A MATLAB analysis routine was developed to parse the trajectory log files for a given patient, compare the expected versus actual machine and MLC positions as well as perform a cross-comparison with the DICOM-RT planmore » file exported from the treatment planning system. The parsing routine sorts the trajectory log files based on the time and date stamp and generates a sequential report file listing treatment parameters and provides a match relative to the DICOM-RT plan file. Results: The trajectory log parsing-routine was compared against a standard record and verify listing for patients undergoing initial IMRT dosimetry verification and weekly and final chart QC. The complete treatment course was independently verified for 10 patients of varying treatment site and a total of 1267 treatment fields were evaluated including pre-treatment imaging fields where applicable. In the context of IMRT plan verification, eight prostate SBRT plans with 4-arcs per plan were evaluated based on expected versus actual machine axis parameters. The average value for the maximum RMS MLC error was 0.067±0.001mm and 0.066±0.002mm for leaf bank A and B respectively. Conclusion: A real-time QC analysis program was tested using trajectory log files and DICOM-RT plan files. The parsing routine is efficient and able to evaluate all relevant machine axis parameters during a patient treatment course including MLC leaf positions and table positions at time of image acquisition and during treatment.« less

  7. An All-Fragments Grammar for Simple and Accurate Parsing

    DTIC Science & Technology

    2012-03-21

    Tsujii. Probabilistic CFG with latent annotations. In Proceedings of ACL, 2005. Slav Petrov and Dan Klein. Improved Inference for Unlexicalized Parsing. In...Proceedings of NAACL-HLT, 2007. Slav Petrov and Dan Klein. Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing. In Proceedings of...EMNLP, 2008. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings

  8. GeoSciGraph: An Ontological Framework for EarthCube Semantic Infrastructure

    NASA Astrophysics Data System (ADS)

    Gupta, A.; Schachne, A.; Condit, C.; Valentine, D.; Richard, S.; Zaslavsky, I.

    2015-12-01

    The CINERGI (Community Inventory of EarthCube Resources for Geosciences Interoperability) project compiles an inventory of a wide variety of earth science resources including documents, catalogs, vocabularies, data models, data services, process models, information repositories, domain-specific ontologies etc. developed by research groups and data practitioners. We have developed a multidisciplinary semantic framework called GeoSciGraph semantic ingration of earth science resources. An integrated ontology is constructed with Basic Formal Ontology (BFO) as its upper ontology and currently ingests multiple component ontologies including the SWEET ontology, GeoSciML's lithology ontology, Tematres controlled vocabulary server, GeoNames, GCMD vocabularies on equipment, platforms and institutions, software ontology, CUAHSI hydrology vocabulary, the environmental ontology (ENVO) and several more. These ontologies are connected through bridging axioms; GeoSciGraph identifies lexically close terms and creates equivalence class or subclass relationships between them after human verification. GeoSciGraph allows a community to create community-specific customizations of the integrated ontology. GeoSciGraph uses the Neo4J,a graph database that can hold several billion concepts and relationships. GeoSciGraph provides a number of REST services that can be called by other software modules like the CINERGI information augmentation pipeline. 1) Vocabulary services are used to find exact and approximate terms, term categories (community-provided clusters of terms e.g., measurement-related terms or environmental material related terms), synonyms, term definitions and annotations. 2) Lexical services are used for text parsing to find entities, which can then be included into the ontology by a domain expert. 3) Graph services provide the ability to perform traversal centric operations e.g., finding paths and neighborhoods which can be used to perform ontological operations like computing transitive closure (e.g., finding all subclasses of rocks). 4) Annotation services are used to adorn an arbitrary block of text (e.g., from a NOAA catalog record) with ontology terms. The system has been used to ontologically integrate diverse sources like Science-base, NOAA records, PETDB.

  9. Electrophysiology of prosodic and lexical-semantic processing during sentence comprehension in aphasia.

    PubMed

    Sheppard, Shannon M; Love, Tracy; Midgley, Katherine J; Holcomb, Phillip J; Shapiro, Lewis P

    2017-12-01

    Event-related potentials (ERPs) were used to examine how individuals with aphasia and a group of age-matched controls use prosody and themattic fit information in sentences containing temporary syntactic ambiguities. Two groups of individuals with aphasia were investigated; those demonstrating relatively good sentence comprehension whose primary language difficulty is anomia (Individuals with Anomic Aphasia (IWAA)), and those who demonstrate impaired sentence comprehension whose primary diagnosis is Broca's aphasia (Individuals with Broca's Aphasia (IWBA)). The stimuli had early closure syntactic structure and contained a temporary early closure (correct)/late closure (incorrect) syntactic ambiguity. The prosody was manipulated to either be congruent or incongruent, and the temporarily ambiguous NP was also manipulated to either be a plausible or an implausible continuation for the subordinate verb (e.g., "While the band played the song/the beer pleased all the customers."). It was hypothesized that an implausible NP in sentences with incongruent prosody may provide the parser with a plausibility cue that could be used to predict syntactic structure. The results revealed that incongruent prosody paired with a plausibility cue resulted in an N400-P600 complex at the implausible NP (the beer) in both the controls and the IWAAs, yet incongruent prosody without a plausibility cue resulted in an N400-P600 at the critical verb (pleased) only in healthy controls. IWBAs did not show evidence of N400 or P600 effects at the ambiguous NP or critical verb, although they did show evidence of a delayed N400 effect at the sentence-final word in sentences with incongruent prosody. These results suggest that IWAAs have difficulty integrating prosodic cues with underlying syntactic structure when lexical-semantic information is not available to aid their parse. IWBAs have difficulty integrating both prosodic and lexical-semantic cues with syntactic structure, likely due to a processing delay. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. A Formal Model of Ambiguity and its Applications in Machine Translation

    DTIC Science & Technology

    2010-01-01

    structure indicates linguisti- cally implausible segmentation that might be generated using dictionary - driven approaches...derivation. As was done in the monolingual case, the functions LHS, RHSi, RHSo and υ can be extended to a derivation δ. D(q) where q ∈V denotes the... monolingual parses. My algorithm runs more efficiently than O(n6) with many grammars (including those that required using heuristic search with other parsers

  11. Exploiting graph kernels for high performance biomedical relation extraction.

    PubMed

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.

  12. Solving LR Conflicts Through Context Aware Scanning

    NASA Astrophysics Data System (ADS)

    Leon, C. Rodriguez; Forte, L. Garcia

    2011-09-01

    This paper presents a new algorithm to compute the exact list of tokens expected by any LR syntax analyzer at any point of the scanning process. The lexer can, at any time, compute the exact list of valid tokens to return only tokens in this set. In the case than more than one matching token is in the valid set, the lexer can resort to a nested LR parser to disambiguate. Allowing nested LR parsing requires some slight modifications when building the LR parsing tables. We also show how LR parsers can parse conflictive and inherently ambiguous languages using a combination of nested parsing and context aware scanning. These expanded lexical analyzers can be generated from high level specifications.

  13. Parsing recursive sentences with a connectionist model including a neural stack and synaptic gating.

    PubMed

    Fedor, Anna; Ittzés, Péter; Szathmáry, Eörs

    2011-02-21

    It is supposed that humans are genetically predisposed to be able to recognize sequences of context-free grammars with centre-embedded recursion while other primates are restricted to the recognition of finite state grammars with tail-recursion. Our aim was to construct a minimalist neural network that is able to parse artificial sentences of both grammars in an efficient way without using the biologically unrealistic backpropagation algorithm. The core of this network is a neural stack-like memory where the push and pop operations are regulated by synaptic gating on the connections between the layers of the stack. The network correctly categorizes novel sentences of both grammars after training. We suggest that the introduction of the neural stack memory will turn out to be substantial for any biological 'hierarchical processor' and the minimalist design of the model suggests a quest for similar, realistic neural architectures. Copyright © 2010 Elsevier Ltd. All rights reserved.

  14. Performance evaluation of continuity of care records (CCRs): parsing models in a mobile health management system.

    PubMed

    Chen, Hung-Ming; Liou, Yong-Zan

    2014-10-01

    In a mobile health management system, mobile devices act as the application hosting devices for personal health records (PHRs) and the healthcare servers construct to exchange and analyze PHRs. One of the most popular PHR standards is continuity of care record (CCR). The CCR is expressed in XML formats. However, parsing is an expensive operation that can degrade XML processing performance. Hence, the objective of this study was to identify different operational and performance characteristics for those CCR parsing models including the XML DOM parser, the SAX parser, the PULL parser, and the JSON parser with regard to JSON data converted from XML-based CCR. Thus, developers can make sensible choices for their target PHR applications to parse CCRs when using mobile devices or servers with different system resources. Furthermore, the simulation experiments of four case studies are conducted to compare the parsing performance on Android mobile devices and the server with large quantities of CCR data.

  15. A novel adaptive Cuckoo search for optimal query plan generation.

    PubMed

    Gomathi, Ramalingam; Sharmila, Dhandapani

    2014-01-01

    The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

  16. Is semantic fluency differentially impaired in schizophrenic patients with delusions?

    PubMed

    Rossell, S L; Rabe-Hesketh, S S; Shapleske, J S; David, A S

    1999-10-01

    The study of cognitive deficits in schizophrenia has recently focused upon semantics: the study of meaning. Delusions are a plausible manifestation of abnormal semantics because by definition they involve changes in personal meaning and belief. A symptom-based approach was used to investigate semantic and phonological fluency in a group of schizophrenic patients subdivided into those with delusions and those with no current delusions. The results demonstrated that deluded patients only were differentially impaired on a test of semantic fluency in comparison to phonological fluency. All subjects showed the same decline in performance over the time course of both tests indicating that retrieval speed in schizophrenia is no different from that of normal controls. Further analysis of word associations in two semantic categories (animals and body parts), revealed that deluded subjects have a more idiosyncratic organisation for animals. The findings of reduced semantic fluency production and poor logical word associations may represent a disorganised storage of semantic information in deluded patients, which in turn affects efficient access.

  17. Parsing Flowcharts and Series-Parallel Graphs

    DTIC Science & Technology

    1978-11-01

    descriptions of the graph. This possible multiplicity is undesirable in most practical applications, a fact that makes parti%:ularly useful reduction...to parse TT networks, some of the features that make this parsing method useful in other cases are more natually introduced in the context of this...as Figure 4.5 shows. This multiplicity is due to the associativity of consecutive Two Terminal Series and Two Terminal Parallel compositions. In spite

  18. Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines.

    PubMed

    Raja, Kalpana; Natarajan, Jeyakumar

    2018-07-01

    Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Joint classification and contour extraction of large 3D point clouds

    NASA Astrophysics Data System (ADS)

    Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

    2017-08-01

    We present an effective and efficient method for point-wise semantic classification and extraction of object contours of large-scale 3D point clouds. What makes point cloud interpretation challenging is the sheer size of several millions of points per scan and the non-grid, sparse, and uneven distribution of points. Standard image processing tools like texture filters, for example, cannot handle such data efficiently, which calls for dedicated point cloud labeling methods. It turns out that one of the major drivers for efficient computation and handling of strong variations in point density, is a careful formulation of per-point neighborhoods at multiple scales. This allows, both, to define an expressive feature set and to extract topologically meaningful object contours. Semantic classification and contour extraction are interlaced problems. Point-wise semantic classification enables extracting a meaningful candidate set of contour points while contours help generating a rich feature representation that benefits point-wise classification. These methods are tailored to have fast run time and small memory footprint for processing large-scale, unstructured, and inhomogeneous point clouds, while still achieving high classification accuracy. We evaluate our methods on the semantic3d.net benchmark for terrestrial laser scans with >109 points.

  20. Workspaces in the Semantic Web

    NASA Technical Reports Server (NTRS)

    Wolfe, Shawn R.; Keller, RIchard M.

    2005-01-01

    Due to the recency and relatively limited adoption of Semantic Web technologies. practical issues related to technology scaling have received less attention than foundational issues. Nonetheless, these issues must be addressed if the Semantic Web is to realize its full potential. In particular, we concentrate on the lack of scoping methods that reduce the size of semantic information spaces so they are more efficient to work with and more relevant to an agent's needs. We provide some intuition to motivate the need for such reduced information spaces, called workspaces, give a formal definition, and suggest possible methods of deriving them.

  1. Context-free parsing with connectionist networks

    NASA Astrophysics Data System (ADS)

    Fanty, M. A.

    1986-08-01

    This paper presents a simple algorithm which converts any context-free grammar into a connectionist network which parses strings (of arbitrary but fixed maximum length) in the language defined by that grammar. The network is fast, O(n), and deterministicd. It consists of binary units which compute a simple function of their input. When the grammar is put in Chomsky normal form, O(n3) units needed to parse inputs of length up to n.

  2. Semantic Context Detection Using Audio Event Fusion

    NASA Astrophysics Data System (ADS)

    Chu, Wei-Ta; Cheng, Wen-Huang; Wu, Ja-Ling

    2006-12-01

    Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

  3. LDRD final report :

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brost, Randolph C.; McLendon, William Clarence,

    2013-01-01

    Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report amore » preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.« less

  4. Interoperability in Personalized Adaptive Learning

    ERIC Educational Resources Information Center

    Aroyo, Lora; Dolog, Peter; Houben, Geert-Jan; Kravcik, Milos; Naeve, Ambjorn; Nilsson, Mikael; Wild, Fridolin

    2006-01-01

    Personalized adaptive learning requires semantic-based and context-aware systems to manage the Web knowledge efficiently as well as to achieve semantic interoperability between heterogeneous information resources and services. The technological and conceptual differences can be bridged either by means of standards or via approaches based on the…

  5. BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

    PubMed Central

    2014-01-01

    The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed. PMID:24495517

  6. Using Parse's humanbecoming theory in Japan.

    PubMed

    Tanaka, Junko; Katsuno, Towako; Takahashi, Teruko

    2012-01-01

    In this paper the authors discuss the use of Parse's humanbecoming theory in Japan. Elements of the theory are used in the nursing approach to an 88 year-old Japanese man who had complications following surgery. Process recordings of the dialogues between the patient, the patient's wife, and the nurse were made and considered in light of the three methodologies of Parse's theory; illuminating meaning, synchronizing rhythms, and mobilizing transcendence. The theory is seen as useful in Japan.

  7. Modeling Syntax for Parsing and Translation

    DTIC Science & Technology

    2003-12-15

    20 CHAPTER 2. MONOLINGUAL PROBABILISTIC PARSING a the D cat snake D S O chased S O ran SS Mary O Figure 2.1: Part of a dictionary . the cat S chased S O...along with their training algorithms: a monolingual gen- erative model of sentence structure, and a model of the relationship between the structure of a...tasks of monolingual parsing and word-level bilingual corpus alignment, they are demonstrated in two additional applications. First, a new statistical

  8. The Humanbecoming theory as a reinterpretation of the symbolic interactionism: a critique of its specific nature and scientific underpinnings.

    PubMed

    Tapp, Diane; Lavoie, Mireille

    2017-04-01

    Discussions about real knowledge contained in grand theories and models seem to remain an active quest in the academic sphere. The most fervent of these defendants is Rosemarie Parse with her Humanbecoming School of Thought (1981, 1998). This article first highlights the similarities between Parse's theory and Blumer's symbolic interactionism (1969). This comparison will act as a counterargument to Parse's assertions that her theory is original 'nursing' material. Standing on the contemporary philosophy of science, the very possibility for discovering specific nursing knowledge will be questioned. Second, Parse's scientific assumptions will be thoroughly addressed and contrasted with Blumer's more moderate view of knowledge. It will lead to recognize that the valorization of the social nature of existence and reality does not necessarily induce requirements and methods such as those proposed by Parse. According to Blumer's point of view, her perspective may not even be desirable. Recommendations will be raised about the necessity for a distanced relationship to knowledge, being the key to the pursuit of its improvement, not its circular contemplation. © 2016 John Wiley & Sons Ltd.

  9. Atypical temporal activation pattern and central-right brain compensation during semantic judgment task in children with early left brain damage.

    PubMed

    Chang, Yi-Tzu; Lin, Shih-Che; Meng, Ling-Fu; Fan, Yang-Teng

    In this study we investigated the event-related potentials (ERPs) during the semantic judgment task (deciding if the two Chinese characters were semantically related or unrelated) to identify the timing of neural activation in children with early left brain damage (ELBD). The results demonstrated that compared with the controls, children with ELBD had (1) competitive accuracy and reaction time in the semantic judgment task, (2) weak operation of the N400, (3) stronger, earlier and later compensational positivities (referred to the enhanced P200, P250, and P600 amplitudes) in the central and right region of the brain to successfully engage in semantic judgment. Our preliminary findings indicate that temporally postlesional reorganization is in accordance with the proposed right-hemispheric organization of speech after early left-sided brain lesion. During semantic processing, the orthography has a greater effect on the children with ELBD, and a later semantic reanalysis (P600) is required due to the less efficient N400 at the former stage for semantic integration. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Bootstrapping language acquisition.

    PubMed

    Abend, Omri; Kwiatkowski, Tom; Smith, Nathaniel J; Goldwater, Sharon; Steedman, Mark

    2017-07-01

    The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured representations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child's task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms). Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on techniques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the "vocabulary spurt"), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. The learner thus demonstrates how statistical learning over structured representations can provide a unified account for these seemingly disparate phenomena. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Exploring relation types for literature-based discovery.

    PubMed

    Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert

    2015-09-01

    Literature-based discovery (LBD) aims to identify "hidden knowledge" in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD. A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the "time slicing" approach.(1) RESULTS: Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge. The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  12. The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes.

    PubMed

    Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc

    2014-12-01

    A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Integrating high dimensional bi-directional parsing models for gene mention tagging.

    PubMed

    Hsu, Chun-Nan; Chang, Yu-Ming; Kuo, Cheng-Ju; Lin, Yu-Shi; Huang, Han-Shen; Chung, I-Fang

    2008-07-01

    Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htm.

  14. Leveraging the Semantic Web for Adaptive Education

    ERIC Educational Resources Information Center

    Kravcik, Milos; Gasevic, Dragan

    2007-01-01

    In the area of technology-enhanced learning reusability and interoperability issues essentially influence the productivity and efficiency of learning and authoring solutions. There are two basic approaches how to overcome these problems--one attempts to do it via standards and the other by means of the Semantic Web. In practice, these approaches…

  15. Object-oriented parsing of biological databases with Python.

    PubMed

    Ramu, C; Gemünd, C; Gibson, T J

    2000-07-01

    While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.

  16. Comparison of Classical and Lazy Approach in SCG Compiler

    NASA Astrophysics Data System (ADS)

    Jirák, Ota; Kolář, Dušan

    2011-09-01

    The existing parsing methods of scattered context grammar usually expand nonterminals deeply in the pushdown. This expansion is implemented by using either a linked list, or some kind of an auxiliary pushdown. This paper describes the parsing algorithm of an LL(1) scattered context grammar. The given algorithm merges two principles together. The first approach is a table-driven parsing method commonly used for parsing of the context-free grammars. The second is a delayed execution used in functional programming. The main part of this paper is a proof of equivalence between the common principle (the whole rule is applied at once) and our approach (execution of the rules is delayed). Therefore, this approach works with the pushdown top only. In the most cases, the second approach is faster than the first one. Finally, the future work is discussed.

  17. Automatic Parsing of Parental Verbal Input

    PubMed Central

    Sagae, Kenji; MacWhinney, Brian; Lavie, Alon

    2006-01-01

    To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity trade-off in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet. PMID:15190707

  18. RysannMD: A biomedical semantic annotator balancing speed and accuracy.

    PubMed

    Cuzzola, John; Jovanović, Jelena; Bagheri, Ebrahim

    2017-07-01

    Recently, both researchers and practitioners have explored the possibility of semantically annotating large and continuously evolving collections of biomedical texts such as research papers, medical reports, and physician notes in order to enable their efficient and effective management and use in clinical practice or research laboratories. Such annotations can be automatically generated by biomedical semantic annotators - tools that are specifically designed for detecting and disambiguating biomedical concepts mentioned in text. The biomedical community has already presented several solid automated semantic annotators. However, the existing tools are either strong in their disambiguation capacity, i.e., the ability to identify the correct biomedical concept for a given piece of text among several candidate concepts, or they excel in their processing time, i.e., work very efficiently, but none of the semantic annotation tools reported in the literature has both of these qualities. In this paper, we present RysannMD (Ryerson Semantic Annotator for Medical Domain), a biomedical semantic annotation tool that strikes a balance between processing time and performance while disambiguating biomedical terms. In other words, RysannMD provides reasonable disambiguation performance when choosing the right sense for a biomedical term in a given context, and does that in a reasonable time. To examine how RysannMD stands with respect to the state of the art biomedical semantic annotators, we have conducted a series of experiments using standard benchmarking corpora, including both gold and silver standards, and four modern biomedical semantic annotators, namely cTAKES, MetaMap, NOBLE Coder, and Neji. The annotators were compared with respect to the quality of the produced annotations measured against gold and silver standards using precision, recall, and F 1 measure and speed, i.e., processing time. In the experiments, RysannMD achieved the best median F 1 measure across the benchmarking corpora, independent of the standard used (silver/gold), biomedical subdomain, and document size. In terms of the annotation speed, RysannMD scored the second best median processing time across all the experiments. The obtained results indicate that RysannMD offers the best performance among the examined semantic annotators when both quality of annotation and speed are considered simultaneously. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Natural language processing, pragmatics, and verbal behavior

    PubMed Central

    Cherpas, Chris

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052

  20. VenomKB, a new knowledge base for facilitating the validation of putative venom therapies

    PubMed Central

    Romano, Joseph D.; Tatonetti, Nicholas P.

    2015-01-01

    Animal venoms have been used for therapeutic purposes since the dawn of recorded history. Only a small fraction, however, have been tested for pharmaceutical utility. Modern computational methods enable the systematic exploration of novel therapeutic uses for venom compounds. Unfortunately, there is currently no comprehensive resource describing the clinical effects of venoms to support this computational analysis. We present VenomKB, a new publicly accessible knowledge base and website that aims to act as a repository for emerging and putative venom therapies. Presently, it consists of three database tables: (1) Manually curated records of putative venom therapies supported by scientific literature, (2) automatically parsed MEDLINE articles describing compounds that may be venom derived, and their effects on the human body, and (3) automatically retrieved records from the new Semantic Medline resource that describe the effects of venom compounds on mammalian anatomy. Data from VenomKB may be selectively retrieved in a variety of popular data formats, are open-source, and will be continually updated as venom therapies become better understood. PMID:26601758

  1. An Intelligent Polar Cyberinfrastrucuture to Support Spatiotemporal Decision Making

    NASA Astrophysics Data System (ADS)

    Song, M.; Li, W.; Zhou, X.

    2014-12-01

    In the era of big data, polar sciences have already faced an urgent demand of utilizing intelligent approaches to support precise and effective spatiotemporal decision-making. Service-oriented cyberinfrastructure has advantages of seamlessly integrating distributed computing resources, and aggregating a variety of geospatial data derived from Earth observation network. This paper focuses on building a smart service-oriented cyberinfrastructure to support intelligent question answering related to polar datasets. The innovation of this polar cyberinfrastructure includes: (1) a problem-solving environment that parses geospatial question in natural language, builds geoprocessing rules, composites atomic processing services and executes the entire workflow; (2) a self-adaptive spatiotemporal filter that is capable of refining query constraints through semantic analysis; (3) a dynamic visualization strategy to support results animation and statistics in multiple spatial reference systems; and (4) a user-friendly online portal to support collaborative decision-making. By means of this polar cyberinfrastructure, we intend to facilitate integration of distributed and heterogeneous Arctic datasets and comprehensive analysis of multiple environmental elements (e.g. snow, ice, permafrost) to provide a better understanding of the environmental variation in circumpolar regions.

  2. Action Algebras and Model Algebras in Denotational Semantics

    NASA Astrophysics Data System (ADS)

    Guedes, Luiz Carlos Castro; Haeusler, Edward Hermann

    This article describes some results concerning the conceptual separation of model dependent and language inherent aspects in a denotational semantics of a programming language. Before going into the technical explanation, the authors wish to relate a story that illustrates how correctly and precisely posed questions can influence the direction of research. By means of his questions, Professor Mosses aided the PhD research of one of the authors of this article and taught the other, who at the time was a novice supervisor, the real meaning of careful PhD supervision. The student’s research had been partially developed towards the implementation of programming languages through denotational semantics specification, and the student had developed a prototype [12] that compared relatively well to some industrial compilers of the PASCAL language. During a visit to the BRICS lab in Aarhus, the student’s supervisor gave Professor Mosses a draft of an article describing the prototype and its implementation experiments. The next day, Professor Mosses asked the supervisor, “Why is the generated code so efficient when compared to that generated by an industrial compiler?” and “You claim that the efficiency is simply a consequence of the Object- Orientation mechanisms used by the prototype programming language (C++); this should be better investigated. Pay more attention to the class of programs that might have this good comparison profile.” As a result of these aptly chosen questions and comments, the student and supervisor made great strides in the subsequent research; the advice provided by Professor Mosses made them perceive that the code generated for certain semantic domains was efficient because it mapped to the “right aspect” of the language semantics. (Certain functional types, used to represent mappings such as Stores and Environments, were pushed to the level of the object language (as in gcc). This had the side-effect of generating code for arrays in the same way as that for functional denotational types. For example, PASCAL arrays belong to the “language inherent” aspect, while the Store domain seems to belong to the “model dependent” aspect. This distinction was important because it focussed attention on optimizing the model dependent semantic domains to obtain a more efficient implementation.) The research led to a nice conclusion: The guidelines of Action Semantics induce a clear separation of the model and language inherent aspects of a language’s semantics. A good implementation of facets, particularly the model dependent ones, leads to generation of an efficient compiler. In this article we discuss the separation of the language inherent and model-inherent domains at the theoretical and conceptual level. In doing so, the authors hope to show how Professor Mosses’s influence extended beyond his technical advice to his professional and personal examples on the supervision of PhD research.

  3. Dynamic semantic cognition: Characterising coherent and controlled conceptual retrieval through time using magnetoencephalography and chronometric transcranial magnetic stimulation.

    PubMed

    Teige, Catarina; Mollo, Giovanna; Millman, Rebecca; Savill, Nicola; Smallwood, Jonathan; Cornelissen, Piers L; Jefferies, Elizabeth

    2018-06-01

    Distinct neural processes are thought to support the retrieval of semantic information that is (i) coherent with strongly-encoded aspects of knowledge, and (ii) non-dominant yet relevant for the current task or context. While the brain regions that support readily coherent and more controlled patterns of semantic retrieval are relatively well-characterised, the temporal dynamics of these processes are not well-understood. This study used magnetoencephalography (MEG) and dual-pulse chronometric transcranial magnetic stimulation (cTMS) in two separate experiments to examine temporal dynamics during the retrieval of strong and weak associations. MEG results revealed a dissociation within left temporal cortex: anterior temporal lobe (ATL) showed greater oscillatory response for strong than weak associations, while posterior middle temporal gyrus (pMTG) showed the reverse pattern. Left inferior frontal gyrus (IFG), a site associated with semantic control and retrieval, showed both patterns at different time points. In the cTMS experiment, stimulation of ATL at ∼150 msec disrupted the efficient retrieval of strong associations, indicating a necessary role for ATL in coherent conceptual activations. Stimulation of pMTG at the onset of the second word disrupted the retrieval of weak associations, suggesting this site may maintain information about semantic context from the first word, allowing efficient engagement of semantic control. Together these studies provide converging evidence for a functional dissociation within the temporal lobe, across both tasks and time. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. E-Learning System Overview Based on Semantic Web

    ERIC Educational Resources Information Center

    Alsultanny, Yas A.

    2006-01-01

    The challenge of the semantic web is the provision of distributed information with well-defined meaning, understandable for different parties. e-Learning is efficient task relevant and just-in-time learning grown from the learning requirements of the new dynamically changing, distributed business world. In this paper we design an e-Learning system…

  5. An Effective Semantic Event Matching System in the Internet of Things (IoT) Environment.

    PubMed

    Alhakbani, Noura; Hassan, Mohammed Mehedi; Ykhlef, Mourad

    2017-09-02

    IoT sensors use the publish/subscribe model for communication to benefit from its decoupled nature with respect to space, time, and synchronization. Because of the heterogeneity of communicating parties, semantic decoupling is added as a fourth dimension. The added semantic decoupling complicates the matching process and reduces its efficiency. Our proposed algorithm clusters subscriptions and events according to topic and performs the matching process within these clusters, which increases the throughput by reducing the matching time from the range of 16-18 ms to 2-4 ms. Moreover, the accuracy of matching is improved when subscriptions must be fully approximated, as demonstrated by an over 40% increase in F-score results. This work shows the benefit of clustering, as well as the improvement in the matching accuracy and efficiency achieved using this approach.

  6. Spatial information semantic query based on SPARQL

    NASA Astrophysics Data System (ADS)

    Xiao, Zhifeng; Huang, Lei; Zhai, Xiaofang

    2009-10-01

    How can the efficiency of spatial information inquiries be enhanced in today's fast-growing information age? We are rich in geospatial data but poor in up-to-date geospatial information and knowledge that are ready to be accessed by public users. This paper adopts an approach for querying spatial semantic by building an Web Ontology language(OWL) format ontology and introducing SPARQL Protocol and RDF Query Language(SPARQL) to search spatial semantic relations. It is important to establish spatial semantics that support for effective spatial reasoning for performing semantic query. Compared to earlier keyword-based and information retrieval techniques that rely on syntax, we use semantic approaches in our spatial queries system. Semantic approaches need to be developed by ontology, so we use OWL to describe spatial information extracted by the large-scale map of Wuhan. Spatial information expressed by ontology with formal semantics is available to machines for processing and to people for understanding. The approach is illustrated by introducing a case study for using SPARQL to query geo-spatial ontology instances of Wuhan. The paper shows that making use of SPARQL to search OWL ontology instances can ensure the result's accuracy and applicability. The result also indicates constructing a geo-spatial semantic query system has positive efforts on forming spatial query and retrieval.

  7. Semantic Document Model to Enhance Data and Knowledge Interoperability

    NASA Astrophysics Data System (ADS)

    Nešić, Saša

    To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.

  8. Direct current stimulation over the anterior temporal areas boosts semantic processing in primary progressive aphasia.

    PubMed

    Teichmann, Marc; Lesoil, Constance; Godard, Juliette; Vernet, Marine; Bertrand, Anne; Levy, Richard; Dubois, Bruno; Lemoine, Laurie; Truong, Dennis Q; Bikson, Marom; Kas, Aurélie; Valero-Cabré, Antoni

    2016-11-01

    Noninvasive brain stimulation in primary progressive aphasia (PPA) is a promising approach. Yet, applied to single cases or insufficiently controlled small-cohort studies, it has not clarified its therapeutic value. We here address the effectiveness of transcranial direct current stimulation (tDCS) on the semantic PPA variant (sv-PPA), applying a rigorous study design to a large, homogeneous sv-PPA cohort. Using a double-blind, sham-controlled counterbalanced cross-over design, we applied three tDCS conditions targeting the temporal poles of 12 sv-PPA patients. Efficiency was assessed by a semantic matching task orthogonally manipulating "living"/"nonliving" categories and verbal/visual modalities. Conforming to predominantly left-lateralized damage in sv-PPA and accounts of interhemispheric inhibition, we applied left hemisphere anodal-excitatory and right hemisphere cathodal-inhibitory tDCS, compared to sham stimulation. Prestimulation data, compared to 15 healthy controls, showed that patients had semantic disorders predominating with living categories in the verbal modality. Stimulation selectively impacted these most impaired domains: Left-excitatory and right-inhibitory tDCS improved semantic accuracy in verbal modality, and right-inhibitory tDCS improved processing speed with living categories and accuracy and processing speed in the combined verbal × living condition. Our findings demonstrate the efficiency of tDCS in sv-PPA by generating highly specific intrasemantic effects. They provide "proof of concept" for future applications of tDCS in therapeutic multiday regimes, potentially driving sustained improvement of semantic processing. Our data also support the hotly debated existence of a left temporal-pole network for verbal semantics selectively modulated through both left-excitatory and right-inhibitory brain stimulation. Ann Neurol 2016;80:693-707. © 2016 American Neurological Association.

  9. Classification with an edge: Improving semantic image segmentation with boundary detection

    NASA Astrophysics Data System (ADS)

    Marmanis, D.; Schindler, K.; Wegner, J. D.; Galliani, S.; Datcu, M.; Stilla, U.

    2018-01-01

    We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large receptive fields. However, this success comes at a cost, since the associated loss of effective spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class boundaries explicit in the model. First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the SEGNET encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs in an end-to-end training scheme. Our best model achieves >90% overall accuracy on the ISPRS Vaihingen benchmark.

  10. Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar

    NASA Astrophysics Data System (ADS)

    Chou, Philip A.

    1989-11-01

    We propose using two-dimensional stochastic context-free grammars for image recognition, in a manner analogous to using hidden Markov models for speech recognition. The value of the approach is demonstrated in a system that recognizes printed, noisy equations. The system uses a two-dimensional probabilistic version of the Cocke-Younger-Kasami parsing algorithm to find the most likely parse of the observed image, and then traverses the corresponding parse tree in accordance with translation formats associated with each production rule, to produce eqn I troff commands for the imaged equation. In addition, it uses two-dimensional versions of the Inside/Outside and Baum re-estimation algorithms for learning the parameters of the grammar from a training set of examples. Parsing the image of a simple noisy equation currently takes about one second of cpu time on an Alliant FX/80.

  11. Prosody and parsing in coordination structures.

    PubMed

    Schepman, A; Rodway, P

    2000-05-01

    The effect of prosodic boundary cues on the off-line disambiguation and on-line parsing of coordination structures was examined. It was found that relative clauses were attached to coordinated object noun phrases in preference to second conjuncts in sentences like: The lawyer greeted the powerful barrister and the wise judge who was/were walking to the courtroom. Naive speakers signalled the syntactic contrast between the two structures by a prosodic break between the conjuncts when the relative clause was attached to the second conjunct. Listeners were able to use this prosodic information in both off-line syntactic disambiguation and on-line syntactic parsing. The findings are compatible with a model in which prosody has a strong immediate effect on parsing. It is argued that the current experimental design has avoided confounds present in earlier studies on the on-line integration of prosodic and syntactic information.

  12. A structural SVM approach for reference parsing.

    PubMed

    Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

    2011-06-09

    Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

  13. Perception of scene-relative object movement: Optic flow parsing and the contribution of monocular depth cues.

    PubMed

    Warren, Paul A; Rushton, Simon K

    2009-05-01

    We have recently suggested that the brain uses its sensitivity to optic flow in order to parse retinal motion into components arising due to self and object movement (e.g. Rushton, S. K., & Warren, P. A. (2005). Moving observers, 3D relative motion and the detection of object movement. Current Biology, 15, R542-R543). Here, we explore whether stereo disparity is necessary for flow parsing or whether other sources of depth information, which could theoretically constrain flow-field interpretation, are sufficient. Stationary observers viewed large field of view stimuli containing textured cubes, moving in a manner that was consistent with a complex observer movement through a stationary scene. Observers made speeded responses to report the perceived direction of movement of a probe object presented at different depths in the scene. Across conditions we varied the presence or absence of different binocular and monocular cues to depth order. In line with previous studies, results consistent with flow parsing (in terms of both perceived direction and response time) were found in the condition in which motion parallax and stereoscopic disparity were present. Observers were poorer at judging object movement when depth order was specified by parallax alone. However, as more monocular depth cues were added to the stimulus the results approached those found when the scene contained stereoscopic cues. We conclude that both monocular and binocular static depth information contribute to flow parsing. These findings are discussed in the context of potential architectures for a model of the flow parsing mechanism.

  14. A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.

    PubMed

    Kothari, Cartik R; Payne, Philip R O

    2015-01-01

    In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.

  15. Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration

    PubMed Central

    2011-01-01

    Background Over the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information formats and representation strategies have emerged as a result of such diverse adoption of chemistry. Although a number of efforts have been dedicated to unifying the computational representation of chemical information, disparities between the various chemical databases still persist and stand in the way of cross-domain, interdisciplinary investigations. Through a common syntax and formal semantics, Semantic Web technology offers the ability to accurately represent, integrate, reason about and query across diverse chemical information. Results Here we specify and implement the Chemical Entity Semantic Specification (CHESS) for the representation of polyatomic chemical entities, their substructures, bonds, atoms, and reactions using Semantic Web technologies. CHESS provides means to capture aspects of their corresponding chemical descriptors, connectivity, functional composition, and geometric structure while specifying mechanisms for data provenance. We demonstrate that using our readily extensible specification, it is possible to efficiently integrate multiple disparate chemical data sources, while retaining appropriate correspondence of chemical descriptors, with very little additional effort. We demonstrate the impact of some of our representational decisions on the performance of chemically-aware knowledgebase searching and rudimentary reaction candidate selection. Finally, we provide access to the tools necessary to carry out chemical entity encoding in CHESS, along with a sample knowledgebase. Conclusions By harnessing the power of Semantic Web technologies with CHESS, it is possible to provide a means of facile cross-domain chemical knowledge integration with full preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research. PMID:21595881

  16. Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration.

    PubMed

    Chepelev, Leonid L; Dumontier, Michel

    2011-05-19

    Over the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information formats and representation strategies have emerged as a result of such diverse adoption of chemistry. Although a number of efforts have been dedicated to unifying the computational representation of chemical information, disparities between the various chemical databases still persist and stand in the way of cross-domain, interdisciplinary investigations. Through a common syntax and formal semantics, Semantic Web technology offers the ability to accurately represent, integrate, reason about and query across diverse chemical information. Here we specify and implement the Chemical Entity Semantic Specification (CHESS) for the representation of polyatomic chemical entities, their substructures, bonds, atoms, and reactions using Semantic Web technologies. CHESS provides means to capture aspects of their corresponding chemical descriptors, connectivity, functional composition, and geometric structure while specifying mechanisms for data provenance. We demonstrate that using our readily extensible specification, it is possible to efficiently integrate multiple disparate chemical data sources, while retaining appropriate correspondence of chemical descriptors, with very little additional effort. We demonstrate the impact of some of our representational decisions on the performance of chemically-aware knowledgebase searching and rudimentary reaction candidate selection. Finally, we provide access to the tools necessary to carry out chemical entity encoding in CHESS, along with a sample knowledgebase. By harnessing the power of Semantic Web technologies with CHESS, it is possible to provide a means of facile cross-domain chemical knowledge integration with full preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research.

  17. Influence of First Language Orthographic Experience on Second Language Decoding and Word Learning

    ERIC Educational Resources Information Center

    Hamada, Megumi; Koda, Keiko

    2008-01-01

    This study examined the influence of first language (L1) orthographic experiences on decoding and semantic information retention of new words in a second language (L2). Hypotheses were that congruity in L1 and L2 orthographic experiences determines L2 decoding efficiency, which, in turn, affects semantic information encoding and retention.…

  18. A Semantic-Oriented Approach for Organizing and Developing Annotation for E-Learning

    ERIC Educational Resources Information Center

    Brut, Mihaela M.; Sedes, Florence; Dumitrescu, Stefan D.

    2011-01-01

    This paper presents a solution to extend the IEEE LOM standard with ontology-based semantic annotations for efficient use of learning objects outside Learning Management Systems. The data model corresponding to this approach is first presented. The proposed indexing technique for this model development in order to acquire a better annotation of…

  19. Hardware independence checkout software

    NASA Technical Reports Server (NTRS)

    Cameron, Barry W.; Helbig, H. R.

    1990-01-01

    ACSI has developed a program utilizing CLIPS to assess compliance with various programming standards. Essentially the program parses C code to extract the names of all function calls. These are asserted as CLIPS facts which also include information about line numbers, source file names, and called functions. Rules have been devised to establish functions called that have not been defined in any of the source parsed. These are compared against lists of standards (represented as facts) using rules that check intersections and/or unions of these. By piping the output into other processes the source is appropriately commented by generating and executing parsed scripts.

  20. ParseCNV integrative copy number variation association software with quality tracking

    PubMed Central

    Glessner, Joseph T.; Li, Jin; Hakonarson, Hakon

    2013-01-01

    A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case–control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net. PMID:23293001

  1. ParseCNV integrative copy number variation association software with quality tracking.

    PubMed

    Glessner, Joseph T; Li, Jin; Hakonarson, Hakon

    2013-03-01

    A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case-control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net.

  2. An efficient and provable secure revocable identity-based encryption scheme.

    PubMed

    Wang, Changji; Li, Yuan; Xia, Xiaonan; Zheng, Kangjia

    2014-01-01

    Revocation functionality is necessary and crucial to identity-based cryptosystems. Revocable identity-based encryption (RIBE) has attracted a lot of attention in recent years, many RIBE schemes have been proposed in the literature but shown to be either insecure or inefficient. In this paper, we propose a new scalable RIBE scheme with decryption key exposure resilience by combining Lewko and Waters' identity-based encryption scheme and complete subtree method, and prove our RIBE scheme to be semantically secure using dual system encryption methodology. Compared to existing scalable and semantically secure RIBE schemes, our proposed RIBE scheme is more efficient in term of ciphertext size, public parameters size and decryption cost at price of a little looser security reduction. To the best of our knowledge, this is the first construction of scalable and semantically secure RIBE scheme with constant size public system parameters.

  3. An Effective Semantic Event Matching System in the Internet of Things (IoT) Environment

    PubMed Central

    Alhakbani, Noura; Ykhlef, Mourad

    2017-01-01

    IoT sensors use the publish/subscribe model for communication to benefit from its decoupled nature with respect to space, time, and synchronization. Because of the heterogeneity of communicating parties, semantic decoupling is added as a fourth dimension. The added semantic decoupling complicates the matching process and reduces its efficiency. Our proposed algorithm clusters subscriptions and events according to topic and performs the matching process within these clusters, which increases the throughput by reducing the matching time from the range of 16–18 ms to 2–4 ms. Moreover, the accuracy of matching is improved when subscriptions must be fully approximated, as demonstrated by an over 40% increase in F-score results. This work shows the benefit of clustering, as well as the improvement in the matching accuracy and efficiency achieved using this approach. PMID:28869508

  4. Semantic processing in connected speech at a uniformly early stage of autopsy-confirmed Alzheimer's disease.

    PubMed

    Ahmed, Samrah; de Jager, Celeste A; Haigh, Anne-Marie; Garrard, Peter

    2013-01-01

    The aim of the present study was to quantify the semantic content of connected speech produced by patients at a uniformly early stage of pathologically proven Alzheimer's disease (AD). A secondary aim was to establish whether semantic units were reduced globally, or whether there was a disproportionate reduction of specific classes of information. Discourse samples were obtained from 18 AD patients and 18 matched controls, all pathologically confirmed. Semantic unit identification was scored overall and for four subclasses: subjects, locations, objects, and actions. Idea density and efficiency were calculated. AD transcripts showed significantly reduced units overall, particularly actions and subjects, as well as reduced efficiency. Total semantic units and a combination of subject-, location-, and object-related units ("noun" units) correlated with the Expression subscore on the Cambridge Cognitive Examination (CAMCOG). Subject related units correlated with the CAMCOG Abstract Thinking scale. Logistic regression analyses confirmed that all measures that were lower in AD than controls were predictive of group membership. An exploratory comparison between units expressed mainly using nouns and those mainly using verbs showed that the latter was the stronger of these two predictors. The present study adds a lexico-semantic dimension to the linguistic profile based on discourse analysis in typical AD, recently described by the same authors. 2012, 83(11): 1056-1062). The suggestion of differential importance of verb and noun use in the present study may be related to the reduction in syntactic complexity that was reported, using the same set of discourse samples, in the earlier study.

  5. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for bioinformatics. PMID:16556315

  6. BioWarehouse: a bioinformatics database warehouse toolkit.

    PubMed

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  7. Using XML to encode TMA DES metadata.

    PubMed

    Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul

    2011-01-01

    The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs.

  8. Using XML to encode TMA DES metadata

    PubMed Central

    Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul

    2011-01-01

    Background: The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. Materials and Methods: We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. Results: We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. Conclusions: All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs. PMID:21969921

  9. Semantic annotation in biomedicine: the current landscape.

    PubMed

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  10. Orthographic versus semantic matching in visual search for words within lists.

    PubMed

    Léger, Laure; Rouet, Jean-François; Ros, Christine; Vibert, Nicolas

    2012-03-01

    An eye-tracking experiment was performed to assess the influence of orthographic and semantic distractor words on visual search for words within lists. The target word (e.g., "raven") was either shown to participants before the search (literal search) or defined by its semantic category (e.g., "bird", categorical search). In both cases, the type of words included in the list affected visual search times and eye movement patterns. In the literal condition, the presence of orthographic distractors sharing initial and final letters with the target word strongly increased search times. Indeed, the orthographic distractors attracted participants' gaze and were fixated for longer times than other words in the list. The presence of semantic distractors related to the target word also increased search times, which suggests that significant automatic semantic processing of nontarget words took place. In the categorical condition, semantic distractors were expected to have a greater impact on the search task. As expected, the presence in the list of semantic associates of the target word led to target selection errors. However, semantic distractors did not significantly increase search times any more, whereas orthographic distractors still did. Hence, the visual characteristics of nontarget words can be strong predictors of the efficiency of visual search even when the exact target word is unknown. The respective impacts of orthographic and semantic distractors depended more on the characteristics of lists than on the nature of the search task.

  11. Differentiation of perceptual and semantic subsequent memory effects using an orthographic paradigm.

    PubMed

    Kuo, Michael C C; Liu, Karen P Y; Ting, Kin Hung; Chan, Chetwyn C H

    2012-11-27

    This study aimed to differentiate perceptual and semantic encoding processes using subsequent memory effects (SMEs) elicited by the recognition of orthographs of single Chinese characters. Participants studied a series of Chinese characters perceptually (by inspecting orthographic components) or semantically (by determining the object making sounds), and then made studied or unstudied judgments during the recognition phase. Recognition performance in terms of d-prime measure in the semantic condition was higher, though not significant, than that of the perceptual condition. The between perceptual-semantic condition differences in SMEs at P550 and late positive component latencies (700-1000ms) were not significant in the frontal area. An additional analysis identified larger SME in the semantic condition during 600-1000ms in the frontal pole regions. These results indicate that coordination and incorporation of orthographic information into mental representation is essential to both task conditions. The differentiation was also revealed in earlier SMEs (perceptual>semantic) at N3 (240-360ms) latency, which is a novel finding. The left-distributed N3 was interpreted as more efficient processing of meaning with semantically learned characters. Frontal pole SMEs indicated strategic processing by executive functions, which would further enhance memory. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Qualitative dynamics semantics for SBGN process description.

    PubMed

    Rougny, Adrien; Froidevaux, Christine; Calzone, Laurence; Paulevé, Loïc

    2016-06-16

    Qualitative dynamics semantics provide a coarse-grain modeling of networks dynamics by abstracting away kinetic parameters. They allow to capture general features of systems dynamics, such as attractors or reachability properties, for which scalable analyses exist. The Systems Biology Graphical Notation Process Description language (SBGN-PD) has become a standard to represent reaction networks. However, no qualitative dynamics semantics taking into account all the main features available in SBGN-PD had been proposed so far. We propose two qualitative dynamics semantics for SBGN-PD reaction networks, namely the general semantics and the stories semantics, that we formalize using asynchronous automata networks. While the general semantics extends standard Boolean semantics of reaction networks by taking into account all the main features of SBGN-PD, the stories semantics allows to model several molecules of a network by a unique variable. The obtained qualitative models can be checked against dynamical properties and therefore validated with respect to biological knowledge. We apply our framework to reason on the qualitative dynamics of a large network (more than 200 nodes) modeling the regulation of the cell cycle by RB/E2F. The proposed semantics provide a direct formalization of SBGN-PD networks in dynamical qualitative models that can be further analyzed using standard tools for discrete models. The dynamics in stories semantics have a lower dimension than the general one and prune multiple behaviors (which can be considered as spurious) by enforcing the mutual exclusiveness between the activity of different nodes of a same story. Overall, the qualitative semantics for SBGN-PD allow to capture efficiently important dynamical features of reaction network models and can be exploited to further refine them.

  13. A neotropical Miocene pollen database employing image-based search and semantic modeling.

    PubMed

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-08-01

    Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery.

  14. Semantic-based surveillance video retrieval.

    PubMed

    Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve

    2007-04-01

    Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.

  15. Yes! An object-oriented compiler compiler (YOOCC)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Avotins, J.; Mingins, C.; Schmidt, H.

    1995-12-31

    Grammar-based processor generation is one of the most widely studied areas in language processor construction. However, there have been very few approaches to date that reconcile object-oriented principles, processor generation, and an object-oriented language. Pertinent here also. is that currently to develop a processor using the Eiffel Parse libraries requires far too much time to be expended on tasks that can be automated. For these reasons, we have developed YOOCC (Yes! an Object-Oriented Compiler Compiler), which produces a processor framework from a grammar using an enhanced version of the Eiffel Parse libraries, incorporating the ideas hypothesized by Meyer, and Grapemore » and Walden, as well as many others. Various essential changes have been made to the Eiffel Parse libraries. Examples are presented to illustrate the development of a processor using YOOCC, and it is concluded that the Eiffel Parse libraries are now not only an intelligent, but also a productive option for processor construction.« less

  16. Historical Semantic Chaining and Efficient Communication: The Case of Container Names

    ERIC Educational Resources Information Center

    Xu, Yang; Regier, Terry; Malt, Barbara C.

    2016-01-01

    Semantic categories in the world's languages often reflect a historical process of "chaining": A name for one referent is extended to a conceptually related referent, and from there on to other referents, producing a chain of exemplars that all bear the same name. The beginning and end points of such a chain might in principle be rather…

  17. Exploring reading-spelling connection as locus of dyslexia in Chinese.

    PubMed

    Leong, C K; Cheng, P W; Lam, C C

    2000-01-01

    This paper advances the argument that in learning to read/spell Chinese characters and words, it is important for learners to understand the role of the component parts. These constituents consist of phonetic and semantic radicals, or bujians, made up of clusters of strokes in their proper sequence. Beginning readers/spellers need to be sensitive to the positional hierarchy and internal structure of these constituent parts. Those Chinese children diagnosed with developmental dyslexia tend to have more difficulties in spelling Chinese characters and in writing to dictation than in reading. A lexical decision study with two groups of tertiary students differing in their Chinese language ability was carried out to test their efficiency in processing real and pseudo characters as a function of printed frequency of the characters, and the consistency of their component semantic radicals. There is some evidence that even for adult readers differing in their Chinese language ability, lexicality, frequency of characters and the consistency of the semantic radicals affect accurate and rapid character identification. Suggestions for research and teaching approaches are made to enhance the analysis and synthesis of the phonetic and semantic radicals to promote efficient reading and spelling in Chinese.

  18. Using triggered operations to offload collective communication operations.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrett, Brian W.; Hemmert, K. Scott; Underwood, Keith Douglas

    2010-04-01

    Efficient collective operations are a major component of application scalability. Offload of collective operations onto the network interface reduces many of the latencies that are inherent in network communications and, consequently, reduces the time to perform the collective operation. To support offload, it is desirable to expose semantic building blocks that are simple to offload and yet powerful enough to implement a variety of collective algorithms. This paper presents the implementation of barrier and broadcast leveraging triggered operations - a semantic building block for collective offload. Triggered operations are shown to be both semantically powerful and capable of improving performance.

  19. Construction of a robust, large-scale, collaborative database for raw data in computational chemistry: the Collaborative Chemistry Database Tool (CCDBT).

    PubMed

    Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A

    2012-04-01

    A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright © 2011 Elsevier Inc. All rights reserved.

  20. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  1. Using RDF to Model the Structure and Process of Systems

    NASA Astrophysics Data System (ADS)

    Rodriguez, Marko A.; Watkins, Jennifer H.; Bollen, Johan; Gershenson, Carlos

    Many systems can be described in terms of networks of discrete elements and their various relationships to one another. A semantic network, or multi-relational network, is a directed labeled graph consisting of a heterogeneous set of entities connected by a heterogeneous set of relationships. Semantic networks serve as a promising general-purpose modeling substrate for complex systems. Various standardized formats and tools are now available to support practical, large-scale semantic network models. First, the Resource Description Framework (RDF) offers a standardized semantic network data model that can be further formalized by ontology modeling languages such as RDF Schema (RDFS) and the Web Ontology Language (OWL). Second, the recent introduction of highly performant triple-stores (i.e. semantic network databases) allows semantic network models on the order of 109 edges to be efficiently stored and manipulated. RDF and its related technologies are currently used extensively in the domains of computer science, digital library science, and the biological sciences. This article will provide an introduction to RDF/RDFS/OWL and an examination of its suitability to model discrete element complex systems.

  2. Enhancing biomedical text summarization using semantic relation extraction.

    PubMed

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  3. Mention Detection: Heuristics for the OntoNotes Annotations

    DTIC Science & Technology

    2011-01-01

    Mention Detection: Heuristics for the OntoNotes annotations Jonathan K. Kummerfeld, Mohit Bansal, David Burkett and Dan Klein Computer Science...considered the provided parses and parses produced by the Berke - ley parser (Petrov et al., 2006) trained on the pro- vided training data. We added a

  4. Simultaneous Translation: Idiom Interpretation and Parsing Heuristics.

    ERIC Educational Resources Information Center

    McDonald, Janet L.; Carpenter, Patricia A.

    1981-01-01

    Presents a model of interpretation, parsing and error recovery in simultaneous translation using two experts and two amateur German-English bilingual translators orally translating from English to German. Argues that the translator first comprehends the text in English and divides it into meaningful units before translating. Study also…

  5. Memory mechanisms supporting syntactic comprehension.

    PubMed

    Caplan, David; Waters, Gloria

    2013-04-01

    Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829-839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension--the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance-long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory.

  6. Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary

    ERIC Educational Resources Information Center

    Caplan, David; Gow, David

    2012-01-01

    Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in…

  7. Applications of Parsing Theory to Computer-Assisted Instruction.

    ERIC Educational Resources Information Center

    Markosian, Lawrence Z.; Ager, Tryg A.

    1983-01-01

    Applications of an LR-1 parsing algorithm to intelligent programs for computer assisted instruction in symbolic logic and foreign languages are discussed. The system has been adequately used for diverse instructional applications, including analysis of student input, generation of pattern drills, and modeling the student's understanding of the…

  8. Parsing the Practice of Teaching

    ERIC Educational Resources Information Center

    Kennedy, Mary

    2016-01-01

    Teacher education programs typically teach novices about one part of teaching at a time. We might offer courses on different topics--cultural foundations, learning theory, or classroom management--or we may parse teaching practice itself into a set of discrete techniques, such as core teaching practices, that can be taught individually. Missing…

  9. Sawja: Static Analysis Workshop for Java

    NASA Astrophysics Data System (ADS)

    Hubert, Laurent; Barré, Nicolas; Besson, Frédéric; Demange, Delphine; Jensen, Thomas; Monfort, Vincent; Pichardie, David; Turpin, Tiphaine

    Static analysis is a powerful technique for automatic verification of programs but raises major engineering challenges when developing a full-fledged analyzer for a realistic language such as Java. Efficiency and precision of such a tool rely partly on low level components which only depend on the syntactic structure of the language and therefore should not be redesigned for each implementation of a new static analysis. This paper describes the Sawja library: a static analysis workshop fully compliant with Java 6 which provides OCaml modules for efficiently manipulating Java bytecode programs. We present the main features of the library, including i) efficient functional data-structures for representing a program with implicit sharing and lazy parsing, ii) an intermediate stack-less representation, and iii) fast computation and manipulation of complete programs. We provide experimental evaluations of the different features with respect to time, memory and precision.

  10. Capturing, Sharing, and Discovering Product Data at a Semantic Level--Moving Forward to the Semantic Web for Advancing the Engineering Product Design Process

    ERIC Educational Resources Information Center

    Zhu, Lijuan

    2011-01-01

    Along with the greater productivity that CAD automation provides nowadays, the product data of engineering applications needs to be shared and managed efficiently to gain a competitive edge for the engineering product design. However, exchanging and sharing the heterogeneous product data is still challenging. This dissertation first presents a…

  11. OntoADR a semantic resource describing adverse drug reactions to support searching, coding, and information retrieval.

    PubMed

    Souvignet, Julien; Declerck, Gunnar; Asfari, Hadyl; Jaulent, Marie-Christine; Bousquet, Cédric

    2016-10-01

    Efficient searching and coding in databases that use terminological resources requires that they support efficient data retrieval. The Medical Dictionary for Regulatory Activities (MedDRA) is a reference terminology for several countries and organizations to code adverse drug reactions (ADRs) for pharmacovigilance. Ontologies that are available in the medical domain provide several advantages such as reasoning to improve data retrieval. The field of pharmacovigilance does not yet benefit from a fully operational ontology to formally represent the MedDRA terms. Our objective was to build a semantic resource based on formal description logic to improve MedDRA term retrieval and aid the generation of on-demand custom groupings by appropriately and efficiently selecting terms: OntoADR. The method consists of the following steps: (1) mapping between MedDRA terms and SNOMED-CT, (2) generation of semantic definitions using semi-automatic methods, (3) storage of the resource and (4) manual curation by pharmacovigilance experts. We built a semantic resource for ADRs enabling a new type of semantics-based term search. OntoADR adds new search capabilities relative to previous approaches, overcoming the usual limitations of computation using lightweight description logic, such as the intractability of unions or negation queries, bringing it closer to user needs. Our automated approach for defining MedDRA terms enabled the association of at least one defining relationship with 67% of preferred terms. The curation work performed on our sample showed an error level of 14% for this automated approach. We tested OntoADR in practice, which allowed us to build custom groupings for several medical topics of interest. The methods we describe in this article could be adapted and extended to other terminologies which do not benefit from a formal semantic representation, thus enabling better data retrieval performance. Our custom groupings of MedDRA terms were used while performing signal detection, which suggests that the graphical user interface we are currently implementing to process OntoADR could be usefully integrated into specialized pharmacovigilance software that rely on MedDRA. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Performance characteristics of a suite of volume phase holographic gratings produced for the Subaru prime focus spectrograph

    NASA Astrophysics Data System (ADS)

    Arns, James A.

    2016-08-01

    The Subaru Prime Focus Spectrograph[1] (PFS) requires a suite of volume phase holographic (VPH) gratings that parse the observational spectrum into three sub-spectral regions. In addition, the red region has a second, higher resolution arm that includes a VPH grating that will eventually be incorporated into a grism. This paper describes the specifications of the four grating types, gives the theoretical performances of diffraction efficiency for the production designs and presents the measured performances on the gratings produced to date.

  13. [Schizophrenia and semantic priming effects].

    PubMed

    Lecardeur, L; Giffard, B; Eustache, F; Dollfus, S

    2006-01-01

    This article is a review of studies using the semantic priming paradigm to assess the functioning of semantic memory in schizophrenic patients. Semantic priming describes the phenomenon of increasing the speed with which a string of letters (the target) is recognized as a word (lexical decision task) by presenting to the subject a semantically related word (the prime) prior to the appearance of the target word. This semantic priming is linked to both automatic and controlled processes depending on experimental conditions (stimulus onset asynchrony (SOA), percentage of related words and explicit memory instructions). Automatic process observed with short SOA, low related word percentage and instructions asking only to process the target, could be linked to the "automatic spreading activation" through the semantic network. Controlled processes involve "semantic matching" (the number of related and unrelated pairs influences the subjects decision) and "expectancy" (the prime leads the subject to generate an expectancy set of potential target to the prime). These processes can be observed whatever the SOA for the former and with long SOA for the later, but both with only high related word percentage and explicit memory instructions. Studies evaluating semantic priming effects in schizophrenia show conflicting results: schizophrenic patients can present hyperpriming (semantic priming effect is larger in patients than in controls), hypopriming (semantic priming effect is lower in patients than in controls) or equal semantic priming effects compared to control subjects. These results could be associated to a global impairment of controlled processes in schizophrenia, essentially to a dysfunction of semantic matching process. On the other hand, efficiency of semantic automatic spreading activation process is controversial. These discrepancies could be linked to the different experimental conditions used (duration of SOA, proportion of related pairs and instructions), which influence on the degree of involvement of controlled processes and therefore prevent to really assess its functioning. In addition, manipulations of the relation between prime and target (semantic distance, type of semantic relation and strength of semantic relation) seem to influence reaction times. However, the relation between prime and target (mediated priming) frequently used could not be the most relevant relation to understand the way of spreading of activation in semantic network in patients with schizophrenia. Finally, patients with formal thought disorders present particularly high priming effects relative to controls. These abnormal semantic priming effects could reflect a dysfunction of automatic spreading activation process and consequently an exaggerated diffusion of activation in the semantic network. In the future, the inclusion of different groups schizophrenic subjects could allow us to determine whether semantic memory disorders are pathognomonic or specific of a particular group of patients with schizophrenia.

  14. Computationally Efficient Clustering of Audio-Visual Meeting Data

    NASA Astrophysics Data System (ADS)

    Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

    This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.

  15. Semantic distance as a critical factor in icon design for in-car infotainment systems.

    PubMed

    Silvennoinen, Johanna M; Kujala, Tuomo; Jokinen, Jussi P P

    2017-11-01

    In-car infotainment systems require icons that enable fluent cognitive information processing and safe interaction while driving. An important issue is how to find an optimised set of icons for different functions in terms of semantic distance. In an optimised icon set, every icon needs to be semantically as close as possible to the function it visually represents and semantically as far as possible from the other functions represented concurrently. In three experiments (N = 21 each), semantic distances of 19 icons to four menu functions were studied with preference rankings, verbal protocols, and the primed product comparisons method. The results show that the primed product comparisons method can be efficiently utilised for finding an optimised set of icons for time-critical applications out of a larger set of icons. The findings indicate the benefits of the novel methodological perspective into the icon design for safety-critical contexts in general. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Topological Alterations and Symptom-Relevant Modules in the Whole-Brain Structural Network in Semantic Dementia.

    PubMed

    Ding, Junhua; Chen, Keliang; Zhang, Weibin; Li, Ming; Chen, Yan; Yang, Qing; Lv, Yingru; Guo, Qihao; Han, Zaizhu

    2017-01-01

    Semantic dementia (SD) is characterized by a selective decline in semantic processing. Although the neuropsychological pattern of this disease has been identified, its topological global alterations and symptom-relevant modules in the whole-brain anatomical network have not been fully elucidated. This study aims to explore the topological alteration of anatomical network in SD and reveal the modules associated with semantic deficits in this disease. We first constructed the whole-brain white-matter networks of 20 healthy controls and 19 patients with SD. Then, the network metrics of graph theory were compared between these two groups. Finally, we separated the network of SD patients into different modules and correlated the structural integrity of each module with the severity of the semantic deficits across patients. The network of the SD patients presented a significantly reduced global efficiency, indicating that the long-distance connections were damaged. The network was divided into the following four distinctive modules: the left temporal/occipital/parietal, frontal, right temporal/occipital, and frontal/parietal modules. The first two modules were associated with the semantic deficits of SD. These findings illustrate the skeleton of the neuroanatomical network of SD patients and highlight the key role of the left temporal/occipital/parietal module and the left frontal module in semantic processing.

  17. A neotropical Miocene pollen database employing image-based search and semantic modeling1

    PubMed Central

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W.; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-01-01

    • Premise of the study: Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Methods: Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Results: Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Discussion: Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery. PMID:25202648

  18. Semantic Segmentation of Indoor Point Clouds Using Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Babacan, K.; Chen, L.; Sohn, G.

    2017-11-01

    As Building Information Modelling (BIM) thrives, geometry becomes no longer sufficient; an ever increasing variety of semantic information is needed to express an indoor model adequately. On the other hand, for the existing buildings, automatically generating semantically enriched BIM from point cloud data is in its infancy. The previous research to enhance the semantic content rely on frameworks in which some specific rules and/or features that are hand coded by specialists. These methods immanently lack generalization and easily break in different circumstances. On this account, a generalized framework is urgently needed to automatically and accurately generate semantic information. Therefore we propose to employ deep learning techniques for the semantic segmentation of point clouds into meaningful parts. More specifically, we build a volumetric data representation in order to efficiently generate the high number of training samples needed to initiate a convolutional neural network architecture. The feedforward propagation is used in such a way to perform the classification in voxel level for achieving semantic segmentation. The method is tested both for a mobile laser scanner point cloud, and a larger scale synthetically generated data. We also demonstrate a case study, in which our method can be effectively used to leverage the extraction of planar surfaces in challenging cluttered indoor environments.

  19. Neural bases of event knowledge and syntax integration in comprehension of complex sentences.

    PubMed

    Malaia, Evie; Newman, Sharlene

    2015-01-01

    Comprehension of complex sentences is necessarily supported by both syntactic and semantic knowledge, but what linguistic factors trigger a readers' reliance on a specific system? This functional neuroimaging study orthogonally manipulated argument plausibility and verb event type to investigate cortical bases of the semantic effect on argument comprehension during reading. The data suggest that telic verbs facilitate online processing by means of consolidating the event schemas in episodic memory and by easing the computation of syntactico-thematic hierarchies in the left inferior frontal gyrus. The results demonstrate that syntax-semantics integration relies on trade-offs among a distributed network of regions for maximum comprehension efficiency.

  20. Amatchmethod Based on Latent Semantic Analysis for Earthquakehazard Emergency Plan

    NASA Astrophysics Data System (ADS)

    Sun, D.; Zhao, S.; Zhang, Z.; Shi, X.

    2017-09-01

    The structure of the emergency plan on earthquake is complex, and it's difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.

  1. An index-based algorithm for fast on-line query processing of latent semantic analysis

    PubMed Central

    Li, Pohan; Wang, Wei

    2017-01-01

    Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm. PMID:28520747

  2. An index-based algorithm for fast on-line query processing of latent semantic analysis.

    PubMed

    Zhang, Mingxi; Li, Pohan; Wang, Wei

    2017-01-01

    Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm.

  3. Information extraction and knowledge graph construction from geoscience literature

    NASA Astrophysics Data System (ADS)

    Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

    2018-03-01

    Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.

  4. Recursive Optimization of Digital Circuits

    DTIC Science & Technology

    1990-12-14

    Obverse- Specification . . . A-23 A.14 Non-MDS Optimization of SAMPLE .. .. .. .. .. .. ..... A-24 Appendix B . BORIS Recursive Optimization System...Software ...... B -i B .1 DESIGN.S File . .... .. .. .. .. .. .. .. .. .. ... ... B -2 B .2 PARSE.S File. .. .. .. .. .. .. .. .. ... .. ... .... B -1i B .3...TABULAR.S File. .. .. .. .. .. .. ... .. ... .. ... B -22 B .4 MDS.S File. .. .. .. .. .. .. .. ... .. ... .. ...... B -28 B .5 COST.S File

  5. Time-Driven Effects on Parsing during Reading

    ERIC Educational Resources Information Center

    Roll, Mikael; Lindgren, Magnus; Alter, Kai; Horne, Merle

    2012-01-01

    The phonological trace of perceived words starts fading away in short-term memory after a few seconds. Spoken utterances are usually 2-3 s long, possibly to allow the listener to parse the words into coherent prosodic phrases while they still have a clear representation. Results from this brain potential study suggest that even during silent…

  6. The Effect of Exposure on Syntactic Parsing in Spanish-English Bilinguals

    ERIC Educational Resources Information Center

    Dussias, Paola E.; Sagarra, Nuria

    2007-01-01

    An eye tracking experiment examined how exposure to a second language (L2) influences sentence parsing in the first language. Forty-four monolingual Spanish speakers, 24 proficient Spanish-English bilinguals with limited immersion experience in the L2 environment and 20 proficient Spanish-English bilinguals with extensive L2 immersion experience…

  7. Semantics driven approach for knowledge acquisition from EMRs.

    PubMed

    Perera, Sujan; Henson, Cory; Thirunarayan, Krishnaprasad; Sheth, Amit; Nair, Suhas

    2014-03-01

    Semantic computing technologies have matured to be applicable to many critical domains such as national security, life sciences, and health care. However, the key to their success is the availability of a rich domain knowledge base. The creation and refinement of domain knowledge bases pose difficult challenges. The existing knowledge bases in the health care domain are rich in taxonomic relationships, but they lack nontaxonomic (domain) relationships. In this paper, we describe a semiautomatic technique for enriching existing domain knowledge bases with causal relationships gleaned from Electronic Medical Records (EMR) data. We determine missing causal relationships between domain concepts by validating domain knowledge against EMR data sources and leveraging semantic-based techniques to derive plausible relationships that can rectify knowledge gaps. Our evaluation demonstrates that semantic techniques can be employed to improve the efficiency of knowledge acquisition.

  8. A python tool for the implementation of domain-specific languages

    NASA Astrophysics Data System (ADS)

    Dejanović, Igor; Vaderna, Renata; Milosavljević, Gordana; Simić, Miloš; Vuković, Željko

    2017-07-01

    In this paper we describe textX, a meta-language and a tool for building Domain-Specific Languages. It is implemented in Python using Arpeggio PEG (Parsing Expression Grammar) parser library. From a single language description (grammar) textX will build a parser and a meta-model (a.k.a. abstract syntax) of the language. The parser is used to parse textual representations of models conforming to the meta-model. As a result of parsing, a Python object graph will be automatically created. The structure of the object graph will conform to the meta-model defined by the grammar. This approach frees a developer from the need to manually analyse a parse tree and transform it to other suitable representation. The textX library is independent of any integrated development environment and can be easily integrated in any Python project. The textX tool works as a grammar interpreter. The parser is configured at run-time using the grammar. The textX tool is a free and open-source project available at GitHub.

  9. Examining the Relationship Between Word Reading Efficiency and Oral Reading Rate in Predicting Comprehension Among Different Types of Readers

    PubMed Central

    Eason, Sarah H.; Sabatini, John; Goldberg, Lindsay; Bruce, Kelly; Cutting, Laurie E.

    2013-01-01

    To further explore contextual reading rate, an important aspect of reading fluency, we examined the relationship between word reading efficiency (WRE) and contextual oral reading rate (ORR), the degree to which they overlap across different comprehension measures, whether oral language (semantics and syntax) predicts ORR beyond contributions of word-level skills, and whether the WRE–ORR relationship varies based on different reader profiles. Assessing reading and language of average readers, poor decoders, and poor comprehenders, ages 10 to 14, ORR was the strongest predictor of comprehension across various formats; WRE contributed no unique variance after taking ORR into account. Findings indicated that semantics, not syntax, contributed to ORR. Poor comprehenders performed below average on measures of ORR, despite average WRE, expanding previous findings suggesting specific weaknesses in ORR for this group. Together, findings suggest that ORR draws upon skills beyond those captured by WRE and suggests a role for oral language (semantics) in ORR. PMID:23667307

  10. Perception of object trajectory: parsing retinal motion into self and object movement components.

    PubMed

    Warren, Paul A; Rushton, Simon K

    2007-08-16

    A moving observer needs to be able to estimate the trajectory of other objects moving in the scene. Without the ability to do so, it would be difficult to avoid obstacles or catch a ball. We hypothesized that neural mechanisms sensitive to the patterns of motion generated on the retina during self-movement (optic flow) play a key role in this process, "parsing" motion due to self-movement from that due to object movement. We investigated this "flow parsing" hypothesis by measuring the perceived trajectory of a moving probe placed within a flow field that was consistent with movement of the observer. In the first experiment, the flow field was consistent with an eye rotation; in the second experiment, it was consistent with a lateral translation of the eyes. We manipulated the distance of the probe in both experiments and assessed the consequences. As predicted by the flow parsing hypothesis, manipulating the distance of the probe had differing effects on the perceived trajectory of the probe in the two experiments. The results were consistent with the scene geometry and the type of simulated self-movement. In a third experiment, we explored the contribution of local and global motion processing to the results of the first two experiments. The data suggest that the parsing process involves global motion processing, not just local motion contrast. The findings of this study support a role for optic flow processing in the perception of object movement during self-movement.

  11. Memory mechanisms supporting syntactic comprehension

    PubMed Central

    Waters, Gloria

    2013-01-01

    Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829–839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension—the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance—long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory. PMID:23319178

  12. High-performance analysis of filtered semantic graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buluc, Aydin; Fox, Armando; Gilbert, John R.

    2012-01-01

    High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry "attributes" of various types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the computation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The filtered approach is superior due to its generality, ease of use, and memory efficiency, but may carry amore » performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes filters in a high-level language, but those filters result in relatively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate filters defined by programmers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by comparing it with the high-performance C++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level filtering language comes without sacrificing performance.« less

  13. Learning From Short Text Streams With Topic Drifts.

    PubMed

    Li, Peipei; He, Lu; Wang, Haiyan; Hu, Xuegang; Zhang, Yuhong; Li, Lei; Wu, Xindong

    2017-09-18

    Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.

  14. Preparatory neural activity predicts performance on a conflict task.

    PubMed

    Stern, Emily R; Wager, Tor D; Egner, Tobias; Hirsch, Joy; Mangels, Jennifer A

    2007-10-24

    Advance preparation has been shown to improve the efficiency of conflict resolution. Yet, with little empirical work directly linking preparatory neural activity to the performance benefits of advance cueing, it is not clear whether this relationship results from preparatory activation of task-specific networks, or from activity associated with general alerting processes. Here, fMRI data were acquired during a spatial Stroop task in which advance cues either informed subjects of the upcoming relevant feature of conflict stimuli (spatial or semantic) or were neutral. Informative cues decreased reaction time (RT) relative to neutral cues, and cues indicating that spatial information would be task-relevant elicited greater activity than neutral cues in multiple areas, including right anterior prefrontal and bilateral parietal cortex. Additionally, preparatory activation in bilateral parietal cortex and right dorsolateral prefrontal cortex predicted faster RT when subjects responded to spatial location. No regions were found to be specific to semantic cues at conventional thresholds, and lowering the threshold further revealed little overlap between activity associated with spatial and semantic cueing effects, thereby demonstrating a single dissociation between activations related to preparing a spatial versus semantic task-set. This relationship between preparatory activation of spatial processing networks and efficient conflict resolution suggests that advance information can benefit performance by leading to domain-specific biasing of task-relevant information.

  15. Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies.

    PubMed

    Gu, H; Chen, Y; He, Z; Halper, M; Chen, L

    2016-01-01

    The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS's Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus's concepts via the assignment of one or more types to each concept. Due to the scope and complexity of the Metathesaurus, errors are all but inevitable in this semantic-type assignment process. To develop a semi-automated methodology to help assure the quality of semantic-type assignments within the UMLS. The methodology uses a cross-validation strategy involving SNOMED CT's hierarchies in combination with UMLS semantic types. Semantically uniform, disjoint concept groups are generated programmatically by partitioning the collection of all concepts in the same SNOMED CT hierarchy according to their respective semantic-type assignments in the UMLS. Domain experts are then called upon to review the concepts in any group having a small number of concepts. It is our hypothesis that a semantic-type assignment combination applicable only to a very small number of concepts in a SNOMED CT hierarchy is an indicator of potential problems. The methodology was applied to the UMLS 2013AA release along with the SNOMED CT from January 2013. An overall error rate of 33% was found for concepts proposed by the quality-assurance methodology. Supporting our hypothesis, that number was four times higher than the error rate found in control samples. The results show that the quality-assurance methodology can aid in effective and efficient identification of UMLS semantic-type assignment errors.

  16. Interoperable cross-domain semantic and geospatial framework for automatic change detection

    NASA Astrophysics Data System (ADS)

    Kuo, Chiao-Ling; Hong, Jung-Hong

    2016-01-01

    With the increasingly diverse types of geospatial data established over the last few decades, semantic interoperability in integrated applications has attracted much interest in the field of Geographic Information System (GIS). This paper proposes a new strategy and framework to process cross-domain geodata at the semantic level. This framework leverages the semantic equivalence of concepts between domains through bridge ontology and facilitates the integrated use of different domain data, which has been long considered as an essential superiority of GIS, but is impeded by the lack of understanding about the semantics implicitly hidden in the data. We choose the task of change detection to demonstrate how the introduction of ontology concept can effectively make the integration possible. We analyze the common properties of geodata and change detection factors, then construct rules and summarize possible change scenario for making final decisions. The use of topographic map data to detect changes in land use shows promising success, as far as the improvement of efficiency and level of automation is concerned. We believe the ontology-oriented approach will enable a new way for data integration across different domains from the perspective of semantic interoperability, and even open a new dimensionality for the future GIS.

  17. Categorizing words through semantic memory navigation

    NASA Astrophysics Data System (ADS)

    Borge-Holthoefer, J.; Arenas, A.

    2010-03-01

    Semantic memory is the cognitive system devoted to storage and retrieval of conceptual knowledge. Empirical data indicate that semantic memory is organized in a network structure. Everyday experience shows that word search and retrieval processes provide fluent and coherent speech, i.e. are efficient. This implies either that semantic memory encodes, besides thousands of words, different kind of links for different relationships (introducing greater complexity and storage costs), or that the structure evolves facilitating the differentiation between long-lasting semantic relations from incidental, phenomenological ones. Assuming the latter possibility, we explore a mechanism to disentangle the underlying semantic backbone which comprises conceptual structure (extraction of categorical relations between pairs of words), from the rest of information present in the structure. To this end, we first present and characterize an empirical data set modeled as a network, then we simulate a stochastic cognitive navigation on this topology. We schematize this latter process as uncorrelated random walks from node to node, which converge to a feature vectors network. By doing so we both introduce a novel mechanism for information retrieval, and point at the problem of category formation in close connection to linguistic and non-linguistic experience.

  18. Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

    PubMed Central

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336

  19. Centrality-based Selection of Semantic Resources for Geosciences

    NASA Astrophysics Data System (ADS)

    Cerba, Otakar; Jedlicka, Karel

    2017-04-01

    Semantical questions intervene almost in all disciplines dealing with geographic data and information, because relevant semantics is crucial for any way of communication and interaction among humans as well as among machines. But the existence of such a large number of different semantic resources (such as various thesauri, controlled vocabularies, knowledge bases or ontologies) makes the process of semantics implementation much more difficult and complicates the use of the advantages of semantics. This is because in many cases users are not able to find the most suitable resource for their purposes. The research presented in this paper introduces a methodology consisting of an analysis of identical relations in Linked Data space, which covers a majority of semantic resources, to find a suitable resource of semantic information. Identical links interconnect representations of an object or a concept in various semantic resources. Therefore this type of relations is considered to be crucial from the view of Linked Data, because these links provide new additional information, including various views on one concept based on different cultural or regional aspects (so-called social role of Linked Data). For these reasons it is possible to declare that one reasonable criterion for feasible semantic resources for almost all domains, including geosciences, is their position in a network of interconnected semantic resources and level of linking to other knowledge bases and similar products. The presented methodology is based on searching of mutual connections between various instances of one concept using "follow your nose" approach. The extracted data on interconnections between semantic resources are arranged to directed graphs and processed by various metrics patterned on centrality computing (degree, closeness or betweenness centrality). Semantic resources recommended by the research could be used for providing semantically described keywords for metadata records or as names of items in data models. Such an approach enables much more efficient data harmonization, integration, sharing and exploitation. * * * * This publication was supported by the project LO1506 of the Czech Ministry of Education, Youth and Sports. This publication was supported by project Data-Driven Bioeconomy (DataBio) from the ICT-15-2016-2017, Big Data PPP call.

  20. Multidimensional incremental parsing for universal source coding.

    PubMed

    Bae, Soo Hyun; Juang, Biing-Hwang

    2008-10-01

    A multidimensional incremental parsing algorithm (MDIP) for multidimensional discrete sources, as a generalization of the Lempel-Ziv coding algorithm, is investigated. It consists of three essential component schemes, maximum decimation matching, hierarchical structure of multidimensional source coding, and dictionary augmentation. As a counterpart of the longest match search in the Lempel-Ziv algorithm, two classes of maximum decimation matching are studied. Also, an underlying behavior of the dictionary augmentation scheme for estimating the source statistics is examined. For an m-dimensional source, m augmentative patches are appended into the dictionary at each coding epoch, thus requiring the transmission of a substantial amount of information to the decoder. The property of the hierarchical structure of the source coding algorithm resolves this issue by successively incorporating lower dimensional coding procedures in the scheme. In regard to universal lossy source coders, we propose two distortion functions, the local average distortion and the local minimax distortion with a set of threshold levels for each source symbol. For performance evaluation, we implemented three image compression algorithms based upon the MDIP; one is lossless and the others are lossy. The lossless image compression algorithm does not perform better than the Lempel-Ziv-Welch coding, but experimentally shows efficiency in capturing the source structure. The two lossy image compression algorithms are implemented using the two distortion functions, respectively. The algorithm based on the local average distortion is efficient at minimizing the signal distortion, but the images by the one with the local minimax distortion have a good perceptual fidelity among other compression algorithms. Our insights inspire future research on feature extraction of multidimensional discrete sources.

  1. Error-Correcting Parsing for Syntactic Pattern Recognition

    DTIC Science & Technology

    1977-08-01

    1971. 55. Slromoney, G., Slromoney, R., and K. Krlthlvasan, "Abstract Families of Matrices and Picture Langauges," Computer Graphic and Image...T112 111X1 121 Tine USLO FOR LINXirjG A THtt .186 SEC "INPUT CHARACTER IS A DISTANCE PORN N0*flAL A IS_ 3 TINE USED FOX PARSING S.l&l SEC

  2. Perceiving Event Dynamics and Parsing Hollywood Films

    ERIC Educational Resources Information Center

    Cutting, James E.; Brunick, Kaitlin L.; Candan, Ayse

    2012-01-01

    We selected 24 Hollywood movies released from 1940 through 2010 to serve as a film corpus. Eight viewers, three per film, parsed them into events, which are best termed subscenes. While watching a film a second time, viewers scrolled through frames and recorded the frame number where each event began. Viewers agreed about 90% of the time. We then…

  3. The Storage and Processing of Morphologically Complex Words in L2 Spanish

    ERIC Educational Resources Information Center

    Foote, Rebecca

    2017-01-01

    Research with native speakers indicates that, during word recognition, regularly inflected words undergo parsing that segments them into stems and affixes. In contrast, studies with learners suggest that this parsing may not take place in L2. This study's research questions are: Do L2 Spanish learners store and process regularly inflected,…

  4. Parsing Protocols Using Problem Solving Grammars. AI Memo 385.

    ERIC Educational Resources Information Center

    Miller, Mark L.; Goldstein, Ira P.

    A theory of the planning and debugging of computer programs is formalized as a context free grammar, which is used to reveal the constituent structure of problem solving episodes by parsing protocols in which programs are written, tested, and debugged. This is illustrated by the detailed analysis of an actual session with a beginning student…

  5. Parsing in a Dynamical System: An Attractor-Based Account of the Interaction of Lexical and Structural Constraints in Sentence Processing.

    ERIC Educational Resources Information Center

    Tabor, Whitney; And Others

    1997-01-01

    Proposes a dynamical systems approach to parsing in which syntactic hypotheses are associated with attractors in a metric space. The experiments discussed documented various contingent frequency effects that cut across traditional linguistic grains, each of which was predicted by the dynamical systems model. (47 references) (Author/CK)

  6. Effects of Prosodic and Lexical Constraints on Parsing in Young Children (and Adults)

    ERIC Educational Resources Information Center

    Snedeker, Jesse; Yuan, Sylvia

    2008-01-01

    Prior studies of ambiguity resolution in young children have found that children rely heavily on lexical information but persistently fail to use referential constraints in online parsing [Trueswell, J.C., Sekerina, I., Hill, N.M., & Logrip, M.L, (1999). The kindergarten-path effect: Studying on-line sentence processing in young children.…

  7. Writing filter processes for the SAGA editor, appendix G

    NASA Technical Reports Server (NTRS)

    Kirslis, Peter A.

    1985-01-01

    The SAGA editor provides a mechanism by which separate processes can be invoked during an editing session to traverse portions of the parse tree being edited. These processes, termed filter processes, read, analyze, and possibly transform the parse tree, returning the result to the editor. By defining new commands with the editor's user defined command facility, which invoke filter processes, authors of filter can provide complex operations as simple commands. A tree plotter, pretty printer, and Pascal tree transformation program were already written using this facility. The filter processes are introduced, parse tree structure is described and the library interface made available to the programmer. Also discussed is how to compile and run filter processes. Examples are presented to illustrate aspect of each of these areas.

  8. Remembering 'zeal' but not 'thing': reverse frequency effects as a consequence of deregulated semantic processing.

    PubMed

    Hoffman, Paul; Jefferies, Elizabeth; Ralph, Matthew A Lambon

    2011-02-01

    More efficient processing of high frequency (HF) words is a ubiquitous finding in healthy individuals, yet frequency effects are often small or absent in stroke aphasia. We propose that some patients fail to show the expected frequency effect because processing of HF words places strong demands on semantic control and regulation processes, counteracting the usual effect. This may occur because HF words appear in a wide range of linguistic contexts, each associated with distinct semantic information. This theory predicts that in extreme circumstances, patients with impaired semantic control should show an outright reversal of the normal frequency effect. To test this prediction, we tested two patients with impaired semantic control with a delayed repetition task that emphasised activation of semantic representations. By alternating HF and low frequency (LF) trials, we demonstrated a significant repetition advantage for LF words, principally because of perseverative errors in which patients produced the previous LF response in place of the HF target. These errors indicated that HF words were more weakly activated than LF words. We suggest that when presented with no contextual information, patients generate a weak and unstable pattern of semantic activation for HF words because information relating to many possible contexts and interpretations is activated. In contrast, LF words are associated with more stable patterns of activation because similar semantic information is activated whenever they are encountered. Copyright © 2011 Elsevier Ltd. All rights reserved.

  9. Effects of Semantic Context and Fundamental Frequency Contours on Mandarin Speech Recognition by Second Language Learners.

    PubMed

    Zhang, Linjun; Li, Yu; Wu, Han; Li, Xin; Shu, Hua; Zhang, Yang; Li, Ping

    2016-01-01

    Speech recognition by second language (L2) learners in optimal and suboptimal conditions has been examined extensively with English as the target language in most previous studies. This study extended existing experimental protocols (Wang et al., 2013) to investigate Mandarin speech recognition by Japanese learners of Mandarin at two different levels (elementary vs. intermediate) of proficiency. The overall results showed that in addition to L2 proficiency, semantic context, F0 contours, and listening condition all affected the recognition performance on the Mandarin sentences. However, the effects of semantic context and F0 contours on L2 speech recognition diverged to some extent. Specifically, there was significant modulation effect of listening condition on semantic context, indicating that L2 learners made use of semantic context less efficiently in the interfering background than in quiet. In contrast, no significant modulation effect of listening condition on F0 contours was found. Furthermore, there was significant interaction between semantic context and F0 contours, indicating that semantic context becomes more important for L2 speech recognition when F0 information is degraded. None of these effects were found to be modulated by L2 proficiency. The discrepancy in the effects of semantic context and F0 contours on L2 speech recognition in the interfering background might be related to differences in processing capacities required by the two types of information in adverse listening conditions.

  10. Efficient Exact Inference With Loss Augmented Objective in Structured Learning.

    PubMed

    Bauer, Alexander; Nakajima, Shinichi; Muller, Klaus-Robert

    2016-08-19

    Structural support vector machine (SVM) is an elegant approach for building complex and accurate models with structured outputs. However, its applicability relies on the availability of efficient inference algorithms--the state-of-the-art training algorithms repeatedly perform inference to compute a subgradient or to find the most violating configuration. In this paper, we propose an exact inference algorithm for maximizing nondecomposable objectives due to special type of a high-order potential having a decomposable internal structure. As an important application, our method covers the loss augmented inference, which enables the slack and margin scaling formulations of structural SVM with a variety of dissimilarity measures, e.g., Hamming loss, precision and recall, Fβ-loss, intersection over union, and many other functions that can be efficiently computed from the contingency table. We demonstrate the advantages of our approach in natural language parsing and sequence segmentation applications.

  11. Automatic textual annotation of video news based on semantic visual object extraction

    NASA Astrophysics Data System (ADS)

    Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem

    2003-12-01

    In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.

  12. Neuroanatomic organization of sound memory in humans.

    PubMed

    Kraut, Michael A; Pitcock, Jeffery A; Calhoun, Vince; Li, Juan; Freeman, Thomas; Hart, John

    2006-11-01

    The neural interface between sensory perception and memory is a central issue in neuroscience, particularly initial memory organization following perceptual analyses. We used functional magnetic resonance imaging to identify anatomic regions extracting initial auditory semantic memory information related to environmental sounds. Two distinct anatomic foci were detected in the right superior temporal gyrus when subjects identified sounds representing either animals or threatening items. Threatening animal stimuli elicited signal changes in both foci, suggesting a distributed neural representation. Our results demonstrate both category- and feature-specific responses to nonverbal sounds in early stages of extracting semantic memory information from these sounds. This organization allows for these category-feature detection nodes to extract early, semantic memory information for efficient processing of transient sound stimuli. Neural regions selective for threatening sounds are similar to those of nonhuman primates, demonstrating semantic memory organization for basic biological/survival primitives are present across species.

  13. A DNA-based semantic fusion model for remote sensing data.

    PubMed

    Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H

    2013-01-01

    Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology.

  14. A DNA-Based Semantic Fusion Model for Remote Sensing Data

    PubMed Central

    Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H.

    2013-01-01

    Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology. PMID:24116207

  15. Is human sentence parsing serial or parallel? Evidence from event-related brain potentials.

    PubMed

    Hopf, Jens-Max; Bader, Markus; Meng, Michael; Bayer, Josef

    2003-01-01

    In this ERP study we investigate the processes that occur in syntactically ambiguous German sentences at the point of disambiguation. Whereas most psycholinguistic theories agree on the view that processing difficulties arise when parsing preferences are disconfirmed (so-called garden-path effects), important differences exist with respect to theoretical assumptions about the parser's recovery from a misparse. A key distinction can be made between parsers that compute all alternative syntactic structures in parallel (parallel parsers) and parsers that compute only a single preferred analysis (serial parsers). To distinguish empirically between parallel and serial parsing models, we compare ERP responses to garden-path sentences with ERP responses to truly ungrammatical sentences. Garden-path sentences contain a temporary and ultimately curable ungrammaticality, whereas truly ungrammatical sentences remain so permanently--a difference which gives rise to different predictions in the two classes of parsing architectures. At the disambiguating word, ERPs in both sentence types show negative shifts of similar onset latency, amplitude, and scalp distribution in an initial time window between 300 and 500 ms. In a following time window (500-700 ms), the negative shift to garden-path sentences disappears at right central parietal sites, while it continues in permanently ungrammatical sentences. These data are taken as evidence for a strictly serial parser. The absence of a difference in the early time window indicates that temporary and permanent ungrammaticalities trigger the same kind of parsing responses. Later differences can be related to successful reanalysis in garden-path but not in ungrammatical sentences. Copyright 2003 Elsevier Science B.V.

  16. A Practical Approach to Implementing Real-Time Semantics

    NASA Technical Reports Server (NTRS)

    Luettgen, Gerald; Bhat, Girish; Cleaveland, Rance

    1999-01-01

    This paper investigates implementations of process algebras which are suitable for modeling concurrent real-time systems. It suggests an approach for efficiently implementing real-time semantics using dynamic priorities. For this purpose a proces algebra with dynamic priority is defined, whose semantics corresponds one-to-one to traditional real-time semantics. The advantage of the dynamic-priority approach is that it drastically reduces the state-space sizes of the systems in question while preserving all properties of their functional and real-time behavior. The utility of the technique is demonstrated by a case study which deals with the formal modeling and verification of the SCSI-2 bus-protocol. The case study is carried out in the Concurrency Workbench of North Carolina, an automated verification tool in which the process algebra with dynamic priority is implemented. It turns out that the state space of the bus-protocol model is about an order of magnitude smaller than the one resulting from real-time semantics. The accuracy of the model is proved by applying model checking for verifying several mandatory properties of the bus protocol.

  17. Semantic Enrichment of Movement Behavior with Foursquare--A Visual Analytics Approach.

    PubMed

    Krueger, Robert; Thom, Dennis; Ertl, Thomas

    2015-08-01

    In recent years, many approaches have been developed that efficiently and effectively visualize movement data, e.g., by providing suitable aggregation strategies to reduce visual clutter. Analysts can use them to identify distinct movement patterns, such as trajectories with similar direction, form, length, and speed. However, less effort has been spent on finding the semantics behind movements, i.e. why somebody or something is moving. This can be of great value for different applications, such as product usage and consumer analysis, to better understand urban dynamics, and to improve situational awareness. Unfortunately, semantic information often gets lost when data is recorded. Thus, we suggest to enrich trajectory data with POI information using social media services and show how semantic insights can be gained. Furthermore, we show how to handle semantic uncertainties in time and space, which result from noisy, unprecise, and missing data, by introducing a POI decision model in combination with highly interactive visualizations. Finally, we evaluate our approach with two case studies on a large electric scooter data set and test our model on data with known ground truth.

  18. ViennaNGS: A toolbox for building efficient next- generation sequencing analysis pipelines

    PubMed Central

    Wolfinger, Michael T.; Fallmann, Jörg; Eggenhofer, Florian; Amman, Fabian

    2015-01-01

    Recent achievements in next-generation sequencing (NGS) technologies lead to a high demand for reuseable software components to easily compile customized analysis workflows for big genomics data. We present ViennaNGS, an integrated collection of Perl modules focused on building efficient pipelines for NGS data processing. It comes with functionality for extracting and converting features from common NGS file formats, computation and evaluation of read mapping statistics, as well as normalization of RNA abundance. Moreover, ViennaNGS provides software components for identification and characterization of splice junctions from RNA-seq data, parsing and condensing sequence motif data, automated construction of Assembly and Track Hubs for the UCSC genome browser, as well as wrapper routines for a set of commonly used NGS command line tools. PMID:26236465

  19. A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.

    PubMed

    Smelter, Andrey; Astra, Morgan; Moseley, Hunter N B

    2017-03-17

    The Biological Magnetic Resonance Data Bank (BMRB) is a public repository of Nuclear Magnetic Resonance (NMR) spectroscopic data of biological macromolecules. It is an important resource for many researchers using NMR to study structural, biophysical, and biochemical properties of biological macromolecules. It is primarily maintained and accessed in a flat file ASCII format known as NMR-STAR. While the format is human readable, the size of most BMRB entries makes computer readability and explicit representation a practical requirement for almost any rigorous systematic analysis. To aid in the use of this public resource, we have developed a package called nmrstarlib in the popular open-source programming language Python. The nmrstarlib's implementation is very efficient, both in design and execution. The library has facilities for reading and writing both NMR-STAR version 2.1 and 3.1 formatted files, parsing them into usable Python dictionary- and list-based data structures, making access and manipulation of the experimental data very natural within Python programs (i.e. "saveframe" and "loop" records represented as individual Python dictionary data structures). Another major advantage of this design is that data stored in original NMR-STAR can be easily converted into its equivalent JavaScript Object Notation (JSON) format, a lightweight data interchange format, facilitating data access and manipulation using Python and any other programming language that implements a JSON parser/generator (i.e., all popular programming languages). We have also developed tools to visualize assigned chemical shift values and to convert between NMR-STAR and JSONized NMR-STAR formatted files. Full API Reference Documentation, User Guide and Tutorial with code examples are also available. We have tested this new library on all current BMRB entries: 100% of all entries are parsed without any errors for both NMR-STAR version 2.1 and version 3.1 formatted files. We also compared our software to three currently available Python libraries for parsing NMR-STAR formatted files: PyStarLib, NMRPyStar, and PyNMRSTAR. The nmrstarlib package is a simple, fast, and efficient library for accessing data from the BMRB. The library provides an intuitive dictionary-based interface with which Python programs can read, edit, and write NMR-STAR formatted files and their equivalent JSONized NMR-STAR files. The nmrstarlib package can be used as a library for accessing and manipulating data stored in NMR-STAR files and as a command-line tool to convert from NMR-STAR file format into its equivalent JSON file format and vice versa, and to visualize chemical shift values. Furthermore, the nmrstarlib implementation provides a guide for effectively JSONizing other older scientific formats, improving the FAIRness of data in these formats.

  20. Ontology Reuse in Geoscience Semantic Applications

    NASA Astrophysics Data System (ADS)

    Mayernik, M. S.; Gross, M. B.; Daniels, M. D.; Rowan, L. R.; Stott, D.; Maull, K. E.; Khan, H.; Corson-Rikert, J.

    2015-12-01

    The tension between local ontology development and wider ontology connections is fundamental to the Semantic web. It is often unclear, however, what the key decision points should be for new semantic web applications in deciding when to reuse existing ontologies and when to develop original ontologies. In addition, with the growth of semantic web ontologies and applications, new semantic web applications can struggle to efficiently and effectively identify and select ontologies to reuse. This presentation will describe the ontology comparison, selection, and consolidation effort within the EarthCollab project. UCAR, Cornell University, and UNAVCO are collaborating on the EarthCollab project to use semantic web technologies to enable the discovery of the research output from a diverse array of projects. The EarthCollab project is using the VIVO Semantic web software suite to increase discoverability of research information and data related to the following two geoscience-based communities: (1) the Bering Sea Project, an interdisciplinary field program whose data archive is hosted by NCAR's Earth Observing Laboratory (EOL), and (2) diverse research projects informed by geodesy through the UNAVCO geodetic facility and consortium. This presentation will outline of EarthCollab use cases, and provide an overview of key ontologies being used, including the VIVO-Integrated Semantic Framework (VIVO-ISF), Global Change Information System (GCIS), and Data Catalog (DCAT) ontologies. We will discuss issues related to bringing these ontologies together to provide a robust ontological structure to support the EarthCollab use cases. It is rare that a single pre-existing ontology meets all of a new application's needs. New projects need to stitch ontologies together in ways that fit into the broader semantic web ecosystem.

  1. Pippi — Painless parsing, post-processing and plotting of posterior and likelihood samples

    NASA Astrophysics Data System (ADS)

    Scott, Pat

    2012-11-01

    Interpreting samples from likelihood or posterior probability density functions is rarely as straightforward as it seems it should be. Producing publication-quality graphics of these distributions is often similarly painful. In this short note I describe pippi, a simple, publicly available package for parsing and post-processing such samples, as well as generating high-quality PDF graphics of the results. Pippi is easily and extensively configurable and customisable, both in its options for parsing and post-processing samples, and in the visual aspects of the figures it produces. I illustrate some of these using an existing supersymmetric global fit, performed in the context of a gamma-ray search for dark matter. Pippi can be downloaded and followed at http://github.com/patscott/pippi.

  2. The Parsing Syllable Envelopes Test for Assessment of Amplitude Modulation Discrimination Skills in Children: Development, Normative Data, and Test-Retest Reliability Studies.

    PubMed

    Cameron, Sharon; Chong-White, Nicky; Mealings, Kiri; Beechey, Tim; Dillon, Harvey; Young, Taegan

    2018-02-01

    Intensity peaks and valleys in the acoustic signal are salient cues to syllable structure, which is accepted to be a crucial early step in phonological processing. As such, the ability to detect low-rate (envelope) modulations in signal amplitude is essential to parse an incoming speech signal into smaller phonological units. The Parsing Syllable Envelopes (ParSE) test was developed to quantify the ability of children to recognize syllable boundaries using an amplitude modulation detection paradigm. The envelope of a 750-msec steady-state /a/ vowel is modulated into two or three pseudo-syllables using notches with modulation depths varying between 0% and 100% along an 11-step continuum. In an adaptive three-alternative forced-choice procedure, the participant identified whether one, two, or three pseudo-syllables were heard. Development of the ParSE stimuli and test protocols, and collection of normative and test-retest reliability data. Eleven adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 10 mo) and 134 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 72 females. Data were collected using a touchscreen computer. Psychometric functions (PFs) were automatically fit to individual data by the ParSE software. Performance was related to the modulation depth at which syllables can be detected with 88% accuracy (referred to as the upper boundary of the uncertainty region [UBUR]). A shallower PF slope reflected a greater level of uncertainty. Age effects were determined based on raw scores. z Scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UBUR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance. Across participants, the performance criterion (UBUR) was met with a median modulation depth of 42%. The effect of age on the UBUR was significant (p < 0.00001). The UBUR ranged from 50% modulation depth for 6-yr-olds to 25% for adults. Children aged 6-10 had significantly higher uncertainty region boundaries than adults. A skewed distribution toward negative performance occurred (p = 0.00007). There was no significant difference in performance on the ParSE between males and females (p = 0.60). Test-retest z scores were strongly correlated (r = 0.68, p < 0.0000001). The ParSE normative data show that the ability to identify syllable boundaries based on changes in amplitude modulation improves with age, and that some children in the general population have performance much worse than their age peers. The test is suitable for use in planned studies in a clinical population. American Academy of Audiology

  3. The effect of working memory load on semantic illusions: what the phonological loop and central executive have to contribute.

    PubMed

    Büttner, Anke Caroline

    2012-01-01

    When asked how many animals of each kind Moses took on the Ark, most people respond with "two" despite the substituted name (Moses for Noah) in the question. Possible explanations for semantic illusions appear to be related to processing limitations such as those of working memory. Indeed, individual working memory capacity has an impact upon how sentences containing substitutions are processed. This experiment examined further the role of working memory in the occurrence of semantic illusions using a dual-task working memory load approach. Participants verified statements while engaging in either articulatory suppression or random number generation. Secondary task type had a significant effect on semantic illusion rate, but only when comparing the control condition to the two dual-task conditions. Furthermore, secondary task performance in the random number generation condition declined, suggesting a tradeoff between tasks. Response time analyses also showed a different pattern of processing across the conditions. The findings suggest that the phonological loop plays a role in representing semantic illusion sentences coherently and in monitoring for details, while the role of the central executive is to assist gist-processing of sentences. This usually efficient strategy leads to error in the case of semantic illusions.

  4. Decoding the Formation of New Semantics: MVPA Investigation of Rapid Neocortical Plasticity during Associative Encoding through Fast Mapping.

    PubMed

    Atir-Sharon, Tali; Gilboa, Asaf; Hazan, Hananel; Koilis, Ester; Manevitz, Larry M

    2015-01-01

    Neocortical structures typically only support slow acquisition of declarative memory; however, learning through fast mapping may facilitate rapid learning-induced cortical plasticity and hippocampal-independent integration of novel associations into existing semantic networks. During fast mapping the meaning of new words and concepts is inferred, and durable novel associations are incidentally formed, a process thought to support early childhood's exuberant learning. The anterior temporal lobe, a cortical semantic memory hub, may critically support such learning. We investigated encoding of semantic associations through fast mapping using fMRI and multivoxel pattern analysis. Subsequent memory performance following fast mapping was more efficiently predicted using anterior temporal lobe than hippocampal voxels, while standard explicit encoding was best predicted by hippocampal activity. Searchlight algorithms revealed additional activity patterns that predicted successful fast mapping semantic learning located in lateral occipitotemporal and parietotemporal neocortex and ventrolateral prefrontal cortex. By contrast, successful explicit encoding could be classified by activity in medial and dorsolateral prefrontal and parahippocampal cortices. We propose that fast mapping promotes incidental rapid integration of new associations into existing neocortical semantic networks by activating related, nonoverlapping conceptual knowledge. In healthy adults, this is better captured by unique anterior and lateral temporal lobe activity patterns, while hippocampal involvement is less predictive of this kind of learning.

  5. Artificial Intelligence-Based Semantic Internet of Things in a User-Centric Smart City

    PubMed Central

    Guo, Kun; Lu, Yueming; Gao, Hui; Cao, Ruohan

    2018-01-01

    Smart city (SC) technologies can provide appropriate services according to citizens’ demands. One of the key enablers in a SC is the Internet of Things (IoT) technology, which enables a massive number of devices to connect with each other. However, these devices usually come from different manufacturers with different product standards, which confront interactive control problems. Moreover, these devices will produce large amounts of data, and efficiently analyzing these data for intelligent services. In this paper, we propose a novel artificial intelligence-based semantic IoT (AI-SIoT) hybrid service architecture to integrate heterogeneous IoT devices to support intelligent services. In particular, the proposed architecture is empowered by semantic and AI technologies, which enable flexible connections among heterogeneous devices. The AI technology can support very implement efficient data analysis and make accurate decisions on service provisions in various kinds. Furthermore, we also present several practical use cases of the proposed AI-SIoT architecture and the opportunities and challenges to implement the proposed AI-SIoT for future SCs are also discussed. PMID:29701679

  6. Artificial Intelligence-Based Semantic Internet of Things in a User-Centric Smart City.

    PubMed

    Guo, Kun; Lu, Yueming; Gao, Hui; Cao, Ruohan

    2018-04-26

    Smart city (SC) technologies can provide appropriate services according to citizens’ demands. One of the key enablers in a SC is the Internet of Things (IoT) technology, which enables a massive number of devices to connect with each other. However, these devices usually come from different manufacturers with different product standards, which confront interactive control problems. Moreover, these devices will produce large amounts of data, and efficiently analyzing these data for intelligent services. In this paper, we propose a novel artificial intelligence-based semantic IoT (AI-SIoT) hybrid service architecture to integrate heterogeneous IoT devices to support intelligent services. In particular, the proposed architecture is empowered by semantic and AI technologies, which enable flexible connections among heterogeneous devices. The AI technology can support very implement efficient data analysis and make accurate decisions on service provisions in various kinds. Furthermore, we also present several practical use cases of the proposed AI-SIoT architecture and the opportunities and challenges to implement the proposed AI-SIoT for future SCs are also discussed.

  7. Aggregation Tool to Create Curated Data albums to Support Disaster Recovery and Response

    NASA Astrophysics Data System (ADS)

    Ramachandran, R.; Kulkarni, A.; Maskey, M.; Li, X.; Flynn, S.

    2014-12-01

    Economic losses due to natural hazards are estimated to be around 6-10 billion dollars annually for the U.S. and this number keeps increasing every year. This increase has been attributed to population growth and migration to more hazard prone locations. As this trend continues, in concert with shifts in weather patterns caused by climate change, it is anticipated that losses associated with natural disasters will keep growing substantially. One of challenges disaster response and recovery analysts face is to quickly find, access and utilize a vast variety of relevant geospatial data collected by different federal agencies. More often analysts may be familiar with limited, but specific datasets and are often unaware of or unfamiliar with a large quantity of other useful resources. Finding airborne or satellite data useful to a natural disaster event often requires a time consuming search through web pages and data archives. The search process for the analyst could be made much more efficient and productive if a tool could go beyond a typical search engine and provide not just links to web sites but actual links to specific data relevant to the natural disaster, parse unstructured reports for useful information nuggets, as well as gather other related reports, summaries, news stories, and images. This presentation will describe a semantic aggregation tool developed to address similar problem for Earth Science researchers. This tool provides automated curation, and creates "Data Albums" to support case studies. The generated "Data Albums" are compiled collections of information related to a specific science topic or event, containing links to relevant data files (granules) from different instruments; tools and services for visualization and analysis; information about the event contained in news reports, and images or videos to supplement research analysis. An ontology-based relevancy-ranking algorithm drives the curation of relevant data sets for a given event. This tool is now being used to generate a catalog of case studies focusing on hurricanes and severe storms.

  8. Conceptual Plural Information Is Used to Guide Early Parsing Decisions: Evidence from Garden-Path Sentences with Reciprocal Verbs

    ERIC Educational Resources Information Center

    Patson, Nikole D.; Ferreira, Fernanda

    2009-01-01

    In three eyetracking studies, we investigated the role of conceptual plurality in initial parsing decisions in temporarily ambiguous sentences with reciprocal verbs (e.g., "While the lovers kissed the baby played alone"). We varied the subject of the first clause using three types of plural noun phrases: conjoined noun phrases ("the bride and the…

  9. Expanding the Extent of a UMLS Semantic Type via Group Neighborhood Auditing

    PubMed Central

    Chen, Yan; Gu, Huanying; Perl, Yehoshua; Halper, Michael; Xu, Junchuan

    2009-01-01

    Objective Each Unified Medical Language System (UMLS) concept is assigned one or more semantic types (ST). A dynamic methodology for aiding an auditor in finding concepts that are missing the assignment of a given ST, S is presented. Design The first part of the methodology exploits the previously introduced Refined Semantic Network and accompanying refined semantic types (RST) to help narrow the search space for offending concepts. The auditing is focused in a neighborhood surrounding the extent of an RST, T (of S) called an envelope, consisting of parents and children of concepts in the extent. The audit moves outward as long as missing assignments are discovered. In the second part, concepts not reached previously are processed and reassigned T as needed during the processing of S's other RSTs. The set of such concepts is expanded in a similar way to that in the first part. Measurements The number of errors discovered is reported. To measure the methodology's efficiency, “error hit rates” (i.e., errors found in concepts examined) are computed. Results The methodology was applied to three STs: Experimental Model of Disease (EMD), Environmental Effect of Humans, and Governmental or Regulatory Activity. The EMD experienced the most drastic change. For its RST “EMD ∩ Neoplastic Process” (RST “EMD”) with only 33 (31) original concepts, 915 (134) concepts were found by the first (second) part to be missing the EMD assignment. Changes to the other two STs were smaller. Conclusion The results show that the proposed auditing methodology can help to effectively and efficiently identify concepts lacking the assignment of a particular semantic type. PMID:19567802

  10. The Yin and the Yang of Prediction: An fMRI Study of Semantic Predictive Processing

    PubMed Central

    Weber, Kirsten; Lau, Ellen F.; Stillerman, Benjamin; Kuperberg, Gina R.

    2016-01-01

    Probabilistic prediction plays a crucial role in language comprehension. When predictions are fulfilled, the resulting facilitation allows for fast, efficient processing of ambiguous, rapidly-unfolding input; when predictions are not fulfilled, the resulting error signal allows us to adapt to broader statistical changes in this input. We used functional Magnetic Resonance Imaging to examine the neuroanatomical networks engaged in semantic predictive processing and adaptation. We used a relatedness proportion semantic priming paradigm, in which we manipulated the probability of predictions while holding local semantic context constant. Under conditions of higher (versus lower) predictive validity, we replicate previous observations of reduced activity to semantically predictable words in the left anterior superior/middle temporal cortex, reflecting facilitated processing of targets that are consistent with prior semantic predictions. In addition, under conditions of higher (versus lower) predictive validity we observed significant differences in the effects of semantic relatedness within the left inferior frontal gyrus and the posterior portion of the left superior/middle temporal gyrus. We suggest that together these two regions mediated the suppression of unfulfilled semantic predictions and lexico-semantic processing of unrelated targets that were inconsistent with these predictions. Moreover, under conditions of higher (versus lower) predictive validity, a functional connectivity analysis showed that the left inferior frontal and left posterior superior/middle temporal gyrus were more tightly interconnected with one another, as well as with the left anterior cingulate cortex. The left anterior cingulate cortex was, in turn, more tightly connected to superior lateral frontal cortices and subcortical regions—a network that mediates rapid learning and adaptation and that may have played a role in switching to a more predictive mode of processing in response to the statistical structure of the wider environmental context. Together, these findings highlight close links between the networks mediating semantic prediction, executive function and learning, giving new insights into how our brains are able to flexibly adapt to our environment. PMID:27010386

  11. The Yin and the Yang of Prediction: An fMRI Study of Semantic Predictive Processing.

    PubMed

    Weber, Kirsten; Lau, Ellen F; Stillerman, Benjamin; Kuperberg, Gina R

    2016-01-01

    Probabilistic prediction plays a crucial role in language comprehension. When predictions are fulfilled, the resulting facilitation allows for fast, efficient processing of ambiguous, rapidly-unfolding input; when predictions are not fulfilled, the resulting error signal allows us to adapt to broader statistical changes in this input. We used functional Magnetic Resonance Imaging to examine the neuroanatomical networks engaged in semantic predictive processing and adaptation. We used a relatedness proportion semantic priming paradigm, in which we manipulated the probability of predictions while holding local semantic context constant. Under conditions of higher (versus lower) predictive validity, we replicate previous observations of reduced activity to semantically predictable words in the left anterior superior/middle temporal cortex, reflecting facilitated processing of targets that are consistent with prior semantic predictions. In addition, under conditions of higher (versus lower) predictive validity we observed significant differences in the effects of semantic relatedness within the left inferior frontal gyrus and the posterior portion of the left superior/middle temporal gyrus. We suggest that together these two regions mediated the suppression of unfulfilled semantic predictions and lexico-semantic processing of unrelated targets that were inconsistent with these predictions. Moreover, under conditions of higher (versus lower) predictive validity, a functional connectivity analysis showed that the left inferior frontal and left posterior superior/middle temporal gyrus were more tightly interconnected with one another, as well as with the left anterior cingulate cortex. The left anterior cingulate cortex was, in turn, more tightly connected to superior lateral frontal cortices and subcortical regions-a network that mediates rapid learning and adaptation and that may have played a role in switching to a more predictive mode of processing in response to the statistical structure of the wider environmental context. Together, these findings highlight close links between the networks mediating semantic prediction, executive function and learning, giving new insights into how our brains are able to flexibly adapt to our environment.

  12. Texture Mixing via Universal Simulation

    DTIC Science & Technology

    2005-08-01

    classes and universal simulation. Based on the well-known Lempel and Ziv (LZ) universal compression scheme, the universal type class of a one...length that produce the same tree (dictionary) under the Lempel - Ziv (LZ) incre- mental parsing defined in the well-known LZ78 universal compression ...the well known Lempel - Ziv parsing algorithm . The goal is not just to synthesize mixed textures, but to understand what texture is. We are currently

  13. Using the Longman Mini-concordancer on Tagged and Parsed Corpora, with Special Reference to Their Use as an Aid to Grammar Learning.

    ERIC Educational Resources Information Center

    Qiao, Hong Liang; Sussex, Roland

    1996-01-01

    Presents methods for using the Longman Mini-Concordancer on tagged and parsed corpora rather than plain text corpora. The article discusses several aspects with models to be applied in the classroom as an aid to grammar learning. This paper suggests exercises suitable for teaching English to both native and nonnative speakers. (13 references)…

  14. Parsing and Tagging of Bilingual Dictionary

    DTIC Science & Technology

    2003-09-01

    LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries

  15. Annotating Socio-Cultural Structures in Text

    DTIC Science & Technology

    2012-10-31

    parts of speech (POS) within text, using the Stanford Part of Speech Tagger (Stanford Log-Linear, 2011). The ERDC-CERL taxonomy is then used to...annotated NP/VP Pane: Shows the sentence parsed using the Parts of Speech tagger Document View Pane: Specifies the document (being annotated) in three...first parsed using the Stanford Parts of Speech tagger and converted to an XML document both components which are done through the Import function

  16. Development of an HL7 interface engine, based on tree structure and streaming algorithm, for large-size messages which include image data.

    PubMed

    Um, Ki Sung; Kwak, Yun Sik; Cho, Hune; Kim, Il Kon

    2005-11-01

    A basic assumption of Health Level Seven (HL7) protocol is 'No limitation of message length'. However, most existing commercial HL7 interface engines do limit message length because they use the string array method, which is run in the main memory for the HL7 message parsing process. Specifically, messages with image and multi-media data create a long string array and thus cause the computer system to raise critical and fatal problem. Consequently, HL7 messages cannot handle the image and multi-media data necessary in modern medical records. This study aims to solve this problem with the 'streaming algorithm' method. This new method for HL7 message parsing applies the character-stream object which process character by character between the main memory and hard disk device with the consequence that the processing load on main memory could be alleviated. The main functions of this new engine are generating, parsing, validating, browsing, sending, and receiving HL7 messages. Also, the engine can parse and generate XML-formatted HL7 messages. This new HL7 engine successfully exchanged HL7 messages with 10 megabyte size images and discharge summary information between two university hospitals.

  17. Text-to-phonemic transcription and parsing into mono-syllables of English text

    NASA Astrophysics Data System (ADS)

    Jusgir Mullick, Yugal; Agrawal, S. S.; Tayal, Smita; Goswami, Manisha

    2004-05-01

    The present paper describes a program that converts the English text (entered through the normal computer keyboard) into its phonemic representation and then parses it into mono-syllables. For every letter a set of context based rules is defined in lexical order. A default rule is also defined separately for each letter. Beginning from the first letter of the word the rules are checked and the most appropriate rule is applied on the letter to find its actual orthographic representation. If no matching rule is found, then the default rule is applied. Current rule sets the next position to be analyzed. Proceeding in the same manner orthographic representation for each word can be found. For example, ``reading'' is represented as ``rEdiNX'' by applying the following rules: r-->r move 1 position ahead ead-->Ed move 3 position ahead i-->i move 1 position ahead ng-->NX move 2 position ahead, i.e., end of word. The phonemic representations obtained from the above procedure are parsed to get mono-syllabic representation for various combinations such as CVC, CVCC, CV, CVCVC, etc. For example, the above phonemic representation will be parsed as rEdiNX---> /rE/ /diNX/. This study is a part of developing TTS for Indian English.

  18. Automatic markerless registration of point clouds with semantic-keypoint-based 4-points congruent sets

    NASA Astrophysics Data System (ADS)

    Ge, Xuming

    2017-08-01

    The coarse registration of point clouds from urban building scenes has become a key topic in applications of terrestrial laser scanning technology. Sampling-based algorithms in the random sample consensus (RANSAC) model have emerged as mainstream solutions to address coarse registration problems. In this paper, we propose a novel combined solution to automatically align two markerless point clouds from building scenes. Firstly, the method segments non-ground points from ground points. Secondly, the proposed method detects feature points from each cross section and then obtains semantic keypoints by connecting feature points with specific rules. Finally, the detected semantic keypoints from two point clouds act as inputs to a modified 4PCS algorithm. Examples are presented and the results compared with those of K-4PCS to demonstrate the main contributions of the proposed method, which are the extension of the original 4PCS to handle heavy datasets and the use of semantic keypoints to improve K-4PCS in relation to registration accuracy and computational efficiency.

  19. A memory learning framework for effective image retrieval.

    PubMed

    Han, Junwei; Ngan, King N; Li, Mingjing; Zhang, Hong-Jiang

    2005-04-01

    Most current content-based image retrieval systems are still incapable of providing users with their desired results. The major difficulty lies in the gap between low-level image features and high-level image semantics. To address the problem, this study reports a framework for effective image retrieval by employing a novel idea of memory learning. It forms a knowledge memory model to store the semantic information by simply accumulating user-provided interactions. A learning strategy is then applied to predict the semantic relationships among images according to the memorized knowledge. Image queries are finally performed based on a seamless combination of low-level features and learned semantics. One important advantage of our framework is its ability to efficiently annotate images and also propagate the keyword annotation from the labeled images to unlabeled images. The presented algorithm has been integrated into a practical image retrieval system. Experiments on a collection of 10,000 general-purpose images demonstrate the effectiveness of the proposed framework.

  20. Simultaneous perception of a spoken and a signed language: The brain basis of ASL-English code-blends

    PubMed Central

    Weisberg, Jill; McCullough, Stephen; Emmorey, Karen

    2018-01-01

    Code-blends (simultaneous words and signs) are a unique characteristic of bimodal bilingual communication. Using fMRI, we investigated code-blend comprehension in hearing native ASL-English bilinguals who made a semantic decision (edible?) about signs, audiovisual words, and semantically equivalent code-blends. English and ASL recruited a similar fronto-temporal network with expected modality differences: stronger activation for English in auditory regions of bilateral superior temporal cortex, and stronger activation for ASL in bilateral occipitotemporal visual regions and left parietal cortex. Code-blend comprehension elicited activity in a combination of these regions, and no cognitive control regions were additionally recruited. Furthermore, code-blends elicited reduced activation relative to ASL presented alone in bilateral prefrontal and visual extrastriate cortices, and relative to English alone in auditory association cortex. Consistent with behavioral facilitation observed during semantic decisions, the findings suggest that redundant semantic content induces more efficient neural processing in language and sensory regions during bimodal language integration. PMID:26177161

  1. Real-time image annotation by manifold-based biased Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming

    2008-01-01

    Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.

  2. An agent-based peer-to-peer architecture for semantic discovery of manufacturing services across virtual enterprises

    NASA Astrophysics Data System (ADS)

    Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu

    2015-04-01

    With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.

  3. Semantic Likelihood Models for Bayesian Inference in Human-Robot Interaction

    NASA Astrophysics Data System (ADS)

    Sweet, Nicholas

    Autonomous systems, particularly unmanned aerial systems (UAS), remain limited in au- tonomous capabilities largely due to a poor understanding of their environment. Current sensors simply do not match human perceptive capabilities, impeding progress towards full autonomy. Recent work has shown the value of humans as sources of information within a human-robot team; in target applications, communicating human-generated 'soft data' to autonomous systems enables higher levels of autonomy through large, efficient information gains. This requires development of a 'human sensor model' that allows soft data fusion through Bayesian inference to update the probabilistic belief representations maintained by autonomous systems. Current human sensor models that capture linguistic inputs as semantic information are limited in their ability to generalize likelihood functions for semantic statements: they may be learned from dense data; they do not exploit the contextual information embedded within groundings; and they often limit human input to restrictive and simplistic interfaces. This work provides mechanisms to synthesize human sensor models from constraints based on easily attainable a priori knowledge, develops compression techniques to capture information-dense semantics, and investigates the problem of capturing and fusing semantic information contained within unstructured natural language. A robotic experimental testbed is also developed to validate the above contributions.

  4. Age-related differences in the neural bases of phonological and semantic processes

    PubMed Central

    Diaz, Michele T.; Johnson, Micah A.; Burke, Deborah M.; Madden, David J.

    2014-01-01

    Changes in language functions during normal aging are greater for phonological compared to semantic processes. To investigate the behavioral and neural basis for these age-related differences, we used functional magnetic resonance imaging (fMRI) to examine younger and older adults who made semantic and phonological decisions about pictures. The behavioral performance of older adults was less accurate and less efficient than younger adults’ in the phonological task, but did not differ in the semantic task. In the fMRI analyses, the semantic task activated left-hemisphere language regions, while the phonological task activated bilateral cingulate and ventral precuneus. Age-related effects were widespread throughout the brain, and most often expressed as greater activation for older adults. Activation was greater for younger compared to older adults in ventral brain regions involved in visual and object processing. Although there was not a significant Age x Condition interaction in the whole-brain fMRI results, correlations examining the relationship between behavior and fMRI activation were stronger for younger compared to older adults. Our results suggest that the relationship between behavior and neural activation declines with age and this may underlie some of the observed declines in performance. PMID:24893737

  5. Parafoveal processing in silent and oral reading: Reading mode influences the relative weighting of phonological and semantic information in Chinese.

    PubMed

    Pan, Jinger; Laubrock, Jochen; Yan, Ming

    2016-08-01

    We examined how reading mode (i.e., silent vs. oral reading) influences parafoveal semantic and phonological processing during the reading of Chinese sentences, using the gaze-contingent boundary paradigm. In silent reading, we found in 2 experiments that reading times on target words were shortened with semantic previews in early and late processing, whereas phonological preview effects mainly occurred in gaze duration or second-pass reading. In contrast, results showed that phonological preview information is obtained early on in oral reading. Strikingly, in oral reading, we observed a semantic preview cost on the target word in Experiment 1 and a decrease in the effect size of preview benefit from first- to second-pass measures in Experiment 2, which we hypothesize to result from increased preview duration. Taken together, our results indicate that parafoveal semantic information can be obtained irrespective of reading mode, whereas readers more efficiently process parafoveal phonological information in oral reading. We discuss implications for notions of information processing priority and saccade generation during silent and oral reading. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  6. Actively learning human gaze shifting paths for semantics-aware photo cropping.

    PubMed

    Zhang, Luming; Gao, Yue; Ji, Rongrong; Xia, Yingjie; Dai, Qionghai; Li, Xuelong

    2014-05-01

    Photo cropping is a widely used tool in printing industry, photography, and cinematography. Conventional cropping models suffer from the following three challenges. First, the deemphasized role of semantic contents that are many times more important than low-level features in photo aesthetics. Second, the absence of a sequential ordering in the existing models. In contrast, humans look at semantically important regions sequentially when viewing a photo. Third, the difficulty of leveraging inputs from multiple users. Experience from multiple users is particularly critical in cropping as photo assessment is quite a subjective task. To address these challenges, this paper proposes semantics-aware photo cropping, which crops a photo by simulating the process of humans sequentially perceiving semantically important regions of a photo. We first project the local features (graphlets in this paper) onto the semantic space, which is constructed based on the category information of the training photos. An efficient learning algorithm is then derived to sequentially select semantically representative graphlets of a photo, and the selecting process can be interpreted by a path, which simulates humans actively perceiving semantics in a photo. Furthermore, we learn a prior distribution of such active graphlet paths from training photos that are marked as aesthetically pleasing by multiple users. The learned priors enforce the corresponding active graphlet path of a test photo to be maximally similar to those from the training photos. Experimental results show that: 1) the active graphlet path accurately predicts human gaze shifting, and thus is more indicative for photo aesthetics than conventional saliency maps and 2) the cropped photos produced by our approach outperform its competitors in both qualitative and quantitative comparisons.

  7. Right inferior frontal gyrus activation is associated with memory improvement in patients with left frontal low-grade glioma resection.

    PubMed

    Miotto, Eliane C; Balardin, Joana B; Vieira, Gilson; Sato, Joao R; Martin, Maria da Graça M; Scaff, Milberto; Teixeira, Manoel J; Junior, Edson Amaro

    2014-01-01

    Patients with low-grade glioma (LGG) have been studied as a model of functional brain reorganization due to their slow-growing nature. However, there is no information regarding which brain areas are involved during verbal memory encoding after extensive left frontal LGG resection. In addition, it remains unknown whether these patients can improve their memory performance after instructions to apply efficient strategies. The neural correlates of verbal memory encoding were investigated in patients who had undergone extensive left frontal lobe (LFL) LGG resections and healthy controls using fMRI both before and after directed instructions were given for semantic organizational strategies. Participants were scanned during the encoding of word lists under three different conditions before and after a brief period of practice. The conditions included semantically unrelated (UR), related-non-structured (RNS), and related-structured words (RS), allowing for different levels of semantic organization. All participants improved on memory recall and semantic strategy application after the instructions for the RNS condition. Healthy subjects showed increased activation in the left inferior frontal gyrus (IFG) and middle frontal gyrus (MFG) during encoding for the RNS condition after the instructions. Patients with LFL excisions demonstrated increased activation in the right IFG for the RNS condition after instructions were given for the semantic strategies. Despite extensive damage in relevant areas that support verbal memory encoding and semantic strategy applications, patients that had undergone resections for LFL tumor could recruit the right-sided contralateral homologous areas after instructions were given and semantic strategies were practiced. These results provide insights into changes in brain activation areas typically implicated in verbal memory encoding and semantic processing.

  8. Top tagging: a method for identifying boosted hadronically decaying top quarks.

    PubMed

    Kaplan, David E; Rehermann, Keith; Schwartz, Matthew D; Tweedie, Brock

    2008-10-03

    A method is introduced for distinguishing top jets (boosted, hadronically decaying top quarks) from light-quark and gluon jets using jet substructure. The procedure involves parsing the jet cluster to resolve its subjets and then imposing kinematic constraints. With this method, light-quark or gluon jets with p{T} approximately 1 TeV can be rejected with an efficiency of around 99% while retaining up to 40% of top jets. This reduces the dijet background to heavy tt[over ] resonances by a factor of approximately 10 000, thereby allowing resonance searches in tt[over ] to be extended into the all-hadronic channel. In addition, top tagging can be used in tt[over ] events when one of the top quarks decays semileptonically, in events with missing energy, and in studies of b-tagging efficiency at high p{T}.

  9. Ground Operations Aerospace Language (GOAL) textbook

    NASA Technical Reports Server (NTRS)

    Dickison, L. R.

    1973-01-01

    The textbook provides a semantical explanation accompanying a complete set of GOAL syntax diagrams, system concepts, language component interaction, and general language concepts necessary for efficient language implementation/execution.

  10. A general natural-language text processor for clinical radiology.

    PubMed Central

    Friedman, C; Alderson, P O; Austin, J H; Cimino, J J; Johnson, S B

    1994-01-01

    OBJECTIVE: Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms. DESIGN: The natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines semantic patterns and a target form. The second phase, regularization, standardizes the terms in the initial target structure via a compositional mapping of multi-word phrases. The third phase, encoding, maps the terms to a controlled vocabulary. Radiology is the test domain for the processor and the target structure is a formal model for representing clinical information in that domain. MEASUREMENTS: The impression sections of 230 radiology reports were encoded by the processor. Results of an automated query of the resultant database for the occurrences of four diseases were compared with the analysis of a panel of three physicians to determine recall and precision. RESULTS: Without training specific to the four diseases, recall and precision of the system (combined effect of the processor and query generator) were 70% and 87%. Training of the query component increased recall to 85% without changing precision. PMID:7719797

  11. Semantic interoperability--HL7 Version 3 compared to advanced architecture standards.

    PubMed

    Blobel, B G M E; Engel, K; Pharow, P

    2006-01-01

    To meet the challenge for high quality and efficient care, highly specialized and distributed healthcare establishments have to communicate and co-operate in a semantically interoperable way. Information and communication technology must be open, flexible, scalable, knowledge-based and service-oriented as well as secure and safe. For enabling semantic interoperability, a unified process for defining and implementing the architecture, i.e. structure and functions of the cooperating systems' components, as well as the approach for knowledge representation, i.e. the used information and its interpretation, algorithms, etc. have to be defined in a harmonized way. Deploying the Generic Component Model, systems and their components, underlying concepts and applied constraints must be formally modeled, strictly separating platform-independent from platform-specific models. As HL7 Version 3 claims to represent the most successful standard for semantic interoperability, HL7 has been analyzed regarding the requirements for model-driven, service-oriented design of semantic interoperable information systems, thereby moving from a communication to an architecture paradigm. The approach is compared with advanced architectural approaches for information systems such as OMG's CORBA 3 or EHR systems such as GEHR/openEHR and CEN EN 13606 Electronic Health Record Communication. HL7 Version 3 is maturing towards an architectural approach for semantic interoperability. Despite current differences, there is a close collaboration between the teams involved guaranteeing a convergence between competing approaches.

  12. Mining Quality Phrases from Massive Text Corpora

    PubMed Central

    Liu, Jialu; Shang, Jingbo; Wang, Chi; Ren, Xiang; Han, Jiawei

    2015-01-01

    Text data are ubiquitous and play an essential role in big data applications. However, text data are mostly unstructured. Transforming unstructured text into structured units (e.g., semantically meaningful phrases) will substantially reduce semantic ambiguity and enhance the power and efficiency at manipulating such data using database technology. Thus mining quality phrases is a critical research problem in the field of databases. In this paper, we propose a new framework that extracts quality phrases from text corpora integrated with phrasal segmentation. The framework requires only limited training but the quality of phrases so generated is close to human judgment. Moreover, the method is scalable: both computation time and required space grow linearly as corpus size increases. Our experiments on large text corpora demonstrate the quality and efficiency of the new method. PMID:26705375

  13. Design and implementation of a health data interoperability mediator.

    PubMed

    Kuo, Mu-Hsing; Kushniruk, Andre William; Borycki, Elizabeth Marie

    2010-01-01

    The objective of this study is to design and implement a common-gateway oriented mediator to solve the health data interoperability problems that exist among heterogeneous health information systems. The proposed mediator has three main components: (1) a Synonym Dictionary (SD) that stores a set of global metadata and terminologies to serve as the mapping intermediary, (2) a Semantic Mapping Engine (SME) that can be used to map metadata and instance semantics, and (3) a DB-to-XML module that translates source health data stored in a database into XML format and back. A routine admission notification data exchange scenario is used to test the efficiency and feasibility of the proposed mediator. The study results show that the proposed mediator can make health information exchange more efficient.

  14. Ground Operations Aerospace Language (GOAL). Volume 2: Compiler

    NASA Technical Reports Server (NTRS)

    1973-01-01

    The principal elements and functions of the Ground Operations Aerospace Language (GOAL) compiler are presented. The technique used to transcribe the syntax diagrams into machine processable format for use by the parsing routines is described. An explanation of the parsing technique used to process GOAL source statements is included. The compiler diagnostics and the output reports generated during a GOAL compilation are explained. A description of the GOAL program package is provided.

  15. Imitation as behaviour parsing.

    PubMed Central

    Byrne, R W

    2003-01-01

    Non-human great apes appear to be able to acquire elaborate skills partly by imitation, raising the possibility of the transfer of skill by imitation in animals that have only rudimentary mentalizing capacities: in contrast to the frequent assumption that imitation depends on prior understanding of others' intentions. Attempts to understand the apes' behaviour have led to the development of a purely mechanistic model of imitation, the 'behaviour parsing' model, in which the statistical regularities that are inevitable in planned behaviour are used to decipher the organization of another agent's behaviour, and thence to imitate parts of it. Behaviour can thereby be understood statistically in terms of its correlations (circumstances of use, effects on the environment) without understanding of intentions or the everyday physics of cause-and-effect. Thus, imitation of complex, novel behaviour may not require mentalizing, but conversely behaviour parsing may be a necessary preliminary to attributing intention and cause. PMID:12689378

  16. Perceived visual speed constrained by image segmentation

    NASA Technical Reports Server (NTRS)

    Verghese, P.; Stone, L. S.

    1996-01-01

    Little is known about how or where the visual system parses the visual scene into objects or surfaces. However, it is generally assumed that the segmentation and grouping of pieces of the image into discrete entities is due to 'later' processing stages, after the 'early' processing of the visual image by local mechanisms selective for attributes such as colour, orientation, depth, and motion. Speed perception is also thought to be mediated by early mechanisms tuned for speed. Here we show that manipulating the way in which an image is parsed changes the way in which local speed information is processed. Manipulations that cause multiple stimuli to appear as parts of a single patch degrade speed discrimination, whereas manipulations that perceptually divide a single large stimulus into parts improve discrimination. These results indicate that processes as early as speed perception may be constrained by the parsing of the visual image into discrete entities.

  17. Ubiquitous Computing Services Discovery and Execution Using a Novel Intelligent Web Services Algorithm

    PubMed Central

    Choi, Okkyung; Han, SangYong

    2007-01-01

    Ubiquitous Computing makes it possible to determine in real time the location and situations of service requesters in a web service environment as it enables access to computers at any time and in any place. Though research on various aspects of ubiquitous commerce is progressing at enterprises and research centers, both domestically and overseas, analysis of a customer's personal preferences based on semantic web and rule based services using semantics is not currently being conducted. This paper proposes a Ubiquitous Computing Services System that enables a rule based search as well as semantics based search to support the fact that the electronic space and the physical space can be combined into one and the real time search for web services and the construction of efficient web services thus become possible.

  18. Building the Knowledge Base to Support the Automatic Animation Generation of Chinese Traditional Architecture

    NASA Astrophysics Data System (ADS)

    Wei, Gongjin; Bai, Weijing; Yin, Meifang; Zhang, Songmao

    We present a practice of applying the Semantic Web technologies in the domain of Chinese traditional architecture. A knowledge base consisting of one ontology and four rule bases is built to support the automatic generation of animations that demonstrate the construction of various Chinese timber structures based on the user's input. Different Semantic Web formalisms are used, e.g., OWL DL, SWRL and Jess, to capture the domain knowledge, including the wooden components needed for a given building, construction sequence, and the 3D size and position of every piece of wood. Our experience in exploiting the current Semantic Web technologies in real-world application systems indicates their prominent advantages (such as the reasoning facilities and modeling tools) as well as the limitations (such as low efficiency).

  19. Quantifying Narrative Ability in Autism Spectrum Disorder: A Computational Linguistic Analysis of Narrative Coherence

    PubMed Central

    Losh, Molly; Gordon, Peter C.

    2014-01-01

    Autism Spectrum Disorder (ASD) is characterized by difficulties with social communication and functioning, and ritualistic/repetitive behaviors (American Psychiatric Association, 2013). While substantial heterogeneity exists in symptom expression, impairments in language discourse skills, including narrative, are universally observed (Tager-Flusberg, Paul, & Lord, 2005). This study applied a computational linguistic tool, Latent Semantic Analysis (LSA), to objectively characterize narrative performance in ASD across two narrative contexts differing in interpersonal and cognitive demands. Results indicated that individuals with ASD produced narratives comparable in semantic content to those from controls when narrating from a picture book, but produced narratives diminished in semantic quality in a more demanding narrative recall task. Results are discussed in terms of the utility of LSA as a quantitative, objective, and efficient measure of narrative ability. PMID:24915929

  20. PyGlobal: A toolkit for automated compilation of DFT-based descriptors.

    PubMed

    Nath, Shilpa R; Kurup, Sudheer S; Joshi, Kaustubh A

    2016-06-15

    Density Functional Theory (DFT)-based Global reactivity descriptor calculations have emerged as powerful tools for studying the reactivity, selectivity, and stability of chemical and biological systems. A Python-based module, PyGlobal has been developed for systematically parsing a typical Gaussian outfile and extracting the relevant energies of the HOMO and LUMO. Corresponding global reactivity descriptors are further calculated and the data is saved into a spreadsheet compatible with applications like Microsoft Excel and LibreOffice. The efficiency of the module has been accounted by measuring the time interval for randomly selected Gaussian outfiles for 1000 molecules. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  1. PCC Framework for Program-Generators

    NASA Technical Reports Server (NTRS)

    Kong, Soonho; Choi, Wontae; Yi, Kwangkeun

    2009-01-01

    In this paper, we propose a proof-carrying code framework for program-generators. The enabling technique is abstract parsing, a static string analysis technique, which is used as a component for generating and validating certificates. Our framework provides an efficient solution for certifying program-generators whose safety properties are expressed in terms of the grammar representing the generated program. The fixed-point solution of the analysis is generated and attached with the program-generator on the code producer side. The consumer receives the code with a fixed-point solution and validates that the received fixed point is indeed a fixed point of the received code. This validation can be done in a single pass.

  2. Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews.

    PubMed

    Ji, Xiaonan; Ritter, Alan; Yen, Po-Yin

    2017-05-01

    Systematic Reviews (SRs) are utilized to summarize evidence from high quality studies and are considered the preferred source of evidence-based practice (EBP). However, conducting SRs can be time and labor intensive due to the high cost of article screening. In previous studies, we demonstrated utilizing established (lexical) article relationships to facilitate the identification of relevant articles in an efficient and effective manner. Here we propose to enhance article relationships with background semantic knowledge derived from Unified Medical Language System (UMLS) concepts and ontologies. We developed a pipelined semantic concepts representation process to represent articles from an SR into an optimized and enriched semantic space of UMLS concepts. Throughout the process, we leveraged concepts and concept relations encoded in biomedical ontologies (SNOMED-CT and MeSH) within the UMLS framework to prompt concept features of each article. Article relationships (similarities) were established and represented as a semantic article network, which was readily applied to assist with the article screening process. We incorporated the concept of active learning to simulate an interactive article recommendation process, and evaluated the performance on 15 completed SRs. We used work saved over sampling at 95% recall (WSS95) as the performance measure. We compared the WSS95 performance of our ontology-based semantic approach to existing lexical feature approaches and corpus-based semantic approaches, and found that we had better WSS95 in most SRs. We also had the highest average WSS95 of 43.81% and the highest total WSS95 of 657.18%. We demonstrated using ontology-based semantics to facilitate the identification of relevant articles for SRs. Effective concepts and concept relations derived from UMLS ontologies can be utilized to establish article semantic relationships. Our approach provided a promising performance and can easily apply to any SR topics in the biomedical domain with generalizability. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. A linked GeoData map for enabling information access

    USGS Publications Warehouse

    Powell, Logan J.; Varanka, Dalia E.

    2018-01-10

    OverviewThe Geospatial Semantic Web (GSW) is an emerging technology that uses the Internet for more effective knowledge engineering and information extraction. Among the aims of the GSW are to structure the semantic specifications of data to reduce ambiguity and to link those data more efficiently. The data are stored as triples, the basic data unit in graph databases, which are similar to the vector data model of geographic information systems (GIS); that is, a node-edge-node model that forms a graph of semantically related information. The GSW is supported by emerging technologies such as linked geospatial data, described below, that enable it to store and manage geographical data that require new cartographic methods for visualization. This report describes a map that can interact with linked geospatial data using a simulation of a data query approach called the browsable graph to find information that is semantically related to a subject of interest, visualized using the Data Driven Documents (D3) library. Such a semantically enabled map functions as a map knowledge base (MKB) (Varanka and Usery, 2017).A MKB differs from a database in an important way. The central element of a triple, alternatively called the edge or property, is composed of a logic formalization that structures the relation between the first and third parts, the nodes or objects. Node-edge-node represents the graphic form of the triple, and the subject-property-object terms represent the data structure. Object classes connect to build a federated graph, similar to a network in visual form. Because the triple property is a logical statement (a predicate), the data graph represents logical propositions or assertions accepted to be true about the subject matter. These logical formalizations can be manipulated to calculate new triples, representing inferred logical assertions, from the existing data.To demonstrate a MKB system, a technical proof-of-concept is developed that uses geographically attributed Resource Description Framework (RDF) serializations of linked data for mapping. The proof-of-concept focuses on accessing triple data from visual elements of a geographic map as the interface to the MKB. The map interface is embedded with other essential functions such as SPARQL Protocol and RDF Query Language (SPARQL) data query endpoint services and reasoning capabilities of Apache Marmotta (Apache Software Foundation, 2017). An RDF database of the Geographic Names Information System (GNIS), which contains official names of domestic feature in the United States, was linked to a county data layer from The National Map of the U.S. Geological Survey. The county data are part of a broader Government Units theme offered to the public as Esri shapefiles. The shapefile used to draw the map itself was converted to a geographic-oriented JavaScript Object Notation (JSON) (GeoJSON) format and linked through various properties with a linked geodata version of the GNIS database called “GNIS–LD” (Butler and others, 2016; B. Regalia and others, University of California-Santa Barbara, written commun., 2017). The GNIS–LD files originated in Terse RDF Triple Language (Turtle) format but were converted to a JSON format specialized in linked data, “JSON–LD” (Beckett and Berners-Lee, 2011; Sorny and others, 2014). The GNIS–LD database is composed of roughly three predominant triple data graphs: Features, Names, and History. The graphs include a set of namespace prefixes used by each of the attributes. Predefining the prefixes made the conversion to the JSON–LD format simple to complete because Turtle and JSON–LD are variant specifications of the basic RDF concept.To convert a shapefile into GeoJSON format to capture the geospatial coordinate geometry objects, an online converter, Mapshaper, was used (Bloch, 2013). To convert the Turtle files, a custom converter written in Java reconstructs the files by parsing each grouping of attributes belonging to one subject and pasting the data into a new file that follows the syntax of JSON–LD. Additionally, the Features file contained its own set of geometries, which was exported into a separate JSON–LD file along with its elevation value to form a fourth file, named “features-geo.json.” Extracted data from external files can be represented in HyperText Markup Language (HTML) path objects. The goal was to import multiple JSON–LD files using this approach.

  4. Digital patient: Personalized and translational data management through the MyHealthAvatar EU project.

    PubMed

    Kondylakis, Haridimos; Spanakis, Emmanouil G; Sfakianakis, Stelios; Sakkalis, Vangelis; Tsiknakis, Manolis; Marias, Kostas; Xia Zhao; Hong Qing Yu; Feng Dong

    2015-08-01

    The advancements in healthcare practice have brought to the fore the need for flexible access to health-related information and created an ever-growing demand for the design and the development of data management infrastructures for translational and personalized medicine. In this paper, we present the data management solution implemented for the MyHealthAvatar EU research project, a project that attempts to create a digital representation of a patient's health status. The platform is capable of aggregating several knowledge sources relevant for the provision of individualized personal services. To this end, state of the art technologies are exploited, such as ontologies to model all available information, semantic integration to enable data and query translation and a variety of linking services to allow connecting to external sources. All original information is stored in a NoSQL database for reasons of efficiency and fault tolerance. Then it is semantically uplifted through a semantic warehouse which enables efficient access to it. All different technologies are combined to create a novel web-based platform allowing seamless user interaction through APIs that support personalized, granular and secure access to the relevant information.

  5. Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary

    PubMed Central

    Caplan, David; Gow, David

    2010-01-01

    Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in discriminating the effects of these concurrent processes from those of the parser/interpreter in neural measures and to approaches that may help resolve them. PMID:20932562

  6. Generic Detection of Register Realignment

    NASA Astrophysics Data System (ADS)

    Ďurfina, Lukáš; Kolář, Dušan

    2011-09-01

    The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property--an ability for parsing by this type of grammar.

  7. Generic Detection of Register Realignment

    NASA Astrophysics Data System (ADS)

    Durfina, Lukáš; Kolář, Dušan

    2011-09-01

    The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property—an ability for parsing by this type of grammar.

  8. Design, Implementation and Testing of a Common Data Model Supporting Autonomous Vehicle Compatibility and Interoperability

    DTIC Science & Technology

    2006-09-01

    is that it is universally applicable. That is, it can be used to parse an instance of any Chomsky Normal Form context-free grammar . This relative... Chomsky -Normal-Form grammar corresponding to the vehicle-specific data format, use of the Cocke-Younger- Kasami algorithm to generate a parse tree...05). The productions of a Chomsky Normal Form context-free grammar have three significant characteristics: • There are no useless symbols (i.e

  9. A Database Design for the Brazilian Air Force Flying Unit Operational Control System.

    DTIC Science & Technology

    1984-12-14

    Company, 1980. 23. Pereira Filho, Jorge da Cunha " Banco de Dados Hoje" Dados e Ideias - Brazilian Magazine , 99 : 55-63 (February 1979). 24. Rich...34QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ?" LIKE "WHAT IS THE FORCESTATUS OF MIRAGE 2120?" % Parsed ! % [ Production added to system...4 - QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ? % Parsed ! % (S DEPLOC) = sbbr % (S-TIME) = 1000 % Endurance = 0200 % (SITCODE

  10. Incremental Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search

    NASA Astrophysics Data System (ADS)

    Nakamura, Katsuhiko; Hoshina, Akemi

    This paper discusses recent improvements and extensions in Synapse system for inductive inference of context free grammars (CFGs) from sample strings. Synapse uses incremental learning, rule generation based on bottom-up parsing, and the search for rule sets. The form of production rules in the previous system is extended from Revised Chomsky Normal Form A→βγ to Extended Chomsky Normal Form, which also includes A→B, where each of β and γ is either a terminal or nonterminal symbol. From the result of bottom-up parsing, a rule generation mechanism synthesizes minimum production rules required for parsing positive samples. Instead of inductive CYK algorithm in the previous version of Synapse, the improved version uses a novel rule generation method, called ``bridging,'' which bridges the lacked part of the derivation tree for the positive string. The improved version also employs a novel search strategy, called serial search in addition to minimum rule set search. The synthesis of grammars by the serial search is faster than the minimum set search in most cases. On the other hand, the size of the generated CFGs is generally larger than that by the minimum set search, and the system can find no appropriate grammar for some CFL by the serial search. The paper shows experimental results of incremental learning of several fundamental CFGs and compares the methods of rule generation and search strategies.

  11. 'Visual’ parsing can be taught quickly without visual experience during critical periods

    PubMed Central

    Reich, Lior; Amedi, Amir

    2015-01-01

    Cases of invasive sight-restoration in congenital blind adults demonstrated that acquiring visual abilities is extremely challenging, presumably because visual-experience during critical-periods is crucial for learning visual-unique concepts (e.g. size constancy). Visual rehabilitation can also be achieved using sensory-substitution-devices (SSDs) which convey visual information non-invasively through sounds. We tested whether one critical concept – visual parsing, which is highly-impaired in sight-restored patients – can be learned using SSD. To this end, congenitally blind adults participated in a unique, relatively short (~70 hours), SSD-‘vision’ training. Following this, participants successfully parsed 2D and 3D visual objects. Control individuals naïve to SSDs demonstrated that while some aspects of parsing with SSD are intuitive, the blind’s success could not be attributed to auditory processing alone. Furthermore, we had a unique opportunity to compare the SSD-users’ abilities to those reported for sight-restored patients who performed similar tasks visually, and who had months of eyesight. Intriguingly, the SSD-users outperformed the patients on most criteria tested. These suggest that with adequate training and technologies, key high-order visual features can be quickly acquired in adulthood, and lack of visual-experience during critical-periods can be somewhat compensated for. Practically, these highlight the potential of SSDs as standalone-aids or combined with invasive restoration approaches. PMID:26482105

  12. RADIANCE: An automated, enterprise-wide solution for archiving and reporting CT radiation dose estimates.

    PubMed

    Cook, Tessa S; Zimmerman, Stefan L; Steingall, Scott R; Maidment, Andrew D A; Kim, Woojin; Boonn, William W

    2011-01-01

    There is growing interest in the ability to monitor, track, and report exposure to radiation from medical imaging. Historically, however, dose information has been stored on an image-based dose sheet, an arrangement that precludes widespread indexing. Although scanner manufacturers are beginning to include dose-related parameters in the Digital Imaging and Communications in Medicine (DICOM) headers of imaging studies, there remains a vast repository of retrospective computed tomographic (CT) data with image-based dose sheets. Consequently, it is difficult for imaging centers to monitor their dose estimates or participate in the American College of Radiology (ACR) Dose Index Registry. An automated extraction software pipeline known as Radiation Dose Intelligent Analytics for CT Examinations (RADIANCE) has been designed that quickly and accurately parses CT dose sheets to extract and archive dose-related parameters. Optical character recognition of information in the dose sheet leads to creation of a text file, which along with the DICOM study header is parsed to extract dose-related data. The data are then stored in a relational database that can be queried for dose monitoring and report creation. RADIANCE allows efficient dose analysis of CT examinations and more effective education of technologists, radiologists, and referring physicians regarding patient exposure to radiation at CT. RADIANCE also allows compliance with the ACR's dose reporting guidelines and greater awareness of patient radiation dose, ultimately resulting in improved patient care and treatment.

  13. Optimizing ROOT’s Performance Using C++ Modules

    NASA Astrophysics Data System (ADS)

    Vassilev, Vassil

    2017-10-01

    ROOT comes with a C++ compliant interpreter cling. Cling needs to understand the content of the libraries in order to interact with them. Exposing the full shared library descriptors to the interpreter at runtime translates into increased memory footprint. ROOT’s exploratory programming concepts allow implicit and explicit runtime shared library loading. It requires the interpreter to load the library descriptor. Re-parsing of descriptors’ content has a noticeable effect on the runtime performance. Present state-of-art lazy parsing technique brings the runtime performance to reasonable levels but proves to be fragile and can introduce correctness issues. An elegant solution is to load information from the descriptor lazily and in a non-recursive way. The LLVM community advances its C++ Modules technology providing an io-efficient, on-disk representation capable to reduce build times and peak memory usage. The feature is standardized as a C++ technical specification. C++ Modules are a flexible concept, which can be employed to match CMS and other experiments’ requirement for ROOT: to optimize both runtime memory usage and performance. Cling technically “inherits” the feature, however tweaking it to ROOT scale and beyond is a complex endeavor. The paper discusses the status of the C++ Modules in the context of ROOT, supported by few preliminary performance results. It shows a step-by-step migration plan and describes potential challenges which could appear.

  14. C3: The Compositional Construction of Content: A New, More Effective and Efficient Way to Marshal Inferences from Background Knowledge that will Enable More Natural and Effective Communication with Automomous Systems

    DTIC Science & Technology

    2014-01-06

    products derived from this funding. This includes two proposed activities for Summer 2014: • Deep Semantic Annotation with Shallow Methods; James... process that we need to ensure that words are unambiguous before we read them (present in just the semantic field that is presently active). Publication...Technical Report). MIT Artificial Intelligence Laboratory. Allen, J., Manshadi, M., Dzikovska, M., & Swift, M. (2007). Deep linguistic processing for

  15. tESA: a distributional measure for calculating semantic relatedness.

    PubMed

    Rybinski, Maciej; Aldana-Montes, José Francisco

    2016-12-28

    Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in the ever growing domain of biomedical informatics. The problem of most state-of-the-art methods for calculating semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing. In this paper we present tESA, an extension to a well known Explicit Semantic Relatedness (ESA) method. In our extension we use two separate sets of vectors, corresponding to different sections of the articles from the underlying corpus of documents, as opposed to the original method, which only uses a single vector space. We present an evaluation of Life Sciences domain-focused applicability of both tESA and domain-adapted Explicit Semantic Analysis. The methods are tested against a set of standard benchmarks established for the evaluation of biomedical semantic relatedness quality. Our experiments show that the propsed method achieves results comparable with or superior to the current state-of-the-art methods. Additionally, a comparative discussion of the results obtained with tESA and ESA is presented, together with a study of the adaptability of the methods to different corpora and their performance with different input parameters. Our findings suggest that combined use of the semantics from different sections (i.e. extending the original ESA methodology with the use of title vectors) of the documents of scientific corpora may be used to enhance the performance of a distributional semantic relatedness measures, which can be observed in the largest reference datasets. We also present the impact of the proposed extension on the size of distributional representations.

  16. Design and Applications of a GeoSemantic Framework for Integration of Data and Model Resources in Hydrologic Systems

    NASA Astrophysics Data System (ADS)

    Elag, M.; Kumar, P.

    2016-12-01

    Hydrologists today have to integrate resources such as data and models, which originate and reside in multiple autonomous and heterogeneous repositories over the Web. Several resource management systems have emerged within geoscience communities for sharing long-tail data, which are collected by individual or small research groups, and long-tail models, which are developed by scientists or small modeling communities. While these systems have increased the availability of resources within geoscience domains, deficiencies remain due to the heterogeneity in the methods, which are used to describe, encode, and publish information about resources over the Web. This heterogeneity limits our ability to access the right information in the right context so that it can be efficiently retrieved and understood without the Hydrologist's mediation. A primary challenge of the Web today is the lack of the semantic interoperability among the massive number of resources, which already exist and are continually being generated at rapid rates. To address this challenge, we have developed a decentralized GeoSemantic (GS) framework, which provides three sets of micro-web services to support (i) semantic annotation of resources, (ii) semantic alignment between the metadata of two resources, and (iii) semantic mediation among Standard Names. Here we present the design of the framework and demonstrate its application for semantic integration between data and models used in the IML-CZO. First we show how the IML-CZO data are annotated using the Semantic Annotation Services. Then we illustrate how the Resource Alignment Services and Knowledge Integration Services are used to create a semantic workflow among TopoFlow model, which is a spatially-distributed hydrologic model and the annotated data. Results of this work are (i) a demonstration of how the GS framework advances the integration of heterogeneous data and models of water-related disciplines by seamless handling of their semantic heterogeneity, (ii) an introduction of new paradigm for reusing existing and new standards as well as tools and models without the need of their implementation in the Cyberinfrastructures of water-related disciplines, and (iii) an investigation of a methodology by which distributed models can be coupled in a workflow using the GS services.

  17. High-content image informatics of the structural nuclear protein NuMA parses trajectories for stem/progenitor cell lineages and oncogenic transformation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vega, Sebastián L.; Liu, Er; Arvind, Varun

    Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regionsmore » of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative “imaging-derived” parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. - Highlights: • High-content analysis of nuclear shape and organization classify stem and progenitor cells poised for distinct lineages. • Early oncogenic changes in mesenchymal stem cells (MSCs) are also detected with nuclear descriptors. • A new class of cancer-mitigating biomaterials was identified based on image informatics. • Textural metrics of the nuclear structural protein NuMA are sufficient to parse emergent cell phenotypes.« less

  18. Leveraging Pattern Semantics for Extracting Entities in Enterprises

    PubMed Central

    Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei

    2015-01-01

    Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises (“Blue” refers to “Windows Blue”), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach. PMID:26705540

  19. Leveraging Pattern Semantics for Extracting Entities in Enterprises.

    PubMed

    Tao, Fangbo; Zhao, Bo; Fuxman, Ariel; Li, Yang; Han, Jiawei

    2015-05-01

    Entity Extraction is a process of identifying meaningful entities from text documents. In enterprises, extracting entities improves enterprise efficiency by facilitating numerous applications, including search, recommendation, etc. However, the problem is particularly challenging on enterprise domains due to several reasons. First, the lack of redundancy of enterprise entities makes previous web-based systems like NELL and OpenIE not effective, since using only high-precision/low-recall patterns like those systems would miss the majority of sparse enterprise entities, while using more low-precision patterns in sparse setting also introduces noise drastically. Second, semantic drift is common in enterprises ("Blue" refers to "Windows Blue"), such that public signals from the web cannot be directly applied on entities. Moreover, many internal entities never appear on the web. Sparse internal signals are the only source for discovering them. To address these challenges, we propose an end-to-end framework for extracting entities in enterprises, taking the input of enterprise corpus and limited seeds to generate a high-quality entity collection as output. We introduce the novel concept of Semantic Pattern Graph to leverage public signals to understand the underlying semantics of lexical patterns, reinforce pattern evaluation using mined semantics, and yield more accurate and complete entities. Experiments on Microsoft enterprise data show the effectiveness of our approach.

  20. Where Is the Semantic System? A Critical Review and Meta-Analysis of 120 Functional Neuroimaging Studies

    PubMed Central

    Desai, Rutvik H.; Graves, William W.; Conant, Lisa L.

    2009-01-01

    Semantic memory refers to knowledge about people, objects, actions, relations, self, and culture acquired through experience. The neural systems that store and retrieve this information have been studied for many years, but a consensus regarding their identity has not been reached. Using strict inclusion criteria, we analyzed 120 functional neuroimaging studies focusing on semantic processing. Reliable areas of activation in these studies were identified using the activation likelihood estimate (ALE) technique. These activations formed a distinct, left-lateralized network comprised of 7 regions: posterior inferior parietal lobe, middle temporal gyrus, fusiform and parahippocampal gyri, dorsomedial prefrontal cortex, inferior frontal gyrus, ventromedial prefrontal cortex, and posterior cingulate gyrus. Secondary analyses showed specific subregions of this network associated with knowledge of actions, manipulable artifacts, abstract concepts, and concrete concepts. The cortical regions involved in semantic processing can be grouped into 3 broad categories: posterior multimodal and heteromodal association cortex, heteromodal prefrontal cortex, and medial limbic regions. The expansion of these regions in the human relative to the nonhuman primate brain may explain uniquely human capacities to use language productively, plan, solve problems, and create cultural and technological artifacts, all of which depend on the fluid and efficient retrieval and manipulation of semantic knowledge. PMID:19329570

  1. Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

    NASA Astrophysics Data System (ADS)

    Farhan Husain, Mohammad; Doshi, Pankil; Khan, Latifur; Thuraisingham, Bhavani

    Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL query. We make use of Hadoop's MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.

  2. Cohort Selection and Management Application Leveraging Standards-based Semantic Interoperability and a Groovy DSL

    PubMed Central

    Bucur, Anca; van Leeuwen, Jasper; Chen, Njin-Zu; Claerhout, Brecht; de Schepper, Kristof; Perez-Rey, David; Paraiso-Medina, Sergio; Alonso-Calvo, Raul; Mehta, Keyur; Krykwinski, Cyril

    2016-01-01

    This paper describes a new Cohort Selection application implemented to support streamlining the definition phase of multi-centric clinical research in oncology. Our approach aims at both ease of use and precision in defining the selection filters expressing the characteristics of the desired population. The application leverages our standards-based Semantic Interoperability Solution and a Groovy DSL to provide high expressiveness in the definition of filters and flexibility in their composition into complex selection graphs including splits and merges. Widely-adopted ontologies such as SNOMED-CT are used to represent the semantics of the data and to express concepts in the application filters, facilitating data sharing and collaboration on joint research questions in large communities of clinical users. The application supports patient data exploration and efficient collaboration in multi-site, heterogeneous and distributed data environments. PMID:27570644

  3. Review of the "AS-BUILT BIM" Approaches

    NASA Astrophysics Data System (ADS)

    Hichri, N.; Stefani, C.; De Luca, L.; Veron, P.

    2013-02-01

    Today, we need 3D models of heritage buildings in order to handle more efficiently projects of restoration, documentation and maintenance. In this context, developing a performing approach, based on a first phase of building survey, is a necessary step in order to build a semantically enriched digital model. For this purpose, the Building Information Modeling is an efficient tool for storing and exchanging knowledge about buildings. In order to create such a model, there are three fundamental steps: acquisition, segmentation and modeling. For these reasons, it is essential to understand and analyze this entire chain that leads to a well- structured and enriched 3D digital model. This paper proposes a survey and an analysis of the existing approaches on these topics and tries to define a new approach of semantic structuring taking into account the complexity of this chain.

  4. Cognitive correlates of verbal memory and verbal fluency in schizophrenia, and differential effects of various clinical symptoms between male and female patients.

    PubMed

    Brébion, Gildas; Villalta-Gil, Victoria; Autonell, Jaume; Cervilla, Jorge; Dolz, Montserrat; Foix, Alexandrina; Haro, Josep Maria; Usall, Judith; Vilaplana, Miriam; Ochoa, Susana

    2013-06-01

    Impairment of higher cognitive functions in patients with schizophrenia might stem from perturbation of more basic functions, such as processing speed. Various clinical symptoms might affect cognitive efficiency as well. Notably, previous research has revealed the role of affective symptoms on memory performance in this population, and suggested sex-specific effects. We conducted a post-hoc analysis of an extensive neuropsychological study of 88 patients with schizophrenia. Regression analyses were conducted on verbal memory and verbal fluency data to investigate the contribution of semantic organisation and processing speed to performance. The role of negative and affective symptoms and of attention disorders in verbal memory and verbal fluency was investigated separately in male and female patients. Semantic clustering contributed to verbal recall, and a measure of reading speed contributed to verbal recall as well as to phonological and semantic fluency. Negative symptoms affected verbal recall and verbal fluency in the male patients, whereas attention disorders affected these abilities in the female patients. Furthermore, depression affected verbal recall in women, whereas anxiety affected it in men. These results confirm the association of processing speed with cognitive efficiency in patients with schizophrenia. They also confirm the previously observed sex-specific associations of depression and anxiety with memory performance in these patients, and suggest that negative symptoms and attention disorders likewise are related to cognitive efficiency differently in men and women. Copyright © 2013 Elsevier B.V. All rights reserved.

  5. Automatic and Controlled Semantic Retrieval: TMS Reveals Distinct Contributions of Posterior Middle Temporal Gyrus and Angular Gyrus

    PubMed Central

    Davey, James; Cornelissen, Piers L.; Thompson, Hannah E.; Sonkusare, Saurabh; Hallam, Glyn; Smallwood, Jonathan

    2015-01-01

    Semantic retrieval involves both (1) automatic spreading activation between highly related concepts and (2) executive control processes that tailor this activation to suit the current context or goals. Two structures in left temporoparietal cortex, angular gyrus (AG) and posterior middle temporal gyrus (pMTG), are thought to be crucial to semantic retrieval and are often recruited together during semantic tasks; however, they show strikingly different patterns of functional connectivity at rest (coupling with the “default mode network” and “frontoparietal control system,” respectively). Here, transcranial magnetic stimulation (TMS) was used to establish a causal yet dissociable role for these sites in semantic cognition in human volunteers. TMS to AG disrupted thematic judgments particularly when the link between probe and target was strong (e.g., a picture of an Alsatian with a bone), and impaired the identification of objects at a specific but not a superordinate level (for the verbal label “Alsatian” not “animal”). In contrast, TMS to pMTG disrupted thematic judgments for weak but not strong associations (e.g., a picture of an Alsatian with razor wire), and impaired identity matching for both superordinate and specific-level labels. Thus, stimulation to AG interfered with the automatic retrieval of specific concepts from the semantic store while stimulation of pMTG impaired semantic cognition when there was a requirement to flexibly shape conceptual activation in line with the task requirements. These results demonstrate that AG and pMTG make a dissociable contribution to automatic and controlled aspects of semantic retrieval. SIGNIFICANCE STATEMENT We demonstrate a novel functional dissociation between the angular gyrus (AG) and posterior middle temporal gyrus (pMTG) in conceptual processing. These sites are often coactivated during neuroimaging studies using semantic tasks, but their individual contributions are unclear. Using transcranial magnetic stimulation and tasks designed to assess different aspects of semantics (item identity and thematic matching), we tested two alternative theoretical accounts. Neither site showed the pattern expected for a “thematic hub” (i.e., a site storing associations between concepts) since stimulation disrupted both tasks. Instead, the data indicated that pMTG contributes to the controlled retrieval of conceptual knowledge, while AG is critical for the efficient automatic retrieval of specific semantic information. PMID:26586812

  6. Distributional analyses in the picture-word interference paradigm: Exploring the semantic interference and the distractor frequency effects.

    PubMed

    Scaltritti, Michele; Navarrete, Eduardo; Peressotti, Francesca

    2015-01-01

    The present study explores the distributional features of two important effects within the picture-word interference paradigm: the semantic interference and the distractor frequency effects. These two effects display different and specific distributional profiles. Semantic interference appears greatly reduced in faster response times, while it reaches its full magnitude only in slower responses. This can be interpreted as a sign of fluctuant attentional efficiency in resolving response conflict. In contrast, the distractor frequency effect is mediated mainly by a distributional shift, with low-frequency distractors uniformly shifting reaction time distribution towards a slower range of latencies. This finding fits with the idea that distractor frequency exerts its effect by modulating the point in time in which operations required to discard the distractor can start. Taken together, these results are congruent with current theoretical accounts of both the semantic interference and distractor frequency effects. Critically, distributional analyses highlight and further describe the different cognitive dynamics underlying these two effects, suggesting that this analytical tool is able to offer important insights about lexical access during speech production.

  7. Word add-in for ontology recognition: semantic enrichment of scientific literature.

    PubMed

    Fink, J Lynn; Fernicola, Pablo; Chandran, Rahul; Parastatidis, Savas; Wade, Alex; Naim, Oscar; Quinn, Gregory B; Bourne, Philip E

    2010-02-24

    In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.

  8. Human ecstasy (MDMA) polydrug users have altered brain activation during semantic processing.

    PubMed

    Watkins, Tristan J; Raj, Vidya; Lee, Junghee; Dietrich, Mary S; Cao, Aize; Blackford, Jennifer U; Salomon, Ronald M; Park, Sohee; Benningfield, Margaret M; Di Iorio, Christina R; Cowan, Ronald L

    2013-05-01

    Ecstasy (3,4-methylenedioxymethamphetamine [MDMA]) polydrug users have verbal memory performance that is statistically significantly lower than that of control subjects. Studies have correlated long-term MDMA use with altered brain activation in regions that play a role in verbal memory. The aim of our study was to examine the association of lifetime ecstasy use with semantic memory performance and brain activation in ecstasy polydrug users. A total of 23 abstinent ecstasy polydrug users (age = 24.57 years) and 11 controls (age = 22.36 years) performed a two-part functional magnetic resonance imaging (fMRI) semantic encoding and recognition task. To isolate brain regions activated during each semantic task, we created statistical activation maps in which brain activation was greater for word stimuli than for non-word stimuli (corrected p < 0.05). During the encoding phase, ecstasy polydrug users had greater activation during semantic encoding bilaterally in language processing regions, including Brodmann areas 7, 39, and 40. Of this bilateral activation, signal intensity with a peak T in the right superior parietal lobe was correlated with lifetime ecstasy use (r s = 0.43, p = 0.042). Behavioral performance did not differ between groups. These findings demonstrate that ecstasy polydrug users have increased brain activation during semantic processing. This increase in brain activation in the absence of behavioral deficits suggests that ecstasy polydrug users have reduced cortical efficiency during semantic encoding, possibly secondary to MDMA-induced 5-HT neurotoxicity. Although pre-existing differences cannot be ruled out, this suggests the possibility of a compensatory mechanism allowing ecstasy polydrug users to perform equivalently to controls, providing additional support for an association of altered cerebral neurophysiology with MDMA exposure.

  9. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

    PubMed

    Velupillai, S; Mowery, D; South, B R; Kvist, M; Dalianis, H

    2015-08-13

    We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices.

  10. Human ecstasy (MDMA) polydrug users have altered brain activation during semantic processing

    PubMed Central

    Watkins, Tristan J.; Raj, Vidya; Lee, Junghee; Dietrich, Mary S.; Cao, Aize; Blackford, Jennifer U.; Salomon, Ronald M.; Park, Sohee; Benningfield, Margaret M.; Di Iorio, Christina R.; Cowan, Ronald L.

    2012-01-01

    Rationale Ecstasy (MDMA) polydrug users have verbal memory performance that is statistically significantly lower than comparison control subjects. Studies have correlated long-term MDMA use with altered brain activation in regions that play a role in verbal memory. Objectives The aim of our study was to examine the association of lifetime ecstasy use with semantic memory performance and brain activation in ecstasy polydrug users. Methods 23 abstinent ecstasy polydrug users (age=24.57) and 11 controls (age=22.36) performed a two-part fMRI semantic encoding and recognition task. To isolate brain regions activated during each semantic task, we created statistical activation maps in which brain activation was greater for word stimuli than for non-word stimuli (corrected p<0.05). Results During the encoding phase, ecstasy polydrug users had greater activation during semantic encoding bilaterally in language processing regions, including Brodmann Areas 7, 39, and 40. Of this bilateral activation, signal intensity with a peak T in the right superior parietal lobe was correlated with lifetime ecstasy use (rs=0.43, p=0.042). Behavioral performance did not differ between groups. Conclusions These findings demonstrate that ecstasy polydrug users have increased brain activation during semantic processing. This increase in brain activation in the absence of behavioral deficits suggests that ecstasy polydrug users have reduced cortical efficiency during semantic encoding, possibly secondary to MDMA-induced 5-HT neurotoxicity. Although pre-existing differences cannot be ruled out, this suggests the possibility of a compensatory mechanism allowing ecstasy polydrug users to perform equivalently to controls, providing additional support for an association of altered cerebral neurophysiology with MDMA exposure. PMID:23241648

  11. Effects of semantic relatedness on age-related associative memory deficits: the role of theta oscillations.

    PubMed

    Crespo-Garcia, Maite; Cantero, Jose L; Atienza, Mercedes

    2012-07-16

    Growing evidence suggests that age-related deficits in associative memory are alleviated when the to-be-associated items are semantically related. Here we investigate whether this beneficial effect of semantic relatedness is paralleled by spatio-temporal changes in cortical EEG dynamics during incidental encoding. Young and older adults were presented with faces at a particular spatial location preceded by a biographical cue that was either semantically related or unrelated. As expected, automatic encoding of face-location associations benefited from semantic relatedness in the two groups of age. This effect correlated with increased power of theta oscillations over medial and anterior lateral regions of the prefrontal cortex (PFC) and lateral regions of the posterior parietal cortex (PPC) in both groups. But better-performing elders also showed increased brain-behavior correlation in the theta band over the right inferior frontal gyrus (IFG) as compared to young adults. Semantic relatedness was, however, insufficient to fully eliminate age-related differences in associative memory. In line with this finding, poorer-performing elders relative to young adults showed significant reductions of theta power in the left IFG that were further predictive of behavioral impairment in the recognition task. All together, these results suggest that older adults benefit less than young adults from executive processes during encoding mainly due to neural inefficiency over regions of the left ventrolateral prefrontal cortex (VLPFC). But this associative deficit may be partially compensated for by engaging preexistent semantic knowledge, which likely leads to an efficient recruitment of attentional and integration processes supported by the left PPC and left anterior PFC respectively, together with neural compensatory mechanisms governed by the right VLPFC. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  13. An automatic method to generate domain-specific investigator networks using PubMed abstracts.

    PubMed

    Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J

    2007-06-20

    Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70-90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks.

  14. An automatic method to generate domain-specific investigator networks using PubMed abstracts

    PubMed Central

    Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J

    2007-01-01

    Background Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. Results We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70–90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. Conclusion We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks. PMID:17584920

  15. Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling

    PubMed Central

    Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L

    2016-01-01

    Background Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. Objective To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. Methods A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Results Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. Conclusions A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. PMID:27026612

  16. High-content image informatics of the structural nuclear protein NuMA parses trajectories for stem/progenitor cell lineages and oncogenic transformation.

    PubMed

    Vega, Sebastián L; Liu, Er; Arvind, Varun; Bushman, Jared; Sung, Hak-Joon; Becker, Matthew L; Lelièvre, Sophie; Kohn, Joachim; Vidi, Pierre-Alexandre; Moghe, Prabhas V

    2017-02-01

    Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regions of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative "imaging-derived" parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Parallel Processing Strategies of the Primate Visual System

    PubMed Central

    Nassi, Jonathan J.; Callaway, Edward M.

    2009-01-01

    Preface Incoming sensory information is sent to the brain along modality-specific channels corresponding to the five senses. Each of these channels further parses the incoming signals into parallel streams to provide a compact, efficient input to the brain. Ultimately, these parallel input signals must be elaborated upon and integrated within the cortex to provide a unified and coherent percept. Recent studies in the primate visual cortex have greatly contributed to our understanding of how this goal is accomplished. Multiple strategies including retinal tiling, hierarchical and parallel processing and modularity, defined spatially and by cell type-specific connectivity, are all used by the visual system to recover the rich detail of our visual surroundings. PMID:19352403

  18. GIDL: a rule based expert system for GenBank Intelligent Data Loading into the Molecular Biodiversity database

    PubMed Central

    2012-01-01

    Background In the scientific biodiversity community, it is increasingly perceived the need to build a bridge between molecular and traditional biodiversity studies. We believe that the information technology could have a preeminent role in integrating the information generated by these studies with the large amount of molecular data we can find in bioinformatics public databases. This work is primarily aimed at building a bioinformatic infrastructure for the integration of public and private biodiversity data through the development of GIDL, an Intelligent Data Loader coupled with the Molecular Biodiversity Database. The system presented here organizes in an ontological way and locally stores the sequence and annotation data contained in the GenBank primary database. Methods The GIDL architecture consists of a relational database and of an intelligent data loader software. The relational database schema is designed to manage biodiversity information (Molecular Biodiversity Database) and it is organized in four areas: MolecularData, Experiment, Collection and Taxonomy. The MolecularData area is inspired to an established standard in Generic Model Organism Databases, the Chado relational schema. The peculiarity of Chado, and also its strength, is the adoption of an ontological schema which makes use of the Sequence Ontology. The Intelligent Data Loader (IDL) component of GIDL is an Extract, Transform and Load software able to parse data, to discover hidden information in the GenBank entries and to populate the Molecular Biodiversity Database. The IDL is composed by three main modules: the Parser, able to parse GenBank flat files; the Reasoner, which automatically builds CLIPS facts mapping the biological knowledge expressed by the Sequence Ontology; the DBFiller, which translates the CLIPS facts into ordered SQL statements used to populate the database. In GIDL Semantic Web technologies have been adopted due to their advantages in data representation, integration and processing. Results and conclusions Entries coming from Virus (814,122), Plant (1,365,360) and Invertebrate (959,065) divisions of GenBank rel.180 have been loaded in the Molecular Biodiversity Database by GIDL. Our system, combining the Sequence Ontology and the Chado schema, allows a more powerful query expressiveness compared with the most commonly used sequence retrieval systems like Entrez or SRS. PMID:22536971

  19. Effective Web and Desktop Retrieval with Enhanced Semantic Spaces

    NASA Astrophysics Data System (ADS)

    Daoud, Amjad M.

    We describe the design and implementation of the NETBOOK prototype system for collecting, structuring and efficiently creating semantic vectors for concepts, noun phrases, and documents from a corpus of free full text ebooks available on the World Wide Web. Automatic generation of concept maps from correlated index terms and extracted noun phrases are used to build a powerful conceptual index of individual pages. To ensure scalabilty of our system, dimension reduction is performed using Random Projection [13]. Furthermore, we present a complete evaluation of the relative effectiveness of the NETBOOK system versus the Google Desktop [8].

  20. Towards a semantics-based approach in the development of geographic portals

    NASA Astrophysics Data System (ADS)

    Athanasis, Nikolaos; Kalabokidis, Kostas; Vaitis, Michail; Soulakellis, Nikolaos

    2009-02-01

    As the demand for geospatial data increases, the lack of efficient ways to find suitable information becomes critical. In this paper, a new methodology for knowledge discovery in geographic portals is presented. Based on the Semantic Web, our approach exploits the Resource Description Framework (RDF) in order to describe the geoportal's information with ontology-based metadata. When users traverse from page to page in the portal, they take advantage of the metadata infrastructure to navigate easily through data of interest. New metadata descriptions are published in the geoportal according to the RDF schemas.

  1. Utilizing semantic networks to database and retrieve generalized stochastic colored Petri nets

    NASA Technical Reports Server (NTRS)

    Farah, Jeffrey J.; Kelley, Robert B.

    1992-01-01

    Previous work has introduced the Planning Coordinator (PCOORD), a coordinator functioning within the hierarchy of the Intelligent Machine Mode. Within the structure of the Planning Coordinator resides the Primitive Structure Database (PSDB) functioning to provide the primitive structures utilized by the Planning Coordinator in the establishing of error recovery or on-line path plans. This report further explores the Primitive Structure Database and establishes the potential of utilizing semantic networks as a means of efficiently storing and retrieving the Generalized Stochastic Colored Petri Nets from which the error recovery plans are derived.

  2. From the Bench to the Bedside: The Role of Semantic Web and Translational Medicine for Enabling the Next Generation Healthcare Enterprise

    NASA Astrophysics Data System (ADS)

    Kashyap, Vipul

    The success of new innovations and technologies are very often disruptive in nature. At the same time, they enable novel next generation infrastructures and solutions. These solutions introduce great efficiencies in the form of efficient processes and the ability to create, organize, share and manage knowledge effectively; and the same time provide crucial enablers for proposing and realizing new visions. In this paper, we propose a new vision of the next generation healthcare enterprise and discuss how Translational Medicine, which aims to improve communication between the basic and clinical sciences, is a key requirement for achieving this vision. This will lead therapeutic insights may be derived from new scientific ideas - and vice versa. Translation research goes from bench to bedside, where theories emerging from preclinical experimentation are tested on disease-affected human subjects, and from bedside to bench, where information obtained from preliminary human experimentation can be used to refine our understanding of the biological principles underpinning the heterogeneity of human disease and polymorphism(s). Informatics and semantic technologies in particular, has a big role to play in making this a reality. We identify critical requirements, viz., data integration, clinical decision support and knowledge maintenance and provenance; and illustrate semantics-based solutions wrt example scenarios and use cases.

  3. Practical life log video indexing based on content and context

    NASA Astrophysics Data System (ADS)

    Tancharoen, Datchakorn; Yamasaki, Toshihiko; Aizawa, Kiyoharu

    2006-01-01

    Today, multimedia information has gained an important role in daily life and people can use imaging devices to capture their visual experiences. In this paper, we present our personal Life Log system to record personal experiences in form of wearable video and environmental data; in addition, an efficient retrieval system is demonstrated to recall the desirable media. We summarize the practical video indexing techniques based on Life Log content and context to detect talking scenes by using audio/visual cues and semantic key frames from GPS data. Voice annotation is also demonstrated as a practical indexing method. Moreover, we apply body media sensors to record continuous life style and use body media data to index the semantic key frames. In the experiments, we demonstrated various video indexing results which provided their semantic contents and showed Life Log visualizations to examine personal life effectively.

  4. The SeaHorn Verification Framework

    NASA Technical Reports Server (NTRS)

    Gurfinkel, Arie; Kahsai, Temesghen; Komuravelli, Anvesh; Navas, Jorge A.

    2015-01-01

    In this paper, we present SeaHorn, a software verification framework. The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics. SeaHorn encompasses several novelties: it (a) encodes verification conditions using an efficient yet precise inter-procedural technique, (b) provides flexibility in the verification semantics to allow different levels of precision, (c) leverages the state-of-the-art in software model checking and abstract interpretation for verification, and (d) uses Horn-clauses as an intermediate language to represent verification conditions which simplifies interfacing with multiple verification tools based on Horn-clauses. SeaHorn provides users with a powerful verification tool and researchers with an extensible and customizable framework for experimenting with new software verification techniques. The effectiveness and scalability of SeaHorn are demonstrated by an extensive experimental evaluation using benchmarks from SV-COMP 2015 and real avionics code.

  5. An Ontology-based Architecture for Integration of Clinical Trials Management Applications

    PubMed Central

    Shankar, Ravi D.; Martins, Susana B.; O’Connor, Martin; Parrish, David B.; Das, Amar K.

    2007-01-01

    Management of complex clinical trials involves coordinated-use of a myriad of software applications by trial personnel. The applications typically use distinct knowledge representations and generate enormous amount of information during the course of a trial. It becomes vital that the applications exchange trial semantics in order for efficient management of the trials and subsequent analysis of clinical trial data. Existing model-based frameworks do not address the requirements of semantic integration of heterogeneous applications. We have built an ontology-based architecture to support interoperation of clinical trial software applications. Central to our approach is a suite of clinical trial ontologies, which we call Epoch, that define the vocabulary and semantics necessary to represent information on clinical trials. We are continuing to demonstrate and validate our approach with different clinical trials management applications and with growing number of clinical trials. PMID:18693919

  6. A technological infrastructure to sustain Internetworked Enterprises

    NASA Astrophysics Data System (ADS)

    La Mattina, Ernesto; Savarino, Vincenzo; Vicari, Claudia; Storelli, Davide; Bianchini, Devis

    In the Web 3.0 scenario, where information and services are connected by means of their semantics, organizations can improve their competitive advantage by publishing their business and service descriptions. In this scenario, Semantic Peer to Peer (P2P) can play a key role in defining dynamic and highly reconfigurable infrastructures. Organizations can share knowledge and services, using this infrastructure to move towards value networks, an emerging organizational model characterized by fluid boundaries and complex relationships. This chapter collects and defines the technological requirements and architecture of a modular and multi-Layer Peer to Peer infrastructure for SOA-based applications. This technological infrastructure, based on the combination of Semantic Web and P2P technologies, is intended to sustain Internetworked Enterprise configurations, defining a distributed registry and enabling more expressive queries and efficient routing mechanisms. The following sections focus on the overall architecture, while describing the layers that form it.

  7. The Dissociation between Polarity, Semantic Orientation, and Emotional Tone as an Early Indicator of Cognitive Impairment.

    PubMed

    Arias Tapia, Susana A; Martínez-Tomás, Rafael; Gómez, Héctor F; Hernández Del Salto, Víctor; Sánchez Guerrero, Javier; Mocha-Bonilla, J A; Barbosa Corbacho, José; Khan, Azizudin; Chicaiza Redin, Veronica

    2016-01-01

    The present study aims to identify early cognitive impairment through the efficient use of therapies that can improve the quality of daily life and prevent disease progress. We propose a methodology based on the hypothesis that the dissociation between oral semantic expression and the physical expressions, facial gestures, or emotions transmitted in a person's tone of voice is a possible indicator of cognitive impairment. Experiments were carried out with phrases, analyzing the semantics of the message, and the tone of the voice of patients through unstructured interviews in healthy people and patients at an early Alzheimer's stage. The results show that the dissociation in cognitive impairment was an effective indicator, arising from patterns of inconsistency between the analyzed elements. Although the results of our study are encouraging, we believe that further studies are necessary to confirm that this dissociation is a probable indicator of cognitive impairment.

  8. The Dissociation between Polarity, Semantic Orientation, and Emotional Tone as an Early Indicator of Cognitive Impairment

    PubMed Central

    Arias Tapia, Susana A.; Martínez-Tomás, Rafael; Gómez, Héctor F.; Hernández del Salto, Víctor; Sánchez Guerrero, Javier; Mocha-Bonilla, J. A.; Barbosa Corbacho, José; Khan, Azizudin; Chicaiza Redin, Veronica

    2016-01-01

    The present study aims to identify early cognitive impairment through the efficient use of therapies that can improve the quality of daily life and prevent disease progress. We propose a methodology based on the hypothesis that the dissociation between oral semantic expression and the physical expressions, facial gestures, or emotions transmitted in a person's tone of voice is a possible indicator of cognitive impairment. Experiments were carried out with phrases, analyzing the semantics of the message, and the tone of the voice of patients through unstructured interviews in healthy people and patients at an early Alzheimer's stage. The results show that the dissociation in cognitive impairment was an effective indicator, arising from patterns of inconsistency between the analyzed elements. Although the results of our study are encouraging, we believe that further studies are necessary to confirm that this dissociation is a probable indicator of cognitive impairment. PMID:27683555

  9. Automated Predictive Big Data Analytics Using Ontology Based Semantics.

    PubMed

    Nural, Mustafa V; Cotterell, Michael E; Peng, Hao; Xie, Rui; Ma, Ping; Miller, John A

    2015-10-01

    Predictive analytics in the big data era is taking on an ever increasingly important role. Issues related to choice on modeling technique, estimation procedure (or algorithm) and efficient execution can present significant challenges. For example, selection of appropriate and optimal models for big data analytics often requires careful investigation and considerable expertise which might not always be readily available. In this paper, we propose to use semantic technology to assist data analysts and data scientists in selecting appropriate modeling techniques and building specific models as well as the rationale for the techniques and models selected. To formally describe the modeling techniques, models and results, we developed the Analytics Ontology that supports inferencing for semi-automated model selection. The SCALATION framework, which currently supports over thirty modeling techniques for predictive big data analytics is used as a testbed for evaluating the use of semantic technology.

  10. Automated Predictive Big Data Analytics Using Ontology Based Semantics

    PubMed Central

    Nural, Mustafa V.; Cotterell, Michael E.; Peng, Hao; Xie, Rui; Ma, Ping; Miller, John A.

    2017-01-01

    Predictive analytics in the big data era is taking on an ever increasingly important role. Issues related to choice on modeling technique, estimation procedure (or algorithm) and efficient execution can present significant challenges. For example, selection of appropriate and optimal models for big data analytics often requires careful investigation and considerable expertise which might not always be readily available. In this paper, we propose to use semantic technology to assist data analysts and data scientists in selecting appropriate modeling techniques and building specific models as well as the rationale for the techniques and models selected. To formally describe the modeling techniques, models and results, we developed the Analytics Ontology that supports inferencing for semi-automated model selection. The SCALATION framework, which currently supports over thirty modeling techniques for predictive big data analytics is used as a testbed for evaluating the use of semantic technology. PMID:29657954

  11. Semantic enrichment of medical forms - semi-automated coding of ODM-elements via web services.

    PubMed

    Breil, Bernhard; Watermann, Andreas; Haas, Peter; Dziuballe, Philipp; Dugas, Martin

    2012-01-01

    Semantic interoperability is an unsolved problem which occurs while working with medical forms from different information systems or institutions. Standards like ODM or CDA assure structural homogenization but in order to compare elements from different data models it is necessary to use semantic concepts and codes on an item level of those structures. We developed and implemented a web-based tool which enables a domain expert to perform semi-automated coding of ODM-files. For each item it is possible to inquire web services which result in unique concept codes without leaving the context of the document. Although it was not feasible to perform a totally automated coding we have implemented a dialog based method to perform an efficient coding of all data elements in the context of the whole document. The proportion of codable items was comparable to results from previous studies.

  12. Exploiting salient semantic analysis for information retrieval

    NASA Astrophysics Data System (ADS)

    Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui

    2016-11-01

    Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.

  13. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

    PubMed Central

    Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505

  14. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

    PubMed

    Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

  15. Semantic integration to identify overlapping functional modules in protein interaction networks

    PubMed Central

    Cho, Young-Rae; Hwang, Woochang; Ramanathan, Murali; Zhang, Aidong

    2007-01-01

    Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification. PMID:17650343

  16. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  17. Translation lexicon acquisition from bilingual dictionaries

    NASA Astrophysics Data System (ADS)

    Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.

    2001-12-01

    Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.

  18. PyParse: a semiautomated system for scoring spoken recall data.

    PubMed

    Solway, Alec; Geller, Aaron S; Sederberg, Per B; Kahana, Michael J

    2010-02-01

    Studies of human memory often generate data on the sequence and timing of recalled items, but scoring such data using conventional methods is difficult or impossible. We describe a Python-based semiautomated system that greatly simplifies this task. This software, called PyParse, can easily be used in conjunction with many common experiment authoring systems. Scored data is output in a simple ASCII format and can be accessed with the programming language of choice, allowing for the identification of features such as correct responses, prior-list intrusions, extra-list intrusions, and repetitions.

  19. Distributed Coordination of Heterogeneous Agents Using a Semantic Overlay Network and a Goal-Directed Graphplan Planner

    PubMed Central

    Lopes, António Luís; Botelho, Luís Miguel

    2013-01-01

    In this paper, we describe a distributed coordination system that allows agents to seamlessly cooperate in problem solving by partially contributing to a problem solution and delegating the subproblems for which they do not have the required skills or knowledge to appropriate agents. The coordination mechanism relies on a dynamically built semantic overlay network that allows the agents to efficiently locate, even in very large unstructured networks, the necessary skills for a specific problem. Each agent performs partial contributions to the problem solution using a new distributed goal-directed version of the Graphplan algorithm. This new goal-directed version of the original Graphplan algorithm provides an efficient solution to the problem of "distraction", which most forward-chaining algorithms suffer from. We also discuss a set of heuristics to be used in the backward-search process of the planning algorithm in order to distribute this process amongst idle agents in an attempt to find a solution in less time. The evaluation results show that our approach is effective in building a scalable and efficient agent society capable of solving complex distributable problems. PMID:23704885

  20. Don’t speak too fast! Processing of fast rate speech in children with specific language impairment

    PubMed Central

    Bedoin, Nathalie; Krifi-Papoz, Sonia; Herbillon, Vania; Caillot-Bascoul, Aurélia; Gonzalez-Monge, Sibylle; Boulenger, Véronique

    2018-01-01

    Background Perception of speech rhythm requires the auditory system to track temporal envelope fluctuations, which carry syllabic and stress information. Reduced sensitivity to rhythmic acoustic cues has been evidenced in children with Specific Language Impairment (SLI), impeding syllabic parsing and speech decoding. Our study investigated whether these children experience specific difficulties processing fast rate speech as compared with typically developing (TD) children. Method Sixteen French children with SLI (8–13 years old) with mainly expressive phonological disorders and with preserved comprehension and 16 age-matched TD children performed a judgment task on sentences produced 1) at normal rate, 2) at fast rate or 3) time-compressed. Sensitivity index (d′) to semantically incongruent sentence-final words was measured. Results Overall children with SLI perform significantly worse than TD children. Importantly, as revealed by the significant Group × Speech Rate interaction, children with SLI find it more challenging than TD children to process both naturally or artificially accelerated speech. The two groups do not significantly differ in normal rate speech processing. Conclusion In agreement with rhythm-processing deficits in atypical language development, our results suggest that children with SLI face difficulties adjusting to rapid speech rate. These findings are interpreted in light of temporal sampling and prosodic phrasing frameworks and of oscillatory mechanisms underlying speech perception. PMID:29373610

  1. Pedoinformatics Approach to Soil Text Analytics

    NASA Astrophysics Data System (ADS)

    Furey, J.; Seiter, J.; Davis, A.

    2017-12-01

    The several extant schema for the classification of soils rely on differing criteria, but the major soil science taxonomies, including the United States Department of Agriculture (USDA) and the international harmonized World Reference Base for Soil Resources systems, are based principally on inferred pedogenic properties. These taxonomies largely result from compiled individual observations of soil morphologies within soil profiles, and the vast majority of this pedologic information is contained in qualitative text descriptions. We present text mining analyses of hundreds of gigabytes of parsed text and other data in the digitally available USDA soil taxonomy documentation, the Soil Survey Geographic (SSURGO) database, and the National Cooperative Soil Survey (NCSS) soil characterization database. These analyses implemented iPython calls to Gensim modules for topic modelling, with latent semantic indexing completed down to the lowest taxon level (soil series) paragraphs. Via a custom extension of the Natural Language Toolkit (NLTK), approximately one percent of the USDA soil series descriptions were used to train a classifier for the remainder of the documents, essentially by treating soil science words as comprising a novel language. While location-specific descriptors at the soil series level are amenable to geomatics methods, unsupervised clustering of the occurrence of other soil science words did not closely follow the usual hierarchy of soil taxa. We present preliminary phrasal analyses that may account for some of these effects.

  2. Production and comprehension show divergent constituent order preferences: Evidence from elicited pantomime

    PubMed Central

    Hall, Matthew L.; Ahn, Y. Danbi; Mayberry, Rachel I.; Ferreira, Victor S.

    2015-01-01

    All natural languages develop devices to communicate who did what to whom. Elicited pantomime provides one model for studying this process, by providing a window into how humans (hearing non-signers) behave in a natural communicative modality (silent gesture) without established conventions from a grammar. Most studies in this paradigm focus on production, although they sometimes make assumptions about how comprehenders would likely behave. Here, we directly assess how naïve speakers of English (Experiments 1 & 2), Korean (Experiment 1), and Turkish (Experiment 2) comprehend pantomimed descriptions of transitive events, which are either semantically reversible (Experiments 1 & 2) or not (Experiment 2). Contrary to previous assumptions, we find no evidence that Person-Person-Action sequences are ambiguous to comprehenders, who simply adopt an agent-first parsing heuristic for all constituent orders. We do find that Person-Action-Person sequences yield the most consistent interpretations, even in native speakers of SOV languages. The full range of behavior in both production and comprehension provides counter-evidence to the notion that producers’ utterances are motivated by the needs of comprehenders. Instead, we argue that production and comprehension are subject to different sets of cognitive pressures, and that the dynamic interaction between these competing pressures can help explain synchronic and diachronic constituent order phenomena in natural human languages, both signed and spoken. PMID:25642018

  3. Morphosemantic parsing of medical compound words: transferring a French analyzer to English.

    PubMed

    Deléger, Louise; Namer, Fiammetta; Zweigenbaum, Pierre

    2009-04-01

    Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin--in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. The similarity of structure of those compounds in several European languages has also been observed, which seems to indicate that a same linguistic analysis could be applied to neo-classical compounds from different languages with minor modifications. This paper reports work on the adaptation of a morphosemantic analyzer dedicated to French (DériF) to analyze English medical neo-classical compounds. It presents the principles of this transposition and its current performance. The analyzer was tested on a set of 1299 compounds extracted from the WHO-ART terminology. 859 could be decomposed and defined, 675 of which successfully. An advantage of this process is that complex linguistic analyses designed for French could be successfully transposed to the analysis of English medical neoclassical compounds, which confirmed our hypothesis of transferability. The fact that the method was successfully applied to a Germanic language such as English suggests that performances would be at least as high if experimenting with Romance languages such as Spanish. Finally, the resulting system can produce more complete analyses of English medical compounds than existing systems, including a hierarchical decomposition and semantic gloss of each word.

  4. Research issues of geometry-based visual languages and some solutions

    NASA Astrophysics Data System (ADS)

    Green, Thorn G.

    This dissertation addresses the problem of how to design visual language systems that are based upon Geometric Algebra, and provide a visual coupling of algebraic expressions and geometric depictions. This coupling of algebraic expressions and geometric depictions provides a new means for expressing both mathematical and geometric relationships present in mathematics, physics, and Computer-Aided Geometric Design (CAGD). Another significant feature of such a system is that the result of changing a parameter (by dragging the mouse) can be seen immediately in the depiction(s) of all expressions that use that parameter. This greatly aides the cognition of the relationships between variables. Systems for representing such a coupling of algebra and geometry have characteristics of both visual language systems, and systems for scientific visualization. Instead of using a parsing or dataflow paradigm for the visual language representation, the systems instead represent equations as manipulatible constrained diagrams for their visualization. This requires that the design of such a system have (but is not limited to) a means for parsing equations entered by the user, a scheme for producing a visual representation of these equations; techniques for maintaining the coupling between the expressions entered and the diagrams displayed; algorithms for maintaining the consistency of the diagrams; and, indexing capabilities that are efficient enough to allow diagrams to be created, and manipulated in a short enough period of time. The author proposes solutions for how such a design can be realized.

  5. Some further clarifications on age-related differences in Stroop interference.

    PubMed

    Augustinova, Maria; Clarys, David; Spatola, Nicolas; Ferrand, Ludovic

    2018-04-01

    Both the locus and processes underlying the age-related differences in Stroop interference are usually inferred from changes in magnitudes of standard (i.e., overall) Stroop interference. Therefore, this study addressed these still-open issues directly. To this end, a sample of younger (18-26 years old) and healthy older (72-97 years old) was administered the semantic Stroop paradigm (that assesses the relative contribution of semantic compared to response conflict both of which contribute to overall Stroop interference) combined with a single-letter coloring and cuing (SLCC) procedure. Independently of an increased attentional focus on the relevant color dimension of Stroop words induced by SLCC (as compared to all letters colored and cued, ALCC), greater magnitudes of standard Stroop interference were observed in older (as compared to younger) adults. These differences were due to greater magnitudes of response conflict whereas magnitudes of semantic conflict remained significant and unchanged by healthy aging and SLCC. Thus, this direct evidence places the locus of age-related differences in Stroop interference at the level of response conflict (as opposed to semantic and/or both conflicts). In terms of processes underlying these differences, the reported evidence shows that both age-groups are equally (in)efficient in (a) focusing on the relevant color dimension and (b) suppressing the meaning of the irrelevant word-dimension of Stroop words. Healthy older adults are simply less efficient in suppressing the (pre-)response activity primed by the fully processed meaning of the irrelevant word-dimension. Standard interpretations of age-related differences in Stroop interference and a more general issue of how attentional selectivity actually operates in the Stroop task are therefore reconsidered in this paper.

  6. Treatment of category generation and retrieval in aphasia: Effect of typicality of category items.

    PubMed Central

    Kiran, Swathi; Sandberg, Chaleece; Sebastian, Rajani

    2011-01-01

    Purpose: Kiran and colleagues (Kiran, 2007, 2008; Kiran & Johnson, 2008; Kiran & Thompson, 2003) have previously suggested that training atypical examples within a semantic category is a more efficient treatment approach to facilitating generalization within the category than training typical examples. The present study extended our previous work examining the notion of semantic complexity within goal-derived (ad-hoc) categories in individuals with aphasia. Methods: Six individuals with fluent aphasia (range = 39-84 years) and varying degrees of naming deficits and semantic impairments were involved. Thirty typical and atypical items each from two categories were selected after an extensive stimulus norming task. Generative naming for the two categories was tested during baseline and treatment. Results: As predicted, training atypical examples in the category resulted in generalization to untrained typical examples in five out the five patient-treatment conditions. In contrast, training typical examples (which was in examined three conditions) produced mixed results. One patient showed generalization to untrained atypical examples, whereas two patients did not show generalization to untrained atypical examples. Conclusions: Results of the present study supplement our existing data on the effect of a semantically based treatment for lexical retrieval by manipulating the typicality of category exemplars. PMID:21173393

  7. Item-specific and generalization effects on brain activation when learning Chinese characters

    PubMed Central

    Deng, Yuan; Booth, James R.; Chou, Tai-Li; Ding, Guo-Sheng; Peng, Dan-Ling

    2009-01-01

    Neural changes related to learning of the meaning of Chinese characters in English speakers were examined using functional magnetic resonance imaging (fMRI). We examined item specific learning effects for trained characters, but also the generalization of semantic knowledge to novel transfer characters that shared a semantic radical (part of a character that gives a clue to word meaning, e.g. water for lake) with trained characters. Behavioral results show that acquired semantic knowledge improves performance for both trained and transfer characters. Neuroimaging results show that the left fusiform gyrus plays a central role in the visual processing of orthographic information in characters. The left superior parietal cortex seems to play a crucial role in learning the visual–spatial aspects of the characters because it shows learning related decreases for trained characters, is correlated with behavioral improvement from early to late in learning for the trained characters, and is correlated with better long-term retention for the transfer characters. The inferior frontal gyrus seems to be associated with the efficiency of retrieving and manipulating semantic representations because there are learning related decreases for trained characters and this decrease is correlated with greater behavioral improvement from early to late in learning. PMID:18514678

  8. The next generation of similarity measures that fully explore the semantics in biomedical ontologies.

    PubMed

    Couto, Francisco M; Pinto, H Sofia

    2013-10-01

    There is a prominent trend to augment and improve the formality of biomedical ontologies. For example, this is shown by the current effort on adding description logic axioms, such as disjointness. One of the key ontology applications that can take advantage of this effort is the conceptual (functional) similarity measurement. The presence of description logic axioms in biomedical ontologies make the current structural or extensional approaches weaker and further away from providing sound semantics-based similarity measures. Although beneficial in small ontologies, the exploration of description logic axioms by semantics-based similarity measures is computational expensive. This limitation is critical for biomedical ontologies that normally contain thousands of concepts. Thus in the process of gaining their rightful place, biomedical functional similarity measures have to take the journey of finding how this rich and powerful knowledge can be fully explored while keeping feasible computational costs. This manuscript aims at promoting and guiding the development of compelling tools that deliver what the biomedical community will require in a near future: a next-generation of biomedical similarity measures that efficiently and fully explore the semantics present in biomedical ontologies.

  9. Word add-in for ontology recognition: semantic enrichment of scientific literature

    PubMed Central

    2010-01-01

    Background In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. Results The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. Conclusions The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata. PMID:20181245

  10. The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser.

    PubMed

    Vosse, Theo; Kempen, Gerard

    2009-12-01

    We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described.

  11. Representations of the language recognition problem for a theorem prover

    NASA Technical Reports Server (NTRS)

    Minker, J.; Vanderbrug, G. J.

    1972-01-01

    Two representations of the language recognition problem for a theorem prover in first order logic are presented and contrasted. One of the representations is based on the familiar method of generating sentential forms of the language, and the other is based on the Cocke parsing algorithm. An augmented theorem prover is described which permits recognition of recursive languages. The state-transformation method developed by Cordell Green to construct problem solutions in resolution-based systems can be used to obtain the parse tree. In particular, the end-order traversal of the parse tree is derived in one of the representations. An inference system, termed the cycle inference system, is defined which makes it possible for the theorem prover to model the method on which the representation is based. The general applicability of the cycle inference system to state space problems is discussed. Given an unsatisfiable set S, where each clause has at most one positive literal, it is shown that there exists an input proof. The clauses for the two representations satisfy these conditions, as do many state space problems.

  12. Industrial application of semantic process mining

    NASA Astrophysics Data System (ADS)

    Espen Ingvaldsen, Jon; Atle Gulla, Jon

    2012-05-01

    Process mining relates to the extraction of non-trivial and useful information from information system event logs. It is a new research discipline that has evolved significantly since the early work on idealistic process logs. Over the last years, process mining prototypes have incorporated elements from semantics and data mining and targeted visualisation techniques that are more user-friendly to business experts and process owners. In this article, we present a framework for evaluating different aspects of enterprise process flows and address practical challenges of state-of-the-art industrial process mining. We also explore the inherent strengths of the technology for more efficient process optimisation.

  13. Recognition of a person named entity from the text written in a natural language

    NASA Astrophysics Data System (ADS)

    Dolbin, A. V.; Rozaliev, V. L.; Orlova, Y. A.

    2017-01-01

    This work is devoted to the semantic analysis of texts, which were written in a natural language. The main goal of the research was to compare latent Dirichlet allocation and latent semantic analysis to identify elements of the human appearance in the text. The completeness of information retrieval was chosen as the efficiency criteria for methods comparison. However, it was insufficient to choose only one method for achieving high recognition rates. Thus, additional methods were used for finding references to the personality in the text. All these methods are based on the created information model, which represents person’s appearance.

  14. Effect of episodic and working memory impairments on semantic and cognitive procedural learning at alcohol treatment entry.

    PubMed

    Pitel, Anne Lise; Witkowski, Thomas; Vabret, François; Guillery-Girard, Bérengère; Desgranges, Béatrice; Eustache, Francis; Beaunieux, Hélène

    2007-02-01

    Chronic alcoholism is known to impair the functioning of episodic and working memory, which may consequently reduce the ability to learn complex novel information. Nevertheless, semantic and cognitive procedural learning have not been properly explored at alcohol treatment entry, despite its potential clinical relevance. The goal of the present study was therefore to determine whether alcoholic patients, immediately after the weaning phase, are cognitively able to acquire complex new knowledge, given their episodic and working memory deficits. Twenty alcoholic inpatients with episodic memory and working memory deficits at alcohol treatment entry and a control group of 20 healthy subjects underwent a protocol of semantic acquisition and cognitive procedural learning. The semantic learning task consisted of the acquisition of 10 novel concepts, while subjects were administered the Tower of Toronto task to measure cognitive procedural learning. Analyses showed that although alcoholic subjects were able to acquire the category and features of the semantic concepts, albeit slowly, they presented impaired label learning. In the control group, executive functions and episodic memory predicted semantic learning in the first and second halves of the protocol, respectively. In addition to the cognitive processes involved in the learning strategies invoked by controls, alcoholic subjects seem to attempt to compensate for their impaired cognitive functions, invoking capacities of short-term passive storage. Regarding cognitive procedural learning, although the patients eventually achieved the same results as the controls, they failed to automate the procedure. Contrary to the control group, the alcoholic groups' learning performance was predicted by controlled cognitive functions throughout the protocol. At alcohol treatment entry, alcoholic patients with neuropsychological deficits have difficulty acquiring novel semantic and cognitive procedural knowledge. Compared with controls, they seem to use more costly learning strategies, which are nonetheless less efficient. These learning disabilities need to be considered when treatment requiring the acquisition of complex novel information is envisaged.

  15. A Computational Linguistic Measure of Clustering Behavior on Semantic Verbal Fluency Task Predicts Risk of Future Dementia in the Nun Study

    PubMed Central

    Pakhomov, Serguei V.S.; Hemmy, Laura S.

    2014-01-01

    Generative semantic verbal fluency (SVF) tests show early and disproportionate decline relative to other abilities in individuals developing Alzheimer’s disease. Optimal performance on SVF tests depends on the efficiency of using clustered organization of semantically related items and the ability to switch between clusters. Traditional approaches to clustering and switching have relied on manual determination of clusters. We evaluated a novel automated computational linguistic approach for quantifying clustering behavior. Our approach is based on Latent Semantic Analysis (LSA) for computing strength of semantic relatedness between pairs of words produced in response to SVF test. The mean size of semantic clusters (MCS) and semantic chains (MChS) are calculated based on pairwise relatedness values between words. We evaluated the predictive validity of these measures on a set of 239 participants in the Nun Study, a longitudinal study of aging. All were cognitively intact at baseline assessment, measured with the CERAD battery, and were followed in 18 month waves for up to 20 years. The onset of either dementia or memory impairment were used as outcomes in Cox proportional hazards models adjusted for age and education and censored at follow up waves 5 (6.3 years) and 13 (16.96 years). Higher MCS was associated with 38% reduction in dementia risk at wave 5 and 26% reduction at wave 13, but not with the onset of memory impairment. Higher (+1 SD) MChS was associated with 39% dementia risk reduction at wave 5 but not wave 13, and association with memory impairment was not significant. Higher traditional SVF scores were associated with 22–29% memory impairment and 35–40% dementia risk reduction. SVF scores were not correlated with either MCS or MChS. Our study suggests that an automated approach to measuring clustering behavior can be used to estimate dementia risk in cognitively normal individuals. PMID:23845236

  16. A computational linguistic measure of clustering behavior on semantic verbal fluency task predicts risk of future dementia in the nun study.

    PubMed

    Pakhomov, Serguei V S; Hemmy, Laura S

    2014-06-01

    Generative semantic verbal fluency (SVF) tests show early and disproportionate decline relative to other abilities in individuals developing Alzheimer's disease. Optimal performance on SVF tests depends on the efficiency of using clustered organization of semantically related items and the ability to switch between clusters. Traditional approaches to clustering and switching have relied on manual determination of clusters. We evaluated a novel automated computational linguistic approach for quantifying clustering behavior. Our approach is based on Latent Semantic Analysis (LSA) for computing strength of semantic relatedness between pairs of words produced in response to SVF test. The mean size of semantic clusters (MCS) and semantic chains (MChS) are calculated based on pairwise relatedness values between words. We evaluated the predictive validity of these measures on a set of 239 participants in the Nun Study, a longitudinal study of aging. All were cognitively intact at baseline assessment, measured with the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) battery, and were followed in 18-month waves for up to 20 years. The onset of either dementia or memory impairment were used as outcomes in Cox proportional hazards models adjusted for age and education and censored at follow-up waves 5 (6.3 years) and 13 (16.96 years). Higher MCS was associated with 38% reduction in dementia risk at wave 5 and 26% reduction at wave 13, but not with the onset of memory impairment. Higher [+1 standard deviation (SD)] MChS was associated with 39% dementia risk reduction at wave 5 but not wave 13, and association with memory impairment was not significant. Higher traditional SVF scores were associated with 22-29% memory impairment and 35-40% dementia risk reduction. SVF scores were not correlated with either MCS or MChS. Our study suggests that an automated approach to measuring clustering behavior can be used to estimate dementia risk in cognitively normal individuals. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. The anterior-ventrolateral temporal lobe contributes to boosting visual working memory capacity for items carrying semantic information.

    PubMed

    Chiou, Rocco; Lambon Ralph, Matthew A

    2018-04-01

    Working memory (WM) is a buffer that temporarily maintains information, be it visual or auditory, in an active state, caching its contents for online rehearsal or manipulation. How the brain enables long-term semantic knowledge to affect the WM buffer is a theoretically significant issue awaiting further investigation. In the present study, we capitalise on the knowledge about famous individuals as a 'test-case' to study how it impinges upon WM capacity for human faces and its neural substrate. Using continuous theta-burst transcranial stimulation combined with a psychophysical task probing WM storage for varying contents, we provide compelling evidence that (1) faces (regardless of familiarity) continued to accrue in the WM buffer with longer encoding time, whereas for meaningless stimuli (colour shades) there was little increment; (2) the rate of WM accrual was significantly more efficient for famous faces, compared to unknown faces; (3) the right anterior-ventrolateral temporal lobe (ATL) causally mediated this superior WM storage for famous faces. Specifically, disrupting the ATL (a region tuned to semantic knowledge including person identity) selectively hinders WM accrual for celebrity faces while leaving the accrual for unfamiliar faces intact. Further, this 'semantically-accelerated' storage is impervious to disruption of the right middle frontal gyrus and vertex, supporting the specific and causative contribution of the right ATL. Our finding advances the understanding of the neural architecture of WM, demonstrating that it depends on interaction with long-term semantic knowledge underpinned by the ATL, which causally expands the WM buffer when visual content carries semantic information. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Higher Language Ability is Related to Angular Gyrus Activation Increase During Semantic Processing, Independent of Sentence Incongruency.

    PubMed

    Van Ettinger-Veenstra, Helene; McAllister, Anita; Lundberg, Peter; Karlsson, Thomas; Engström, Maria

    2016-01-01

    This study investigates the relation between individual language ability and neural semantic processing abilities. Our aim was to explore whether high-level language ability would correlate to decreased activation in language-specific regions or rather increased activation in supporting language regions during processing of sentences. Moreover, we were interested if observed neural activation patterns are modulated by semantic incongruency similarly to previously observed changes upon syntactic congruency modulation. We investigated 27 healthy adults with a sentence reading task-which tapped language comprehension and inference, and modulated sentence congruency-employing functional magnetic resonance imaging (fMRI). We assessed the relation between neural activation, congruency modulation, and test performance on a high-level language ability assessment with multiple regression analysis. Our results showed increased activation in the left-hemispheric angular gyrus extending to the temporal lobe related to high language ability. This effect was independent of semantic congruency, and no significant relation between language ability and incongruency modulation was observed. Furthermore, there was a significant increase of activation in the inferior frontal gyrus (IFG) bilaterally when the sentences were incongruent, indicating that processing incongruent sentences was more demanding than processing congruent sentences and required increased activation in language regions. The correlation of high-level language ability with increased rather than decreased activation in the left angular gyrus, a region specific for language processing, is opposed to what the neural efficiency hypothesis would predict. We can conclude that no evidence is found for an interaction between semantic congruency related brain activation and high-level language performance, even though the semantic incongruent condition shows to be more demanding and evoking more neural activation.

  19. Closed-Loop Lifecycle Management of Service and Product in the Internet of Things: Semantic Framework for Knowledge Integration.

    PubMed

    Yoo, Min-Jung; Grozel, Clément; Kiritsis, Dimitris

    2016-07-08

    This paper describes our conceptual framework of closed-loop lifecycle information sharing for product-service in the Internet of Things (IoT). The framework is based on the ontology model of product-service and a type of IoT message standard, Open Messaging Interface (O-MI) and Open Data Format (O-DF), which ensures data communication. (1) BACKGROUND: Based on an existing product lifecycle management (PLM) methodology, we enhanced the ontology model for the purpose of integrating efficiently the product-service ontology model that was newly developed; (2) METHODS: The IoT message transfer layer is vertically integrated into a semantic knowledge framework inside which a Semantic Info-Node Agent (SINA) uses the message format as a common protocol of product-service lifecycle data transfer; (3) RESULTS: The product-service ontology model facilitates information retrieval and knowledge extraction during the product lifecycle, while making more information available for the sake of service business creation. The vertical integration of IoT message transfer, encompassing all semantic layers, helps achieve a more flexible and modular approach to knowledge sharing in an IoT environment; (4) Contribution: A semantic data annotation applied to IoT can contribute to enhancing collected data types, which entails a richer knowledge extraction. The ontology-based PLM model enables as well the horizontal integration of heterogeneous PLM data while breaking traditional vertical information silos; (5) CONCLUSION: The framework was applied to a fictive case study with an electric car service for the purpose of demonstration. For the purpose of demonstrating the feasibility of the approach, the semantic model is implemented in Sesame APIs, which play the role of an Internet-connected Resource Description Framework (RDF) database.

  20. Closed-Loop Lifecycle Management of Service and Product in the Internet of Things: Semantic Framework for Knowledge Integration

    PubMed Central

    Yoo, Min-Jung; Grozel, Clément; Kiritsis, Dimitris

    2016-01-01

    This paper describes our conceptual framework of closed-loop lifecycle information sharing for product-service in the Internet of Things (IoT). The framework is based on the ontology model of product-service and a type of IoT message standard, Open Messaging Interface (O-MI) and Open Data Format (O-DF), which ensures data communication. (1) Background: Based on an existing product lifecycle management (PLM) methodology, we enhanced the ontology model for the purpose of integrating efficiently the product-service ontology model that was newly developed; (2) Methods: The IoT message transfer layer is vertically integrated into a semantic knowledge framework inside which a Semantic Info-Node Agent (SINA) uses the message format as a common protocol of product-service lifecycle data transfer; (3) Results: The product-service ontology model facilitates information retrieval and knowledge extraction during the product lifecycle, while making more information available for the sake of service business creation. The vertical integration of IoT message transfer, encompassing all semantic layers, helps achieve a more flexible and modular approach to knowledge sharing in an IoT environment; (4) Contribution: A semantic data annotation applied to IoT can contribute to enhancing collected data types, which entails a richer knowledge extraction. The ontology-based PLM model enables as well the horizontal integration of heterogeneous PLM data while breaking traditional vertical information silos; (5) Conclusion: The framework was applied to a fictive case study with an electric car service for the purpose of demonstration. For the purpose of demonstrating the feasibility of the approach, the semantic model is implemented in Sesame APIs, which play the role of an Internet-connected Resource Description Framework (RDF) database. PMID:27399717

  1. Higher Language Ability is Related to Angular Gyrus Activation Increase During Semantic Processing, Independent of Sentence Incongruency

    PubMed Central

    Van Ettinger-Veenstra, Helene; McAllister, Anita; Lundberg, Peter; Karlsson, Thomas; Engström, Maria

    2016-01-01

    This study investigates the relation between individual language ability and neural semantic processing abilities. Our aim was to explore whether high-level language ability would correlate to decreased activation in language-specific regions or rather increased activation in supporting language regions during processing of sentences. Moreover, we were interested if observed neural activation patterns are modulated by semantic incongruency similarly to previously observed changes upon syntactic congruency modulation. We investigated 27 healthy adults with a sentence reading task—which tapped language comprehension and inference, and modulated sentence congruency—employing functional magnetic resonance imaging (fMRI). We assessed the relation between neural activation, congruency modulation, and test performance on a high-level language ability assessment with multiple regression analysis. Our results showed increased activation in the left-hemispheric angular gyrus extending to the temporal lobe related to high language ability. This effect was independent of semantic congruency, and no significant relation between language ability and incongruency modulation was observed. Furthermore, there was a significant increase of activation in the inferior frontal gyrus (IFG) bilaterally when the sentences were incongruent, indicating that processing incongruent sentences was more demanding than processing congruent sentences and required increased activation in language regions. The correlation of high-level language ability with increased rather than decreased activation in the left angular gyrus, a region specific for language processing, is opposed to what the neural efficiency hypothesis would predict. We can conclude that no evidence is found for an interaction between semantic congruency related brain activation and high-level language performance, even though the semantic incongruent condition shows to be more demanding and evoking more neural activation. PMID:27014040

  2. Modulation of the inter-hemispheric processing of semantic information during normal aging. A divided visual field experiment.

    PubMed

    Hoyau, E; Cousin, E; Jaillard, A; Baciu, M

    2016-12-01

    We evaluated the effect of normal aging on the inter-hemispheric processing of semantic information by using the divided visual field (DVF) method, with words and pictures. Two main theoretical models have been considered, (a) the HAROLD model which posits that aging is associated with supplementary recruitment of the right hemisphere (RH) and decreased hemispheric specialization, and (b) the RH decline theory, which assumes that the RH becomes less efficient with aging, associated with increased LH specialization. Two groups of subjects were examined, a Young Group (YG) and an Old Group (OG), while participants performed a semantic categorization task (living vs. non-living) in words and pictures. The DVF was realized in two steps: (a) unilateral DVF presentation with stimuli presented separately in each visual field, left or right, allowing for their initial processing by only one hemisphere, right or left, respectively; (b) bilateral DVF presentation (BVF) with stimuli presented simultaneously in both visual fields, followed by their processing by both hemispheres. These two types of presentation permitted the evaluation of two main characteristics of the inter-hemispheric processing of information, the hemispheric specialization (HS) and the inter-hemispheric cooperation (IHC). Moreover, the BVF allowed determining the driver-hemisphere for processing information presented in BVF. Results obtained in OG indicated that: (a) semantic categorization was performed as accurately as YG, even if more slowly, (b) a non-semantic RH decline was observed, and (c) the LH controls the semantic processing during the BVF, suggesting an increased role of the LH in aging. However, despite the stronger involvement of the LH in OG, the RH is not completely devoid of semantic abilities. As discussed in the paper, neither the HAROLD nor the RH decline does fully explain this pattern of results. We rather suggest that the effect of aging on the hemispheric specialization and inter-hemispheric cooperation during semantic processing is explained not by only one model, but by an interaction between several complementary mechanisms and models. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Nonschematic drawing recognition: a new approach based on attributed graph grammar with flexible embedding

    NASA Astrophysics Data System (ADS)

    Lee, Kyu J.; Kunii, T. L.; Noma, T.

    1993-01-01

    In this paper, we propose a syntactic pattern recognition method for non-schematic drawings, based on a new attributed graph grammar with flexible embedding. In our graph grammar, the embedding rule permits the nodes of a guest graph to be arbitrarily connected with the nodes of a host graph. The ambiguity caused by this flexible embedding is controlled with the evaluation of synthesized attributes and the check of context sensitivity. To integrate parsing with the synthesized attribute evaluation and the context sensitivity check, we also develop a bottom up parsing algorithm.

  4. KEGGParser: parsing and editing KEGG pathway maps in Matlab.

    PubMed

    Arakelyan, Arsen; Nersisyan, Lilit

    2013-02-15

    KEGG pathway database is a collection of manually drawn pathway maps accompanied with KGML format files intended for use in automatic analysis. KGML files, however, do not contain the required information for complete reproduction of all the events indicated in the static image of a pathway map. Several parsers and editors of KEGG pathways exist for processing KGML files. We introduce KEGGParser-a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab. The source code is available at http://www.mathworks.com/matlabcentral/fileexchange/37561.

  5. A novel argument for the Universality of Parsing principles.

    PubMed

    Grillo, Nino; Costa, João

    2014-10-01

    Previous work on Relative Clause attachment has overlooked a crucial grammatical distinction across both the languages and structures tested: the selective availability of Pseudo Relatives. We reconsider the literature in light of this observation and argue that, all else being equal, local attachment is found with genuine Relative Clauses and that non-local attachment emerges when their surface identical imposters, Pseudo Relatives, are available. Hence, apparent cross-linguistic variation in parsing preferences is reducible to grammatical factors. The results from two novel experiments in Italian are presented in support of these conclusions. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Development of Pulsar Detection Methods for a Galactic Center Search

    NASA Astrophysics Data System (ADS)

    Thornton, Stephen; Wharton, Robert; Cordes, James; Chatterjee, Shami

    2018-01-01

    Finding pulsars within the inner parsec of the galactic center would be incredibly beneficial: for pulsars sufficiently close to Sagittarius A*, extremely precise tests of general relativity in the strong field regime could be performed through measurement of post-Keplerian parameters. Binary pulsar systems with sufficiently short orbital periods could provide the same laboratories with which to test existing theories. Fast and efficient methods are needed to parse large sets of time-domain data from different telescopes to search for periodicity in signals and differentiate radio frequency interference (RFI) from pulsar signals. Here we demonstrate several techniques to reduce red noise (low-frequency interference), generate signals from pulsars in binary orbits, and create plots that allow for fast detection of both RFI and pulsars.

  7. Energy-Efficient Integration of Continuous Context Sensing and Prediction into Smartwatches.

    PubMed

    Rawassizadeh, Reza; Tomitsch, Martin; Nourizadeh, Manouchehr; Momeni, Elaheh; Peery, Aaron; Ulanova, Liudmila; Pazzani, Michael

    2015-09-08

    As the availability and use of wearables increases, they are becoming a promising platform for context sensing and context analysis. Smartwatches are a particularly interesting platform for this purpose, as they offer salient advantages, such as their proximity to the human body. However, they also have limitations associated with their small form factor, such as processing power and battery life, which makes it difficult to simply transfer smartphone-based context sensing and prediction models to smartwatches. In this paper, we introduce an energy-efficient, generic, integrated framework for continuous context sensing and prediction on smartwatches. Our work extends previous approaches for context sensing and prediction on wrist-mounted wearables that perform predictive analytics outside the device. We offer a generic sensing module and a novel energy-efficient, on-device prediction module that is based on a semantic abstraction approach to convert sensor data into meaningful information objects, similar to human perception of a behavior. Through six evaluations, we analyze the energy efficiency of our framework modules, identify the optimal file structure for data access and demonstrate an increase in accuracy of prediction through our semantic abstraction method. The proposed framework is hardware independent and can serve as a reference model for implementing context sensing and prediction on small wearable devices beyond smartwatches, such as body-mounted cameras.

  8. Energy-Efficient Integration of Continuous Context Sensing and Prediction into Smartwatches

    PubMed Central

    Rawassizadeh, Reza; Tomitsch, Martin; Nourizadeh, Manouchehr; Momeni, Elaheh; Peery, Aaron; Ulanova, Liudmila; Pazzani, Michael

    2015-01-01

    As the availability and use of wearables increases, they are becoming a promising platform for context sensing and context analysis. Smartwatches are a particularly interesting platform for this purpose, as they offer salient advantages, such as their proximity to the human body. However, they also have limitations associated with their small form factor, such as processing power and battery life, which makes it difficult to simply transfer smartphone-based context sensing and prediction models to smartwatches. In this paper, we introduce an energy-efficient, generic, integrated framework for continuous context sensing and prediction on smartwatches. Our work extends previous approaches for context sensing and prediction on wrist-mounted wearables that perform predictive analytics outside the device. We offer a generic sensing module and a novel energy-efficient, on-device prediction module that is based on a semantic abstraction approach to convert sensor data into meaningful information objects, similar to human perception of a behavior. Through six evaluations, we analyze the energy efficiency of our framework modules, identify the optimal file structure for data access and demonstrate an increase in accuracy of prediction through our semantic abstraction method. The proposed framework is hardware independent and can serve as a reference model for implementing context sensing and prediction on small wearable devices beyond smartwatches, such as body-mounted cameras. PMID:26370997

  9. Musical Expertise Increases Top–Down Modulation Over Hippocampal Activation during Familiarity Decisions

    PubMed Central

    Gagnepain, Pierre; Fauvel, Baptiste; Desgranges, Béatrice; Gaubert, Malo; Viader, Fausto; Eustache, Francis; Groussard, Mathilde; Platel, Hervé

    2017-01-01

    The hippocampus has classically been associated with episodic memory, but is sometimes also recruited during semantic memory tasks, especially for the skilled exploration of familiar information. Cognitive control mechanisms guiding semantic memory search may benefit from the set of cognitive processes at stake during musical training. Here, we examined using functional magnetic resonance imaging, whether musical expertise would promote the top–down control of the left inferior frontal gyrus (LIFG) over the generation of hippocampally based goal-directed thoughts mediating the familiarity judgment of proverbs and musical items. Analyses of behavioral data confirmed that musical experts more efficiently access familiar melodies than non-musicians although such increased ability did not transfer to verbal semantic memory. At the brain level, musical expertise specifically enhanced the recruitment of the hippocampus during semantic access to melodies, but not proverbs. Additionally, hippocampal activation contributed to speed of access to familiar melodies, but only in musicians. Critically, causal modeling of neural dynamics between LIFG and the hippocampus further showed that top–down excitatory regulation over the hippocampus during familiarity decision specifically increases with musical expertise – an effect that generalized across melodies and proverbs. At the local level, our data show that musical expertise modulates the online recruitment of hippocampal response to serve semantic memory retrieval of familiar melodies. The reconfiguration of memory network dynamics following musical training could constitute a promising framework to understand its ability to preserve brain functions. PMID:29033805

  10. A path-oriented knowledge representation system: Defusing the combinatorial system

    NASA Technical Reports Server (NTRS)

    Karamouzis, Stamos T.; Barry, John S.; Smith, Steven L.; Feyock, Stefan

    1995-01-01

    LIMAP is a programming system oriented toward efficient information manipulation over fixed finite domains, and quantification over paths and predicates. A generalization of Warshall's Algorithm to precompute paths in a sparse matrix representation of semantic nets is employed to allow questions involving paths between components to be posed and answered easily. LIMAP's ability to cache all paths between two components in a matrix cell proved to be a computational obstacle, however, when the semantic net grew to realistic size. The present paper describes a means of mitigating this combinatorial explosion to an extent that makes the use of the LIMAP representation feasible for problems of significant size. The technique we describe radically reduces the size of the search space in which LIMAP must operate; semantic nets of more than 500 nodes have been attacked successfully. Furthermore, it appears that the procedure described is applicable not only to LIMAP, but to a number of other combinatorially explosive search space problems found in AI as well.

  11. Ontology-based vector space model and fuzzy query expansion to retrieve knowledge on medical computational problem solutions.

    PubMed

    Bratsas, Charalampos; Koutkias, Vassilis; Kaimakamis, Evangelos; Bamidis, Panagiotis; Maglaveras, Nicos

    2007-01-01

    Medical Computational Problem (MCP) solving is related to medical problems and their computerized algorithmic solutions. In this paper, an extension of an ontology-based model to fuzzy logic is presented, as a means to enhance the information retrieval (IR) procedure in semantic management of MCPs. We present herein the methodology followed for the fuzzy expansion of the ontology model, the fuzzy query expansion procedure, as well as an appropriate ontology-based Vector Space Model (VSM) that was constructed for efficient mapping of user-defined MCP search criteria and MCP acquired knowledge. The relevant fuzzy thesaurus is constructed by calculating the simultaneous occurrences of terms and the term-to-term similarities derived from the ontology that utilizes UMLS (Unified Medical Language System) concepts by using Concept Unique Identifiers (CUI), synonyms, semantic types, and broader-narrower relationships for fuzzy query expansion. The current approach constitutes a sophisticated advance for effective, semantics-based MCP-related IR.

  12. Incremental Query Rewriting with Resolution

    NASA Astrophysics Data System (ADS)

    Riazanov, Alexandre; Aragão, Marcelo A. T.

    We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.

  13. The development of sentence interpretation: effects of perceptual, attentional and semantic interference.

    PubMed

    Leech, Robert; Aydelott, Jennifer; Symons, Germaine; Carnevale, Julia; Dick, Frederic

    2007-11-01

    How does the development and consolidation of perceptual, attentional, and higher cognitive abilities interact with language acquisition and processing? We explored children's (ages 5-17) and adults' (ages 18-51) comprehension of morphosyntactically varied sentences under several competing speech conditions that varied in the degree of attentional demands, auditory masking, and semantic interference. We also evaluated the relationship between subjects' syntactic comprehension and their word reading efficiency and general 'speed of processing'. We found that the interactions between perceptual and attentional processes and complex sentence interpretation changed considerably over the course of development. Perceptual masking of the speech signal had an early and lasting impact on comprehension, particularly for more complex sentence structures. In contrast, increased attentional demand in the absence of energetic auditory masking primarily affected younger children's comprehension of difficult sentence types. Finally, the predictability of syntactic comprehension abilities by external measures of development and expertise is contingent upon the perceptual, attentional, and semantic milieu in which language processing takes place.

  14. Cross-Domain Shoe Retrieval with a Semantic Hierarchy of Attribute Classification Network.

    PubMed

    Zhan, Huijing; Shi, Boxin; Kot, Alex C

    2017-08-04

    Cross-domain shoe image retrieval is a challenging problem, because the query photo from the street domain (daily life scenario) and the reference photo in the online domain (online shop images) have significant visual differences due to the viewpoint and scale variation, self-occlusion, and cluttered background. This paper proposes the Semantic Hierarchy Of attributE Convolutional Neural Network (SHOE-CNN) with a three-level feature representation for discriminative shoe feature expression and efficient retrieval. The SHOE-CNN with its newly designed loss function systematically merges semantic attributes of closer visual appearances to prevent shoe images with the obvious visual differences being confused with each other; the features extracted from image, region, and part levels effectively match the shoe images across different domains. We collect a large-scale shoe dataset composed of 14341 street domain and 12652 corresponding online domain images with fine-grained attributes to train our network and evaluate our system. The top-20 retrieval accuracy improves significantly over the solution with the pre-trained CNN features.

  15. Locating and parsing bibliographic references in HTML medical articles

    PubMed Central

    Zou, Jie; Le, Daniel; Thoma, George R.

    2010-01-01

    The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level. PMID:20640222

  16. Locating and parsing bibliographic references in HTML medical articles.

    PubMed

    Zou, Jie; Le, Daniel; Thoma, George R

    2010-06-01

    The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level.

  17. Constructing Adverse Outcome Pathways: a Demonstration of an Ontology-based Semantics Mapping Approach

    EPA Science Inventory

    Adverse outcome pathway (AOP) provides a conceptual framework to evaluate and integrate chemical toxicity and its effects across the levels of biological organization. As such, it is essential to develop a resource-efficient and effective approach to extend molecular initiating ...

  18. Strategic search from long-term memory: an examination of semantic and autobiographical recall.

    PubMed

    Unsworth, Nash; Brewer, Gene A; Spillers, Gregory J

    2014-01-01

    Searching long-term memory is theoretically driven by both directed (search strategies) and random components. In the current study we conducted four experiments evaluating strategic search in semantic and autobiographical memory. Participants were required to generate either exemplars from the category of animals or the names of their friends for several minutes. Self-reported strategies suggested that participants typically relied on visualization strategies for both tasks and were less likely to rely on ordered strategies (e.g., alphabetic search). When participants were instructed to use particular strategies, the visualization strategy resulted in the highest levels of performance and the most efficient search, whereas ordered strategies resulted in the lowest levels of performance and fairly inefficient search. These results are consistent with the notion that retrieval from long-term memory is driven, in part, by search strategies employed by the individual, and that one particularly efficient strategy is to visualize various situational contexts that one has experienced in the past in order to constrain the search and generate the desired information.

  19. Semantics-based distributed I/O with the ParaMEDIC framework.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balaji, P.; Feng, W.; Lin, H.

    2008-01-01

    Many large-scale applications simultaneously rely on multiple resources for efficient execution. For example, such applications may require both large compute and storage resources; however, very few supercomputing centers can provide large quantities of both. Thus, data generated at the compute site oftentimes has to be moved to a remote storage site for either storage or visualization and analysis. Clearly, this is not an efficient model, especially when the two sites are distributed over a wide-area network. Thus, we present a framework called 'ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing' which uses application-specific semantic information to convert the generatedmore » data to orders-of-magnitude smaller metadata at the compute site, transfer the metadata to the storage site, and re-process the metadata at the storage site to regenerate the output. Specifically, ParaMEDIC trades a small amount of additional computation (in the form of data post-processing) for a potentially significant reduction in data that needs to be transferred in distributed environments.« less

  20. Fast Semantic Segmentation of 3d Point Clouds with Strongly Varying Density

    NASA Astrophysics Data System (ADS)

    Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

    2016-06-01

    We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point's (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.

  1. Intelligence as the efficiency of cue-driven retrieval from secondary memory.

    PubMed

    Liesefeld, Heinrich René; Hoffmann, Eugenia; Wentura, Dirk

    2016-01-01

    Complex-span (working-memory-capacity) tasks are among the most successful predictors of intelligence. One important contributor to this relationship is the ability to efficiently employ cues for the retrieval from secondary memory. Presumably, intelligent individuals can considerably restrict their memory search sets by using such cues and can thereby improve recall performance. We here test this assumption by experimentally manipulating the validity of retrieval cues. When memoranda are drawn from the same semantic category on two successive trials of a verbal complex-span task, the category is a very strong retrieval cue on its first occurrence (strong-cue trial) but loses some of its validity on its second occurrence (weak-cue trial). If intelligent individuals make better use of semantic categories as retrieval cues, their recall accuracy suffers more from this loss of cue validity. Accordingly, our results show that less variance in intelligence is explained by recall accuracy on weak-cue compared with strong-cue trials.

  2. Feature hashing for fast image retrieval

    NASA Astrophysics Data System (ADS)

    Yan, Lingyu; Fu, Jiarun; Zhang, Hongxin; Yuan, Lu; Xu, Hui

    2018-03-01

    Currently, researches on content based image retrieval mainly focus on robust feature extraction. However, due to the exponential growth of online images, it is necessary to consider searching among large scale images, which is very timeconsuming and unscalable. Hence, we need to pay much attention to the efficiency of image retrieval. In this paper, we propose a feature hashing method for image retrieval which not only generates compact fingerprint for image representation, but also prevents huge semantic loss during the process of hashing. To generate the fingerprint, an objective function of semantic loss is constructed and minimized, which combine the influence of both the neighborhood structure of feature data and mapping error. Since the machine learning based hashing effectively preserves neighborhood structure of data, it yields visual words with strong discriminability. Furthermore, the generated binary codes leads image representation building to be of low-complexity, making it efficient and scalable to large scale databases. Experimental results show good performance of our approach.

  3. Energy-efficient privacy protection for smart home environments using behavioral semantics.

    PubMed

    Park, Homin; Basaran, Can; Park, Taejoon; Son, Sang Hyuk

    2014-09-02

    Research on smart environments saturated with ubiquitous computing devices is rapidly advancing while raising serious privacy issues. According to recent studies, privacy concerns significantly hinder widespread adoption of smart home technologies. Previous work has shown that it is possible to infer the activities of daily living within environments equipped with wireless sensors by monitoring radio fingerprints and traffic patterns. Since data encryption cannot prevent privacy invasions exploiting transmission pattern analysis and statistical inference, various methods based on fake data generation for concealing traffic patterns have been studied. In this paper, we describe an energy-efficient, light-weight, low-latency algorithm for creating dummy activities that are semantically similar to the observed phenomena. By using these cloaking activities, the amount of  fake data transmissions can be flexibly controlled to support a trade-off between energy efficiency and privacy protection. According to the experiments using real data collected from a smart home environment, our proposed method can extend the lifetime of the network by more than 2× compared to the previous methods in the literature. Furthermore, the activity cloaking method supports low latency transmission of real data while also significantly reducing the accuracy of the wireless snooping attacks.

  4. Energy-Efficient Privacy Protection for Smart Home Environments Using Behavioral Semantics

    PubMed Central

    Park, Homin; Basaran, Can; Park, Taejoon; Son, Sang Hyuk

    2014-01-01

    Research on smart environments saturated with ubiquitous computing devices is rapidly advancing while raising serious privacy issues. According to recent studies, privacy concerns significantly hinder widespread adoption of smart home technologies. Previous work has shown that it is possible to infer the activities of daily living within environments equipped with wireless sensors by monitoring radio fingerprints and traffic patterns. Since data encryption cannot prevent privacy invasions exploiting transmission pattern analysis and statistical inference, various methods based on fake data generation for concealing traffic patterns have been studied. In this paper, we describe an energy-efficient, light-weight, low-latency algorithm for creating dummy activities that are semantically similar to the observed phenomena. By using these cloaking activities, the amount of fake data transmissions can be flexibly controlled to support a trade-off between energy efficiency and privacy protection. According to the experiments using real data collected from a smart home environment, our proposed method can extend the lifetime of the network by more than 2× compared to the previous methods in the literature. Furthermore, the activity cloaking method supports low latency transmission of real data while also significantly reducing the accuracy of the wireless snooping attacks. PMID:25184489

  5. Semantic and syntactic interoperability in online processing of big Earth observation data.

    PubMed

    Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea

    2018-01-01

    The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover).

  6. Semantic and syntactic interoperability in online processing of big Earth observation data

    PubMed Central

    Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea

    2018-01-01

    ABSTRACT The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover). PMID:29387171

  7. Constructing Adverse Outcome Pathways: a Demonstration of ...

    EPA Pesticide Factsheets

    Adverse outcome pathway (AOP) provides a conceptual framework to evaluate and integrate chemical toxicity and its effects across the levels of biological organization. As such, it is essential to develop a resource-efficient and effective approach to extend molecular initiating events (MIEs) of chemicals to their downstream phenotypes of a greater regulatory relevance. A number of ongoing public phenomics (high throughput phenotyping) efforts have been generating abundant phenotypic data annotated with ontology terms. These phenotypes can be analyzed semantically and linked to MIEs of interest, all in the context of a knowledge base integrated from a variety of ontologies for various species and knowledge domains. In such analyses, two phenotypic profiles (PPs; anchored by genes or diseases) each characterized by multiple ontology terms are compared for their semantic similarities within a common ontology graph, but across boundaries of species and knowledge domains. Taking advantage of publicly available ontologies and software tool kits, we have implemented an OS-Mapping (Ontology-based Semantics Mapping) approach as a Java application, and constructed a network of 19383 PPs as nodes with edges weighed by their pairwise semantic similarity scores. Individual PPs were assembled from public phenomics data. Out of possible 1.87×108 pairwise connections among these nodes, about 71% of them have similarity scores between 0.2 and the maximum possible of 1.0.

  8. Coinductive Logic Programming with Negation

    NASA Astrophysics Data System (ADS)

    Min, Richard; Gupta, Gopal

    We introduce negation into coinductive logic programming (co-LP) via what we term Coinductive SLDNF (co-SLDNF) resolution. We present declarative and operational semantics of co-SLDNF resolution and present their equivalence under the restriction of rationality. Co-LP with co-SLDNF resolution provides a powerful, practical and efficient operational semantics for Fitting's Kripke-Kleene three-valued logic with restriction of rationality. Further, applications of co-SLDNF resolution are also discussed and illustrated where Co-SLDNF resolution allows one to develop elegant implementations of modal logics. Moreover it provides the capability of non-monotonic inference (e.g., predicate Answer Set Programming) that can be used to develop novel and effective first-order modal non-monotonic inference engines.

  9. Service composition towards increasing end-user accessibility.

    PubMed

    Kaklanis, Nikolaos; Votis, Konstantinos; Tzovaras, Dimitrios

    2015-01-01

    This paper presents the Cloud4all Service Synthesizer Tool, a framework that enables efficient orchestration of accessibility services, as well as their combination into complex forms, providing more advanced functionalities towards increasing the accessibility of end-users with various types of functional limitations. The supported services are described formally within an ontology, enabling, thus, semantic service composition. The proposed service composition approach is based on semantic matching between services specifications on the one hand and user needs/preferences and current context of use on the other hand. The use of automatic composition of accessibility services can significantly enhance end-users' accessibility, especially in cases where assistive solutions are not available in their device.

  10. IDEA: Interactive Display for Evolutionary Analyses.

    PubMed

    Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C

    2008-12-08

    The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.

  11. IDEA: Interactive Display for Evolutionary Analyses

    PubMed Central

    Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C

    2008-01-01

    Background The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. Results We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. Conclusion IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data. PMID:19061522

  12. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  13. Thermo-msf-parser: an open source Java library to parse and visualize Thermo Proteome Discoverer msf files.

    PubMed

    Colaert, Niklaas; Barsnes, Harald; Vaudel, Marc; Helsens, Kenny; Timmerman, Evy; Sickmann, Albert; Gevaert, Kris; Martens, Lennart

    2011-08-05

    The Thermo Proteome Discoverer program integrates both peptide identification and quantification into a single workflow for peptide-centric proteomics. Furthermore, its close integration with Thermo mass spectrometers has made it increasingly popular in the field. Here, we present a Java library to parse the msf files that constitute the output of Proteome Discoverer. The parser is also implemented as a graphical user interface allowing convenient access to the information found in the msf files, and in Rover, a program to analyze and validate quantitative proteomics information. All code, binaries, and documentation is freely available at http://thermo-msf-parser.googlecode.com.

  14. Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling.

    PubMed

    Rioth, Matthew J; Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L

    2016-07-01

    Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. Acoustic facilitation of object movement detection during self-motion

    PubMed Central

    Calabro, F. J.; Soto-Faraco, S.; Vaina, L. M.

    2011-01-01

    In humans, as well as most animal species, perception of object motion is critical to successful interaction with the surrounding environment. Yet, as the observer also moves, the retinal projections of the various motion components add to each other and extracting accurate object motion becomes computationally challenging. Recent psychophysical studies have demonstrated that observers use a flow-parsing mechanism to estimate and subtract self-motion from the optic flow field. We investigated whether concurrent acoustic cues for motion can facilitate visual flow parsing, thereby enhancing the detection of moving objects during simulated self-motion. Participants identified an object (the target) that moved either forward or backward within a visual scene containing nine identical textured objects simulating forward observer translation. We found that spatially co-localized, directionally congruent, moving auditory stimuli enhanced object motion detection. Interestingly, subjects who performed poorly on the visual-only task benefited more from the addition of moving auditory stimuli. When auditory stimuli were not co-localized to the visual target, improvements in detection rates were weak. Taken together, these results suggest that parsing object motion from self-motion-induced optic flow can operate on multisensory object representations. PMID:21307050

  16. Attribute And-Or Grammar for Joint Parsing of Human Pose, Parts and Attributes.

    PubMed

    Park, Seyoung; Nie, Xiaohan; Zhu, Song-Chun

    2017-07-25

    This paper presents an attribute and-or grammar (A-AOG) model for jointly inferring human body pose and human attributes in a parse graph with attributes augmented to nodes in the hierarchical representation. In contrast to other popular methods in the current literature that train separate classifiers for poses and individual attributes, our method explicitly represents the decomposition and articulation of body parts, and account for the correlations between poses and attributes. The A-AOG model is an amalgamation of three traditional grammar formulations: (i)Phrase structure grammar representing the hierarchical decomposition of the human body from whole to parts; (ii)Dependency grammar modeling the geometric articulation by a kinematic graph of the body pose; and (iii)Attribute grammar accounting for the compatibility relations between different parts in the hierarchy so that their appearances follow a consistent style. The parse graph outputs human detection, pose estimation, and attribute prediction simultaneously, which are intuitive and interpretable. We conduct experiments on two tasks on two datasets, and experimental results demonstrate the advantage of joint modeling in comparison with computing poses and attributes independently. Furthermore, our model obtains better performance over existing methods for both pose estimation and attribute prediction tasks.

  17. Semantic Segmentation and Difference Extraction via Time Series Aerial Video Camera and its Application

    NASA Astrophysics Data System (ADS)

    Amit, S. N. K.; Saito, S.; Sasaki, S.; Kiyoki, Y.; Aoki, Y.

    2015-04-01

    Google earth with high-resolution imagery basically takes months to process new images before online updates. It is a time consuming and slow process especially for post-disaster application. The objective of this research is to develop a fast and effective method of updating maps by detecting local differences occurred over different time series; where only region with differences will be updated. In our system, aerial images from Massachusetts's road and building open datasets, Saitama district datasets are used as input images. Semantic segmentation is then applied to input images. Semantic segmentation is a pixel-wise classification of images by implementing deep neural network technique. Deep neural network technique is implemented due to being not only efficient in learning highly discriminative image features such as road, buildings etc., but also partially robust to incomplete and poorly registered target maps. Then, aerial images which contain semantic information are stored as database in 5D world map is set as ground truth images. This system is developed to visualise multimedia data in 5 dimensions; 3 dimensions as spatial dimensions, 1 dimension as temporal dimension, and 1 dimension as degenerated dimensions of semantic and colour combination dimension. Next, ground truth images chosen from database in 5D world map and a new aerial image with same spatial information but different time series are compared via difference extraction method. The map will only update where local changes had occurred. Hence, map updating will be cheaper, faster and more effective especially post-disaster application, by leaving unchanged region and only update changed region.

  18. Brain oscillatory subsequent memory effects differ in power and long-range synchronization between semantic and survival processing.

    PubMed

    Fellner, Marie-Christin; Bäuml, Karl-Heinz T; Hanslmayr, Simon

    2013-10-01

    Memory crucially depends on the way information is processed during encoding. Differences in processes during encoding not only lead to differences in memory performance but also rely on different brain networks. Although these assumptions are corroborated by several previous fMRI and ERP studies, little is known about how brain oscillations dissociate between different memory encoding tasks. The present study therefore compared encoding related brain oscillatory activity elicited by two very efficient encoding tasks: a typical deep semantic item feature judgment task and a more elaborative survival encoding task. Subjects were asked to judge words either for survival relevance or for animacy, as indicated by a cue presented prior to the item. This allowed dissociating pre-item activity from item-related activity for both tasks. Replicating prior studies, survival processing led to higher recognition performance than semantic processing. Successful encoding in the semantic condition was reflected by a strong decrease in alpha and beta power, whereas successful encoding in the survival condition was related to increased alpha and beta long-range phase synchrony. Moreover, a pre-item subsequent memory effect in theta power was found which did not vary with encoding condition. These results show that measures of local synchrony (power) and global long range-synchrony (phase synchronization) dissociate between memory encoding processes. Whereas semantic encoding was reflected in decreases in local synchrony, increases in global long range synchrony were related to elaborative survival encoding, presumably reflecting the involvement of a more widespread cortical network in this task. Copyright © 2013 Elsevier Inc. All rights reserved.

  19. Exploiting Semantic Web Technologies to Develop OWL-Based Clinical Practice Guideline Execution Engines.

    PubMed

    Jafarpour, Borna; Abidi, Samina Raza; Abidi, Syed Sibte Raza

    2016-01-01

    Computerizing paper-based CPG and then executing them can provide evidence-informed decision support to physicians at the point of care. Semantic web technologies especially web ontology language (OWL) ontologies have been profusely used to represent computerized CPG. Using semantic web reasoning capabilities to execute OWL-based computerized CPG unties them from a specific custom-built CPG execution engine and increases their shareability as any OWL reasoner and triple store can be utilized for CPG execution. However, existing semantic web reasoning-based CPG execution engines suffer from lack of ability to execute CPG with high levels of expressivity, high cognitive load of computerization of paper-based CPG and updating their computerized versions. In order to address these limitations, we have developed three CPG execution engines based on OWL 1 DL, OWL 2 DL and OWL 2 DL + semantic web rule language (SWRL). OWL 1 DL serves as the base execution engine capable of executing a wide range of CPG constructs, however for executing highly complex CPG the OWL 2 DL and OWL 2 DL + SWRL offer additional executional capabilities. We evaluated the technical performance and medical correctness of our execution engines using a range of CPG. Technical evaluations show the efficiency of our CPG execution engines in terms of CPU time and validity of the generated recommendation in comparison to existing CPG execution engines. Medical evaluations by domain experts show the validity of the CPG-mediated therapy plans in terms of relevance, safety, and ordering for a wide range of patient scenarios.

  20. Formalization of treatment guidelines using Fuzzy Cognitive Maps and semantic web tools.

    PubMed

    Papageorgiou, Elpiniki I; Roo, Jos De; Huszka, Csaba; Colaert, Dirk

    2012-02-01

    Therapy decision making and support in medicine deals with uncertainty and needs to take into account the patient's clinical parameters, the context of illness and the medical knowledge of the physician and guidelines to recommend a treatment therapy. This research study is focused on the formalization of medical knowledge using a cognitive process, called Fuzzy Cognitive Maps (FCMs) and semantic web approach. The FCM technique is capable of dealing with situations including uncertain descriptions using similar procedure such as human reasoning does. Thus, it was selected for the case of modeling and knowledge integration of clinical practice guidelines. The semantic web tools were established to implement the FCM approach. The knowledge base was constructed from the clinical guidelines as the form of if-then fuzzy rules. These fuzzy rules were transferred to FCM modeling technique and, through the semantic web tools, the whole formalization was accomplished. The problem of urinary tract infection (UTI) in adult community was examined for the proposed approach. Forty-seven clinical concepts and eight therapy concepts were identified for the antibiotic treatment therapy problem of UTIs. A preliminary pilot-evaluation study with 55 patient cases showed interesting findings; 91% of the antibiotic treatments proposed by the implemented approach were in fully agreement with the guidelines and physicians' opinions. The results have shown that the suggested approach formalizes medical knowledge efficiently and gives a front-end decision on antibiotics' suggestion for cystitis. Concluding, modeling medical knowledge/therapeutic guidelines using cognitive methods and web semantic tools is both reliable and useful. Copyright © 2011 Elsevier Inc. All rights reserved.

  1. Bizarreness effect in dream recall.

    PubMed

    Cipolli, C; Bolzani, R; Cornoldi, C; De Beni, R; Fagioli, I

    1993-02-01

    This study aimed to ascertain a) whether morning reports of dream experience more frequently reproduce bizarre contents of night reports than nonbizarre ones and b) whether this effect depends on the rarity of bizarre contents in the dream or on their richer encoding in memory. Ten subjects were awakened in rapid eye movement (REM) sleep three times per night for 4 nonconsecutive nights and asked to report their previous dream experiences. In the morning they were asked to re-report those dreams. Two separate pairs of judges scored the reports: the former identified the parts in each report with bizarre events, characters or feelings and the latter parsed each report into content units using transformational grammar criteria. By combining the data of the two analyses, content units were classified as bizarre or nonbizarre and, according to whether present in both the night and corresponding morning reports, as semantically equivalent or nonequivalent. The proportion of bizarre contents common to night and morning reports was about twice that of nonbizarre contents and was positively correlated to the quantity of bizarre contents present in the night report. These findings support the view that bizarreness enhances recall of dream contents and that this memory advantage is determined by a richer encoding at the moment of dream generation. Such a view would seem to explain why dreams in everyday life, which are typically remembered after a rather long interval, appear more markedly bizarre than those recalled in the sleep laboratory.

  2. A framework for medical image retrieval using machine learning and statistical similarity matching techniques with relevance feedback.

    PubMed

    Rahman, Md Mahmudur; Bhattacharya, Prabir; Desai, Bipin C

    2007-01-01

    A content-based image retrieval (CBIR) framework for diverse collection of medical images of different imaging modalities, anatomic regions with different orientations and biological systems is proposed. Organization of images in such a database (DB) is well defined with predefined semantic categories; hence, it can be useful for category-specific searching. The proposed framework consists of machine learning methods for image prefiltering, similarity matching using statistical distance measures, and a relevance feedback (RF) scheme. To narrow down the semantic gap and increase the retrieval efficiency, we investigate both supervised and unsupervised learning techniques to associate low-level global image features (e.g., color, texture, and edge) in the projected PCA-based eigenspace with their high-level semantic and visual categories. Specially, we explore the use of a probabilistic multiclass support vector machine (SVM) and fuzzy c-mean (FCM) clustering for categorization and prefiltering of images to reduce the search space. A category-specific statistical similarity matching is proposed in a finer level on the prefiltered images. To incorporate a better perception subjectivity, an RF mechanism is also added to update the query parameters dynamically and adjust the proposed matching functions. Experiments are based on a ground-truth DB consisting of 5000 diverse medical images of 20 predefined categories. Analysis of results based on cross-validation (CV) accuracy and precision-recall for image categorization and retrieval is reported. It demonstrates the improvement, effectiveness, and efficiency achieved by the proposed framework.

  3. A Format for Phylogenetic Placements

    PubMed Central

    Matsen, Frederick A.; Hoffman, Noah G.; Gallagher, Aaron; Stamatakis, Alexandros

    2012-01-01

    We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g., short reads) into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format, which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement. PMID:22383988

  4. A format for phylogenetic placements.

    PubMed

    Matsen, Frederick A; Hoffman, Noah G; Gallagher, Aaron; Stamatakis, Alexandros

    2012-01-01

    We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g., short reads) into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format, which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.

  5. Panacea, a semantic-enabled drug recommendations discovery framework.

    PubMed

    Doulaverakis, Charalampos; Nikolaidis, George; Kleontas, Athanasios; Kompatsiaris, Ioannis

    2014-03-06

    Personalized drug prescription can be benefited from the use of intelligent information management and sharing. International standard classifications and terminologies have been developed in order to provide unique and unambiguous information representation. Such standards can be used as the basis of automated decision support systems for providing drug-drug and drug-disease interaction discovery. Additionally, Semantic Web technologies have been proposed in earlier works, in order to support such systems. The paper presents Panacea, a semantic framework capable of offering drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standard classifications and terminologies, provide the backbone of the common representation of medical data while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Representation is based on a lightweight ontology. A layered reasoning approach is implemented where at the first layer ontological inference is used in order to discover underlying knowledge, while at the second layer a two-step rule selection strategy is followed resulting in a computationally efficient reasoning approach. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. Panacea is evaluated both in terms of quality of recommendations against real clinical data and performance. The quality recommendation gave useful insights regarding requirements for real world deployment and revealed several parameters that affected the recommendation results. Performance-wise, Panacea is compared to a previous published work by the authors, a service for drug recommendations named GalenOWL, and presents their differences in modeling and approach to the problem, while also pinpointing the advantages of Panacea. Overall, the paper presents a framework for providing an efficient drug recommendations service where Semantic Web technologies are coupled with traditional business rule engines.

  6. Towards a Semantic-Based Approach for Affect and Metaphor Detection

    ERIC Educational Resources Information Center

    Zhang, Li; Barnden, John

    2013-01-01

    Affect detection from open-ended virtual improvisational contexts is a challenging task. To achieve this research goal, the authors developed an intelligent agent which was able to engage in virtual improvisation and perform sentence-level affect detection from user inputs. This affect detection development was efficient for the improvisational…

  7. Pronouns in Catalan: Information, Discourse and Strategy

    ERIC Educational Resources Information Center

    Mayol, Laia

    2009-01-01

    This thesis investigates the variation between null and overt pronouns in subject position in Catalan, a null subject language. I argue that null and overt subject pronouns are two resources that speakers efficiently deploy to signal their intended interpretation regarding antecedent choice or semantic meaning, and that communicative agents…

  8. Explaining Variance in Comprehension for Students in a High-Poverty Setting

    ERIC Educational Resources Information Center

    Conradi, Kristin; Amendum, Steven J.; Liebfreund, Meghan D.

    2016-01-01

    This study examined the contributions of decoding, language, spelling, and motivation to the reading comprehension of elementary school readers in a high-poverty setting. Specifically, the research questions addressed whether and how the influences of word reading efficiency, semantic knowledge, reading self-concept, and spelling on reading…

  9. Improving Semantic Updating Method on 3d City Models Using Hybrid Semantic-Geometric 3d Segmentation Technique

    NASA Astrophysics Data System (ADS)

    Sharkawi, K.-H.; Abdul-Rahman, A.

    2013-09-01

    Cities and urban areas entities such as building structures are becoming more complex as the modern human civilizations continue to evolve. The ability to plan and manage every territory especially the urban areas is very important to every government in the world. Planning and managing cities and urban areas based on printed maps and 2D data are getting insufficient and inefficient to cope with the complexity of the new developments in big cities. The emergence of 3D city models have boosted the efficiency in analysing and managing urban areas as the 3D data are proven to represent the real world object more accurately. It has since been adopted as the new trend in buildings and urban management and planning applications. Nowadays, many countries around the world have been generating virtual 3D representation of their major cities. The growing interest in improving the usability of 3D city models has resulted in the development of various tools for analysis based on the 3D city models. Today, 3D city models are generated for various purposes such as for tourism, location-based services, disaster management and urban planning. Meanwhile, modelling 3D objects are getting easier with the emergence of the user-friendly tools for 3D modelling available in the market. Generating 3D buildings with high accuracy also has become easier with the availability of airborne Lidar and terrestrial laser scanning equipments. The availability and accessibility to this technology makes it more sensible to analyse buildings in urban areas using 3D data as it accurately represent the real world objects. The Open Geospatial Consortium (OGC) has accepted CityGML specifications as one of the international standards for representing and exchanging spatial data, making it easier to visualize, store and manage 3D city models data efficiently. CityGML able to represents the semantics, geometry, topology and appearance of 3D city models in five well-defined Level-of-Details (LoD), namely LoD0 to LoD4. The accuracy and structural complexity of the 3D objects increases with the LoD level where LoD0 is the simplest LoD (2.5D; Digital Terrain Model (DTM) + building or roof print) while LoD4 is the most complex LoD (architectural details with interior structures). Semantic information is one of the main components in CityGML and 3D City Models, and provides important information for any analyses. However, more often than not, the semantic information is not available for the 3D city model due to the unstandardized modelling process. One of the examples is where a building is normally generated as one object (without specific feature layers such as Roof, Ground floor, Level 1, Level 2, Block A, Block B, etc). This research attempts to develop a method to improve the semantic data updating process by segmenting the 3D building into simpler parts which will make it easier for the users to select and update the semantic information. The methodology is implemented for 3D buildings in LoD2 where the buildings are generated without architectural details but with distinct roof structures. This paper also introduces hybrid semantic-geometric 3D segmentation method that deals with hierarchical segmentation of a 3D building based on its semantic value and surface characteristics, fitted by one of the predefined primitives. For future work, the segmentation method will be implemented as part of the change detection module that can detect any changes on the 3D buildings, store and retrieve semantic information of the changed structure, automatically updates the 3D models and visualize the results in a userfriendly graphical user interface (GUI).

  10. Computable visually observed phenotype ontological framework for plants

    PubMed Central

    2011-01-01

    Background The ability to search for and precisely compare similar phenotypic appearances within and across species has vast potential in plant science and genetic research. The difficulty in doing so lies in the fact that many visual phenotypic data, especially visually observed phenotypes that often times cannot be directly measured quantitatively, are in the form of text annotations, and these descriptions are plagued by semantic ambiguity, heterogeneity, and low granularity. Though several bio-ontologies have been developed to standardize phenotypic (and genotypic) information and permit comparisons across species, these semantic issues persist and prevent precise analysis and retrieval of information. A framework suitable for the modeling and analysis of precise computable representations of such phenotypic appearances is needed. Results We have developed a new framework called the Computable Visually Observed Phenotype Ontological Framework for plants. This work provides a novel quantitative view of descriptions of plant phenotypes that leverages existing bio-ontologies and utilizes a computational approach to capture and represent domain knowledge in a machine-interpretable form. This is accomplished by means of a robust and accurate semantic mapping module that automatically maps high-level semantics to low-level measurements computed from phenotype imagery. The framework was applied to two different plant species with semantic rules mined and an ontology constructed. Rule quality was evaluated and showed high quality rules for most semantics. This framework also facilitates automatic annotation of phenotype images and can be adopted by different plant communities to aid in their research. Conclusions The Computable Visually Observed Phenotype Ontological Framework for plants has been developed for more efficient and accurate management of visually observed phenotypes, which play a significant role in plant genomics research. The uniqueness of this framework is its ability to bridge the knowledge of informaticians and plant science researchers by translating descriptions of visually observed phenotypes into standardized, machine-understandable representations, thus enabling the development of advanced information retrieval and phenotype annotation analysis tools for the plant science community. PMID:21702966

  11. Verification and Planning Based on Coinductive Logic Programming

    NASA Technical Reports Server (NTRS)

    Bansal, Ajay; Min, Richard; Simon, Luke; Mallya, Ajay; Gupta, Gopal

    2008-01-01

    Coinduction is a powerful technique for reasoning about unfounded sets, unbounded structures, infinite automata, and interactive computations [6]. Where induction corresponds to least fixed point's semantics, coinduction corresponds to greatest fixed point semantics. Recently coinduction has been incorporated into logic programming and an elegant operational semantics developed for it [11, 12]. This operational semantics is the greatest fix point counterpart of SLD resolution (SLD resolution imparts operational semantics to least fix point based computations) and is termed co- SLD resolution. In co-SLD resolution, a predicate goal p( t) succeeds if it unifies with one of its ancestor calls. In addition, rational infinite terms are allowed as arguments of predicates. Infinite terms are represented as solutions to unification equations and the occurs check is omitted during the unification process. Coinductive Logic Programming (Co-LP) and Co-SLD resolution can be used to elegantly perform model checking and planning. A combined SLD and Co-SLD resolution based LP system forms the common basis for planning, scheduling, verification, model checking, and constraint solving [9, 4]. This is achieved by amalgamating SLD resolution, co-SLD resolution, and constraint logic programming [13] in a single logic programming system. Given that parallelism in logic programs can be implicitly exploited [8], complex, compute-intensive applications (planning, scheduling, model checking, etc.) can be executed in parallel on multi-core machines. Parallel execution can result in speed-ups as well as in larger instances of the problems being solved. In the remainder we elaborate on (i) how planning can be elegantly and efficiently performed under real-time constraints, (ii) how real-time systems can be elegantly and efficiently model- checked, as well as (iii) how hybrid systems can be verified in a combined system with both co-SLD and SLD resolution. Implementations of co-SLD resolution as well as preliminary implementations of the planning and verification applications have been developed [4]. Co-LP and Model Checking: The vast majority of properties that are to be verified can be classified into safety properties and liveness properties. It is well known within model checking that safety properties can be verified by reachability analysis, i.e, if a counter-example to the property exists, it can be finitely determined by enumerating all the reachable states of the Kripke structure.

  12. The lived experience of doing the right thing: a parse method study.

    PubMed

    Smith, Sandra Maxwell

    2012-01-01

    The purposes of this research were to discover the structure of the experience of doing the right thing and to contribute to nursing knowledge. The Parse research method was used in this study to answer the research question: What is the structure of the lived experience of doing the right thing? Participants were 10 individuals living in the community. The central finding of this study was the following structure: The lived experience of doing the right thing is steadfast uprightness amid adversity, as honorableness with significant affiliations emerges with contentment. New knowledge extended the theory of humanbecoming and enhanced understanding of the experience of doing the right thing.

  13. Intelligent Text Retrieval and Knowledge Acquisition from Texts for NASA Applications: Preprocessing Issues

    NASA Technical Reports Server (NTRS)

    2002-01-01

    A system that retrieves problem reports from a NASA database is described. The database is queried with natural language questions. Part-of-speech tags are first assigned to each word in the question using a rule based tagger. A partial parse of the question is then produced with independent sets of deterministic finite state a utomata. Using partial parse information, a look up strategy searches the database for problem reports relevant to the question. A bigram stemmer and irregular verb conjugates have been incorporated into the system to improve accuracy. The system is evaluated by a set of fifty five questions posed by NASA engineers. A discussion of future research is also presented.

  14. The value of parsing as feature generation for gene mention recognition

    PubMed Central

    Smith, Larry H; Wilbur, W John

    2009-01-01

    We measured the extent to which information surrounding a base noun phrase reflects the presence of a gene name, and evaluated seven different parsers in their ability to provide information for that purpose. Using the GENETAG corpus as a gold standard, we performed machine learning to recognize from its context when a base noun phrase contained a gene name. Starting with the best lexical features, we assessed the gain of adding dependency or dependency-like relations from a full sentence parse. Features derived from parsers improved performance in this partial gene mention recognition task by a small but statistically significant amount. There were virtually no differences between parsers in these experiments. PMID:19345281

  15. The lived experience of serenity: using Parse's research method.

    PubMed

    Kruse, B G

    1999-04-01

    Parse's research method was used to investigate the meaning of serenity for survivors of a life-threatening illness or traumatic event. Ten survivors of cancer told their stories of the meaning of serenity as they had lived it in their lives. Descriptions were aided by photographs chosen by each participant to represent the meaning of serenity for them. The structure of serenity was generated through the extraction-synthesis process. Four main concepts--steering-yielding with the flow, savoring remembered visions of engaging surroundings, abiding with aloneness-togetherness, and attesting to a loving presence--emerged and led to a theoretical structure of serenity from the human becoming perspective. Findings confirm serenity as a multidimensional process.

  16. Development of 3D browsing and interactive web system

    NASA Astrophysics Data System (ADS)

    Shi, Xiaonan; Fu, Jian; Jin, Chaolin

    2017-09-01

    In the current market, users need to download specific software or plug-ins to browse the 3D model, and browsing the system may be unstable, and it cannot be 3D model interaction issues In order to solve this problem, this paper presents a solution to the interactive browsing of the model in the server-side parsing model, and when the system is applied, the user only needs to input the system URL and upload the 3D model file to operate the browsing The server real-time parsing 3D model, the interactive response speed, these completely follows the user to walk the minimalist idea, and solves the current market block 3D content development question.

  17. Ontology modularization to improve semantic medical image annotation.

    PubMed

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  18. Exemplar-Based Image and Video Stylization Using Fully Convolutional Semantic Features.

    PubMed

    Zhu, Feida; Yan, Zhicheng; Bu, Jiajun; Yu, Yizhou

    2017-05-10

    Color and tone stylization in images and videos strives to enhance unique themes with artistic color and tone adjustments. It has a broad range of applications from professional image postprocessing to photo sharing over social networks. Mainstream photo enhancement softwares, such as Adobe Lightroom and Instagram, provide users with predefined styles, which are often hand-crafted through a trial-and-error process. Such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. On the other hand, stylistic enhancement needs to apply distinct adjustments to various semantic regions. Such an ability enables a broader range of visual styles. In this paper, we first propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture.

  19. Formal ontologies in biomedical knowledge representation.

    PubMed

    Schulz, S; Jansen, L

    2013-01-01

    Medical decision support and other intelligent applications in the life sciences depend on increasing amounts of digital information. Knowledge bases as well as formal ontologies are being used to organize biomedical knowledge and data. However, these two kinds of artefacts are not always clearly distinguished. Whereas the popular RDF(S) standard provides an intuitive triple-based representation, it is semantically weak. Description logics based ontology languages like OWL-DL carry a clear-cut semantics, but they are computationally expensive, and they are often misinterpreted to encode all kinds of statements, including those which are not ontological. We distinguish four kinds of statements needed to comprehensively represent domain knowledge: universal statements, terminological statements, statements about particulars and contingent statements. We argue that the task of formal ontologies is solely to represent universal statements, while the non-ontological kinds of statements can nevertheless be connected with ontological representations. To illustrate these four types of representations, we use a running example from parasitology. We finally formulate recommendations for semantically adequate ontologies that can efficiently be used as a stable framework for more context-dependent biomedical knowledge representation and reasoning applications like clinical decision support systems.

  20. Focused Belief Measures for Uncertainty Quantification in High Performance Semantic Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Joslyn, Cliff A.; Weaver, Jesse R.

    In web-scale semantic data analytics there is a great need for methods which aggregate uncertainty claims, on the one hand respecting the information provided as accurately as possible, while on the other still being tractable. Traditional statistical methods are more robust, but only represent distributional, additive uncertainty. Generalized information theory methods, including fuzzy systems and Dempster-Shafer (DS) evidence theory, represent multiple forms of uncertainty, but are computationally and methodologically difficult. We require methods which provide an effective balance between the complete representation of the full complexity of uncertainty claims in their interaction, while satisfying the needs of both computational complexitymore » and human cognition. Here we build on J{\\o}sang's subjective logic to posit methods in focused belief measures (FBMs), where a full DS structure is focused to a single event. The resulting ternary logical structure is posited to be able to capture the minimal amount of generalized complexity needed at a maximum of computational efficiency. We demonstrate the efficacy of this approach in a web ingest experiment over the 2012 Billion Triple dataset from the Semantic Web Challenge.« less

  1. Vigi4Med Scraper: A Framework for Web Forum Structured Data Extraction and Semantic Representation

    PubMed Central

    Audeh, Bissan; Beigbeder, Michel; Zimmermann, Antoine; Jaillon, Philippe; Bousquet, Cédric

    2017-01-01

    The extraction of information from social media is an essential yet complicated step for data analysis in multiple domains. In this paper, we present Vigi4Med Scraper, a generic open source framework for extracting structured data from web forums. Our framework is highly configurable; using a configuration file, the user can freely choose the data to extract from any web forum. The extracted data are anonymized and represented in a semantic structure using Resource Description Framework (RDF) graphs. This representation enables efficient manipulation by data analysis algorithms and allows the collected data to be directly linked to any existing semantic resource. To avoid server overload, an integrated proxy with caching functionality imposes a minimal delay between sequential requests. Vigi4Med Scraper represents the first step of Vigi4Med, a project to detect adverse drug reactions (ADRs) from social networks founded by the French drug safety agency Agence Nationale de Sécurité du Médicament (ANSM). Vigi4Med Scraper has successfully extracted greater than 200 gigabytes of data from the web forums of over 20 different websites. PMID:28122056

  2. Semantic Information Processing of Physical Simulation Based on Scientific Concept Vocabulary Model

    NASA Astrophysics Data System (ADS)

    Kino, Chiaki; Suzuki, Yoshio; Takemiya, Hiroshi

    Scientific Concept Vocabulary (SCV) has been developed to actualize Cognitive methodology based Data Analysis System: CDAS which supports researchers to analyze large scale data efficiently and comprehensively. SCV is an information model for processing semantic information for physics and engineering. In the model of SCV, all semantic information is related to substantial data and algorisms. Consequently, SCV enables a data analysis system to recognize the meaning of execution results output from a numerical simulation. This method has allowed a data analysis system to extract important information from a scientific view point. Previous research has shown that SCV is able to describe simple scientific indices and scientific perceptions. However, it is difficult to describe complex scientific perceptions by currently-proposed SCV. In this paper, a new data structure for SCV has been proposed in order to describe scientific perceptions in more detail. Additionally, the prototype of the new model has been constructed and applied to actual data of numerical simulation. The result means that the new SCV is able to describe more complex scientific perceptions.

  3. Semantics-Based Intelligent Indexing and Retrieval of Digital Images - A Case Study

    NASA Astrophysics Data System (ADS)

    Osman, Taha; Thakker, Dhavalkumar; Schaefer, Gerald

    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they typically rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this chapter we present a semantically enabled image annotation and retrieval engine that is designed to satisfy the requirements of commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as presenting our initial thoughts on exploiting lexical databases for explicit semantic-based query expansion.

  4. Integrated Data Capturing Requirements for 3d Semantic Modelling of Cultural Heritage: the Inception Protocol

    NASA Astrophysics Data System (ADS)

    Di Giulio, R.; Maietti, F.; Piaia, E.; Medici, M.; Ferrari, F.; Turillazzi, B.

    2017-02-01

    The generation of high quality 3D models can be still very time-consuming and expensive, and the outcome of digital reconstructions is frequently provided in formats that are not interoperable, and therefore cannot be easily accessed. This challenge is even more crucial for complex architectures and large heritage sites, which involve a large amount of data to be acquired, managed and enriched by metadata. In this framework, the ongoing EU funded project INCEPTION - Inclusive Cultural Heritage in Europe through 3D semantic modelling proposes a workflow aimed at the achievements of efficient 3D digitization methods, post-processing tools for an enriched semantic modelling, web-based solutions and applications to ensure a wide access to experts and non-experts. In order to face these challenges and to start solving the issue of the large amount of captured data and time-consuming processes in the production of 3D digital models, an Optimized Data Acquisition Protocol (DAP) has been set up. The purpose is to guide the processes of digitization of cultural heritage, respecting needs, requirements and specificities of cultural assets.

  5. Digital Investigations of AN Archaeological Smart Point Cloud: a Real Time Web-Based Platform to Manage the Visualisation of Semantical Queries

    NASA Astrophysics Data System (ADS)

    Poux, F.; Neuville, R.; Hallot, P.; Van Wersch, L.; Luczfalvy Jancsó, A.; Billen, R.

    2017-05-01

    While virtual copies of the real world tend to be created faster than ever through point clouds and derivatives, their working proficiency by all professionals' demands adapted tools to facilitate knowledge dissemination. Digital investigations are changing the way cultural heritage researchers, archaeologists, and curators work and collaborate to progressively aggregate expertise through one common platform. In this paper, we present a web application in a WebGL framework accessible on any HTML5-compatible browser. It allows real time point cloud exploration of the mosaics in the Oratory of Germigny-des-Prés, and emphasises the ease of use as well as performances. Our reasoning engine is constructed over a semantically rich point cloud data structure, where metadata has been injected a priori. We developed a tool that directly allows semantic extraction and visualisation of pertinent information for the end users. It leads to efficient communication between actors by proposing optimal 3D viewpoints as a basis on which interactions can grow.

  6. High-Level Data-Abstraction System

    NASA Technical Reports Server (NTRS)

    Fishwick, P. A.

    1986-01-01

    Communication with data-base processor flexible and efficient. High Level Data Abstraction (HILDA) system is three-layer system supporting data-abstraction features of Intel data-base processor (DBP). Purpose of HILDA establishment of flexible method of efficiently communicating with DBP. Power of HILDA lies in its extensibility with regard to syntax and semantic changes. HILDA's high-level query language readily modified. Offers powerful potential to computer sites where DBP attached to DEC VAX-series computer. HILDA system written in Pascal and FORTRAN 77 for interactive execution.

  7. Automated vocabulary discovery for geo-parsing online epidemic intelligence.

    PubMed

    Keller, Mikaela; Freifeld, Clark C; Brownstein, John S

    2009-11-24

    Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon. The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting.

  8. [Pilot study of domain-specific terminology adaptation for morphological analysis: research on unknown terms in national examination documents of radiological technologists].

    PubMed

    Tsuji, Shintarou; Nishimoto, Naoki; Ogasawara, Katsuhiko

    2008-07-20

    Although large medical texts are stored in electronic format, they are seldom reused because of the difficulty of processing narrative texts by computer. Morphological analysis is a key technology for extracting medical terms correctly and automatically. This process parses a sentence into its smallest unit, the morpheme. Phrases consisting of two or more technical terms, however, cause morphological analysis software to fail in parsing the sentence and output unprocessed terms as "unknown words." The purpose of this study was to reduce the number of unknown words in medical narrative text processing. The results of parsing the text with additional dictionaries were compared with the analysis of the number of unknown words in the national examination for radiologists. The ratio of unknown words was reduced 1.0% to 0.36% by adding terminologies of radiological technology, MeSH, and ICD-10 labels. The terminology of radiological technology was the most effective resource, being reduced by 0.62%. This result clearly showed the necessity of additional dictionary selection and trends in unknown words. The potential for this investigation is to make available a large body of clinical information that would otherwise be inaccessible for applications other than manual health care review by personnel.

  9. Exploring information from the topology beneath the Gene Ontology terms to improve semantic similarity measures.

    PubMed

    Zhang, Shu-Bo; Lai, Jian-Huang

    2016-07-15

    Measuring the similarity between pairs of biological entities is important in molecular biology. The introduction of Gene Ontology (GO) provides us with a promising approach to quantifying the semantic similarity between two genes or gene products. This kind of similarity measure is closely associated with the GO terms annotated to biological entities under consideration and the structure of the GO graph. However, previous works in this field mainly focused on the upper part of the graph, and seldom concerned about the lower part. In this study, we aim to explore information from the lower part of the GO graph for better semantic similarity. We proposed a framework to quantify the similarity measure beneath a term pair, which takes into account both the information two ancestral terms share and the probability that they co-occur with their common descendants. The effectiveness of our approach was evaluated against seven typical measurements on public platform CESSM, protein-protein interaction and gene expression datasets. Experimental results consistently show that the similarity derived from the lower part contributes to better semantic similarity measure. The promising features of our approach are the following: (1) it provides a mirror model to characterize the information two ancestral terms share with respect to their common descendant; (2) it quantifies the probability that two terms co-occur with their common descendant in an efficient way; and (3) our framework can effectively capture the similarity measure beneath two terms, which can serve as an add-on to improve traditional semantic similarity measure between two GO terms. The algorithm was implemented in Matlab and is freely available from http://ejl.org.cn/bio/GOBeneath/. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Don’t Like RDF Reification? Making Statements about Statements Using Singleton Property

    PubMed Central

    Nguyen, Vinh; Bodenreider, Olivier; Sheth, Amit

    2015-01-01

    Statements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occurring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the querying and reasoning mechanisms to be aware of provenance, time, location, or certainty of triples. However, an efficient RDF representation for such meta knowledge of triples remains challenging. The existing standard reification approach allows such meta knowledge of RDF triples to be expressed using RDF by two steps. The first step is representing the triple by a Statement instance which has subject, predicate, and object indicated separately in three different triples. The second step is creating assertions about that instance as if it is a statement. While reification is simple and intuitive, this approach does not have formal semantics and is not commonly used in practice as described in the RDF Primer. In this paper, we propose a novel approach called Singleton Property for representing statements about statements and provide a formal semantics for it. We explain how this singleton property approach fits well with the existing syntax and formal semantics of RDF, and the syntax of SPARQL query language. We also demonstrate the use of singleton property in the representation and querying of meta knowledge in two examples of Semantic Web knowledge bases: YAGO2 and BKR. Our experiments on the BKR show that the singleton property approach gives a decent performance in terms of number of triples, query length and query execution time compared to existing approaches. This approach, which is also simple and intuitive, can be easily adopted for representing and querying statements about statements in other knowledge bases. PMID:25750938

  11. Semantic Metadata for Heterogeneous Spatial Planning Documents

    NASA Astrophysics Data System (ADS)

    Iwaniak, A.; Kaczmarek, I.; Łukowicz, J.; Strzelecki, M.; Coetzee, S.; Paluszyński, W.

    2016-09-01

    Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa). The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.

  12. A Semantic Rule-Based Framework for Efficient Retrieval of Educational Materials

    ERIC Educational Resources Information Center

    Mahmoudi, Maryam Tayefeh; Taghiyareh, Fattaneh; Badie, Kambiz

    2013-01-01

    Retrieving resources in an appropriate manner has a promising role in increasing the performance of educational support systems. A variety of works have been done to organize materials for educational purposes using tagging techniques. Despite the effectiveness of these techniques within certain domains, organizing resources in a way being…

  13. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of…

  14. Single-Event Rapid Word Collection Workshops: Efficient, Effective, Empowering

    ERIC Educational Resources Information Center

    Boerger, Brenda H.; Stutzman, Verna

    2018-01-01

    In this paper we describe single-event Rapid Word Collection (RWC) workshop results in 12 languages, and compare these results to fieldwork lexicons collected by other means. We show that this methodology of collecting words by semantic domain by community engagement leads to obtaining more words in less time than conventional collection methods.…

  15. Scalable and expressive medical terminologies.

    PubMed

    Mays, E; Weida, R; Dionne, R; Laker, M; White, B; Liang, C; Oles, F J

    1996-01-01

    The K-Rep system, based on description logic, is used to represent and reason with large and expressive controlled medical terminologies. Expressive concept descriptions incorporate semantically precise definitions composed using logical operators, together with important non-semantic information such as synonyms and codes. Examples are drawn from our experience with K-Rep in modeling the InterMed laboratory terminology and also developing a large clinical terminology now in production use at Kaiser-Permanente. System-level scalability of performance is achieved through an object-oriented database system which efficiently maps persistent memory to virtual memory. Equally important is conceptual scalability-the ability to support collaborative development, organization, and visualization of a substantial terminology as it evolves over time. K-Rep addresses this need by logically completing concept definitions and automatically classifying concepts in a taxonomy via subsumption inferences. The K-Rep system includes a general-purpose GUI environment for terminology development and browsing, a custom interface for formulary term maintenance, a C+2 application program interface, and a distributed client-server mode which provides lightweight clients with efficient run-time access to K-Rep by means of a scripting language.

  16. Harvesting Intelligence in Multimedia Social Tagging Systems

    NASA Astrophysics Data System (ADS)

    Giannakidou, Eirini; Kaklidou, Foteini; Chatzilari, Elisavet; Kompatsiaris, Ioannis; Vakali, Athena

    As more people adopt tagging practices, social tagging systems tend to form rich knowledge repositories that enable the extraction of patterns reflecting the way content semantics is perceived by the web users. This is of particular importance, especially in the case of multimedia content, since the availability of such content in the web is very high and its efficient retrieval using textual annotations or content-based automatically extracted metadata still remains a challenge. It is argued that complementing multimedia analysis techniques with knowledge drawn from web social annotations may facilitate multimedia content management. This chapter focuses on analyzing tagging patterns and combining them with content feature extraction methods, generating, thus, intelligence from multimedia social tagging systems. Emphasis is placed on using all available "tracks" of knowledge, that is tag co-occurrence together with semantic relations among tags and low-level features of the content. Towards this direction, a survey on the theoretical background and the adopted practices for analysis of multimedia social content are presented. A case study from Flickr illustrates the efficiency of the proposed approach.

  17. Domain Adaption of Parsing for Operative Notes

    PubMed Central

    Wang, Yan; Pakhomov, Serguei; Ryan, James O.; Melton, Genevieve B.

    2016-01-01

    Background Full syntactic parsing of clinical text as a part of clinical natural language processing (NLP) is critical for a wide range of applications, such as identification of adverse drug reactions, patient cohort identification, and gene interaction extraction. Several robust syntactic parsers are publicly available to produce linguistic representations for sentences. However, these existing parsers are mostly trained on general English text and often require adaptation for optimal performance on clinical text. Our objective was to adapt an existing general English parser for the clinical text of operative reports via lexicon augmentation, statistics adjusting, and grammar rules modification based on a set of biomedical text. Method The Stanford unlexicalized probabilistic context-free grammar (PCFG) parser lexicon was expanded with SPECIALIST lexicon along with statistics collected from a limited set of operative notes tagged with a two of POS taggers (GENIA tagger and MedPost). The most frequently occurring verb entries of the SPECIALIST lexicon were adjusted based on manual review of verb usage in operative notes. Stanford parser grammar production rules were also modified based on linguistic features of operative reports. An analogous approach was then applied to the GENIA corpus to test the generalizability of this approach to biomedical text. Results The new unlexicalized PCFG parser extended with the extra lexicon from SPECIALIST along with accurate statistics collected from an operative note corpus tagged with GENIA POS tagger improved the parser performance by 2.26% from 87.64% to 89.90%. There was a progressive improvement with the addition of multiple approaches. Most of the improvement occurred with lexicon augmentation combined with statistics from the operative notes corpus. Application of this approach on the GENIA corpus showed that parsing performance was boosted by 3.81% with a simple new grammar and the addition of the GENIA corpus lexicon. Conclusion Using statistics collected from clinical text tagged with POS taggers along with proper modification of grammars and lexicons of an unlexicalized PCFG parser can improve parsing performance. PMID:25661593

  18. Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.

    PubMed

    Lu, Xiaoqiang; Chen, Yaxiong; Li, Xuelong

    Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.

  19. Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry.

    PubMed

    Röst, Hannes L; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars

    2015-01-01

    In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS.

  20. Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry

    PubMed Central

    Röst, Hannes L.; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars

    2015-01-01

    Motivation In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Results Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Availability Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS. PMID:25927999

  1. Analysis of the phase transition in the two-dimensional Ising ferromagnet using a Lempel-Ziv string-parsing scheme and black-box data-compression utilities

    NASA Astrophysics Data System (ADS)

    Melchert, O.; Hartmann, A. K.

    2015-02-01

    In this work we consider information-theoretic observables to analyze short symbolic sequences, comprising time series that represent the orientation of a single spin in a two-dimensional (2D) Ising ferromagnet on a square lattice of size L2=1282 for different system temperatures T . The latter were chosen from an interval enclosing the critical point Tc of the model. At small temperatures the sequences are thus very regular; at high temperatures they are maximally random. In the vicinity of the critical point, nontrivial, long-range correlations appear. Here we implement estimators for the entropy rate, excess entropy (i.e., "complexity"), and multi-information. First, we implement a Lempel-Ziv string-parsing scheme, providing seemingly elaborate entropy rate and multi-information estimates and an approximate estimator for the excess entropy. Furthermore, we apply easy-to-use black-box data-compression utilities, providing approximate estimators only. For comparison and to yield results for benchmarking purposes, we implement the information-theoretic observables also based on the well-established M -block Shannon entropy, which is more tedious to apply compared to the first two "algorithmic" entropy estimation procedures. To test how well one can exploit the potential of such data-compression techniques, we aim at detecting the critical point of the 2D Ising ferromagnet. Among the above observables, the multi-information, which is known to exhibit an isolated peak at the critical point, is very easy to replicate by means of both efficient algorithmic entropy estimation procedures. Finally, we assess how good the various algorithmic entropy estimates compare to the more conventional block entropy estimates and illustrate a simple modification that yields enhanced results.

  2. Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2.

    PubMed

    Chen, W; Kowatch, R; Lin, S; Splaingard, M; Huang, Y

    2015-01-01

    Nationwide Children's Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use.

  3. Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2

    PubMed Central

    Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.

    2015-01-01

    Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080

  4. Discovering discovery patterns with Predication-based Semantic Indexing.

    PubMed

    Cohen, Trevor; Widdows, Dominic; Schvaneveldt, Roger W; Davies, Peter; Rindflesch, Thomas C

    2012-12-01

    In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as "discovery patterns", such as "drug x INHIBITS substance y, substance y CAUSES disease z" that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Discovering discovery patterns with predication-based Semantic Indexing

    PubMed Central

    Cohen, Trevor; Widdows, Dominic; Schvaneveldt, Roger W.; Davies, Peter; Rindflesch, Thomas C.

    2012-01-01

    In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as “discovery patterns”, such as “drug x INHIBITS substance y, substance y CAUSES disease z” that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues. PMID:22841748

  6. The Semantic Distance Task: Quantifying Semantic Distance with Semantic Network Path Length

    ERIC Educational Resources Information Center

    Kenett, Yoed N.; Levi, Effi; Anaki, David; Faust, Miriam

    2017-01-01

    Semantic distance is a determining factor in cognitive processes, such as semantic priming, operating upon semantic memory. The main computational approach to compute semantic distance is through latent semantic analysis (LSA). However, objections have been raised against this approach, mainly in its failure at predicting semantic priming. We…

  7. A Multi-Encoding Approach for LTL Symbolic Satisfiability Checking

    NASA Technical Reports Server (NTRS)

    Rozier, Kristin Y.; Vardi, Moshe Y.

    2011-01-01

    Formal behavioral specifications written early in the system-design process and communicated across all design phases have been shown to increase the efficiency, consistency, and quality of the system under development. To prevent introducing design or verification errors, it is crucial to test specifications for satisfiability. Our focus here is on specifications expressed in linear temporal logic (LTL). We introduce a novel encoding of symbolic transition-based Buchi automata and a novel, "sloppy," transition encoding, both of which result in improved scalability. We also define novel BDD variable orders based on tree decomposition of formula parse trees. We describe and extensively test a new multi-encoding approach utilizing these novel encoding techniques to create 30 encoding variations. We show that our novel encodings translate to significant, sometimes exponential, improvement over the current standard encoding for symbolic LTL satisfiability checking.

  8. Gro2mat: a package to efficiently read gromacs output in MATLAB.

    PubMed

    Dien, Hung; Deane, Charlotte M; Knapp, Bernhard

    2014-07-30

    Molecular dynamics (MD) simulations are a state-of-the-art computational method used to investigate molecular interactions at atomic scale. Interaction processes out of experimental reach can be monitored using MD software, such as Gromacs. Here, we present the gro2mat package that allows fast and easy access to Gromacs output files from Matlab. Gro2mat enables direct parsing of the most common Gromacs output formats including the binary xtc-format. No openly available Matlab parser currently exists for this format. The xtc reader is orders of magnitudes faster than other available pdb/ascii workarounds. Gro2mat is especially useful for scientists with an interest in quick prototyping of new mathematical and statistical approaches for Gromacs trajectory analyses. © 2014 Wiley Periodicals, Inc. Copyright © 2014 Wiley Periodicals, Inc.

  9. Trying something new.

    PubMed

    Condon, Barbara Backer

    2013-01-01

    Trying something new is a universal living experience of health. Although trying something new frequently occurs in healthcare, its meaning has never explicitly been studied. Parse's humanbecoming school of thought is the theoretical perspective for this study. The research question for this study is: What is the structure of the living experience of trying something new? The purpose of this study was to advance nursing science. Parse's qualitative phenomenological-hermeneutic research method was used to guide this study. Participants were 8 men and 2 women, ages 29 to 65 who utilize an outpatient mental health facility in the Midwest. Data were collected with dialogical engagement. The major finding of the study is the structure: Trying something new is engaging in capricious exploitations with vacillating sentiments, as wistful contemplation surfaces with disparate affiliations.

  10. Motion based parsing for video from observational psychology

    NASA Astrophysics Data System (ADS)

    Kokaram, Anil; Doyle, Erika; Lennon, Daire; Joyeux, Laurent; Fuller, Ray

    2006-01-01

    In Psychology it is common to conduct studies involving the observation of humans undertaking some task. The sessions are typically recorded on video and used for subjective visual analysis. The subjective analysis is tedious and time consuming, not only because much useless video material is recorded but also because subjective measures of human behaviour are not necessarily repeatable. This paper presents tools using content based video analysis that allow automated parsing of video from one such study involving Dyslexia. The tools rely on implicit measures of human motion that can be generalised to other applications in the domain of human observation. Results comparing quantitative assessment of human motion with subjective assessment are also presented, illustrating that the system is a useful scientific tool.

  11. Introduction of statistical information in a syntactic analyzer for document image recognition

    NASA Astrophysics Data System (ADS)

    Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie

    2011-01-01

    This paper presents an improvement to document layout analysis systems, offering a possible solution to Sayre's paradox (which states that an element "must be recognized before it can be segmented; and it must be segmented before it can be recognized"). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores.

  12. GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

    PubMed

    Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

    2017-10-01

    Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.

  13. Indexing and retrieval of MPEG compressed video

    NASA Astrophysics Data System (ADS)

    Kobla, Vikrant; Doermann, David S.

    1998-04-01

    To keep pace with the increased popularity of digital video as an archival medium, the development of techniques for fast and efficient analysis of ideo streams is essential. In particular, solutions to the problems of storing, indexing, browsing, and retrieving video data from large multimedia databases are necessary to a low access to these collections. Given that video is often stored efficiently in a compressed format, the costly overhead of decompression can be reduced by analyzing the compressed representation directly. In earlier work, we presented compressed domain parsing techniques which identified shots, subshots, and scenes. In this article, we present efficient key frame selection, feature extraction, indexing, and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame type independent representation which normalizes spatial and temporal features including frame type, frame size, macroblock encoding, and motion compensation vectors. Features for indexing are derived directly from this representation and mapped to a low- dimensional space where they can be accessed using standard database techniques. Spatial information is used as primary index into the database and temporal information is used to rank retrieved clips and enhance the robustness of the system. The techniques presented enable efficient indexing, querying, and retrieval of compressed video as demonstrated by our system which typically takes a fraction of a second to retrieve similar video scenes from a database, with over 95 percent recall.

  14. CLOUDCLOUD : general-purpose instrument monitoring and data managing software

    NASA Astrophysics Data System (ADS)

    Dias, António; Amorim, António; Tomé, António

    2016-04-01

    An effective experiment is dependent on the ability to store and deliver data and information to all participant parties regardless of their degree of involvement in the specific parts that make the experiment a whole. Having fast, efficient and ubiquitous access to data will increase visibility and discussion, such that the outcome will have already been reviewed several times, strengthening the conclusions. The CLOUD project aims at providing users with a general purpose data acquisition, management and instrument monitoring platform that is fast, easy to use, lightweight and accessible to all participants of an experiment. This work is now implemented in the CLOUD experiment at CERN and will be fully integrated with the experiment as of 2016. Despite being used in an experiment of the scale of CLOUD, this software can also be used in any size of experiment or monitoring station, from single computers to large networks of computers to monitor any sort of instrument output without influencing the individual instrument's DAQ. Instrument data and meta data is stored and accessed via a specially designed database architecture and any type of instrument output is accepted using our continuously growing parsing application. Multiple databases can be used to separate different data taking periods or a single database can be used if for instance an experiment is continuous. A simple web-based application gives the user total control over the monitored instruments and their data, allowing data visualization and download, upload of processed data and the ability to edit existing instruments or add new instruments to the experiment. When in a network, new computers are immediately recognized and added to the system and are able to monitor instruments connected to them. Automatic computer integration is achieved by a locally running python-based parsing agent that communicates with a main server application guaranteeing that all instruments assigned to that computer are monitored with parsing intervals as fast as milliseconds. This software (server+agents+interface+database) comes in easy and ready-to-use packages that can be installed in any operating system, including Android and iOS systems. This software is ideal for use in modular experiments or monitoring stations with large variability in instruments and measuring methods or in large collaborations, where data requires homogenization in order to be effectively transmitted to all involved parties. This work presents the software and provides performance comparison with previously used monitoring systems in the CLOUD experiment at CERN.

  15. A Linked Fusion of Things, Services, and Data to Support a Collaborative Data Management Facility

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stephan, Eric G.; Elsethagen, Todd O.; Wynne, Adam S.

    The purpose of this paper is to illustrate the use of semantic technologies and approaches to seamlessly link things, services, and data in the proposed design of a scientific offshore wind energy research for the U.S. Department of Energy Wind and Water Technology Office of the Office of Energy Efficiency and Renewable Energy (EERE). By adapting linked community best practices, we were able to design a collaborative facility supporting both operational staff and end users that incorporates off-the-shelf components and overcome traditional barriers between devices, resulting data, and processing services. This was made largely possible through complementary advances in themore » Internet of Things (IoT), semantic web, Linked Services, and Linked Data communities, which provide the foundation for our design.« less

  16. Semantic Catalog of Things, Services, and Data to Support a Wind Data Management Facility

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stephan, E. G.; Elsethagen, T. O.; Berg, L. K.

    The purpose of this paper is to discuss how community vocabularies and linked open data best practices are being used to seamlessly link things, data, and off the shelf services to support scientific offshore wind energy research for the U.S. Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) Wind and Water Power Program. This is largely made possible by leveraging collaborative advances in the Internet of Things (IoT), Semantic Web, Linked Services, Linked Open Data (LOD), and RDF vocabulary communities, which provide the foundation for our design. By adapting these linked community best practices, we designed amore » wind characterization data management facility capable of continually collecting, processing, and preservation of in situ and remote sensing instrume« less

  17. A program for the conversion of The National Map data from proprietary format to resource description framework (RDF)

    USGS Publications Warehouse

    Bulen, Andrew; Carter, Jonathan J.; Varanka, Dalia E.

    2011-01-01

    To expand data functionality and capabilities for users of The National Map of the U.S. Geological Survey, data sets for six watersheds and three urban areas were converted from the Best Practices vector data model formats to Semantic Web data formats. This report describes and documents the conver-sion process. The report begins with an introduction to basic Semantic Web standards and the background of The National Map. Data were converted from a proprietary format to Geog-raphy Markup Language to capture the geometric footprint of topographic data features. Configuration files were designed to eliminate redundancy and make the conversion more efficient. A SPARQL endpoint was established for data validation and queries. The report concludes by describing the results of the conversion.

  18. Development of an Agile Knowledge Engineering Framework in Support of Multi-Disciplinary Translational Research

    PubMed Central

    Borlawsky, Tara B.; Dhaval, Rakesh; Hastings, Shannon L.; Payne, Philip R. O.

    2009-01-01

    In October 2006, the National Institutes of Health launched a new national consortium, funded through Clinical and Translational Science Awards (CTSA), with the primary objective of improving the conduct and efficiency of the inherently multi-disciplinary field of translational research. To help meet this goal, the Ohio State University Center for Clinical and Translational Science has launched a knowledge management initiative that is focused on facilitating widespread semantic interoperability among administrative, basic science, clinical and research computing systems, both internally and among the translational research community at-large, through the integration of domain-specific standard terminologies and ontologies with local annotations. This manuscript describes an agile framework that builds upon prevailing knowledge engineering and semantic interoperability methods, and will be implemented as part this initiative. PMID:21347164

  19. Development of an agile knowledge engineering framework in support of multi-disciplinary translational research.

    PubMed

    Borlawsky, Tara B; Dhaval, Rakesh; Hastings, Shannon L; Payne, Philip R O

    2009-03-01

    In October 2006, the National Institutes of Health launched a new national consortium, funded through Clinical and Translational Science Awards (CTSA), with the primary objective of improving the conduct and efficiency of the inherently multi-disciplinary field of translational research. To help meet this goal, the Ohio State University Center for Clinical and Translational Science has launched a knowledge management initiative that is focused on facilitating widespread semantic interoperability among administrative, basic science, clinical and research computing systems, both internally and among the translational research community at-large, through the integration of domain-specific standard terminologies and ontologies with local annotations. This manuscript describes an agile framework that builds upon prevailing knowledge engineering and semantic interoperability methods, and will be implemented as part this initiative.

  20. Semantic Memory in the Clinical Progression of Alzheimer Disease.

    PubMed

    Tchakoute, Christophe T; Sainani, Kristin L; Henderson, Victor W

    2017-09-01

    Semantic memory measures may be useful in tracking and predicting progression of Alzheimer disease. We investigated relationships among semantic memory tasks and their 1-year predictive value in women with Alzheimer disease. We conducted secondary analyses of a randomized clinical trial of raloxifene in 42 women with late-onset mild-to-moderate Alzheimer disease. We assessed semantic memory with tests of oral confrontation naming, category fluency, semantic recognition and semantic naming, and semantic density in written narrative discourse. We measured global cognition (Alzheimer Disease Assessment Scale, cognitive subscale), dementia severity (Clinical Dementia Rating sum of boxes), and daily function (Activities of Daily Living Inventory) at baseline and 1 year. At baseline and 1 year, most semantic memory scores correlated highly or moderately with each other and with global cognition, dementia severity, and daily function. Semantic memory task performance at 1 year had worsened one-third to one-half standard deviation. Factor analysis of baseline test scores distinguished processes in semantic and lexical retrieval (semantic recognition, semantic naming, confrontation naming) from processes in lexical search (semantic density, category fluency). The semantic-lexical retrieval factor predicted global cognition at 1 year. Considered separately, baseline confrontation naming and category fluency predicted dementia severity, while semantic recognition and a composite of semantic recognition and semantic naming predicted global cognition. No individual semantic memory test predicted daily function. Semantic-lexical retrieval and lexical search may represent distinct aspects of semantic memory. Semantic memory processes are sensitive to cognitive decline and dementia severity in Alzheimer disease.

Top