Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. From Natural Language Specifications to Program Input Parsers

    E-print Network

    Lei, Tao

    We present a method for automatically generating input parsers from English specifications of input file formats. We use a Bayesian generative model to capture relevant natural language phenomena and translate the English ...

  3. Benchmarking natural-language parsers for biological applications using dependency graphs

    PubMed Central

    Clegg, Andrew B; Shepherd, Adrian J

    2007-01-01

    Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351

  4. Parallel Processing of Natural Language Parsers M. P. van Lohuizen

    E-print Network

    Kuzmanov, Georgi

    demand for computing power. Almost all of today's NLP systems use unification-based parsers the most compu- tationally expensive component. In order to let NLP systems be responsive, NLP systems Parallel parsing for NLP in specific has been researched extensively. For example, Tomita[9] presented

  5. Are Efficient Natural Language Parsers Robust? Kyongho Min & William H. Wilson

    E-print Network

    Wilson, Bill

    -correcting chart parser: a basic bottom-up chart parser, and chart parsers employing selectivity, top-down filtering, and a combination of selectivity and a top-down filtering. The combined selectivity and top-down at the syntactic level. Keywords: robust parsing, ill-formed text, selectivity, top-down filtering, spelling

  6. Flexible natural language parser based on a two-level representation of syntax

    SciTech Connect

    Lesmo, L.; Torasso, P.

    1983-01-01

    In this paper the authors present a parser which allows to make explicit the interconnections between syntax and semantics, to analyze the sentences in a quasi-deterministic fashion and, in many cases, to identify the roles of the various constituents even if the sentence is ill-formed. The main feature of the approach on which the parser is based consists in a two-level representation of the syntactic knowledge: a first set of rules emits hypotheses about the constituents of the sentence and their functional role and another set of rules verifies whether a hypothesis satisfies the constraints about the well-formedness of sentences. However, the application of the second set of rules is delayed until the semantic knowledge confirms the acceptability of the hypothesis. If the semantics reject it, a new hypothesis is obtained by applying a simple and relatively inexpensive natural modification; a set of these modifications is predefined and only when none of them is applicable a real backup is performed: in most cases this situation corresponds to a case where people would normally garden path. 19 references.

  7. Fence - An Efficient Parser with Ambiguity Support for Model-Driven Language Specification

    E-print Network

    Quesada, Luis; Cortijo, Francisco J

    2011-01-01

    Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven approaches, which constrain language designers to specific kinds of grammars, it needs general parser generators able to deal with ambiguities. In this paper, we propose Fence, an efficient bottom-up parsing algorithm with lexical and syntactic ambiguity support that enables the use of model-based language specification in practice.

  8. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, D.; Forster, M.; /Cornell U., LNS; Bates, D.A.; /LBL, Berkeley; Wolski, A.; /Liverpool U. /Cockcroft Inst. Accel. Sci. Tech.; Schmidt, F.; /CERN; Walker, N.J.; /DESY; Larrieu, T.; Roblin, Y.; /Jefferson Lab; Pelaia, T.; /Oak Ridge; Tenenbaum, P.; Woodley, M.; /SLAC; Reiche, S.; /UCLA

    2006-10-06

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally, this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  9. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, David; Forster, M.; Bates, D.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, Theodore; Roblin, Yves; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.

    2006-07-01

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  10. DynGenPar A Dynamic Generalized Parser for Common Mathematical Language

    E-print Network

    Neumaier, Arnold

    DynGenPar ­ A Dynamic Generalized Parser for Common Mathematical Language Kevin Kofler and Arnold Neumaier University of Vienna, Austria Faculty of Mathematics Nordbergstr. 15, 1090 Wien, Austria kevin-term goal is to computerize a large library of existing mathematical knowl- edge using the new parser

  11. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  12. Wait-and-See Strategies for Parsing Natural Language

    E-print Network

    Marcus, Mitchell P.

    The intent of this paper is to convey one idea central to the structure of a natural language parser currently under development, the notion of wait-and-see strategies. This notion will hopefully allow the recognition of ...

  13. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  14. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  15. NEWCAT: Parsing natural language using left-associative grammar

    SciTech Connect

    Hausser, R.

    1986-01-01

    This book shows that constituent structure analysis induces an irregular order of linear composition which is the direct cause of extreme computational inefficiency. It proposes an alternative left-associative grammar which operates with a regular order of linear compositions. Left-associative grammar is based on building up and cancelling valencies. Left-associative parsers differ from all other systems in that the history of the parse doubles as the linguistic analysis. Left-associative grammar is illustrated with two left-associative natural language parsers: one for German and one for English.

  16. A NATURAL LANGUAGE PARSER WITH INTERLEAVED SPELLING CORRECTION

    E-print Network

    GRAMMAR AND ILL­FORMED INPUT BY MOHAMMAD ALI ELMI Submitted in partial fulfillment of the requirements of Technology Adv i sor Approved Chicago, Illinois December 1994 #12; ii Ó Copyright by Mohammad Ali Elmi 1994

  17. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…

  18. Toward a theory of distributed word expert natural language parsing

    NASA Technical Reports Server (NTRS)

    Rieger, C.; Small, S.

    1981-01-01

    An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.

  19. Natural language processors

    SciTech Connect

    Rauzino, V.C.

    1983-09-05

    The development of natural language processors has required a shift in the perception of language structures to bring the user interface closer to the ultimate ease of natural language dialogue. This article explains the principles of these new natural language processors which are increasingly becoming commercially available.

  20. Processing Natural Language without Natural Language Processing

    Microsoft Academic Search

    Eric Brill

    2003-01-01

    We can still create computer programs displaying only the most rudimentary natural language processing capabilities. One of\\u000a the greatest barriers to advanced natural language processing is our inability to overcome the linguistic knowledge acquisition\\u000a bottleneck. In this paper, we describe recent work in a number of areas, including grammar checker development, automatic\\u000a question answering, and language modeling, where state of

  1. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K. [Lawrence Livermore National Lab., CA (United States)

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  2. Automatic natural language parsing

    SciTech Connect

    Sprack-Jones, K.; Wilks, Y.

    1985-01-01

    This collection of papers on automatic natural language parsing examines research and development in language processing over the past decade. It focuses on current trends toward a phrase structure grammar and deterministic parsing.

  3. Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability

    Microsoft Academic Search

    Ana-Maria Popescu; Alex Armanasu; Oren Etzioni; David Ko; Alexander Yates

    2004-01-01

    Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statis- tical parsing over the last fifteen years or so. However, statistical parsers require training on a massive, labeled corpus, and manually cre- ating such a corpus for each database is pro- hibitively expensive. To address this quandary, this paper reports on the PRECISE NLI, which uses a

  4. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate…

  5. Natural language parsing in a hybrid connectionist-symbolic architecture

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-03-01

    Most connectionist parsers either cannot guarantee the correctness of their derivations or have to simulate a serial flow of control. In the first case, users have to restrict the tasks (e.g. parse less complex or shorter sentences) of the parser or they need to believe in the soundness of the result. In the second case, the resulting network has lost most of its attractivity because seriality needs to be hard-coded into the structure of the net. We here present a hybrid symbolic connectionist parser, which was designed to fulfill the following goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free grammar, and (3) learning the applicability of parsing rules with a neural network. Our hybrid architecture consists of a serial parsing algorithm and a trainable net. BrainC (Backtracking and Backpropagation in C) combines the well known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent the typical properties of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN- Workstations and was tested with several grammars for English and German. We discuss how BrainC reached its design goals and what results we observed.

  6. Building friendly parsers

    Microsoft Academic Search

    Fahimeh Jalili; Jean H. Gallier

    1982-01-01

    An essential part of any interactive programming development system is an incremental parser capable of error recovery. This paper presents a general incremental parser for LR(1) grammars allowing several\\/any form of modifications in the input program. The parser is suplemented with error recovery routines using an error automaton. Errors can be corrected automatically by recovery routines, or by the user

  7. Natural Language Processing

    Microsoft Academic Search

    Stefano Ferilli

    \\u000a Text processing represents a preliminary phase to many document content handling tasks aimed at extracting and organizing\\u000a information therein. The computer science disciplines devoted to understanding language, and hence useful for such objectives,\\u000a are Computational Linguistics and Natural Language Processing. They rely on the availability of suitable linguistic resources (corpora, computational lexica, etc.) and of standard representation\\u000a models of linguistic

  8. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K. [Lawrence Livermore National Lab., CA (United States)

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. In addition, the Cray1 version contains LRLTRAN and CFT-FORTRAN 77 skeletons. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  9. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K. [Lawrence Livermore National Lab., CA (United States)

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. In addition, the DEC VAX11 version contains LRLTRAN and CFT-FORTRAN 77 skeletons. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  10. A distributed intelligent information system with natural language input for ad hoc knowledge discovery in databases

    SciTech Connect

    Fass, D.; Hall, G.; Laurens, O.; McFetridge, P.; Popowich, F.; Rueden, M. von [Simon Fraser Univ., Burnaby, British Columbia (Canada)

    1996-11-01

    A distributed information system is described which features a graphic user interface incorporating natural language input and which provides ad hoc knowledge discovery in relational databases. The system is comprised of multiple processes which communicate with each other over a network. The knowledge discovery process involves extracting generalizations from data using background knowledge in the form of concept hierarchies and a learning procedure based upon an attribute-oriented induction technique. The natural language understanding process is a parser based on Head-Driven Phrase Structure Grammar (HPSG), a modern lexicon-based grammar formalism better equipped than older rule-based approaches for handling the often idiosyncratic behavior of words. To generate semantic interpretations, the parser makes use of a process which orders logical access paths in unnormalized databases based on the strength of their dependency structures and on their efficiency of execution.

  11. Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors

    E-print Network

    Kuhn, Tobias

    2011-01-01

    Existing grammar frameworks do not work out particularly well for controlled natural languages (CNL), especially if they are to be used in predictive editors. I introduce in this paper a new grammar notation, called Codeco, which is designed specifically for CNLs and predictive editors. Two different parsers have been implemented and a large subset of Attempto Controlled English (ACE) has been represented in Codeco. The results show that Codeco is practical, adequate and efficient.

  12. Natural Language Introduction to NLP

    E-print Network

    Inkpen, Diana

    Translation 1/11/2014 Speech and Language Processing - Jurafsky and Martin 12 The automatic translationNatural Language Processing Introduction to NLP #12;Speech and Language Processing - Jurafsky and Martin 2 Natural Language Processing We're going to study what goes into getting computers to perform

  13. La Description des langues naturelles en vue d'applications linguistiques: Actes du colloque (The Description of Natural Languages with a View to Linguistic Applications: Conference Papers). Publication K-10.

    ERIC Educational Resources Information Center

    Ouellon, Conrad, Comp.

    Presentations from a colloquium on applications of research on natural languages to computer science address the following topics: (1) analysis of complex adverbs; (2) parser use in computerized text analysis; (3) French language utilities; (4) lexicographic mapping of official language notices; (5) phonographic codification of Spanish; (6)…

  14. NLP (Natural Language Processing) for NLP (Natural Language Programming)

    Microsoft Academic Search

    Rada Mihalcea; Hugo Liu; Henry Lieberman

    2006-01-01

    Natural Language Processing holds great promise for making com- puter interfaces that are easier to use for people, since people will (hopefully) be able to talk to the computer in their own language, rather than learn a specialized language of computer commands. For programming, however, the necessity of a formal programming language for communicating with a computer has always been

  15. A Model-Driven Parser Generator, from Abstract Syntax Trees to Abstract Syntax Graphs

    E-print Network

    Quesada, Luis; Cubero, Juan-Carlos

    2012-01-01

    Model-based parser generators decouple language specification from language processing. The model-driven approach avoids the limitations that conventional parser generators impose on the language designer. Conventional tools require the designed language grammar to conform to the specific kind of grammar supported by the particular parser generator (being LL and LR parser generators the most common). Model-driven parser generators, like ModelCC, do not require a grammar specification, since that grammar can be automatically derived from the language model and, if needed, adapted to conform to the requirements of the given kind of parser, all of this without interfering with the conceptual design of the language and its associated applications. Moreover, model-driven tools such as ModelCC are able to automatically resolve references between language elements, hence producing abstract syntax graphs instead of abstract syntax trees as the result of the parsing process. Such graphs are not confined to directed ac...

  16. Introduction to natural language processing

    SciTech Connect

    Harris, M.D.

    1984-01-01

    This book presents an overview of the production by computers and utilization of natural language, as differentiated from programming language. It considers both the practical and theoretical problems of natural language input-output. It presents the computational aspects of the subject with exceptional clarity through the use of concrete programs written in Pascal. It outlines methods for analysis, synthesis, and transformation of language. The book treats syntax and grammar (structure), semantics (inherent meaning), and representation of knowledge (storage and access).

  17. The Nature of Natural Languages.

    ERIC Educational Resources Information Center

    Pierce, Joe E.

    A variety of types of evidence are examined to help determine the true nature of "deep structure" and what, if any, implications this has for linguistic theory as well as culture theory generally. The evidence accumulated over the past century on the nature of phonetic and phonemic systems is briefly discussed, and the following areas of analysis…

  18. Naturally Embedded Query Languages

    Microsoft Academic Search

    Val Tannen; Peter Buneman; Limsoon Wong

    1992-01-01

    We investigate the properties of a simple programming language whose main computational engine is structural recursion on sets. We describe a progression of sublanguages in this paradigm that (1) have increasing expressive power, and (2) illustrate robust conceptual restrictions thus exhibiting interesting additional properties. These properties suggest that we consider our sublanguages as candidates for query languages. Viewing query languages

  19. Left-corner unification-based natural language processing

    SciTech Connect

    Lytinen, S.L.; Tomuro, N. [DePaul Univ., Chicago, IL (United States)

    1996-12-31

    In this paper, we present an efficient algorithm for parsing natural language using unification grammars. The algorithm is an extension of left-corner parsing, a bottom-up algorithm which utilizes top-down expectations. The extension exploits unification grammar`s uniform representation of syntactic, semantic, and domain knowledge, by incorporating all types of grammatical knowledge into parser expectations. In particular, we extend the notion of the reachability table, which provides information as to whether or not a top-down expectation can be realized by a potential subconstituent, by including all types of grammatical information in table entries, rather than just phrase structure information. While our algorithm`s worst-case computational complexity is no better than that of many other algorithms, we present empirical testing in which average-case linear time performance is achieved. Our testing indicates this to be much improved average-case performance over previous leftcomer techniques.

  20. Coping with Ambiguity in Knowledge-based Natural Language Analysis

    E-print Network

    Jordan, Pamela W.

    University klb@cs.cmu.edu Abstract This paper describes the strategies and techniques used by the English (KBMT) system designed to translate English source documents intomulti- ple target languages the "Universal Parser" [Tomita and Carbonell, 1987] with a grammar formalism based on Lexical-Functional Grammar

  1. SRI International: Natural Language Program

    NSDL National Science Digital Library

    This website describes the Natural Language Program that is part of SRI International's Artificial Intelligence Center. The center's research focuses on natural language theory and applications, with emphasis on three subgroups of study. The subprogram on Multimedia / Multimodal Interfaces seeks to understand the optimal ways in which natural language can be incorporated into multimedia interfaces. The subprogram on Spoken Language Systems integrates linguistic processing with speech recognition for use in ATIS, a system for retrieving airline schedules, fares, and related information from a relational database. The subprogram on Written Language Systems researches the problems involved in interpreting and extracting information from written text, such as on-line newspaper articles. Additional information on these projects, related publications, and software are available from this website.

  2. Building representations from natural language

    E-print Network

    Seifter, Mark J

    2007-01-01

    In this thesis, I describe a system I built that produces instantiated representations from descriptions embedded in natural language. For example, in the sentence 'The girl walked to the table', my system produces a ...

  3. Toward understanding natural language directions

    E-print Network

    Kollar, Thomas Fleming

    Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped ...

  4. Natural Language Processing for Biosurveillance

    Microsoft Academic Search

    Wendy W. Chapman; Adi V. Gundlapalli; Brett R. South; John N. Dowling

    \\u000a Information described in electronic clinical reports can be useful for both detection and characterization of outbreaks. However,\\u000a the information is in unstructured, free-text format and is not available to computerized applications. Natural Language processing\\u000a methods structure free-text information by classifying, extracting, and encoding details from the text. We provide a brief\\u000a description of the types of natural Language processing techniques

  5. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  6. Prolog implementation of lexical functional grammar as a base for a natural language processing system

    SciTech Connect

    Frey, W.; Reyle, U.

    1983-01-01

    The authors present a system which constructs a database out of a narrative natural language text. Firstly they give a detailed description of the PROLOG implementation of the parser which is based on the theory of lexical functional grammar (LFG). They show that PROLOG provides an efficient tool for LFG implementation. Secondly, they postulate some requirements a semantic representation has to fulfil in order to be able to analyse whole texts. They show how kamps theory meets these requirements by analysing sample discourses involving anaphoric nps. 4 references.

  7. Natural language processing of lyrics

    Microsoft Academic Search

    Jose P. G. Mahedero; Álvaro MartÍnez; Pedro Cano; Markus Koppenberger; Fabien Gouyon

    2005-01-01

    We report experiments on the use of standard natural language processing (NLP) tools for the analysis of music lyrics. A significant amount of music audio has lyrics. Lyrics encode an important part of the semantics of a song, therefore their analysis complements that of acoustic and cultural metadata and is fundamental for the development of complete music information retrieval systems.

  8. A shallow parser based on closed-class words to capture relations in biomedical text.

    PubMed

    Leroy, Gondy; Chen, Hsinchun; Martinez, Jesse D

    2003-06-01

    Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun phrases automatically from free text has been developed and evaluated. It uses heuristics and a noun phraser to capture entities of interest in the text. Cascaded finite state automata structure the relations between individual entities. The automata are based on closed-class English words and model generic relations not limited to specific words. The parser also recognizes coordinating conjunctions and captures negation in text, a feature usually ignored by others. Three cancer researchers evaluated 330 relations extracted from 26 abstracts of interest to them. There were 296 relations correctly extracted from the abstracts resulting in 90% precision of the relations and an average of 11 correct relations per abstract. PMID:14615225

  9. Connectionist natural language parsing with BrainC

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-08-01

    A close examination of pure neural parsers shows that they either could not guarantee the correctness of their derivations or had to hard-code seriality into the structure of the net. The authors therefore decided to use a hybrid architecture, consisting of a serial parsing algorithm and a trainable net. The system fulfills the following design goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free language, and (3) learning the applicability of parsing rules with a neural network to increase the efficiency of the whole system. BrainC (backtracktacking and backpropagation in C) combines the well- known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent typical structures of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN workstations and was tested with several grammars for English and German. The design of the system and then the results are discussed.

  10. Java Mathematical Expression Parser

    NSDL National Science Digital Library

    Funk, Nathan.

    The Java Mathematical Expression Parser (JEP) is a handy tool "for parsing and evaluating mathematical expressions." It is a no-frills package that incorporates several important features, including user-definable functions and implicit multiplication for easy use. JEP can be downloaded as a complete application, or a couple of its features can be used online as applets. There is a separate page of documentation and installation instructions. Also available on this Web site is the AutoAbacus, which allows users to input a system of equations and obtain the solutions instantaneously.

  11. Discriminative Reranking for Natural Language Parsing

    Microsoft Academic Search

    Michael Collins

    2000-01-01

    This paper considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. We describe and compare

  12. White paper on natural language processing

    Microsoft Academic Search

    Ralph Weischedel; Jaime Carbonell; Barbara Grosz; Wendy Lehnert; Mitchell Marcus; Raymond Perrault; Robert Wilensky

    1989-01-01

    We take the ultimate goal of natural language processing (NLP) to be the ability to use natural languages as effectively as humans do. Natural language, whether spoken, written, or typed, is the most natural means of communication between humans, and the mode of expression of choice for most of the documents they produce. As computers play a larger role in

  13. The design of a parser generator

    Microsoft Academic Search

    David A. Workman; John B. Higdon

    1978-01-01

    In this paper we present an algorithm for generating bounded-context parsers using a modified version of Knuth's algorithm for generating LR(k) parsers. We also describe the internal design of a parser generating system capable of constructing the parsing tables for (s,l) bounded-context and LR(1) parsers. The parser generating system is written in PL\\/1 and is operational on an IBM-370\\/165. The

  14. Deep Natural Language Processing for Italian Sign Language Translation

    E-print Network

    Mazzei, Alessandro

    Deep Natural Language Processing for Italian Sign Language Translation No Author Given No Institute Sign Language. We describe the main features of the four modules of this architecture, i showed a growing interest toward sign languages, and a number of projects concerning the translation

  15. GEMINI: A Natural Language System for Spoken-Language Understanding

    Microsoft Academic Search

    John Dowding; Jean Mark Gawron; Douglas E. Appelt; John Bear; Lynn Cherny; Robert C. Moore; Douglas B. Moran

    1993-01-01

    Gemini is a natural language understanding system developed for spoken language applications. This paper describes the details of the system, and includes relevant measurements of size, efficiency, and performance of each of its sub-components in detail.

  16. Foundations of Statistical Natural Language Processing

    Microsoft Academic Search

    Christopher D. Manning; Hinrich Schiitze

    1999-01-01

    Abstract: this paperas "the first clear demonstration of a probabilistic parser outperforming a trigram model" (pg. 457), itdoes not discuss what features of the algorithm lead to its superior results

  17. Natural language processing: an introduction

    PubMed Central

    Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786

  18. A lex-based mad parser and its applications

    SciTech Connect

    Oleg Krivosheev et al.

    2001-07-03

    An embeddable and portable Lex-based MAD language parser has been developed. The parser consists of a front-end which reads a MAD file and keeps beam elements, beam line data and algebraic expressions in tree-like structures, and a back-end, which processes the front-end data to generate an input file or data structures compatible with user applications. Three working programs are described, namely, a MAD to C++ converter, a dynamic C++ object factory and a MAD-MARS beam line builder. Design and implementation issues are discussed.

  19. PARSING TURKISH SENTENCES FOR NATURAL LANGUAGE WATERMARKING

    E-print Network

    Güngör, Tunga

    , methodologies, and de- sign decisions for transforming a Treebank for Turkish Language to a hierarchical model for information hiding in natural language text can be grouped under two categories. First group is basedPARSING TURKISH SENTENCES FOR NATURAL LANGUAGE WATERMARKING Ersin Ihsan ¨Unkar June 2007 Submitted

  20. Natural Language Processing, Instructor: Diana Inkpen

    E-print Network

    Inkpen, Diana

    @eecs.uottawa.ca Preliminaries #12;Why study Natural Language Processing (NLP)? NLP is a very important current area to the world #12;NLP and related terms Natural language processing (NLP) = manipulation, processing the way people do it". Language engineering = Building systems that apply the techniques of NLP; has

  1. ALGORITHMIC ASPECTS OF NATURAL LANGUAGE PROCESSING

    E-print Network

    Nederhof, Mark-Jan

    ) studies algorithms, tools and techniques for automatic processing of natural languages. A re- latedALGORITHMIC ASPECTS OF NATURAL LANGUAGE PROCESSING Mark-Jan Nederhof, University of St Andrews languages, which are designed to allow easy processing by computer algorithms. Typically, programs in pro

  2. Edinburgh Research Explorer Historical Post Office Directory Parser (POD Parser) Software

    E-print Network

    Millar, Andrew J.

    Edinburgh Research Explorer Historical Post Office Directory Parser (POD Parser) Software From, 'Historical Post Office Directory Parser (POD Parser) Software From the AddressingHistory Project' Journal Office Directories(PODs) with contemporaneous histori- cal maps. TPODs emerged during the late

  3. Natural language and spatial reasoning

    E-print Network

    Tellex, Stefanie, 1980-

    2010-01-01

    Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands ...

  4. Recent Developments in Natural Language Text Retrieval

    Microsoft Academic Search

    Tomek Strzalkowski; Jose Perez Carballo

    1993-01-01

    ABSTRACT This paper reports on some,recent developments,in our natural language text retrieval system. The system uses advanced,natural language processing techniques to enhance,the,effectiveness of,term-based document retrieval. The backbone,of our system is a traditional statistical engine which,builds inverted index files from pre-processed documents, and then searches and ranks the documents,in response,to user queries. Natural language processing is used to (1) preprocess the

  5. Parallel processing of natural language

    SciTech Connect

    Chang, H.O.

    1986-01-01

    Two types of parallel natural language processing are studied in this work: (1) the parallelism between syntactic and nonsyntactic processing and (2) the parallelism within syntactic processing. It is recognized that a syntactic category can potentially be attached to more than one node in the syntactic tree of a sentence. Even if all the attachments are syntactically well-formed, nonsyntactic factors such as semantic and pragmatic consideration may require one particular attachment. Syntactic processing must synchronize and communicate with nonsyntactic processing. Two syntactic processing algorithms are proposed for use in a parallel environment: Early's algorithm and the LR(k) algorithm. Conditions are identified to detect the syntactic ambiguity and the algorithms are augmented accordingly. It is shown that by using nonsyntactic information during syntactic processing, backtracking can be reduced, and the performance of the syntactic processor is improved. For the second type of parallelism, it is recognized that one portion of a grammar can be isolated from the rest of the grammar and be processed by a separate processor. A partial grammar of a larger grammar is defined. Parallel syntactic processing is achieved by using two processors concurrently: the main processor (mp) and the two processors concurrently: the main processor (mp) and the auxiliary processor (ap).

  6. Empirical learning of natural language processing tasks

    Microsoft Academic Search

    Walter Daelemans; A. Van den Bosch

    1997-01-01

    Language learning has thus far not been a hot application for machine-learning (ML) research. This limited attention for work on empirical learning of language knowledge and behaviour from text and speech data seems unjustified. After all, it is becoming apparent that empirical learning of Natural Language Processing (NLP) can alleviate NLP's all-time main problem, viz. the knowledge acquisition bottleneck: empirical

  7. A Robust Parser-Interpreter for Jazz Chord Sequences Mark Granroth-Wilding1 and Mark Steedman2

    E-print Network

    Steedman, Mark

    A Robust Parser-Interpreter for Jazz Chord Sequences Mark Granroth-Wilding1 and Mark Steedman2 1 to semantics in language. Our parser uses a formal grammar of jazz chord sequences, of a kind widely used in the grammar. Using machine learning over a small corpus of jazz chord sequences annotated with harmonic anal

  8. Natural language processing based ontology learning

    Microsoft Academic Search

    Chengxiang Yuan; Yi Zhuang; Xiaojun Li

    2010-01-01

    Though the utility of domain Ontologies is now widely acknowledged in an increasing number of domains, a critical task of identifying, defining, and entering the concept definitions is still intractable. Nowadays, natural language becomes a more and more important information source, and natural language processing (NLP) becomes more and more practical. To reduce time, cost and making use of NLP,

  9. Natural Language Processing on the Web

    E-print Network

    Natural Language Processing on the Web Guy Lapalme RALI-DIRO, Université de Montréal ! http://www.iro.umontreal.ca/~lapalme #12;Overview · What is Natural Language Processing (NLP) · NLP for the Web · The Web for NLP 2 #12 recognition 5 #12;http://rali.iro.umontreal.ca #12;NLP for the syntactic Web search engines · NLP saved

  10. Natural language processing: a prolog perspective

    Microsoft Academic Search

    Christian Bitter; David A. Elizondo; Yingjie Yang

    2010-01-01

    Natural language processing (NLP) is a vibrant field of interdisciplinary Computer Science research. Ultimately, NLP seeks\\u000a to build intelligence into software so that software will be able to process a natural language as skillfully and artfully\\u000a as humans. Prolog, a general purpose logic programming language, has been used extensively to develop NLP applications or\\u000a components thereof. This report is concerned

  11. Natural Language Processing and Digital Libraries

    Microsoft Academic Search

    Jean-pierre Chanod

    1999-01-01

    As one envisions a document model where language, physical location and medium - electronic, paper or other - impose no barrier\\u000a to effective use, natural language processing will play an increasing role, especially in the context of digital libraries.\\u000a \\u000a This paper presents language components based mostly on finite-state technology that improve our capabilities for exploring,\\u000a enriching and interacting in various

  12. Natural language search of structured documents

    E-print Network

    Oney, Stephen W

    2008-01-01

    This thesis focuses on techniques with which natural language can be used to search for specific elements in a structured document, such as an XML file. The goal is to create a system capable of being trained to identify ...

  13. Sepia: a Framework for Natural Language Semantics

    E-print Network

    Marton, Gregory Adam

    2009-05-28

    To help explore linguistic semantics in the context of computational natural language understanding, Sepia provides a realization the central theoretical idea of categorial grammar: linking words and phrases to compositional ...

  14. Natural Language Processing and Systems Biology

    Microsoft Academic Search

    K. Bretonnel Cohen; Lawrence Hunter

    This chapter outlines the basic families of applications of natural language processing techniques to questions of interest\\u000a to systems biologists and describes publicly available resources for such applications.

  15. Decoding algorithms for complex natural language tasks

    E-print Network

    Deshpande, Pawan

    2007-01-01

    This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a ...

  16. Learning Semantic Maps from Natural Language Descriptions

    E-print Network

    Walter, Matthew R.

    This paper proposes an algorithm that enables robots to efficiently learn human-centric models of their environment from natural language descriptions. Typical semantic mapping approaches augment metric maps with higher-level ...

  17. Natural Language Interfaces to Databases - An Introduction

    Microsoft Academic Search

    Ion Androutsopoulos; Graeme D. Ritchie; Peter Thanisch

    1995-01-01

    This paper is an introduction to natural language interfaces to databases (Nlidbs). A brief overview of the history of Nlidbs is first given. Some advantages and disad- vantages of Nlidbs are then discussed, comparing Nlidbs to formal query languages, form-based interfaces, and graphical interfaces. An introduction to some of the linguistic problems Nlidbs have to confront follows, for the benefit

  18. Dynamic semantics for a controlled natural language

    Microsoft Academic Search

    Rolf Schwitter

    2004-01-01

    In this paper the author present a dynamic approach for constructing an unambiguous semantic representation for a text written in a controlled natural language called PENG. The semantic representation is built up incrementally n left-to-right order - while the user of the PENG system writes the text. For each word form that the user types, the language processor creates a

  19. Introduction: Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Smeaton, Alan F.

    1990-01-01

    Discussion of research into information and text retrieval problems highlights the work with automatic natural language processing (NLP) that is reported in this issue. Topics discussed include the occurrences of nominal compounds; anaphoric references; discontinuous language constructs; automatic back-of-the-book indexing; and full-text analysis.…

  20. Visual Tools for Natural Language Processing

    Microsoft Academic Search

    Robert J. Gaizauskas; Peter J. Rodgers; Kevin Humphreys

    2001-01-01

    We describe GATE, the General Architecture for Text Engineering, an integrated visual development environment to support the visual assembly, execution and analysis of modular natural language processing systems. The visual model is an executable data flow program graph, automatically synthesised from data dependency declarations of language processing modules. The graph is then directly executable: modules are run interactively in the

  1. Acquiring Correct Knowledge for Natural Language Generation

    Microsoft Academic Search

    Ehud Reiter; Somayajulu Sripada; Roma Robertson

    2003-01-01

    Natural language generation (nlg) systems are computer software systems that pro- duce texts in English and other human languages, often from non-linguistic input data. nlg systems, like most ai systems, need substantial amounts of knowledge. However, our experience in two nlg projects suggests that it is dicult to acquire correct knowledge for nlg systems; indeed, every knowledge acquisition (ka) technique

  2. Linguistic Aspects of Natural Language Processing

    Microsoft Academic Search

    Eva Hajicová

    1992-01-01

    As we stated at the beginning of our paper, systems using natural language as a means of human\\/machine communication exhibit varying degrees of complexity; closely related to it there is the degree of complexity of the NLP module. In any case, however, NLP is both a practically necessary and a theoretically stimulative task. A command of language is an inherent

  3. Type Parser 1.1

    NSDL National Science Digital Library

    This handy utility helps users who want to get more information on the usage of their drives. With Type Parser, users can discover wasted drive space, the types of files responsible, and where they reside. The options menu allows users to specify whether they wish to calculate cluster size or directory data and what size files they wish to analyze. A good way to discover long forgotten files and recover some of that seemingly ever-elusive disk space.

  4. Parser Combinators: a Practical Application for Generating Parsers for NMR Data

    PubMed Central

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  5. A Natural Language Processing Infrastructure for Turkish

    Microsoft Academic Search

    A. C. Cem SAY; Özlem ÇETNOLU

    We built an open-source software platform in- tended to serve as a common infrastructure that can be of use in the development of new applica- tions involving the processing of Turkish. The platform incorporates a lexicon, a morphological analyzer\\/generator, and a DCG parser\\/generator that translates Turkish sentences to predicate logic formulas, and a knowledge base frame- work. Several developers have

  6. Natural Language Technology in Precision Content Retrieval

    Microsoft Academic Search

    Jacek Ambroziak; William A. Woods

    1998-01-01

    This paper describes a new approach to information access that combines techniques from natural lan- guage processing and knowledge representation with a new technique for relevance estimation and pas- sage retrieval. Unlike many attempts to combine natural language processing with information retrieval, these results show significant benefit from using linguistic knowledge. Subsumption technology is used to automatically integrate syntactic, semantic,

  7. Statistical Physics for Natural Language Processing

    E-print Network

    Moreno, Juan-Manuel Torres; SanJuan, Eric

    2010-01-01

    In this paper we study the {\\sc Enertex} model that has been applied to fundamental tasks in Natural Language Processing (NLP) including automatic document summarization and topic segmentation. The model is language independent. It is based on the intuitive concept of Textual Energy, inspired by Neural Networks and Statistical Physics of magnetic systems. It can be implemented using simple matrix operations and on the contrary of PageRank algorithms, it avoids any iterative process.

  8. Knowledge engineering approach to natural language understanding

    SciTech Connect

    Shapiro, S.C.; Neal, J.G.

    1982-01-01

    The authors describe the results of a preliminary study of a knowledge engineering approach to natural language understanding. A computer system is being developed to handle the acquisition, representation, and use of linguistic knowledge. The computer system is rule-based and utilizes a semantic network for knowledge storage and representation. In order to facilitate the interaction between user and system, input of linguistic knowledge and computer responses are in natural language. Knowledge of various types can be entered and utilized: syntactic and semantic; assertions and rules. The inference tracing facility is also being developed as a part of the rule-based system with output in natural language. A detailed example is presented to illustrate the current capabilities and features of the system. 12 references.

  9. From NLP (Natural Language Processing) to MLP (Machine Language Processing)

    Microsoft Academic Search

    Peter Teufl; Udo Payer; Guenter Lackner

    2010-01-01

    \\u000a Natural Language Processing (NLP) in combination with Machine Learning techniques plays an important role in the field of\\u000a automatic text analysis. Motivated by the successful use of NLP in solving text classification problems in the area of e-Participation\\u000a and inspired by our prior work in the field of polymorphic shellcode detection we gave classical NLP-processes a trial in\\u000a the special

  10. Multilingual Natural Language Generation (Experience from AGILE Project)1

    E-print Network

    Borissova, Daniela

    Language Generation is an interesting and challenging field of Natural Language Processing. Automatic generation of texts in natural language could be viewed as a final part of automated translation process from Language Processing technologies, which concentratethe researchers atten- tion on Natural Language

  11. The Rhetorical Parsing of Natural Language Texts

    Microsoft Academic Search

    Daniel Marcu

    1997-01-01

    We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sentences into clauses, and one that produces valid rhetorical structure trees for unrestricted natural languages texts. The algorithms use information that was derived from a corpus analysis of cue phrases.

  12. Two Interpretive Systems for Natural Language?

    ERIC Educational Resources Information Center

    Frazier, Lyn

    2015-01-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system…

  13. Capturing Practical Natural Language Transformations Kevin Knight

    E-print Network

    Knight, Kevin

    Capturing Practical Natural Language Transformations Kevin Knight Information Sciences Institute: - Expressiveness: can we can express the required linguistic knowl- edge in the formalism? c 2008 Kluwer Academic Publishers. Printed in the Netherlands. f.tex; 20/08/2008; 10:49; p.1 #12;2 Kevin Knight - Modularity: can we

  14. Natural Language Syntax First Order Inference

    E-print Network

    Givan, Bob

    Natural Language Syntax and First Order Inference David McAllester and Robert Givan MIT Artificial­standard syntax for first order logic. In this paper we define a syntax for first order logic based than analogous procedures based on either classical or taxonomic syntax. This paper appeared

  15. HEADDRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING

    E-print Network

    Collins, Michael

    HEAD­DRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING Michael Collins A DISSERTATION of Dissertation Professor Jean Gallier Graduate Group Chair #12; COPYRIGHT Michael Collins 1999 #12 breadth and depth of their feedback. I had countless impromptu but influential discussions with Jason

  16. Dynamic Semantics for a Controlled Natural Language

    Microsoft Academic Search

    Rolf Schwitter

    2004-01-01

    In this paper I present a dynamic approach for construct- ing an unambiguous semantic representation for a text writ- ten in a controlled natural language called PENG. The se- mantic representation is built up incrementally - in left-to- right order - while the user of the PENG system writes the text. For each word form that the user types, the

  17. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query…

  18. Natural Language Annotations for the Semantic Web

    E-print Network

    Massachusetts Institute of Technology (MIT), Computer Science and Artificial Intelligence Laboratory, InfoLab

    Natural Language Annotations for the Semantic Web Boris Katz 1 , Jimmy Lin 1 , and Dennis Quan 2 1. Because the ultimate purpose of the Semantic Web is to help users locate, organize, and process of the Semantic Web, was designed to be easily processed by computers, not humans. To render RDF friendlier

  19. Natural Language Annotations for the Semantic Web

    E-print Network

    Lin, Jimmy

    Natural Language Annotations for the Semantic Web Boris Katz1 , Jimmy Lin1 , and Dennis Quan2 1 MIT. Because the ultimate purpose of the Semantic Web is to help users locate, organize, and process of the Semantic Web, was designed to be easily processed by computers, not humans. To render RDF friendlier

  20. Adaptive value within natural language discourse

    Microsoft Academic Search

    Michael L. Best

    2006-01-01

    Abstract A trait is of adaptive value if it confers a fitness adv antage to its possessor. Thus adaptivness is an ahistorical identification of a trait af fording some selective advantage to an agent within some particular environment. In results reported here we identify a trait within natural language discourse as having adaptive value by computing a trait\\/fitness covariance; the

  1. Transition network grammars for natural language analysis

    Microsoft Academic Search

    William A. Woods

    1970-01-01

    The use of augmented transition network grammars for the analysis of natural language sentences is described. Structure-building actions associated with the arcs of the grammar network allow for the reordering, restructuring, and copying of constituents necessary to produce deep-structure representations of the type normally obtained from a transformational analysis, and conditions on the arcs allow for a powerful selectivity which

  2. Natural Language Processing in the Medical and Biological Domains

    E-print Network

    Zweigenbaum, Pierre

    Natural Language Processing in the Medical and Biological Domains : a Parallel Perspective Pierre Language Processing in the Medical and Biological Domains : Why are they dierent? Pierre Zweigenbaum LIMSI/59 #12;Introduction Natural Language Processing in the Medical Domain Natural Language Non

  3. Towards a Bio-computational Model of Natural Language Learning

    E-print Network

    Boyer, Edmond

    together the theory of Grammatical Inference and the studies of natural language acquisition. We discuss how the studies of natural language acquisition can improve results in the field of Grammatical in natural language learning. 1 Introduction Children, independently of their culture and the language

  4. Current trends with natural language processing.

    PubMed

    Rassinoux, A M; Michel, P A; Wagner, J; Baud, R

    1995-01-01

    Natural Language Processing in the medical domain becomes more and more powerful, efficient, and ready to be used in daily practice. The needs for such tools are enormous in the medical field, due to the vast amount of written texts for medical records. In the authors' point of view, the Electronic Patient Record (EPR) is achieved neither with Information Systems of all kinds nor with commercially available word processing systems. Natural Language Processing (NLP) is one dimension of the EPR, as well as Image Processing and Decision Support Systems. Analysis of medical texts to facilitate indexing and retrieval is well known. The need for a generation tool is to produce progress notes from menu driven systems. The computer systems of tomorrow cannot miss any single dimension. Since 1988, we've been developing an NLP system; it is supported by the European program AIM (Advanced Informatics in Medicine) within the GALEN and HELIOS consortium and the CERS (Commission d'Encouragement á la Recherche Scientifique) in Switzerland. The main directions of development are: a medical language analyzer, a language generator, a query processor, and dictionary building tools to support the Medical Linguistic Knowledge Base (MLKB). The knowledge representation schema is essentially based on Sowa's conceptual graphs, and the MLKB is multilingual from its design phase; it currently incorporates the English and the French languages; it will also continue using German. The goal of this demonstration is to provide evidence of what exists today, what will be soon available, and what is planned for the long term. Complete sentences will be processed in real time, and the browsing capabilities of the MLKB will be exercised. In particular, the following features will be presented: Analysis of complete sentences with verbs and relatives, as extracted from clinical narratives, with special attention to the method of "proximity processing" as developed in our group and the rule based approach to language description to resolve the specific surface language problems as well as the language independent semantic situations. Comparison of results for English, French, and German sentences, showing the commonalities between these languages and, therefore, the re-usable features and the language specific aspects. Generation of noun phrases in English and French, showing the opportunities for translation between these two languages. Application of the analyzer to build a knowledge representation of ICD under the form of conceptual graphs and presentation of the possibilities of a natural language encoding of diagnosis. Strategies for query processing through a sample of abdominal ultrasonography reports, which have been analyzed and stored under the form of conceptual graphs. Feeding in and browsing of the Medical Linguistic Knowledge Base and other Dictionary Building Tools, using the perspective of an international initiative to converge towards a multilingual universal solution, valid for the medical domain. The demonstration platform is Microsoft Windows 4 on a PC, with Microsoft Visual Basic as the GUI and Quintus Prolog as NLP tools language. The same programs were originally developed for Unix-based workstations and are available on multiple platforms under Motif and X11. . PMID:8591530

  5. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  6. Substitutional Semantics and Natural Language Quantification

    E-print Network

    Ludlow, Peter

    sharpening of certain points came out of discussion of these issues with Noam Chomsky and Norbert Hornstein. Finally, I would like to thank the MIT Department of Linguistics and Philosophy for making their facilities available to me during my tenure as a... Visiting Scholar. 'For example. Dale Gottlieb, Ontoloqical Economy; Substitutional Quantification and Mathematics, (Oxford: Oxford University Press, I960). 'Noam Chomsky has suggested to me that it is ille­ gitimate to suppose that natural language...

  7. Reference And Description In Natural Language

    NASA Astrophysics Data System (ADS)

    Steinberg, Alan N.

    1988-03-01

    We propose a theory for modeling the semantic and pragmatic properties of natural language expressions used to refer. The sorts of expressions to be discussed include proper names, definite noun phrases and personal pronouns. We will focus in this paper on such expressions in the singular, having discussed elsewhere procedures for extending the present sort of analysis to various plural uses of these expressions. Propositions involving referential expressions are formally redefined in a second order predicate calculus, in which various semantic and pragmatic factors involved in establishing and interpreting references are modeled as rules of inference. Uses of referential utterances are differentiated according to the means used for individuating the object referred to. Analyses are provided for anaphoric, contextual, demonstrative, introductory and citational individuative devices. We analyze sentences like 'The man [or John] is wise' as conditionals of the form 'Whatever is uniquely a man [or named "John"] relevant to the present discourse is wise'. So modeled, the presupposition of existence (which historically has concerned much logical analysis of such sentences) is represented as a conversational implicature of the sort which obtains from any proposition of the form '(P -> Q)' to the corresponding `P'. This formalization is intended to serve as part of an empirical theory of natural language phenomena. Being an empirical theory, ours will strive to model the greatest possible diversity of phenomena using a minimum of formal apparatus. Such a theory may provide a foundation for automatic systems to predict and replicate natural language phenomena for purposes of text understanding and synthesis.

  8. Representing the Semantics of Natural Language as Constraint Expressions

    E-print Network

    Grossman, Richard W.

    The issue of how to represent the "meaning" of an utterance is central to the problem of computer understanding of natural language. Rather than relying on ad-hoc structures or forcing the complexities of natural language ...

  9. Using Semantic Unification to Generate Regular Expressions from Natural Language

    E-print Network

    Kushman, Nate

    We consider the problem of translating natural language text queries into regular expressions which represent their meaning. The mismatch in the level of abstraction between the natural language representation and the ...

  10. Two interpretive systems for natural language?

    PubMed

    Frazier, Lyn

    2015-02-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system is one that uses the grammar together with knowledge of how the human production system works. It is token based and only delivers plausible meanings, including meanings based on a repaired input when the input might have been produced as a speech error. PMID:25420935

  11. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  12. Natural languages as collections of resources Robin Cooper

    E-print Network

    Cooper, Robin

    Natural languages as collections of resources Robin Cooper G¨oteborg University Aarne Ranta.2) propose a view on which natural languages are rather to be regarded as collec- tions of resources on general resources for natural languages and we will give a brief characterization of the system in section

  13. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  14. The economics of natural language interfaces: natural language processing technology as a scarce resource

    Microsoft Academic Search

    Sumali J. Conlon; John R. Conlon; Tabitha L. James

    2004-01-01

    This paper discusses appropriate application areas for natural language interfaces (NLIs) to databases. This requires comparing NLIs with competing approaches, including other user-friendly interfaces, and training of users with less user-friendly interfaces. Also, since NLI technology is still limited, users may need to learn how to use NLIs themselves. This suggests that NLI popularity may snowball at some point, as

  15. XSS-FP: Browser Fingerprinting using HTML Parser Quirks

    E-print Network

    Paris-Sud XI, Université de

    1 XSS-FP: Browser Fingerprinting using HTML Parser Quirks Abgrall Erwan, Yves Le Traon, Martin Firefox 15) of a web-browser, exploiting HTML parser quirks exercised through XSS. Our experiments show to use the behavior of the HTML parser under specific inputs to fingerprint the type and version

  16. Natural language processing technologies for developing a language learning environment

    Microsoft Academic Search

    Harald Wahl; Werner Winiwarter; Gerald Quirchmayr

    2010-01-01

    So far, Computer-Assisted Language Learning (CALL) comes in many different flavors. Our research work focuses on developing an integrated e-learning environment that allows improving language skills in specific contexts. Integrated e-learning environment means that it is a Web-based solution that performs language learning tasks using common working environments like, for instance, Web browsers or Email clients. It should be accessible

  17. Understanding natural language for spacecraft sequencing

    NASA Technical Reports Server (NTRS)

    Katz, Boris; Brooks, Robert N., Jr.

    1987-01-01

    The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.

  18. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  19. Intelligent CAI: An Author Aid for a Natural Language Interface.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)…

  20. ON THE NATURE AND NURTURE OF LANGUAGE Elizabeth Bates

    E-print Network

    ON THE NATURE AND NURTURE OF LANGUAGE Elizabeth Bates University of California, San Diego Support, University of California at San Diego, La Jolla, CA 92093-0526, or bates@crl.ucsd.edu. #12;2 ON THE NATURE AND NURTURE OF LANGUAGE Elizabeth Bates Language is the crowning achievement of the human species

  1. 6.881 Natural Language Processing, Fall 2004

    E-print Network

    Barzilay, Regina

    This course is a graduate level introduction to natural language processing, the primary concern of which is the study of human language from a computational perspective. The class will cover models at the level of syntactic, ...

  2. Evolutionary Explanations for Natural Language -Criteria from Evolutionary Biology

    E-print Network

    Zuidema, Jelle

    Evolutionary Explanations for Natural Language - Criteria from Evolutionary Biology Willem Zuidema Spuistraat 210, 1012 VT Amsterdam, The Netherlands Abstract Theories of the evolutionary origins of language surveys the requirements on evolutionary scenarios that derive from mathematical evolutionary biology

  3. Sign languages are naturally occurring languages used by members of deaf communities throughout the world.

    E-print Network

    Sign languages are naturally occurring languages used by members of deaf communities throughout of vocal tract articula- tors, established sign languages display the full array of hierarchically of the present research was in sign­speech correspondence in the perceptual effects of language experience

  4. Automatic induction of n-gram language models from a natural language grammar

    Microsoft Academic Search

    Stephanie Seneff; Chao Wang; Timothy J. Hazen

    2003-01-01

    This paper details our work in developing a technique which can automatically generate class n-gram language models from natural language (NL) grammars in dialogue systems. The pro- cedure eliminates the need for double maintenance of the recog- nizer language model and NL grammar. The resulting language model adopts the standard class n-gram framework for compu- tational efficiency. Moreover, both the

  5. A Maximum-Entropy-Inspired Parser

    Microsoft Academic Search

    Eugene Charniak

    2000-01-01

    We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision\\/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5, 9, 10, 15, 17] \\

  6. Confluent Preorder Parser as Finite State Automata

    Microsoft Academic Search

    Edward Kei Shiu Ho; Lai-wan Chan

    1996-01-01

    . We present the Confluent Preorder Parser (CPP) in whichsyntactic parsing is achieved via a holistic transformation from the sentencerepresentation to the desired parse tree representation. Simulationresults show that CPP has achieved excellent generalization performanceand is capable of handling erroneous sentences and resolving syntacticambiguities. An analysis is presented which elucidates the operations ofCPP as governed by a finite state automata.

  7. Fuzzy Modeling and Natural Language Processing for Panini's Sanskrit Grammar

    E-print Network

    Reddy, P Venkata Subba

    2010-01-01

    Indian languages have long history in World Natural languages. Panini was the first to define Grammar for Sanskrit language with about 4000 rules in fifth century. These rules contain uncertainty information. It is not possible to Computer processing of Sanskrit language with uncertain information. In this paper, fuzzy logic and fuzzy reasoning are proposed to deal to eliminate uncertain information for reasoning with Sanskrit grammar. The Sanskrit language processing is also discussed in this paper.

  8. The Linguistic Nature of Language and Communication.

    ERIC Educational Resources Information Center

    Zitlow, Connie S., Ed.

    2001-01-01

    Discusses five recent books about language that address issues that arise in classrooms with an increasing number of diverse dialects and varied home languages. Discusses the complexities of language, misunderstandings in the Ebonics controversy, socioeducational issues, and classroom ideas for teachers. Describes two web sites. (SR)

  9. Natural Language Metaphors Covertly Influence Reasoning

    PubMed Central

    Thibodeau, Paul H.; Boroditsky, Lera

    2013-01-01

    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city’s crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people’s reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents. PMID:23301009

  10. Using Speech and Natural Language Technology in Language Intervention

    E-print Network

    . Current research and practice in remediation both stress the need for achieving engagement and sustaining or incidence of childhood cancer. With health care and education costs near $20,000 per year per child in verbal expressive language have been found to be the most stressful type of impairment with which parents

  11. Transportable natural-language interfaces: problems and techniques

    SciTech Connect

    Grosz, B.J.

    1982-01-01

    The author considers the question of natural language database access within the context of a project at SRI, TEAM, that is developing techniques for transportable natural-language interfaces. The goal of transportability is to enable nonspecialists to adapt a natural-language processing system for access to an existing conventional database. TEAM is designed to interact with two different kinds of users. During an acquisition dialogue, a database expert (DBE) provides TAEM with information about the files and fields in the conventional database for which a natural-language interface is desired. (Typically this database already exists and is populated, but TAEM also provides facilities for creating small local databases.) This dialogue results in extension of the language-processing and data access components that make it possible for an end user to query the new database in natural language. 13 references.

  12. INMED/TINS special issue Nature and nurture in language

    E-print Network

    Dehaene-Lambertz, Ghislaine

    INMED/TINS special issue Nature and nurture in language acquisition: anatomical and functional/TINS special issue Nature and nurture in brain development and neurological disorders, based on presentations

  13. Natural Language Processing based Automatic Multilingual Code Generation

    Microsoft Academic Search

    Imran Sarwar Bajwa; M. Shahid Naveed; M. Abbas Choudhary

    2006-01-01

    Unified modeling language is being used as a premier tool for modeling the user requirements. These CASE tools provide an easy way to get efficient solutions. This paper presents a natural language processing based automated system for generating code in multi- languages after modeling the user requirements based on UML. UML diagrams are first generated by analyzing the given business

  14. Sex and Gender in Natural Language.

    ERIC Educational Resources Information Center

    Percival, W. Keith

    The relation between a real-world category (sex) and a linguistic category (gender) is examined. The gender system of Indo-European languages is discussed, and the way gender works in Greek, one of the older Indo-European languages, is examined at some length. The conclusion is that, but for the existence of separate gender-sensitive adjectival…

  15. Overview of computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language processing and understanding is the key to enabling humans and their creations to interact with machines in natural language (in contrast to computer language). The doors that such an achievement can open has made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural languages interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations.

  16. Natural Language Dialog with a Tutor System for Mathematical Proofs

    Microsoft Academic Search

    Christoph Benzmüller; Helmut Horacek; Ivana Kruijff-korbayová; Manfred Pinkal; Jörg H. Siekmann; Magdalena Wolska

    2005-01-01

    Natural language interaction between a student and a tutor- ing or an assistance system for mathematics is a new multi-disciplinary challenge that requires the interaction of (i) advanced natural language processing, (ii) flexible tutorial dialog strategies including hints, and (iii) mathematical domain reasoning. This paper provides an overview on the current research in the multi-disciplinary research project Dialog, whose goal

  17. Interactive Location-Based Services Combined with Natural Language

    Microsoft Academic Search

    Ma Chang-jie; Fang Jin-yun

    2007-01-01

    Extending beyond today's geographical information system (GIS) applications and wireless positioning techniques, location-based services (LBS) have become a new compelling branch and supplied egocentric location information anywhere anytime. In order to speed up interactive location-based queries in moving or field environments effectively, the paper has given a valuable attempt on LBS with natural language and also achieved: (1) Natural language

  18. The SIS Project: Software Reuse with a Natural Language Approach

    E-print Network

    Prechelt, Lutz

    The SIS Project: Software Reuse with a Natural Language Approach Lutz Prechelt (prechelt Karlsruhe, Germany Technical Report 2/92 June 16, 1992 Abstract The SIS (Software Information System in the knowledge base. In the SIS project, both the knowledge representation system YAKS and the natural language

  19. MENELAS: an access system for medical records using natural language

    Microsoft Academic Search

    Pierre Zweigenbaum

    1994-01-01

    The overall goal of Menelas is to provide better access to the information contained in natural language patient discharge summaries, through the design and implementation of a pilot system able to access medical reports through natural languages. A first, experimental version of the Menelas indexing prototype for French has been assembled. Its function is to encode free text PDSs into

  20. Natural Language Processing and User Modeling: Synergies and Limitations

    E-print Network

    Zukerman, Ingrid

    precisely; the incorporation of user models into Natural Language Generation (NLG) systems has yielded on developing complete practical systems, albeit of limited scope. The dream of the natural language community in order to achieve this dream. Pars- ing techniques must be robust enough to handle ill

  1. CSI 5180: Topics in AI: Natural Language Processing,

    E-print Network

    Inkpen, Diana

    Inkpen e-mail: diana@site.uottawa.ca Preliminaries Why study Natural Language Processing (NLP)? NLP the world? ­ how to connect utterances to the world NLP and related terms Natural language processing (NLP the techniques of NLP; has an emphasis on the creation of large systems, software engineering Computational

  2. Maurice Gross' grammar lexicon and Natural Language Processing

    Microsoft Academic Search

    Claire Gardent; Bruno Guillaume; Guy Perrier; Ingrid Falk

    Maurice Gross' grammar lexicon contains an extremly rich and exhaustive information about the morphosyntactic and semantic proper- ties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is still restricted. In this paper, we first argue that the information contained in the grammar lexicon is potentially useful for Natural Language Processing (NLP). We then

  3. Inferring heuristic classification hierarchies from natural language input

    NASA Technical Reports Server (NTRS)

    Hull, Richard; Gomez, Fernando

    1993-01-01

    A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.

  4. Using Sequence Package Analysis to Improve Natural Language Understanding

    Microsoft Academic Search

    Amy Neustein

    2001-01-01

    Developers of dialogue systems must confront the complexities of natural language. The purpose of this paper is to demonstrate how “sequence package” analysis, as a novel approach, can help to improve natural language understanding. Such an approach would go beyond the standard grammatical formalisms represented in most dialogue systems, to include context-dependent utterance sequences that are shaped by the unfolding

  5. SBVR Business Rules Generation from Natural Language Specification

    E-print Network

    Lee, Mark

    , the business rule analyst generates and manages business rules. Moreover, Business rule management (BRMSBVR Business Rules Generation from Natural Language Specification Imran S. Bajwa, Mark G. Lee of translating natural languages specification to SBVR business rules. The business rules constraint business

  6. Tutorial on techniques and applications for natural language processing

    SciTech Connect

    Hayes, P.J.; Carbonell, J.G.

    1983-10-17

    Natural language communication with computers has long been a major goal of Artificial Intelligence both for what it can tell us about intelligence in general and for its practical utility - data bases, software packages, and Al-based expert systems all require flexible interfaces to a growing community of users who are not able or do not wish to communicate with computers in formal, artificial command languages. Whereas many of the fundamental problems of general natural language processing (NLP) by machine remain to be solved, the area has matured in recent years to the point where practical natural language interfaces to software systems can be constructed in many restricted, but nevertheless useful, circumstances. This tutorial is intended to survey the current state of applied natural language processing by presenting computationally effective NLP techniques, by discussing the range of capabilities these techniques provide for NLP systems, an by discussing their current limitations. Following the introduction, this document is divided into two major sections: the first on language recognition strategies at the single sentence level, and the second on language processing issues that arise during interactive dialogues. In both cases, we concentrate on those aspects of the problem appropriate for interactive natural language interfaces, but relate the techniques and systems discussed to more general work on natural language, independent of application domain.

  7. MyProLang - My Programming Language: A Template-Driven Automatic Natural Programming Language

    E-print Network

    Bassil, Youssef

    2012-01-01

    Modern computer programming languages are governed by complex syntactic rules. They are unlike natural languages; they require extensive manual work and a significant amount of learning and practicing for an individual to become skilled at and to write correct programs. Computer programming is a difficult, complicated, unfamiliar, non-automated, and a challenging discipline for everyone; especially, for students, new programmers and end-users. This paper proposes a new programming language and an environment for writing computer applications based on source-code generation. It is mainly a template-driven automatic natural imperative programming language called MyProLang. It harnesses GUI templates to generate proprietary natural language source-code, instead of having computer programmers write the code manually. MyProLang is a blend of five elements. A proprietary natural programming language with unsophisticated grammatical rules and expressive syntax; automation templates that automate the generation of in...

  8. A Robust Finite-State Parser For French

    Microsoft Academic Search

    Jean-pierre Chanod; Pasi Tapanainen

    1997-01-01

    This paper describes a robust finite-state parser implemented for French.The parser attaches morpho-syntactic tags to each word and determines clauseboundaries. It is a reductionist parser based on finite-state networks and theirintersection. We describe essential elements of the rule writing system, andshow how it is actually applied to solve various phenomena, such as argumentuniqueness, agreement or apposition. We show some results

  9. Construction of Efficient Generalized LR Parsers

    Microsoft Academic Search

    Miguel A. Alonso; David Cabrero; Manuel Vilares Ferro

    1997-01-01

    We show how LR parsers for the analysis of arbitrary context-free grammars can be derived from classical Earley's parsing algorithm. The result is a Generalized LR parsing algorithm working at complexity O(n3) in the worst case, which is achieved by the use of dynamic programming to represent the non-deterministic evolution of the stack instead of graph-structured stack representations, as has

  10. Natural language processing for usage based indexing of web resources

    E-print Network

    Paris-Sud XI, Université de

    Natural language processing for usage based indexing of web resources Anne Boyer and Armelle Brun: we ignore the nature, the content and the structure of resources. We describe a new ap- proach taking the content, the nature, the format and the structure of resources. #12;This paper describes our intended work

  11. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  12. Two Types of Definites in Natural Language

    ERIC Educational Resources Information Center

    Schwarz, Florian

    2009-01-01

    This thesis is concerned with the description and analysis of two semantically different types of definite articles in German. While the existence of distinct article paradigms in various Germanic dialects and other languages has been acknowledged in the descriptive literature for quite some time, the theoretical implications of their existence…

  13. Traditional Logic, Modern Logic and Natural Language

    Microsoft Academic Search

    Wilfrid Hodges

    2009-01-01

    In a recent paper Johan van Benthem reviews earlier work done by himself and colleagues on ‘natural logic’. His paper makes\\u000a a number of challenging comments on the relationships between traditional logic, modern logic and natural logic. I respond\\u000a to his challenge, by drawing what I think are the most significant lines dividing traditional logic from modern. The leading\\u000a difference

  14. Nature of language impairment in motor neurone disease 

    E-print Network

    Rewaj, Phillipa Jane

    2014-07-01

    Background: Language impairment associated with Motor Neurone Disease (MND) has been documented since the late 19th century, yet little is understood about the pervasiveness or nature of these deficits. The common clinical ...

  15. MOOIDE : natural language interface for programming MOO environments

    E-print Network

    Ahmad, Moinuddin

    2008-01-01

    MOOIDE is an interface to allow novice users to program a MOO environment using natural language. Programming the MOO involves a variety of tasks like creating objects and their states, assigning verb actions to objects, ...

  16. Natural Language Processing Neural Network Considering Deep Cases

    NASA Astrophysics Data System (ADS)

    Sagara, Tsukasa; Hagiwara, Masafumi

    In this paper, we propose a novel neural network considering deep cases. It can learn knowledge from natural language documents and can perform recall and inference. Various techniques of natural language processing using Neural Network have been proposed. However, natural language sentences used in these techniques consist of about a few words, and they cannot handle complicated sentences. In order to solve these problems, the proposed network divides natural language sentences into a sentence layer, a knowledge layer, ten kinds of deep case layers and a dictionary layer. It can learn the relations among sentences and among words by dividing sentences. The advantages of the method are as follows: (1) ability to handle complicated sentences; (2) ability to restructure sentences; (3) usage of the conceptual dictionary, Goi-Taikei, as the long term memory in a brain. Two kinds of experiments were carried out by using goo dictionary and Wikipedia as knowledge sources. Superior performance of the proposed neural network has been confirmed.

  17. A Preliminary Report on a Program for Generating Natural Language

    E-print Network

    McDonald, David

    A program framework has been designed in which the linguistic facts and heuristics necessary for generating fluent natural language can be encoded. The linguistic data is represented in annotated procedures and data ...

  18. Information extraction to facilitate translation of natural language legislation

    E-print Network

    Wang, Samuel (Samuel Siyue)

    2011-01-01

    There is a large body of existing legislation and policies that govern how government organizations and corporations can share information. Since these rules are generally expressed in natural language, it is difficult and ...

  19. A discriminative model for understanding natural language route directions

    E-print Network

    Kollar, Thomas

    2010-01-01

    To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a ...

  20. Natural language processing for unmanned aerial vehicle guidance interfaces

    E-print Network

    Craparo, Emily M. (Emily Marie), 1980-

    2004-01-01

    In this thesis, the opportunities and challenges involved in applying natural language processing techniques to the control of unmanned aerial vehicles (UAVs) are addressed. The problem of controlling an unmanned aircraft ...

  1. Natural Language Generation for the Semantic Web: Unsupervised template extraction 

    E-print Network

    Duma, Daniel

    2012-11-28

    I propose an architecture for a Natural Language Generation system that automatically learns sentence templates, together with statistical document planning, from parallel RDF data and text. To this end, I design, build ...

  2. A natural language teaching paradigm for nonverbal autistic children

    Microsoft Academic Search

    Robert L. Koegel; Mary C. O'Dell; Lynn Kern Koegel

    1987-01-01

    The purpose of this study was to attempt to improve verbal language acquisition for nonverbal autistic children by manipulating traditional teaching techniques so they incorporated parameters of natural language interactions and motivational techniques. Within a multiple baseline design, treatment was conducted in a baseline condition with trials presented serially in a traditional analogue clinical format where the therapist presented instructions,

  3. On the Representation of Physical Quantities in Natural Language Text

    E-print Network

    Forbus, Kenneth D.

    language. Our focus is on physical quantities found in descriptions of physical processes that water will eventually boil if you heat it on a stove, that a ball placed at the top of a steep ramp continuous properties can appear in written natural language. Our focus is on physical quantities found

  4. Using natural language processing technology for qualitative data analysis

    Microsoft Academic Search

    Kevin Crowston; Eileen E. Allen; Robert Heckman

    2011-01-01

    Social researchers often apply qualitative research methods to study groups and their communications artifacts. The use of computer-mediated communications has dramatically increased the volume of text available, but coding such text requires considerable manual effort. We discuss how systems that process text in human languages (i.e. natural language processing [NLP]) might partially automate content analysis by extracting theoretical evidence. We

  5. Natural Language and Spatial Reasoning Stefanie Anne Tellex

    E-print Network

    Roy, Deb

    that match a spatial language description such as "People walking through the kitchen and then going to the dining room" and fol- lowing natural language commands such as "Go down the hall towards the fireplace: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Boris Katz Principal Research Scientist MIT Computer Science and Artificial Intelligence Laboratory

  6. Natural selection of the critical period for language acquisition

    E-print Network

    Nowak, Martin A.

    Natural selection of the critical period for language acquisition Natalia L. Komarova1,2 and Martin of Mathematics, University of Leeds, Leeds LS2 9JT, UK The language acquisition period in humans lasts about 13 acquisition devices, which di¡er in the length of the learning period. There are two selective forces

  7. An Historical Overview of Natural Language Processing Systems that Learn

    Microsoft Academic Search

    Robin Collier

    1994-01-01

    A fundamental issue in natural language processing is the pre- requisite of an enormous quantity of preprogrammed knowl- edge concerning both the language and the domain under examination. Manual acquisition of this knowledge is tedious and error prone. Development of an automated acquisition proc- ess would prove invaluable. This paper references and overviews a range of the systems that have

  8. Plan-Based Integration of Natural Language and Graphics Generation

    Microsoft Academic Search

    Wolfgang Wahlster; Elisabeth André; Wolfgang Finkler; Hans-jürgen Profitlich; Thomas Rist

    1993-01-01

    W. Wahlster, E. André, W. Finkler, H.-J. Profitlich and T. Rist, Plan-based integration of natural language and graphics generation, Artificial Intelligence 63 (1993) 387-427. Multimodal interfaces combining natural language and graphics take advantage of both the individual strength of each communication mode and the fact that several modes can be employed in parallel. The central claim of this paper is

  9. An Evaluation of LOLITA and Related Natural Language Processing Systems

    Microsoft Academic Search

    Paul Callaghan

    1998-01-01

    An Evaluation of LOLITA and related Natural Language Processing SystemsPaul CallaghanSubmitted to the University of Durham for the degree of Ph.D., August 1997---------------------This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA isthe Natural Language Processing (NLP) system under development at the University of Durham.It is intended as a platform for building NL applications. We are therefore

  10. Innovations in Natural Language Document Processing for Requirements Engineering

    Microsoft Academic Search

    Valdis Berzins; Craig Martell; Luqi; Paige Adams

    2007-01-01

    This paper evaluates the potential contributions of natural language processing to requirements engineering. We present a\\u000a selective history of the relationship between requirements engineering (RE) and natural-language processing (NLP), and briefly\\u000a summarize relevant recent trends in NLP. The paper outlines basic issues in RE and how they relate to interactions between\\u000a a NLP front end and system-development processes. We suggest

  11. A General Natural-language Text Processor for Clinical Radiology

    Microsoft Academic Search

    Carol Friedman; Philip O Alderson; John H M Austin; James J Cimino; Stephen B Johnson

    1994-01-01

    ObjectiveDevelopment of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms.DesignThe natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines

  12. On LR(k)-Parsers of Polynomial Size

    Microsoft Academic Search

    Norbert Blum

    2010-01-01

    Usually, a parser for an LR(k)-grammar G is a deterministic pushdown transducer which produces backwards the unique rightmost derivation for a given input string x 2 L(G). The best known upper bound for the size of such a parser is O(2 |G||?| k+1 ) where |G| and |?| are the sizes of the grammar G and the terminal alphabet ?,

  13. Proceedings of the MT Summit XI Workshop Using Corpora for Natural Language Generation

    E-print Network

    Belz, Anja

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iv KEYNOTE PAPER: "Automatic Language Translation Generation Help Needs Badly" Kevin Knight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Declarative Syntactic Processing of Natural Language Using Concurrent Constraint ProgrammingProceedings of the MT Summit XI Workshop Using Corpora for Natural Language Generation: Language

  14. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  15. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  16. Using Natural Language Processing Applications to Learn a Foreign Language on the Web

    Microsoft Academic Search

    Chris Shei; Ming-Chu Hsu

    Internet is a large community which offers boundless opportunities for acquiring all sorts of knowledge and skills. This paper discusses how one can learn a foreign language with the help of natural language processing based applications, evaluating such machineries as automatic error correction, human-computer dialogue systems and machine translation. This paper also introduces key concepts in using the web as

  17. Natural language processing techniques in computer-assisted language learning: Status and instructional issues

    Microsoft Academic Search

    V. Melissa Holland; Jonathan D. Kaplan

    1995-01-01

    The role of natural language processing (NLP) techniques, such as parsing and semantic analysis, is described within current language tutoring systems. Significant trends are distinguished in the exploitation of these techniques, design issues and tradeoffs are examined, and current and potential contributions of NLP technology are discussed with respect to instructional theory and educational practice. Limitations and problems are addressed

  18. FromTo-CLIR: Web-Based Natural Language Interface for Cross-Language Information Retrieval.

    ERIC Educational Resources Information Center

    Kim, Taewan; Sim, Chul-Min; Yuh, Sanghwa; Jung, Hanmin; Kim, Young-Kil; Choi, Sung-Kwon; Park, Dong-In; Choi, Key Sun

    1999-01-01

    Describes the implementation of FromTo-CLIR, a Web-based natural-language interface for cross-language information retrieval that was tested with Korean and Japanese. Proposes a method that uses a semantic category tree and collocation to resolve the ambiguity of query translation. (Author/LRW)

  19. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  20. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  1. Overview of Computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  2. Overview of computer-based natural language processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language-Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc. in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and other who will be affected by this field as it unfolds.

  3. The human factors of natural language query systems

    Microsoft Academic Search

    William C. Ogden

    1985-01-01

    Understanding the hidden limitations and constraints of the system is the largest potential problem for users of natural language query (NLQ). By their nature, most NLQ systems hide “how it works” because they are intended for users who do not want to know. However, human factors research indicates that when users do not have a good understanding of a system,

  4. Detection of Duplicate Defect Reports Using Natural Language Processing

    Microsoft Academic Search

    Per Runeson; Magnus Alexandersson; Oskar Nyholm

    2007-01-01

    Defect reports are generated from various testing and development activities in software engineering. Sometimes two reports are submitted that describe the same problem, leading to duplicate reports. These reports are mostly written in structured natural language, and as such, it is hard to compare two reports for similarity with formal methods. In order to identify duplicates, we investigate using natural

  5. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  6. NaturalJava: A Natural Language Interface for Programming in Java

    E-print Network

    Riloff, Ellen

    manager that provides the interface used by the case frame interpreter to manage the syntax tree,riloff,zachary,blharvey}@cs.utah.edu ABSTRACT NaturalJava is a prototype for an intelligent natural-language- based user interface for creating language processing system accepts English sentences as input and uses information extraction techniques

  7. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  8. The Rhetorical Parsing of Natural Language Texts Daniel Marcu

    E-print Network

    Marcu, Daniel

    The Rhetorical Parsing of Natural Language Texts Daniel Marcu Department of Computer Science University of Toronto Toronto, Ontario Canada M5S 3G4 marcu@cs.toronto.edu Abstract We derive the rhetorical structures of texts by means of two new, surface­form­based algorithms: one that identifies discourse usages

  9. Anthropologism, naturalism, and the pragmatic study of language

    Microsoft Academic Search

    José Medina

    2004-01-01

    This paper is a critical assessment of Wittgenstein's anthropological perspective and Quine's naturalistic perspective as solutions to the problem of semantic indeterminacy. The three stages of my argument try to establish the following points: (1) that Wittgenstein and Quine offer two substantially different philosophical models of language learning and cognitive development; (2) that unlike Quine's naturalism, Wittgenstein's anthropologism is not

  10. Generating Natural Language specifications from UML class diagrams

    Microsoft Academic Search

    Farid Meziane; Nikos Athanasakis; Sophia Ananiadou

    2008-01-01

    Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semi-structured or formal specifications. Fur- thermore, consistency checking is seen by many software engineers as the solu- tion to reduce the number

  11. FCGlight: A System for Studying the Evolution of Natural Language

    Microsoft Academic Search

    Vlad Saveluc; Liviu Ciortuz

    2010-01-01

    We defined FCGlight, a refined version of the Fluid Construction Grammar (FCG), which is a formalism for studying the evolution of the natural language. We picked a core subset of FCG, and expressed it in the semantic framework of the Order-Sorted Features (OSF) logic. This allows for efficient processing, and also gives FCG a solid formal background for further analysis

  12. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation

    E-print Network

    Teller, Seth

    command according to the com- mand's hierarchical and compositional semantic structure. Our system commands such as "Put the tire pallet on the truck." The model is trained using a corpus of commands. Previous approaches (Kollar et al., 2010; Shimizu and Haas, 2009) assume that natural language com- mands

  13. What can natural language processing do for clinical decision support?

    Microsoft Academic Search

    Dina Demner-fushman; Wendy Webber Chapman; Clement J. Mcdonald

    2009-01-01

    Computerized clinical decision support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. natural language processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative

  14. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  15. A Natural Language Processing for Semantic Web Services

    Microsoft Academic Search

    M. Stanojevic; S. Vranes

    2005-01-01

    The problem of natural language understanding is one of the first problems researchers in AI were trying to solve, and our brain is the best proof that the problem can be solved. In this paper we propose a model that could be used to describe roughly the process of understanding as it happens in our brain. Using the proposed model

  16. Natural Language Grammatical Inference with Recurrent Neural Networks

    E-print Network

    Fong, Sandiway

    Natural Language Grammatical Inference with Recurrent Neural Networks Steve Lawrence, Member, IEEE of a complex grammar with neural networksÐspecifically, the task considered is that of training a network-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed

  17. Anaphora in Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth DuRoss

    1990-01-01

    Describes the linguistic phenomenon of anaphora; surveys the approaches to anaphora undertaken in theoretical linguistics and natural language processing (NLP); presents results of research conducted at Syracuse University on anaphora in information retrieval; and discusses the future of anaphora research in regard to information retrieval tasks.…

  18. The EMNLP 2014 Workshop on Arabic Natural Language Processing

    E-print Network

    . There has been a lot of progress in the last 15 years in the area of Arabic Natural Language Processing (NLP). In particular, the TIDES, GALE, and BOLT programs provided a significant boost to Arabic NLP, both in generating research initiatives. This creates the hope that our own research field, NLP, and especially Arabic NLP

  19. Natural Language Processing: A Terminological And Statistical Approach

    Microsoft Academic Search

    Gabriella Pardelli; Manuela Sassi; Sara Goggi; Paola Orsolini

    The aim of this article is to provide a statistical representation of significant terms used in the field of Natural Language Processing from the 1960s till nowadays, in order to draft a survey on the most significant research trends in that period. By retrieving these keywords it should be possible to highlight the ebb and flow of some thematic topics.

  20. Using Natural Language Processing for Automatic Detection of Plagiarism

    Microsoft Academic Search

    Miranda Chong; Lucia Specia; Ruslan Mitkov

    Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. In this study the aim is to improve the accuracy of plagiarism detection by incorporating Natural Language Processing (NLP) techniques into existing approaches. We propose a framework for external plagiarism detection in which a number of NLP techniques are applied

  1. Natural Language Processing With Modular PDP Networks and Distributed Lexicon

    Microsoft Academic Search

    Risto Miikkulainen; Michael G. Dyer

    1991-01-01

    An approach to cannectionist natural language processing is proposed, which is based on hierarchically organized modular parallel distributed processing (PDP) networks and a central lexican of distributed input\\/output representations. The modules communicate using these representations, which are global and publicly available in the system. The representations are developed automatically by all networks while they are learning their processing tasks. The

  2. Applications of Finite-State Transducers in Natural Language Processing

    Microsoft Academic Search

    Lauri Karttunen

    2000-01-01

    This paper is a review of some of the major applications of nite-state transducers in Natural Language Processing ranging from morphological analysis to nite-state parsing. The analysis and gener- ation of inflected word forms can be performed eciently by means of lexical transducers. Such transducers can be compiled using an extended regular expression calculus with restriction and replacement operators. These

  3. Proof-Theoretic Semantics for a Natural Language Fragment

    NASA Astrophysics Data System (ADS)

    Francez, Nissim; Dyckhoff, Roy

    We propose a Proof - Theoretic Semantics (PTS) for a (positive) fragment E+0 of Natural Language (NL) (English in this case). The semantics is intended [7] to be incorporated into actual grammars, within the framework of Type - Logical Grammar (TLG) [12]. Thereby, this semantics constitutes an alternative to the traditional model - theoretic semantics (MTS), originating in Montague's seminal work [11], used in TLG.

  4. A finite and real-time processor for natural language

    SciTech Connect

    Blank, G.D. (Lehigh Univ., Computer Science and Electrical Engineering Dept., Packard Lab., PA (US))

    1989-10-01

    People process natural language in real time and with very limited short-term memories. This article describes a computational architecture for syntactic performance that also requires fixed finite resources. The processor presented here represents syntactic versatility without incurring combinatorial redundancy in the number of transitions or rules. It avoids both excess grammar size and excessive computational complexity.

  5. CS769 Spring 2010 Advanced Natural Language Processing Logistic Regression

    E-print Network

    Zhu, Xiaojin "Jerry"

    CS769 Spring 2010 Advanced Natural Language Processing Logistic Regression Lecturer: Xiaojin Zhu(y|x) directly. A model that estimates p(y|x) directly is known as a discriminative model. Logistic regression interpret it as the label probability. This is done via the logistic function: p(y = 1|x) = ( x) = 1 1 + exp

  6. Word-based self-indexes for natural language text

    Microsoft Academic Search

    Antonio Fariña; Nieves R. Brisaboa; Gonzalo Navarro; Francisco Claude

    2012-01-01

    The inverted index supports efficient full-text searches on natural language text collections. It requires some extra space over the compressed text that can be traded for search speed. It is usually fast for single-word searches, yet phrase searches require more expensive intersections. In this article we introduce a different kind of index. It replaces the text using essentially the same

  7. Natural-language access to databases-theoretical/technical issues

    SciTech Connect

    Moore, R.C.

    1982-01-01

    Although there have been many experimental systems for natural-language access to databases, with some now going into actual use, many problems in this area remain to be solved. The author presents descriptions of five problem areas that seem to me not to be adequately handled by any existing system.

  8. Conversing with management information systems in a natural language

    Microsoft Academic Search

    Robert W. Blanning

    1984-01-01

    Wouldn't it make life easier if people could communicate with a management information system in everyday language? In fact, there are already available a number of systems that allow users to do this with MISs that retrieve and display stored data and perform simple calculations. But what about being able to converse naturally with MISs that contain decision models?

  9. From Natural Language to RDF Graphs with Pregroups Antonin Delpeuch

    E-print Network

    From Natural Language to RDF Graphs with Pregroups Antonin Delpeuch �cole Normale Supérieure 45 rue to the formal syntax of RDF, an existential conjunctive logic widely used on the Semantic Web. Our translation extensional models. We establish a one-to-one correspondence between exten- sional models and RDF models

  10. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor.

    ERIC Educational Resources Information Center

    Dadlez, Eva M.

    1984-01-01

    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big Brother"…

  11. A natural language generation system based on dynamic knowledge base

    Microsoft Academic Search

    Wang Weiwei; Lin Biqin; Chen Fang; Yuan Baozong

    1996-01-01

    This paper describes a domain-specialized natural language generation system-Intelligent Scenes Inquiring System (ISIS), which automatically produce Chinese text with information of a park and the scenes in it when the user asks a question in Chinese. One feature of ISIS is the discourse focusing strategy in text planning, another is the use of a dynamic knowledge base that includes a

  12. Time, Tense and Aspect in Natural Language Database Interfaces

    Microsoft Academic Search

    Ion Androutsopoulos; Graeme D. Ritchie; Peter Thanisch

    1998-01-01

    Abstract Most existing natural language database interfaces (nldbs) were designed to be used with database systems that provide very limited facilities for manipulating time-dependent data, and they do not support adequately temporal linguistic mechanisms (verb tenses, temporal adverbials, temporal subordinate clauses, etc.). The database community is becoming increasingly interested in temporal database systems, that are intended to store and manipulate

  13. Research at Yale in Natural Language Processing. Research Report #84.

    ERIC Educational Resources Information Center

    Schank, Roger C.

    This report summarizes the capabilities of five computer programs at Yale that do automatic natural language processing as of the end of 1976. For each program an introduction to its overall intent is given, followed by the input/output, a short discussion of the research underlying the program, and a prognosis for future development. The programs…

  14. ACPYPE - AnteChamber PYthon Parser interfacE

    PubMed Central

    2012-01-01

    Background ACPYPE (or AnteChamber PYthon Parser interfacE) is a wrapper script around the ANTECHAMBER software that simplifies the generation of small molecule topologies and parameters for a variety of molecular dynamics programmes like GROMACS, CHARMM and CNS. It is written in the Python programming language and was developed as a tool for interfacing with other Python based applications such as the CCPN software suite (for NMR data analysis) and ARIA (for structure calculations from NMR data). ACPYPE is open source code, under GNU GPL v3, and is available as a stand-alone application at http://www.ccpn.ac.uk/acpype and as a web portal application at http://webapps.ccpn.ac.uk/acpype. Findings We verified the topologies generated by ACPYPE in three ways: by comparing with default AMBER topologies for standard amino acids; by generating and verifying topologies for a large set of ligands from the PDB; and by recalculating the structures for 5 protein–ligand complexes from the PDB. Conclusions ACPYPE is a tool that simplifies the automatic generation of topology and parameters in different formats for different molecular mechanics programmes, including calculation of partial charges, while being object oriented for integration with other applications. PMID:22824207

  15. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  16. CS 585: Natural Language Processing James Allen, Natural Language Understanding, Second Edition. Benjamin/Cummings, Menlo Park, CA, 1995.

    E-print Network

    Heller, Barbara

    and generating responses in intelligent tutoring systems 1.5 hours · Student project reports 3.0 hours Total 45 Edition. Benjamin/Cummings, Menlo Park, CA, 1995. Judith Markowitz, Using Speech Recognition Prentice Hall corpora and other natural language resources. Choose a speech recognition system for a given application

  17. Natural language understanding and speech recognition for industrial vision systems

    NASA Astrophysics Data System (ADS)

    Batchelor, Bruce G.

    1992-11-01

    The accepted method of programming machine vision systems for a new application is to incorporate sub-routines from a standard library into code, written specially for the given task. Typical programming languages that might be used here are Pascal, C, and assembly code, although other `conventional' (i.e., imperative) languages are often used instead. The representation of an algorithm to recognize a certain object, in the form of, say, a C language program is clumsy and unnatural, compared to the alternative process of describing the object itself and leaving the software to search for it. The latter method, known as declarative programming, is used extensively both when programming in Prolog and when people talk to one another in English, or other natural languages. Programs to understand a limited sub-set of a natural language can also be written conveniently in Prolog. The article considers the prospects for talking to an image processing system, using only slightly constrained English. Moderately priced speech recognition devices, which interface to a standard desk-top computer and provide a limited repertoire (200 words) as well as the ability to identify isolated words, are already available commercially. At the moment, the goal of talking in English to a computer is incompletely fulfilled. Yet, sufficient progress has been made to encourage greater effort in this direction.

  18. NLP and Linguistics Introduction to Natural Language Processing

    E-print Network

    Smith, David A.

    : Children listen to language [unsupervised] 9 #12;Language learning: Children listen to language [unsupervised] Children are corrected?? [supervised] 9 #12;Language learning: Children listen to language learning: Children listen to language [unsupervised] Children are corrected?? [supervised] Children

  19. Automatic and Unsupervised Methods in Natural Language Processing

    Microsoft Academic Search

    JOHNNY BIGERT

    2005-01-01

    Abstract Natural language,processing,(NLP) means,the computer-aided,processing,of language,produced,by a human.,But human,language,is inherently,irregular and,the most,reliable results are obtained,when,a human,is involved,in at least some part of the processing. However, manual workis time-consuming and expensive. This thesis focuses on what,can be accomplished,in NLP when manual,workis k ept to a minimum. We describe,the construction,of two,tools that greatly simplify,the implementation,of automatic,evaluation. They are used,to implement,several supervised, semi-supervised

  20. The application of structured learning in natural language processing

    Microsoft Academic Search

    Yizhao Ni; Craig Saunders; Sándor Szedmák; Mahesan Niranjan

    2010-01-01

    We propose a structured learning approach, max-margin structure (MMS), which is targeted at natural language processing (NLP)\\u000a tasks. The architecture of our approach is shown to capture structural aspects of the problem domains, leading to demonstrable\\u000a performance improvements on two NLP tasks: part-of-speech tagging and statistical machine translation (SMT). We present a\\u000a perceptron-based online learning algorithm to train the model

  1. Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases

    Microsoft Academic Search

    Danica Damljanoviand; Kalina Bontcheva

    2009-01-01

    \\u000a Many Natural Language Interfaces (NLIs) to knowledge bases have been developed in order to provide easy access to structured\\u000a data for casual users. However, those that have reasonable performance are domain-specific and tend to require customisation\\u000a for each new domain, which, from a developer’s perspective, makes them expensive to maintain and unattractive for practical\\u000a applications spanning different domains. This paper

  2. Advanced Natural Language Processing, Spring 2013 Problem Set 3

    E-print Network

    Carreras, Xavier

    Advanced Natural Language Processing, Spring 2013 Problem Set 3 deadline: May 2 2013 Instructions: the current tag is a and the current word is capitalized f1,a(x1:n, i, j, y) = 1 if is capitalized(xi) and yi = a 0 otherwise · Type 2: the current tag is a and the current word is not capitalized f2,a(x1:n, i, j

  3. Web-based models for natural language processing

    Microsoft Academic Search

    Mirella Lapata; Frank Keller

    2005-01-01

    Previous work demonstrated that Web counts can be used to approximate bigram counts, suggesting that Web-based frequencies should be useful for a wide variety of Natural Language Processing (NLP) tasks. However, only a limited number of tasks have so far been tested using Web-scale data sets. The present article overcomes this limitation by systematically investigating the performance of Web-based models

  4. Integration of Natural Language Processing Chains in Content Management Systems

    Microsoft Academic Search

    Diman Karagiozov; N. Genchev

    \\u000a Modern web application hype revolves around a rich user interface experience. A lesser-known aspect of modern applications\\u000a is the use of techniques that enable the intelligent processing of information and add value that can’t be delivered by other\\u000a means. This article presents a scalable, maintainable and inter-operable approach for combining content management functionalities\\u000a with natural language processing (NLP) tools. The

  5. Applications of Weighted Automata in Natural Language Processing

    Microsoft Academic Search

    Kevin Knight; Jonathan May

    2009-01-01

    We explain why weighted automata are an attractive knowledge representation for natural language problems. We first trace\\u000a the close historical ties between the two fields, then present two complex real-world problems, transliteration and translation.\\u000a These problems are usefully decomposed into a pipeline of weighted transducers, and weights can be set to maximize the likelihood\\u000a of a training corpus using standard

  6. Automated Encoding of Clinical Documents Based on Natural Language Processing

    Microsoft Academic Search

    Carol Friedman; Lyudmila Shagina; Yves Lussier; George Hripcsak

    2004-01-01

    ObjectiveThe aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method.MethodsAn existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain

  7. Natural Language Processing Across Time: An Empirical Investigation on Italian

    Microsoft Academic Search

    Marco Pennacchiotti; Fabio Massimo Zanzotto

    2008-01-01

    In this paper, we study how existing natural language processing tools for Italian perform on ancient texts. The first goal\\u000a is to understand to what extent such tools can be used “as they are” for the automatic analysis of old literary works. Indeed,\\u000a while NLP tools for Italian achieve today good performance, it is not clear if they could be

  8. Text-based requirements preprocessing using nature language processing techniques

    Microsoft Academic Search

    Huafeng Chen; Keqing He; Peng Liang; Rong Li

    2010-01-01

    In a distributed environment, non-technical stakeholders are required to write down requirement statements by themselves. Nature language is the first choice for them. In order to alleviate the burden of reading free-text requirement documents by requirements engineers, we extract goals and relevant stakeholders from requirement statements automatically by a computer-assisted way. In this paper, requirements are divided into system level

  9. How evolutionary algorithms are applied to statistical natural language processing

    Microsoft Academic Search

    Lourdes Araujo

    2007-01-01

    Statistical natural language processing (NLP) and evolutionary algorithms (EAs) are two very active areas of research which\\u000a have been combined many times. In general, statistical models applied to deal with NLP tasks require designing specific algorithms\\u000a to be trained and applied to process new texts. The development of such algorithms may be hard. This makes EAs attractive\\u000a since they offer

  10. Tracking health disparities through natural-language processing.

    PubMed

    Wieland, Mark L; Wu, Stephen T; Kaggal, Vinod C; Yawn, Barbara P

    2013-03-01

    Health disparities and solutions are heterogeneous within and among racial and ethnic groups, yet existing administrative databases lack the granularity to reflect important sociocultural distinctions. We measured the efficacy of a natural-language-processing algorithm to identify a specific immigrant group. The algorithm demonstrated accuracy and precision in identifying Somali patients from the electronic medical records at a single institution. This technology holds promise to identify and track immigrants and refugees in the United States in local health care settings. PMID:23327237

  11. Towards a natural language semantics without functors and operands

    Microsoft Academic Search

    Miklós Erdélyi-szabó; László Kálmán; Agi Kurucz

    2008-01-01

    The paper sets out to offer an alternative to the function\\/argument approach to the most essential aspects of natural language\\u000a meanings. That is, we question the assumption that semantic completeness (of, e.g., propositions) or incompleteness (of, e.g.,\\u000a predicates) exactly replicate the corresponding grammatical concepts (of, e.g., sentences and verbs, respectively). We argue\\u000a that even if one gives up this assumption,

  12. Head-Driven Statistical Models for Natural Language Parsing

    Microsoft Academic Search

    Michael Collins

    2003-01-01

    This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements,

  13. Mathematics as an Exact and Precise Language of Nature

    E-print Network

    Afsar Abbas

    2005-11-05

    One of the outstanding problems of philosophy of science and mathematics today is whether there is just "one" unique mathematics or the same can be bifurcated into "pure" and "applied" categories. A novel solution for this problem is offered here. This will allow us to appreciate the manner in which mathematics acts as an exact and precise language of nature. This has significant implications for Artificial Intelligence.

  14. Knowledge discovery and data mining to assist natural language understanding.

    PubMed

    Wilcox, A; Hripcsak, G

    1998-01-01

    As natural language processing systems become more frequent in clinical use, methods for interpreting the output of these programs become increasingly important. These methods require the effort of a domain expert, who must build specific queries and rules for interpreting the processor output. Knowledge discovery and data mining tools can be used instead of a domain expert to automatically generate these queries and rules. C5.0, a decision tree generator, was used to create a rule base for a natural language understanding system. A general-purpose natural language processor using this rule base was tested on a set of 200 chest radiograph reports. When a small set of reports, classified by physicians, was used as the training set, the generated rule base performed as well as lay persons, but worse than physicians. When a larger set of reports, using ICD9 coding to classify the set, was used for training the system, the rule base performed worse than the physicians and lay persons. It appears that a larger, more accurate training set is needed to increase performance of the method. PMID:9929336

  15. Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language

    Microsoft Academic Search

    Carol Friedman; George Hripcsak; Lyuda Shagina; Hongfang Liu

    1999-01-01

    ObjectiveTo design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model.MethodsA document model that encodes structured clinical information in patient reports while retaining the original contents

  16. Applications of Natural Language Processing in Biodiversity Science

    PubMed Central

    Thessen, Anne E.; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  17. Applications of natural language processing in biodiversity science.

    PubMed

    Thessen, Anne E; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  18. Perceptron Training for a Wide-Coverage Lexicalized-Grammar Parser Stephen Clark

    E-print Network

    Koehn, Philipp

    Perceptron Training for a Wide-Coverage Lexicalized-Grammar Parser Stephen Clark Oxford University@it.usyd.edu.au Abstract This paper investigates perceptron training for a wide-coverage CCG parser and com- pares the perceptron with a log-linear model. The CCG parser uses a phrase-structure pars- ing model and dynamic

  19. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  20. An approach for natural language understanding in GIS based on ontology

    NASA Astrophysics Data System (ADS)

    Zhou, Liguo; Feng, Xuezhi; She, Jiangfeng; Xie, Shunping

    2007-06-01

    A natural language interface can make a geographic information system (GIS) easy to use. It allows one to use the natural language quickly and conveniently to operate in such as digital city management system or traffic guidance system. This paper discusses the method of nature language understanding in GIS based on ontology. Natural language understanding is general apply in computer or artificial intelligence research area, yet in GIS the natural language understanding is mainly concerned about spatial information. In order to implement the natural language understanding for spatial information perfectly we use the ontology model. First we put forward a generally process of natural language understanding in GIS, defined the conception of the ontology, next set up the ontology structure, ontology-based understanding model, also indicate the mechanism of natural language understanding based on ontology. Finally are a case study and a prototype, a discussion about the research deficiency and the development forecast of my research.

  1. Children as Models for Computers: Natural Language Acquisition for Machine Learning

    E-print Network

    Paris-Sud XI, Université de

    Children as Models for Computers: Natural Language Acquisition for Machine Learning Leonor Becerra language in a very short time. Children are able to learn any natural language given the adequate input, if we are able to give machines the capacity of learning language as children do, maybe we could reach

  2. Signed or Spoken, Children need Natural Languages Daphne Bavelier, Elissa L. Newport, and Ted Supalla

    E-print Network

    DeAngelis, Gregory

    Signed or Spoken, Children need Natural Languages Daphne Bavelier, Elissa L. Newport, and Ted Supalla Signed or Spoken, Children Need Natural Languages Daphne Bavelier, Elissa L. Newport, and Ted Supalla Sign languages are as different, and as specific to their communities, as spoken languages

  3. Natural Language Processing as a Discipline at LLNL

    SciTech Connect

    Firpo, M A

    2005-02-04

    The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.

  4. Augmenting a database knowledge representation for natural language generation

    SciTech Connect

    McCoy, K.F.

    1982-01-01

    The knowledge representation is an important factor in natural language generation since it limits the semantic capabilities of the generation system. This paper identifies several information types in a knowledge representation that can be used to generate meaningful responses to questions about database structure. Creating such a knowledge representation, however, is a long and tedious process. A system is presented which uses the contents of the database to form part of this knowledge representation automatically. It employs three types of world knowledge axioms to ensure that the representation formed is meaningful and contains salient information. 7 references.

  5. UMLS knowledge for biomedical language processing.

    PubMed

    McCray, A T; Aronson, A R; Browne, A C; Rindflesch, T C; Razi, A; Srinivasan, S

    1993-04-01

    This paper describes efforts to provide access to the free text in biomedical databases. The focus of the effort is the development of SPECIALIST, an experimental natural language processing system for the biomedical domain. The system includes a broad coverage parser supported by a large lexicon, modules that provide access to the extensive Unified Medical Language System (UMLS) Knowledge Sources, and a retrieval module that permits experiments in information retrieval. The UMLS Metathesaurus and Semantic Network provide a rich source of biomedical concepts and their interrelationships. Investigations have been conducted to determine the type of information required to effect a map between the language of queries and the language of relevant documents. Mappings are never straightforward and often involve multiple inferences. PMID:8472004

  6. Automatic Item Generation via Frame Semantics: Natural Language Generation of Math Word Problems.

    ERIC Educational Resources Information Center

    Deane, Paul; Sheehan, Kathleen

    This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language…

  7. Towards Natural Language Processing: A Well-Formed Substring Table Approach to Understanding Garden Path Sentence

    Microsoft Academic Search

    Jia-li Du; Ping-fang Yu

    2010-01-01

    As computers have become more affordable and accessible, the theories and techniques of natural language processing (NLP) are increasingly used as a means for automatically decoding natural language. Well-formed substring table (WFST) is an efficient parsing algorithm used to decode natural language. The form of (START, FINISH, LABEL?FOUND. TO FIND) is accepted by system as its basic model, and its

  8. Writing Application Protocol Parsers Jeffrey Kirby

    E-print Network

    Valtorta, Marco

    ­ connection ­ pair of flows #12;binpac Language Features Integrating custom computation ­ C/C++ code may ­ Tedious and error prone ­ Protocols are complex ­ Need to think about corner, or rare, cases ­ Hacker analyzers #12;binpac Language Declarative language Describes what computation should be performed

  9. On LR(k)-parsers of polynomial size Norbert Blum

    E-print Network

    Eckmiller, Rolf

    of the current right sentential form such that the unique rightmost derivation of the input can be computed structure. Using some ingenious data structures and increasing the parsing time by a small constant factor, the size of the extended parser can be reduced to O(|G|+#LA|N|k2 ). The parsing time is O(ld(input) + k

  10. On LR(k)parsers of polynomial size Norbert Blum

    E-print Network

    Eckmiller, Rolf

    of the current right sentential form such that the unique rightmost derivation of the input can be computed structure. Using some ingenious data structures and increasing the parsing time by a small constant factor, the size of the extended parser can be reduced to O(|G|+#LA|N |k 2 ). The parsing time is O(ld(input) + k

  11. Parsley: a Command-Line Parser for Astronomical Applications

    Microsoft Academic Search

    William Deich

    1996-01-01

    Parsley is a sophisticated keyword + value parser, packaged as a library of routines that offers an easy method for providing command-line arguments to programs. It makes it easy for the user to enter values, and it makes it easy for the programmer to collect and validate the user's entries. Parsley is tuned for astronomical applications: for example, dates entered

  12. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    Microsoft Academic Search

    Tai-Hoon Kim

    2010-01-01

    \\u000a Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems\\u000a of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and\\u000a natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment.\\u000a We propose some ideas for using the computer as

  13. Literature-Based Knowledge Discovery using Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Hristovski, D.; Friedman, C.; Rindflesch, T. C.; Peterlin, B.

    Literature-based discovery (LBD) is an emerging methodology for uncovering nonovert relationships in the online research literature. Making such relationships explicit supports hypothesis generation and discovery. Currently LBD systems depend exclusively on co-occurrence of words or concepts in target documents, regardless of whether relations actually exist between the words or concepts. We describe a method to enhance LBD through capture of semantic relations from the literature via use of natural language processing (NLP). This paper reports on an application of LBD that combines two NLP systems: BioMedLEE and SemRep, which are coupled with an LBD system called BITOLA. The two NLP systems complement each other to increase the types of information utilized by BITOLA. We also discuss issues associated with combining heterogeneous systems. Initial experiments suggest this approach can uncover new associations that were not possible using previous methods.

  14. From Web Directories to Ontologies: Natural Language Processing Challenges

    NASA Astrophysics Data System (ADS)

    Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing

    Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.

  15. Natural language processing in biomedicine: a unified system architecture overview.

    PubMed

    Doan, Son; Conway, Mike; Phuong, Tu Minh; Ohno-Machado, Lucila

    2014-01-01

    In contemporary electronic medical records much of the clinically important data-signs and symptoms, symptom severity, disease status, etc.-are not provided in structured data fields but rather are encoded in clinician-generated narrative text. Natural language processing (NLP) provides a means of unlocking this important data source for applications in clinical decision support, quality assurance, and public health. This chapter provides an overview of representative NLP systems in biomedicine based on a unified architectural view. A general architecture in an NLP system consists of two main components: background knowledge that includes biomedical knowledge resources and a framework that integrates NLP tools to process text. Systems differ in both components, which we review briefly. Additionally, the challenge facing current research efforts in biomedical NLP includes the paucity of large, publicly available annotated corpora, although initiatives that facilitate data sharing, system evaluation, and collaborative work between researchers in clinical NLP are starting to emerge. PMID:24870142

  16. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  17. Spatial and numerical abilities without a complete natural language

    PubMed Central

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  18. Constructing Concept Schemes From Astronomical Telegrams Via Natural Language Clustering

    NASA Astrophysics Data System (ADS)

    Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.

    2012-01-01

    The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs (http://www.astronomerstelegram.org) are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.

  19. Tasking and sharing sensing assets using controlled natural language

    NASA Astrophysics Data System (ADS)

    Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David

    2012-06-01

    We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.

  20. Knowledge-Assisted Document Retrieval: I. The Natural-Language Interface. II. The Retrieval Process.

    ERIC Educational Resources Information Center

    Biswas, Gautam; And Others

    1987-01-01

    Two articles describe a model for processing natural-language queries in information retrieval systems. Part I proposes a language interface based on fuzzy set techniques to handle the uncertainty inherent in natural-language semantics. Part II develops a model of the retrieval system and describes an implementation using a knowledge-based systems…

  1. The integration hypothesis of human language evolution and the nature of contemporary languages

    E-print Network

    Miyagawa, Shigeru

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language ...

  2. NLGbAse: A Free Linguistic Resource for Natural Language Processing Systems

    Microsoft Academic Search

    Eric Charton; Juan Manuel Torres Moreno

    2010-01-01

    Availability of labeled language resources, such as annotated corpora and domain dependent labeled language resources is crucial for experiments in the field of Natural Language Processing. Most often, due to lack of resources, manual verification and annotation of electronic text material is a prerequisite for the development of NLP tools. In the context of under-resourced language, the lack of copora

  3. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  4. Using a natural language and gesture interface for unmanned vehicles

    NASA Astrophysics Data System (ADS)

    Perzanowski, Dennis; Schultz, Alan C.; Adams, William; Marsh, Elaine

    2000-07-01

    Unmanned vehicles, such as mobile robots, must exhibit adjustable autonomy. They must be able to be self-sufficient when the situation warrants; however, as they interact with each other and with humans, they must exhibit an ability to dynamically adjust their independence or dependence as co-operative agents attempting to achieve some goal. This is what we mean by adjustable autonomy. We have been investigating various modes of communication that enhance a robot's capability to work interactively with other robots and with humans. Specifically, we have been investigating how natural language and gesture can provide a user- friendly interface to mobile robots. We have extended this initial work to include semantic and pragmatic procedures that allow humans and robots to act co-operatively, based on whether or not goals have been achieved by the various agents in the interaction. By processing commands that are either spoken or initiated by clicking buttons on a Personal Digital Assistant and by gesturing either naturally or symbolically, we are tracking the various goals of the interaction, the agent involved in the interaction, and whether or not the goal has been achieved. The various agents involved in achieving the goals are each aware of their own and others' goals and what goals have been stated or accomplished so that eventually any member of the group, be it robot or a human, if necessary, can interact with the other members to achieve the stated goals of a mission.

  5. Toward the use of speech and natural language technology in intervention for a language-disordered population

    Microsoft Academic Search

    Jill Fain Lehman

    1998-01-01

    We describe the design of Simone Says an interactive software environment for language remediation that brings together research in speech recognition, natural language processing and computer-aided instruction. The underlying technology for the implementation and the system's eventual evaluation are also discussed. 1 Motivation The Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) defines pervasive developmental disorders (alternatively, autistic spectrum disorders

  6. Automated Encoding of Clinical Documents Based on Natural Language Processing

    PubMed Central

    Friedman, Carol; Shagina, Lyudmila; Lussier, Yves; Hripcsak, George

    2004-01-01

    Objective: The aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method. Methods: An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to Unified Medical Language System (UMLS) coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by seven experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts. Results: Recall of the system for UMLS coding of all terms was .77 (95% CI .72–.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79–.87). Recall of the system for extracting all terms was .84 (.81–.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was .89 (.87–.91), and precision of the experts ranged from .61 to .91. Conclusion: Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than six experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval. PMID:15187068

  7. Semantic Grammar: A Technique for Constructing Natural Language Interfaces to Instructional Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    A major obstacle to the effective educational use of computers is the lack of a natural means of communication between the student and the computer. This report describes a technique for generating such natural language front-ends for advanced instructional systems. It discusses: (1) the essential properties of a natural language front-end, (2)…

  8. An Evaluation of Strategies for Selective Utterance Verification for Spoken Natural Language Dialog

    E-print Network

    Smith, Ronnie W.

    An Evaluation of Strategies for Selective Utterance Verification for Spoken Natural Language Dialog Selective Verification of Questionable User Inputs Every system that uses natural language under­ standing­ ferent users, 141 problem­solving dialogs, and 2840 user utterances, the Circuit Fix­It Shop natural

  9. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    SciTech Connect

    Powers, D.M.W.

    1983-04-01

    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  10. Logical connectives in natural language: a cultural evolutionary approach 

    E-print Network

    van Wijk, Maarten

    2006-01-01

    mechanism that links the proposed causes to language form. The evolutionary accounts propose that cultural evolution be that mechanism. However, these accounts use a representation of language structure that is too impoverished to answer the research...

  11. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  12. Storing files in a parallel computing system based on user-specified parser function

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  13. A Cache-Based Natural Language Model for Speech Recognition

    Microsoft Academic Search

    Roland Kuhn; Renato De Mori

    1990-01-01

    Speech-recognition systems must often decide between competing ways of breaking up the acoustic input into strings of words. Since the possible strings may be acoustically similar, a language model is required; given a word string, the model returns its linguistic probability. Several Markov language models are discussed. A novel kind of language model which reflects short-term patterns of word use

  14. Parallel Earley's parser and its application to syntactic image analysis

    SciTech Connect

    Chiang, Y.P.; Fu, K.S.

    1983-01-01

    A complete Earley parser which includes recognition and parse extraction has been implemented on a triangular array of processors. The detailed analysis of the complete parser is given. The recognition algorithm is executed in parallel by adopting a new operator, x/sup */, and restricting the input context-free grammar to be lamda-free. The parse extraction algorithm which follows recognition uses a nonrecursive subroutine to generate the correct right-parse in parallel. A special busing arrangement within this array enables the right data to reach the right place at the right time. Simulation examples are provided. The results show that when a string of length >n> is under testing, at the system time 2>n> + 1, the correct right-parse will be obtained if the string is accepted. 15 references.

  15. Automating curation using a natural language processing pipeline

    PubMed Central

    Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Tobin, Richard; Wang, Xinglong

    2008-01-01

    Background: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. Results: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. Conclusion: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems. PMID:18834488

  16. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  17. Perceptron Training for a WideCoverage LexicalizedGrammar Parser Stephen Clark

    E-print Network

    Curran, James R.

    Perceptron Training for a Wide­Coverage Lexicalized­Grammar Parser Stephen Clark Oxford University@it.usyd.edu.au Abstract This paper investigates perceptron training for a wide­coverage CCG parser and com­ pares the perceptron with a log­linear model. The CCG parser uses a phrase­structure pars­ ing model and dynamic

  18. FrAG, a Hybrid Constraint Grammar Parser for French

    Microsoft Academic Search

    Eckhard Bick

    2010-01-01

    This paper describes a hybrid tagger\\/parser for French (FrAG), and presents results from ongoing development work, corpus annotation and evaluation. The core of the system is a sentence scope Constraint Grammar (CG), with linguist-written rules. However, unlike traditional CG, the system uses hybrid techniques on both its morphological input side and its syntactic output side. Thus, FrAG draws on a

  19. Integrating natural language into the word graph search for simultaneous speech recognition and understanding

    Microsoft Academic Search

    Stephanie Seneff; Michael K. McCandless; Victor Zue

    1995-01-01

    This paper describes work aimed towards replacing traditional N-gram language models in a recognizer with a more linguistically motivated language model. We report on experiments involving an A* search through a large word graph of candidate hypotheses, within the ARPA atis domain. We show that the tina natural language system, when properly trained, can compete favorably with a traditional word

  20. Sam Noble Oklahoma Museum of Natural History Department of Native American Languages

    E-print Network

    Oklahoma, University of

    Sam Noble Oklahoma Museum of Natural History Department of Native American Languages Restrictions archives for language materials, especially for Native Americans. A. The Department of Native American collections in the Department of Native American Languages archives: · central location for Tribal access

  1. The Bermuda Triangle: Natural Language Semantics Between Linguistics, Knowledge Representation, and Knowledge Processing

    Microsoft Academic Search

    Peter Bosch

    1991-01-01

    Linguistic parameters alone cannot determine the interpretation of natural language utterances. They can only constrain their interpretation and must leave the rest to other knowledge sources and other processes: language understanding is not just a matter of knowing the language, but also to a considerable degree a matter of logical inference and world knowledge. This is no news as far

  2. Selectional restrictions in natural language sentence generation Raymond Kozlowski, Kathleen F. McCoy, and K. VijayShanker

    E-print Network

    McCoy, Kathleen F.

    Selectional restrictions in natural language sentence generation Raymond Kozlowski, Kathleen F. Mc selectional restrictions can be naturally incorporated into our generation architecture and our notion of a lexico­ grammatical resource. Keywords: Natural language generation, Selectional restrictions, Lexical

  3. Selectional restrictions in natural language sentence generation Raymond Kozlowski, Kathleen F. McCoy, and K. Vijay-Shanker

    E-print Network

    McCoy, Kathleen F.

    Selectional restrictions in natural language sentence generation Raymond Kozlowski, Kathleen F. Mc selectional restrictions can be naturally incorporated into our generation architecture and our notion of a lexico- grammatical resource. Keywords: Natural language generation, Selectional restrictions, Lexical

  4. Of Substance: The Nature of Language Effects on Entity Construal

    ERIC Educational Resources Information Center

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.…

  5. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a…

  6. Notes on the Nature of Bilingual Specific Language Impairment

    ERIC Educational Resources Information Center

    de Jong, Jan

    2010-01-01

    Johanne Paradis' Keynote Article can be read as a concise critical review of the research that focuses on the sometimes strained relationship between bilingualism and specific language impairment (SLI). In my comments I will add some thoughts based on our own research on the learning of Dutch as a second language (L2) by children with SLI.

  7. Planning in AI and Text Planning in Natural Language JongGyun Lim

    E-print Network

    1 Planning in AI and Text Planning in Natural Language Generation Jong­Gyun Lim Columbia University the content and structure of the natural language text and that of other AI planning tasks. The problem of text planning and other AI planning problems have been studied separately from each other, and while

  8. ScratchTalk and Social Computation: Towards a natural language scripting model

    E-print Network

    ScratchTalk and Social Computation: Towards a natural language scripting model Ian Eslick MIT Media Lab 20 Ames St. E15-320R Cambridge, MA 02139 USA eslick@media.mit.edu ABSTRACT Natural Language- lenges. This paper introduces Social Computation, a theo- retical model targeting both challenges

  9. Sydney OWL Syntax -towards a Controlled Natural Language Syntax for OWL 1.1

    E-print Network

    Schwitter, Rolf

    Sydney OWL Syntax - towards a Controlled Natural Language Syntax for OWL 1.1 Anne Cregan1,2 , Rolf new syntax that can be used to write and read OWL ontologies in Controlled Natural Lan- guage (CNL): a well-defined subset of the English language. Following the lead of Manchester OWL Syntax in making OWL

  10. Meta-Knowledge Annotation for Efficient Natural-Language Question-Answering

    E-print Network

    Veale, Tony

    a tight integration of natural language processing (NLP), information retrieval (IR) and information question Q, generate a natural-language representation nlp(Q) 2. From nlp(Q), generate an information-retrieval query ir(nlp(Q)) #12;3. Use ir(nlp(Q)) to retrieve D documents from an authoritative text archive 4

  11. Natural language processing for transparent communication between public administration and citizens

    Microsoft Academic Search

    Bernardo Magnini; Oliviero Stock; Carlo Strapparava

    2000-01-01

    This paper presents two projects concerned with the application of natural language processing technology for improving communication between Public Administration and citizens. The first project, GIST,is concerned with automatic multilingual generation of instructional texts for form-filling. The second project, TAMIC, aims at providing an interface for interactive access to information, centered on natural language processing and supposed to be used

  12. Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE

    Microsoft Academic Search

    Branimir Boguraev; Ted Briscoe

    1987-01-01

    This article focusses on the derivation of large lexicons for natural language processing. We describe the development of a dictionary support environment linking a restructured version of the Longman Dictionary of Contemporary English to natural language processing systems. The process of restructuring the information in the machine readable version of the dictionary is discussed. The Longman grammar code system is

  13. The Role and Resolution of Textual Entailment in Natural Language Processing Applications

    Microsoft Academic Search

    Zornitsa Kozareva; Andrés Montoyo

    2006-01-01

    A fundamental phenomenon in Natural Language Process- ing concerns the semantic variability of expressions. Identifying that two texts express the same meaning with dierent words is a challenging problem. We discuss the role of entailment for various Natural Language Processing applications and develop a machine learning system for their resolution. In our system, text similarity is based on the number

  14. Evolving Readable String Test Inputs Using a Natural Language Model to Reduce Human Oracle Cost

    E-print Network

    McMinn, Phil

    . In this paper, we apply a natural language model to the automatic generation of string inputs, with the aim in a variety of areas including natural language processing [7], where one of their applications is to assist with automatic translation [8], and in speech processing [9], where they are used to choose between the possible

  15. Class Diagram Extraction from Textual Requirements Using Natural Language Processing (NLP) Techniques

    Microsoft Academic Search

    Mohd Ibrahim; Rodina Ahmad

    2010-01-01

    The automation of class generation from natural language requirements is highly challenging. This paper proposes a method and a tool to facilitate requirements analysis process and class diagram extraction from textual requirements supporting natural language processing NLP and Domain Ontology techniques. Requirements engineers analyze requirements manually to come out with analysis artifacts such as class diagram. The time spent on

  16. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by…

  17. Success story in software engineering using NIAM (Natural language Information Analysis Methodology)

    SciTech Connect

    Eaton, S.M.; Eaton, D.S.

    1995-10-01

    To create an information system, we employ NIAM (Natural language Information Analysis Methodology). NIAM supports the goals of both the customer and the analyst completely understanding the information. We use the customer`s own unique vocabulary, collect real examples, and validate the information in natural language sentences. Examples are discussed from a successfully implemented information system.

  18. Linear Algebra as a Natural Language for Special Relativity and Its Paradoxes.

    E-print Network

    Kaup, David J.

    Linear Algebra as a Natural Language for Special Relativity and Its Paradoxes. A talk by Prof. John WHERE: MSB 318 TIME: 11:00am USING BASIC LINEAR ALGEBRA as a natural language of special relativity relativity. A BASIC ASSUMPTION of special relativity (SR) is that the speed of light in a vacuum is the same

  19. AbstFinder, A Prototype Natural Language Text Abstraction Finder for Use in Requirements Elicitation

    Microsoft Academic Search

    Leah Goldin; Daniel M. Berry

    1997-01-01

    Abstraction identification is named as a key problem in requirements analysis. Typically, the abstrac- tions must be found among the large mass of natural language text collected from the clients and users. This paper motivates and describes a new approach, based on traditional signal processing methods, for finding abstractions in natural language text and offers a new tool, AbstFinder as

  20. Natural language vs. Boolean query evaluation: a comparison of retrieval performance

    Microsoft Academic Search

    Howard R. Turtle

    1994-01-01

    The results of experiments comparing the relative performance of natural language and Boolean query formulations are presented. The experiments show that on average a current generation natural language system provides better retrieval performance than expert searchers using a Boolean retrieval system when searching full-text legal materials. Methodological issues are reviewed and the effect of database size on query formulation strategy

  1. Visual display, pointing, and natural language: the power of multimodal interaction

    Microsoft Academic Search

    Antonella De Angeli; Walter Gerbino; Giulia Cassano; Daniela Petrelli

    1998-01-01

    This paper examines user behavior during multimodal human-computer interaction (HCI). It discusses how pointing, natural language, and graphical layout should be integrated to enhance the usability of multimodal systems. Two experiments were run to study simulated systems capable of understanding written natural language and mouse-supported pointing gestures. Results allowed to: (a) develop a taxonomy of communication acts aimed at identifying

  2. Moving Toward a Unified Effort to Understand the Nature and Causes of Language Disorders

    E-print Network

    Rice, Mabel L.; Warren, Steven F.

    2005-01-01

    University of Kansas ADDRESS FOR CORRESPONDENCE Mabel L. Rice, University of Kansas, Child Language Doctoral Program, 1000 Sunnyside Avenue, 3031 Dole Center, Lawrence, KS 66045-7555. E-mail: mabel@ku.edu The nature and causes of language disorders..., neurocortical processes, cognitive neurolinguistics, behavioral phenotypes, and language inter- vention. This list is not complete by any means, but it does suggest the wide front of the current search for better knowledge about what causes language disorders...

  3. On the neurolinguistic nature of language abnormalities in Huntington's disease.

    PubMed Central

    Wallesch, C W; Fehrenbach, R A

    1988-01-01

    Spontaneous language of 18 patients suffering from Huntington's disease and 15 dysarthric controls suffering from Friedreich's ataxia were investigated. In addition, language functions in various modalities were assessed with the Aachen Aphasia Test (AAT). The Huntington patients exhibited deficits in the syntactical complexity of spontaneous speech and in the Token Test, confrontation naming, and language comprehension subtests of the AAT, which are interpreted as resulting from their dementia. Errors affecting word access mechanisms and production of syntactical structures as such were not encountered. PMID:2452241

  4. MATHEMATICS, RHYTHM, AND NATURAL LANGUAGE IN CHINESE AND EUROPEAN CULTURE

    E-print Network

    Spagnolo, Filippo

    's musical language has evolved in the direction of a complex architectural form, with complex rhythmic in China 21 An aesthetics of sound in China 24 Tempo in China 24 From the temple to the Chinese concert 26

  5. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  6. The Present Use of Statistics in the Evaluation of NLP Parsers J.Entwisle D.M.W.Powers

    E-print Network

    | Ii I m m m m m The Present Use of Statistics in the Evaluation of NLP Parsers J.Entwisle D powers~cs, flinders, edu. au Abstract We are concerned that the quality of results produced by an NLP parser bears little, if any, relation to the percentage-results claimed by the various NLP parser

  7. Words Gone Wild: Language in Rolston's Philosophy of Nature

    Microsoft Academic Search

    Brenda Hausauer

    “Nature writing” has a long and rich tradition around the world. In the English-speaking world during the past two hundred years, for example, there has been an incredibly diverse number of essays, poems, and assorted manuscripts written about nature. Nature was the primary topic for all the English Romantic poets, the American Transcendentalists, the writer-naturalists who explored and charted America,

  8. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    PubMed Central

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  9. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of…

  10. Language Teaching in Literature Departments: Natural Partnership or Shotgun Marriage?

    ERIC Educational Resources Information Center

    Shumway, Nicolas

    1990-01-01

    It is noted that language teaching in literature departments in many colleges and universities does not confer the same benefits or offer the same rewards as teaching literature. The problems this imbalance creates in academic rigor and continuity and the questions addressing why this is imbalance exists are discussed. (GLR)

  11. Integrating Corpus-Based Resources and Natural Language Processing.

    ERIC Educational Resources Information Center

    Cantos, Pascual

    2002-01-01

    Surveys computational linguistic tools presently available, but whose potential has neither been fully considered nor exploited to its full in modern computer assisted language learning (CALL). Discusses the rationale of DDL to engage learning, presenting typical data-driven learning (DDL)-activities, DDL-software, and potential extensions of…

  12. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…

  13. CS/Informatics Colloquium, 2006-03-03 Natural language,

    E-print Network

    Gasser, Michael

    research · Inter-relationships among - Knowledge - Language - (Power) - Informatics #12;What this talk to particular regions of the world. · Easily accessible knowledge is largely western. #12;Knowledge inequity:"stimulate respect for cultural identity, cultural and linguistic diversity, traditions and religions

  14. Evaluation of Machine Learning Methods for Natural Language Processing Tasks

    Microsoft Academic Search

    Walter Daelemans; Veronique Hoste

    2002-01-01

    We show that the methodology currently in use for comparing symbolic supervised learning methods applied to human language technol- ogy tasks is unreliable. We show that the interaction between algorithm parameter settings and feature selection within a single algorithm often accounts for a higher variation in results than differences between different algorithms or information sources. We illustrate this with experiments

  15. Unit 1001: The Nature of Meaning in Language.

    ERIC Educational Resources Information Center

    Minnesota Univ., Minneapolis. Center for Curriculum Development in English.

    This 10th-grade unit in Minnesota's "language-centered" curriculum introduces the complexity of linguistic meaning by demonstrating the relationships among linguistic symbols, their referents, their interpreters, and the social milieu. The unit begins with a discussion of Ray Bradbury's "The Kilimanjaro Machine," which illustrates how an otherwise…

  16. Natural Language Processing (NLP) as an Instrument of Raising the Language Awareness of Learners of English as a Second Language

    ERIC Educational Resources Information Center

    Dodigovic, Marina

    2003-01-01

    Based on the statistical regularity of certain error types, an interlanguage grammar could be devised and applied to develop an intelligent computer tool, capable not only of identifying the typical errors in L2 student writing, but also of making adequate corrections. The purpose of the corrections is to make the student aware of the language…

  17. Machine Learning for Efficient Natural-Language Processing

    Microsoft Academic Search

    Fernando C. N. Pereira

    2000-01-01

    Much of computational linguistics in the past thirty years assumed a ready supply of general and linguistic knowledge, and\\u000a limitless computational resources to use it in understanding and producing language. However, accurate knowledge is hard to\\u000a acquire and computational power is limited. Over the last ten years, inspired in part by advances in speech recognition, computational\\u000a linguists have been investigating

  18. IR-NLI: an expert natural language interface to online data bases

    SciTech Connect

    Guida, G.; Tasso, C.

    1983-01-01

    Constructing natural language interfaces to computer systems often requires achievement of advanced reasoning and expert capabilities in addition to basic natural language understanding. In this paper the above issues are faced in the context of an actual application concerning the design of a natural language interface for access to online information retrieval systems. After a short discussion of the peculiarities of this application, which requires both natural language understanding and reasoning capabilities, the general architecture and fundamental design criteria of IR-NLI, a system presently being developed at the University of Udine, are presented. Attention is then focused on the basic functions of IR-NLI, namely, understanding and dialogue, strategy generation, and reasoning. Knowledge representation methods and algorithms adopted are also illustrated. A short example of interaction with IR-NLI is presented. Perspectives and directions for future research are also discussed. 15 references.

  19. A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing

    E-print Network

    Rush, Alexander Matthew

    Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This ...

  20. Biomimetic design through natural language analysis to facilitate cross-domain information retrieval

    E-print Network

    Shu, Lily H.

    Biomimetic design through natural language analysis to facilitate cross-domain information, Toronto, Ontario, Canada (Received October 11, 2005; Accepted May 17, 2006! Abstract Biomimetic. Several instances of biomimetic design result from personal observations of biological phenomena. How

  1. Logic-Based Rhetorical Structuring for Natural Language Generation in Human-Computer Dialogue

    Microsoft Academic Search

    Vladimir Popescu; Jean Caelen; Corneliu Burileanu

    2007-01-01

    Rhetorical structuring is field approached mostly by research in natu- ral language (pragmatic) interpretation. However, in natural language generation (NLG) the rhetorical structure plays an important part, in monologues and dia- logues as well. Hence, several approaches in this direction exist. In most of these, the rhetorical structure is calculated and built in the framework of Rhetorical Structure Theory (RST),

  2. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  3. A Fuzzy Set Approach to Modifiers and Vagueness in Natural Language

    ERIC Educational Resources Information Center

    Hersh, Harry M.; Caramazza, Alfonso

    1976-01-01

    The proposition that natural language concepts are represented as fuzzy sets, a generalization of the traditional theory of sets, of meaning components and that language operators--adverbs, negative markers, and adjectives--can be considered as operators on fuzzy sets was assessed empirically. (Editor/RK)

  4. SQUALL: a Controlled Natural Language for Querying and Updating RDF Graphs

    E-print Network

    Paris-Sud XI, Université de

    SQUALL: a Controlled Natural Language for Querying and Updating RDF Graphs S´ebastien Ferr´e IRISA and maintain. We introduce SQUALL, a controlled nat- ural language for querying and updating RDF graphs. It has a strong adequacy with RDF, an expressiveness close to SPARQL 1.1, and a CNL syntax that completely

  5. Automation of Software System Development Using Natural Language Processing and Two-Level Grammar

    Microsoft Academic Search

    Beum-seuk Lee; Barrett R. Bryant

    2002-01-01

    \\u000a In software engineering, even with recent active research on formal methods and automated tools, users’ involvement is inevitable\\u000a and crucial throughout the software development lifecycle. Automation of these manual tasks would assist the developers throughout\\u000a the development. Our project goal is to help the engineers to resolve ambiguity in natural language (NL) using Natural Language\\u000a Processing and to overcome different

  6. Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging

    Microsoft Academic Search

    Eric Brill

    1995-01-01

    Recently, there has been a rebirth of empiricism in the field of natural language processing. Manual encoding of linguistic information is being challenged by automated corpus-based learning as a method of providing a natural language processing system with linguistic knowledge. Although corpus-based approaches have been successful in many different areas of natural language processing, it is often the case that

  7. Dynamic changes in network activations characterize early learning of a natural language.

    PubMed

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius

    2014-09-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  8. Spatial and Numerical Abilities without a Complete Natural Language

    ERIC Educational Resources Information Center

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations…

  9. Natural Language Interfaces for Data Warehouses Nicolas Kuchmann-Beauger

    E-print Network

    Boyer, Edmond

    -to-text technologies like Siri 2 has made natural search interfaces popular. This observation, and the fact that users are more comfortable to use such query interfaces compared to very structured ones have been pointed out://www.wolfram.com/mathematica/. 2. For more information about Siri: http://www.apple.com/iphone/features/#siri hal-00704293,version1

  10. Modeling Nature's Emergent Patterns with Multi-agent Languages

    E-print Network

    Boone, Randall B.

    have been adopted across a wide array of natural and social sciences. An understanding of complex systems is becoming an essential part of every scientist's knowledge and skills. The time has come for these ideas and methods to become a central part of every student's learning. Despite its adoption

  11. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  12. For the People...Citizenship Education and Naturalization Information. An English as a Second Language Text.

    ERIC Educational Resources Information Center

    Short, Deborah J.; And Others

    A textbook for English-as-a-Second-Language (ESL) students presents lessons on U.S. citizenship education and naturalization information. The nine lessons cover the following topics: the U.S. system of government; the Bill of Rights; responsibilities and rights of citizens; voting; requirements for naturalization; the application process; the…

  13. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  14. CAL Abstract Can natural language recognition technologies be used to enhance the learning

    E-print Network

    experience of young children? Background Natural language as a bridge to useable technology. The features for children in that they become familiar with them at an early age ­ however they are disobedient technologies if children prefer this natural technology to the keyboard. The `word input rate' as well as the `correction

  15. Natural language processing for information assurance and security: an overview and implementations

    Microsoft Academic Search

    Mikhail J. Atallah; Craig J. Mcdonough; Victor Raskin; Sergei Nirenburg

    2001-01-01

    This paper explores a promising interface between natural language processing (NLP) and informationassurance and security (IAS). More specifically, it is devoted to possible applications ofthe accumulated considerable resources in NLP to IAS. The paper is of a mixed theoretical andempirical nature. Of the four possible venues of applications, (i) memorizing randomly generatedpasswords with the help of automatically generated funny jingles,

  16. Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

    Microsoft Academic Search

    Zhe Chen; Dunwei Wen

    2007-01-01

    With the development of Natural Language Processing (NLP), more and more systems want to adopt NLP in User Interface Module to process user input, in order to communicate with user in a natural way. However, this raises a speed problem. That is, if NLP module can not process sentences in durable time delay, users will never use the system. As

  17. A framework is summarized which supports the planning of natural language argument struc

    E-print Network

    Reed, Chris

    Abstract A framework is summarized which supports the planning of natural language argument struc­ ture. One key aspect of natural argument is the order in which components are presented. This is in part responsible for both the coherency and persuasive effect of an argument. One means of effecting

  18. Parsing spoken input introduces serious problems not present in parsing typed natural language. In particular, indeterminacies and

    E-print Network

    Hauptmann, Alexander G.

    , there is no natural way to make use of such likelihood scores in most natural language processing techniquesAbstract Parsing spoken input introduces serious problems not present in parsing typed natural in an integral manner. Many techniques for parsing typed natural language do not adapt well to these extra

  19. Stochastic Model for the Vocabulary Growth in Natural Languages

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2013-04-01

    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  20. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 286296, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    Ng, Hwee Tou

    Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 286­296, Jeju Island, Korea, 12­14 July 2012. c 2012 Association for Computational Linguistics Source Language Adaptation for Resource-Poor Machine Translation Pidong Wang

  1. Semi-automatic syntactic and semantic corpus annotation with a deep parser

    Microsoft Academic Search

    Mary D. Swift; Myroslava O. Dzikovska; Joel R. Tetreault; James F. Allen

    2004-01-01

    We describe a semi-automatic method for linguistically rich corpus annotation using a broad-coverage deep parser to generate syntactic structure, semantic representation and discourse information for task-oriented dialogs. The parser-generated analyses are checked by trained annotators. Incomplete coverage and incorrect analyses are addressed through lexicon and grammar development, after which the dialogs undergo another cycle of parsing and checking. Currently we

  2. Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions

    Microsoft Academic Search

    Sampo Pyysalo; Filip Ginter; Tapio Pahikkala; Jorma Boberg; Jouni Järvinen; Tapio Salakoski

    2006-01-01

    Summary We present an evaluation of Link Grammar and Connexor Machinese Syn-tax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein, protein interactions. In the eval-uation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein, protein interaction. We measure the perfor-mance of the parsers

  3. High-School Projects at the Laboratory for Laser Energetics (2011) Brandon Avila (Allendale Columbia) researched Natural Language Processing (NLP) for extracting information from LLE

    E-print Network

    Portman, Douglas

    2011-01-01

    Columbia) researched Natural Language Processing (NLP) for extracting information from LLE documentation libraries. He developed an NLP application using Python, XML, and Natural Language Toolkit modules. Andrew

  4. The feasibility of using natural language processing to extract clinical information from breast pathology reports

    PubMed Central

    Buckley, Julliette M.; Coopey, Suzanne B.; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K.; Kim, Elizabeth M. H.; Garber, Judy E.; Smith, Barbara L.; Gadd, Michele A.; Specht, Michelle C.; Roche, Constance A.; Gudewicz, Thomas M.; Hughes, Kevin S.

    2012-01-01

    Objective: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. Approach and Procedure: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. Results: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. Conclusion: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. PMID:22934236

  5. A Natural Language for AdS/CFT Correlators

    SciTech Connect

    Fitzpatrick, A.Liam; /Boston U.; Kaplan, Jared; /SLAC; Penedones, Joao; /Perimeter Inst. Theor. Phys.; Raju, Suvrat; /Harish-Chandra Res. Inst.; van Rees, Balt C.; /YITP, Stony Brook

    2012-02-14

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  6. On the Natural Language of Signs and Its Value and Uses in the Instruction of the Deaf and Dumb.

    ERIC Educational Resources Information Center

    Gallaudet, Thomas H.

    1997-01-01

    This reprinted article discusses the intrinsic value and indispensable necessity of the use of natural signs in the education of students with deafness. The benefits of sign language over oral language, the use of sign language to teach moral development, and the need for a common language is discussed. (CR)

  7. Vision based Interpretation of Natural Sign Languages Richard Bowden12, Andrew Zisserman2, Timor Kadir2, Mike Brady2

    E-print Network

    Bowden, Richard

    Vision based Interpretation of Natural Sign Languages Richard Bowden12, Andrew Zisserman2, Timor stated. This allows the same system to be used for different sign languages requiring only a change of the knowledge base. 1 Introduction Sign Language is a visual language and consists of 3 major components: 1

  8. A study of the medical record interface to natural language processing.

    PubMed

    Takemura, Tadamasa; Ashida, Nobuyuki

    2002-04-01

    The information about a patient tends to be handled more on a computer system. However, it is not sufficiently rational enough because of the fundamental difference between man and a computer. Up to now, man has treated information using a natural language. Therefore, if it can be applied to handle medical information electrically, that will become more rational. For this reason, we developed a new classification method that interfaces a computer with the human being, using a natural language. This method was named as a situation-oriented medical record, and this depicts changes in a situation by the case frame from a viewpoint of man's cognition. Moreover, the medical communication by a natural language, which is currently used when a patient changes a hospital, was analyzed in order to confirm the validity of this method. In addition, we developed a prototype system that allows computers to implement this kind of communication. PMID:11993574

  9. Ontology-Based Controlled Natural Language Editor Using CFG with Lexical Dependency

    NASA Astrophysics Data System (ADS)

    Namgoong, Hyun; Kim, Hong-Gee

    In recent years, CNL (Controlled Natural Language) has received much attention with regard to ontology-based knowledge acquisition systems. CNLs, as subsets of natural languages, can be useful for both humans and computers by eliminating ambiguity of natural languages. Our previous work, OntoPath [10], proposed to edit natural language-like narratives that are structured in RDF (Resource Description Framework) triples, using a domain-specific ontology as their language constituents. However, our previous work and other systems employing CFG for grammar definition have difficulties in enlarging the expression capacity. A newly developed editor, which we propose in this paper, permits grammar definitions through CFG-LD (Context-Free Grammar with Lexical Dependency) that includes sequential and semantic structures of the grammars. With CFG describing the sequential structure of grammar, lexical dependencies between sentence elements can be designated in the definition system. Through the defined grammars, the implemented editor guides users' narratives in more familiar expressions with a domain-specific ontology and translates the content into RDF triples.

  10. Selecting the Best Mobile Information Service with Natural Language User Input

    NASA Astrophysics Data System (ADS)

    Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu

    Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.

  11. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  12. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  13. Optimizing Planar and 2-Planar Parsers with MaltOptimizer Optimizando los Parsers Planar y 2-Planar con MaltOptimizer

    E-print Network

    de Madrid, Spain Universidade da Coru~na, Spain §Uppsala University, Sweden miballes actuales requieren una completa configuraci´on para obtener resultados a la altura del estado del arte, y algoritmos diferentes y de reciente incor- poraci´on en MaltParser. En el presente art´iculo presentamos c

  14. Technology-Mediated Telepathy: A Natural Language Brain-Computer Interface

    Microsoft Academic Search

    Anand Kulkarni; Kevin Simler; Alex Storer

    We present a new model for a communication interface between the human brain and a computer. The neurological mechanisms of thought and language in the brain are at present poorly understood. By contrast, the basis of motor activity in the brain is relatively well-known. Our research involves reading the motor signals generated by the human brain while communicating naturally and

  15. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…

  16. Verification Processes in Recognition Memory: The Role of Natural Language Mediators

    ERIC Educational Resources Information Center

    Marshall, Philip H.; Smith, Randolph A. S.

    1977-01-01

    The existence of verification processes in recognition memory was confirmed in the context of Adams' (Adams & Bray, 1970) closed-loop theory. Subjects' recognition was tested following a learning session. The expectation was that data would reveal consistent internal relationships supporting the position that natural language mediation plays an…

  17. A curriculum database with boolean natural-language searching in HyperCard.

    PubMed Central

    Mann, D.; Goodrum, K.; DeWine, J. M.; McVicker, J.

    1992-01-01

    A curriculum database including both natural-language and keyword searching was developed to assist faculty in curriculum research and reform. HyperCard (with extensions) on the Apple Macintosh provides a flexible single-user or networked environment for entering, indexing, searching and retrieving content in detailed faculty notes for the instructional activities in a four-year predoctoral curriculum. PMID:1482977

  18. Natural Vs. Precise Concise Languages for Human Operation of Computers: Research Issues and Experimental Approaches

    Microsoft Academic Search

    Ben Shneiderman

    1980-01-01

    This paper raises concerns that natural language front ends for computer systems can limit a researcher's scope of thinking, yield inappropriately complex systems, and exaggerate public fear of computers. Alternative modes of computer use are suggested and the role of psychologically oriented controlled experimentation is emphasized. Research methods and recent experimental results are briefly reviewed.

  19. FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text

    Microsoft Academic Search

    Jerry R. Hobbs; Douglas E. Appelt; John Bear; David J. Israel; Megumi Kameyama; Mark E. Stickel; Mabry Tyson

    1997-01-01

    Abstract FASTUS is a system for extracting information from natural language text for entry into a database and for other applications. It works essentially as a cascaded, nondeterministic finite-state automaton. There are five stages in the operation of FASTUS. In Stage 1, names and other fixed form expressions are recognized. In Stage 2, basic noun groups, verb groups, and prepositions

  20. AI agents combining natural language interaction, task planning, and business ontologies can help

    E-print Network

    Fox, Mark S.

    AI agents combining natural language interaction, task planning, and business ontologies can help a new customer than to keep an existing one. How can AI help in addressing this problem? For several years we have built a domain-inde- pendent AI platform for creating conversation- al customer

  1. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  2. Natural Language Interface for Fault Diagnosis System of Nuclear Power Plant Control Systems

    Microsoft Academic Search

    Yukiharu OHGA; Yukio NAGAOKA; Satoshi SUZUKI; Tetsuo ITO

    1990-01-01

    A fault diagnosis system was developed to improve the availability and maintainability of control systems in nuclear power plants. To facilitate man-machine communications in the system a natural language interface was introduced. Features of the interface include two-step analysis of the meaning of input sentences, identification of the kind of input content using sentence patterns, and retrieval of the information

  3. The Importance of Lexicalized Syntax Models for Natural Language Generation Tasks

    E-print Network

    Marcu, Daniel

    The Importance of Lexicalized Syntax Models for Natural Language Generation Tasks Hal Daum´e III recog- nized the importance of lexicalized mod- els of syntax. By contrast, these models do not appear that a lexicalized model of syntax improves the performance of a statistical text compres- sion system, and show

  4. Natural Language Syntax and First Order In David McAllester and Robert Givan

    E-print Network

    McAllester, David

    Natural Language Syntax and First Order In­ ference David McAllester and Robert Givan MIT­standard syntax for first order logic. In this paper we define a syntax for first order logic based than analogous procedures based on either classical or taxonomic syntax. This paper appeared

  5. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles

    Microsoft Academic Search

    Carol Friedman; Pauline Kra; Hong Yu; Michael Krauthammer; Andrey Rzhetsky

    2001-01-01

    Systems that extract structured information from natural language passages have been highly successful in specialized domains. The time is opportune for devel- oping analogous applications for molecular biology and genomics. We present a system, GENIES, that extracts and structures information about cellular pathways from the biological literature in accordance with a knowledge model that we developed earlier. We implemented GENIES

  6. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory

    ERIC Educational Resources Information Center

    Tierney, Patrick J.

    2012-01-01

    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

  7. Combining data integration with natural language technology for the semantic web

    Microsoft Academic Search

    D. Williams; A. Poulovassilis

    2003-01-01

    Abstract: Currentdataintegration systems allowavariety of heterogeneousstructured or semi-structured data sources to be combined andqueried by providing an integrated view over them. The Semantic Webalso requires us to be able to integrate information from a variety ofheterogeneous information sources. However, these information sourceswill also include natural language (e.g. web pages) and ontologies. Inthis paper wedescribeanarchitecture whichcombines the data integrationapproach with...

  8. The linguistic correlates of conversational deception: Comparing natural language processing technologies

    Microsoft Academic Search

    NICHOLAS D. DURAN; CHARLES HALL; PHILIP M. MCCARTHY; DANIELLE S. MCNAMARA

    2010-01-01

    The words people use and the way they use them can reveal a great deal about their mental states when they attempt to deceive. The challenge for researchers is how to reliably distinguish the linguistic features that characterize these hidden states. In this study, we use a natural language processing tool called Coh-Metrix to evaluate deceptive and truthful conversations that

  9. Using Natural Language Generation Technology to Improve Information Flows in Intensive Care Units

    Microsoft Academic Search

    Jim Hunter; Albert Gatt; François Portet; Ehud Reiter; Somayajulu Sripada

    2008-01-01

    In the drive to improve patient safety, patients in modern intensive care units are closely monitored with the generation of very large volumes of data. Unless the data are further processed, it is difficult for medical and nursing staff to assimilate what is important. It has been demonstrated that data summarization in natural language has the potential to improve clinical

  10. The Nature of Auditory Discrimination Problems in Children with Specific Language Impairment: An MMN Study

    ERIC Educational Resources Information Center

    Davids, Nina; Segers, Eliane; van den Brink, Danielle; Mitterer, Holger; van Balkom, Hans; Hagoort, Peter; Verhoeven, Ludo

    2011-01-01

    Many children with specific language impairment (SLI) show impairments in discriminating auditorily presented stimuli. The present study investigates whether these discrimination problems are speech specific or of a general auditory nature. This was studied using a linguistic and nonlinguistic contrast that were matched for acoustic complexity in…

  11. Waking Up a Sleeping Rabbit: On Natural-Language Sentence Generation with FF

    E-print Network

    Paris-Sud XI, Université de

    Waking Up a Sleeping Rabbit: On Natural-Language Sentence Generation with FF Alexander Koller analyze in detail the reasons for ineffective- ness in FF, resulting in a few minor implementation fixes in FF's preprocessor, and in a basic reconfiguration of its search options. The performance

  12. New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

    Microsoft Academic Search

    Filip Ginter; Jorma Boberg; Jouni Järvinen; Tapio Salakoski

    2004-01-01

    We study the problems of disambiguation in natural language, focusing on the problem of gene vs. protein name disambiguation in biological text and also considering the problem of context-sensitive spelling error correction. We introduce a new family of classifiers based on ordering and weighting the feature vectors obtained from word counts and word co-occurrence in the text, and inspect several

  13. Working while Driving: Corpus based language modelling of a natural English

    E-print Network

    Berzins, M.

    for the degree of Master of Science (by Research) The University of Leeds, School of Computer Studies September 1996 The candidate confirms that the work submitted is his own and that appropriate credit has been study of a potential industrial application of Natural Language Processing (NLP). An industrial sponsor

  14. Natural Language Processing (NLP) tools for the analysis of incident and accident reports

    E-print Network

    Paris-Sud XI, Université de

    Natural Language Processing (NLP) tools for the analysis of incident and accident reports project, we use NLP methods to facilitate experience feedback in the field of civil aviation safety. In this paper, we present how NLP methods based on the extraction of textual information from the Air France ASR

  15. A Sublanguage Approach to Natural Language Processing for an Expert System.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.; And Others

    1993-01-01

    Reports on the development of an NLP (natural language processing) component for processing the free-text comments on life insurance applications for evaluation by an underwriting expert system. A sublanguage grammar approach with strong reliance on semantic word classes is described. Highlights include lexical analysis, adjacency analysis, and…

  16. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)…

  17. An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System

    Microsoft Academic Search

    Asanee Kawtrakul

    In information-age society, advanced retrieval technique and the automatic extraction of useful information from streams of text become the societal needs. This system should provide more in depth about the contents than does a standard information retrieval system, which relies on keyword based analysis and matching computation. Clearly, natural language processing does playa role for capturing text information and making

  18. A Web-based Integrated Knowledge Mining Aid System Using Term-oriented Natural Language Processing

    Microsoft Academic Search

    Hideki MIMA; Sophia ANANIADOU; Junichi TSUJII

    1999-01-01

    In this paper, we propose a web -based integrated knowledge mining aid system in which information extraction and intelligent database access are combined using term-oriented natural language tools. Our domain is molecular biology and our aim is to provide efficient access to heterogeneous biological and genomic databases, enabling users to use a wide range of textual and non textual resources

  19. Applying natural language processing (NLP) based metadata extraction to automatically acquire user preferences

    Microsoft Academic Search

    Woojin Paik; Sibel Yilmazel; Eric Brown; Maryjane Poulin; Stephane Dubon; Christophe Amice

    2001-01-01

    This paper describes a metadata extraction technique based on natural language processing (NLP) which extracts personalized information from email communications between financial analysts and their clients. Personalized means connecting users with content in a personally meaningful way to create, grow, and retain online relationships. Personalization often results in the creation of user profiles that store individuals' preferences regarding goods or

  20. Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques

    Microsoft Academic Search

    Jeonghee Yi; Tetsuya Nasukawa; Razvan C. Bunescu; Wayne Niblack

    2003-01-01

    We present Sentiment Analyzer (SA) that extracts senti- ment (or opinion) about a subject from online text docu- ments. Instead of classifying the sentiment of an entire doc- ument about a subject, SA detects all references to the given subject, and determines sentiment in each of the references using natural language processing (NLP) techniques. Our sentiment analysis consists of 1)

  1. i2b2 Workshop on Natural Language Processing Challenges for Clinical Records

    Microsoft Academic Search

    Ozlem Uzuner; Peter Szolovits; Isaac Kohane

    This workshop aims to bring together computational linguists and medical informaticians interested in automatic linguistic processing of clinical records such as medical discharge summaries and radiology reports. Lack of a publicly available and standardized data set has been one of the biggest barriers to systematic progress of Natural Language Processing techniques for clinical data. Within the framework of the i2b2

  2. Classifying free-text triage chief complaints into syndromic categories with natural language processing

    Microsoft Academic Search

    Wendy Webber Chapman; Lee M. Christensen; Michael M. Wagner; Peter J. Haug; Oleg Ivanov; John N. Dowling; Robert T. Olszewski

    2005-01-01

    Objective: Develop and evaluate a natural language processing application for classifying chief complaints into syndromic categories for syndromic surveillance. Introduction: Much of the input data for artificial intelligence applications in the medical field are free-text patient medical records, including dictated medical reports and triage chief complaints. To be useful for automated systems, the free-text must be translated into encoded form.

  3. Not for its own sake: knowledge as a byproduct of natural language processing

    Microsoft Academic Search

    William B. Dolan; Stephen D. Richardson

    Recent claims in the literature as well as current research trends suggest that a long-held goal of natural language processing, the ability to map automatically from machine readable dictionaries into structured knowledge bases that can be used for various artificial intelligence tasks may be impossible. This paper argues to the contrary, describing an extremely large and rich lexical knowledge base

  4. Natural Language Processing based on Semantic inferentialism for extracting crime information from text

    Microsoft Academic Search

    Vladia Pinheiro; Vasco Furtado; Tarcisio H. C. Pequeno; Douglas Nogueira

    2010-01-01

    This article describes an architecture for Information Extraction systems on the web, based on Natural Language Processing (NLP) and especially geared toward the exploration of information about crime. The main feature of the architecture is its NLP module, which is based on the Semantic Inferential Model. We demonstrate the feasibility of the architecture through the implementation thereof to provide input

  5. A methodology and tool suite for evaluation of accuracy of interoperating statistical natural language processing engines

    Microsoft Academic Search

    Uma Murthy; John F. Pitrelli; Ganesh N. Ramaswamy; Martin Franz; Burn L. Lewis

    2008-01-01

    Evaluation of accuracy of natural language processing (NLP) engines plays an important role in their development and im- provement. Such evaluation usually takes place at a per-engine level. For example, there are evaluation methods for engines such as speech recognition, machine translation, story bound- ary detection, etc. Many real-world applications require combi- nations of these functions. This has become possible

  6. Towards a Cascade of Morpho-syntactic Tools for Arabic Natural Language Processing

    Microsoft Academic Search

    Slim Mesfar

    2010-01-01

    \\u000a This paper presents a cascade of morpho-syntactic tools to deal with Arabic natural language processing. It begins with the\\u000a description of a large coverage formalization of the Arabic lexicon. The built electronic dictionary, named \\

  7. Natural language processing for information retrieval: the time is ripe (again)

    Microsoft Academic Search

    Matthew Lease

    2007-01-01

    Paraphrasing van Rijsbergen (37), the time is ripe for an- other attempt at using natural language processing (NLP) for information retrieval (IR). This paper introduces my dis- sertation study, which will explore methods for integrating modern NLP with state-of-the-art IR techniques. In ad- dition to text, I will also apply retrieval to conversational speech data, which poses a unique set

  8. A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language Processing

    E-print Network

    Paris-Sud XI, Université de

    A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language videos. For this, we developed a complete video Optical Character Recognition system (OCR), specifically adapted to detect and recognize embedded texts in videos. Based on a neural approach, this new method

  9. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for…

  10. Ontology-Based Natural Language Processing for In-store Shopping Situations

    Microsoft Academic Search

    Sabine Janzen; Wolfgang Maass

    2009-01-01

    Natural Language communication between customers and products within in-store shopping environments enables new forms of product interfaces and an improved filtering and intuitive presentation of product information. In this article, we describe how customer's access to product information at the point of sale can be improved through the use of dialogue systems and heterogeneous web-based representations of product information based

  11. Bayesian Inference with Tears a tutorial workbook for natural language researchers

    E-print Network

    Zhang, Yi

    an idiot. That was not the turning point in my life, though. The turning point was EM. Here was a learning experiments without lots of new code and new bugs. 3. Another turning point? When I recently started seeing September 2009 1. Introduction When I first saw this in a natural language paper, it certainly brought tears

  12. A Rule-Based Semiautomated Approach to Building Natural Language Question Answering (NLQA) Systems

    Microsoft Academic Search

    Kaushik Krishnasamy; Brian P. Butz; Michael Duarte

    2004-01-01

    This paper presents a rule-based approach to natural language question answering that can be easily implemented for any domain. We discuss the framework in the context of a National Science Foundation funded project - Universal Virtual Laboratory (UVL). UVL is a virtual electrical engineering (EE) laboratory for able and disabled individuals to construct, simulate and understand the characteristics of basic

  13. Human-computer interaction through natural language and hypermedia in AlFresco

    Microsoft Academic Search

    Oliviero Stock; Carlo Strapparava; Massimo Zancanaro

    1996-01-01

    Multimodality is a powerful concept for dealing with dialogue cohesion in a Human-Computer Natural Language centered system. Two issues, important for a the more effective exploitation of the potentially large bandwidth of communication provided by this situation are presented: (i) the integration of navigational and mediated aspects of interaction; (ii) the use of a graphical representation of the dialogue structure

  14. Using Natural Language Generation Technology to Improve Information Flows in Intensive Care Units

    E-print Network

    Paris-Sud XI, Université de

    real-time patient care. In the healthcare sector there is increased interest in deploying technologyUsing Natural Language Generation Technology to Improve Information Flows in Intensive Care Units feature of this technology is that it brings together a diverse set of techniques such as medical signal

  15. Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars

    Microsoft Academic Search

    Ted Briscoe; John Carroll

    1993-01-01

    We describe work toward the construction of a very wide-coverage probabilistic parsing system for natural language (NL), based on LR parsing techniques. The system is intended to rank the large number of syntactic analyses produced by NL grammars according to the frequency of occurrence of the individual rules deployed in each analysis. We discuss a fully automatic procedure for constructing

  16. THEORETICAL REVIEW Zipf's word frequency law in natural language: A critical review

    E-print Network

    Makous, Walter

    THEORETICAL REVIEW Zipf's word frequency law in natural language: A critical review and future approximately follows a simple mathemati- cal form known as Zipf 's law. This article first shows that human law, al- though prior data visualization methods have obscured this fact. A number of empirical

  17. AbstFinder, A Prototype Natural Language Text Abstraction Finder for Use in Requirements

    E-print Network

    Berry, Daniel M.

    , requirements elicitation, evaluation of tool, tool use method 1. Introduction 1.1. The Problem Requirements is the requirements elicitation and specification stage, and that within this stage, elicitation is less understoodAbstFinder, A Prototype Natural Language Text Abstraction Finder for Use in Requirements

  18. Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing

    Microsoft Academic Search

    Genevieve Gorrell

    2006-01-01

    An algorithm based on the Generalized Hebbian Algorithm is described that allows the singular value decomposition of a dataset to be learned based on single observation pairs presented seri- ally. The algorithm has minimal mem- ory requirements, and is therefore in- teresting in the natural language do- main, where very large datasets are of- ten used, and datasets quickly become

  19. Introduction to Special Issue: Understanding the Nature-Nurture Interactions in Language and Learning Differences.

    ERIC Educational Resources Information Center

    Berninger, Virginia Wise

    2001-01-01

    The introduction to this special issue on nature-nurture interactions notes that the following articles represent five biologically oriented research approaches which each provide a tutorial on the investigator's major research tool, a summary of current research understandings regarding language and learning differences, and a discussion of…

  20. ELIZA — a computer program for the study of natural language communication between man and machine

    Microsoft Academic Search

    Joseph Weizenbaum

    1983-01-01

    ELIZA is a program operating within the MAC time-sharing system of MIT which makes certain kinds of natural language conversation between man and computer possible. Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules. The fundamental technical

  1. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  2. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 332343, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing. By examining data for over 100 languages, we train a statistical model to automat- ically relate graphemic, these resources are entirely lack- ing for the vast majority of the world's languages. Thus, automatic and generic

  3. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  4. A prototype natural language interface to a large complex knowledge base, the Foundational Model of Anatomy.

    PubMed

    Distelhorst, Gregory; Srivastava, Vishrut; Rosse, Cornelius; Brinkley, James F

    2003-01-01

    We describe a constrained natural language interface to a large knowledge base, the Foundational Model of Anatomy (FMA). The interface, called GAPP, handles simple or nested questions that can be parsed to the form, subject-relation-object, where subject or object is unknown. With the aid of domain-specific dictionaries the parsed sentence is converted to queries in the StruQL graph-searching query language, then sent to a server we developed, called OQAFMA, that queries the FMA and returns output as XML. Preliminary evaluation shows that GAPP has the potential to be used in the evaluation of the FMA by domain experts in anatomy. PMID:14728162

  5. Natural Language Engineering 5 (3): 113--133. Printed in the United Kingdom fl 2000 Cambridge University Press

    E-print Network

    Yarowsky, David

    language processing. For English, at least, performance of state­of­ the­art systems on other lexical tasks language processing since the inception of the field (Weaver 1949), algorithms for word sense selectionNatural Language Engineering 5 (3): 113--133. Printed in the United Kingdom c fl 2000 Cambridge

  6. Adapting a WSJ-Trained Parser to Grammatically Noisy Text Jennifer Foster, Joachim Wagner and Josef van Genabith

    E-print Network

    van Genabith, Josef

    Adapting a WSJ-Trained Parser to Grammatically Noisy Text Jennifer Foster, Joachim Wagner and Josef performance on the ungrammatical test sets. We show how a classifier can be used to prevent performance, it is not clear that it is has solved the accurate robustness problem. The problem of adapting parsers

  7. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  8. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  9. Toward enhanced Natural Language Processing to databases: Building a specific domain Ontology derived from database conceptual model

    Microsoft Academic Search

    Rajeh Khamis; Safwan Shatnawi

    2010-01-01

    Natural Language Interface to database NLIDB applications achieve great success when dealing with simple user requests, however most of NLIDB applications fail dramatically when users issue indirect or sophisticated requests. One modern approach to enhance NLIDB is using Ontology. Ontologies are very helpful when used with Natural Language Processing applications for supporting extraction of relevant elements from databases. This paper

  10. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  11. The effect of teachers' language on students' conceptions of the nature of science

    NASA Astrophysics Data System (ADS)

    Zeidler, Dana L.; Lederman, Norman G.

    Conveying an adequate conception of the nature of science to students is implicit in the border context of what has come to be known as scientific literacy. However, it has previously been demonstrated that possession of valid conceptions of the nature of science does not necessarily result in the performance of those teaching behaviors that are related to improved student conceptions. The present study examines the possibility that the language teachers use to communicate science content may provide the context (Realist or Instrumentalist orientations) in which students come to formulate a world view of science. Eighteen high school biology teachers and one randomly selected class from each of their sections (n = 409 students) were administered pre- and posttests at the beginning and end of the fall term using the Nature of Scientific Knowledge Scale (NSKS). Composite scores of the student changes on the Testable, Developmental, and Creative subscales were used to compare those six classes that exhibited the greatest change with those six classes that had the least change on the NSKS. Intensive qualitative observations of each teacher were also conducted over the fall semester, resulting in complete transcripts of teacher-student interactions. Qualitative comparisons of classes with respect to six variables related to Realist and Instrumentalist conceptions of the nature of science were conducted. TEACHERS' ordinary language in the presentation of subject matter was found to have significant impact on students' conceptions of the nature of science. These variables represented different contexts (Realist-Instrumental) teachers used to express themselves, scientific information, and concepts. Determining the extent to which TEACHERS' language has an impact on changes in students' conception of the nature of science has direct bearing on all preservice and inservice science teacher education programs.

  12. NLP-SIR: A Natural Language Approach for Spreadsheet Information Retrieval

    E-print Network

    Flood, Derek; Caffery, Fergal Mc

    2009-01-01

    Spreadsheets are a ubiquitous software tool, used for a wide variety of tasks such as financial modelling, statistical analysis and inventory management. Extracting meaningful information from such data can be a difficult task, especially for novice users unfamiliar with the advanced data processing features of many spreadsheet applications. We believe that through the use of Natural Language Processing (NLP) techniques this task can be made considerably easier. This paper introduces NLP-SIR, a Natural language interface for spreadsheet information retrieval. The results of a recent evaluation which compared NLP-SIR with existing Information retrieval tools are also outlined. This evaluation has shown that NLP-SIR is a more effective method of spreadsheet information retrieval.

  13. Keyword and Natural Language Query Processing for Semi-Structured Data Sources

    Microsoft Academic Search

    Arash Termehchy

    2009-01-01

    The unprecedentedly large volume of semi-structured data has exacerbated the need for an easy-to-use query interface for semi-structured data sources. Natural language inter- faces and keyword search techniques that take advantage of the data set structure make it very easy for ordinary users to access the data. In this paper, we introduce the impor- tant challenges that lie in the

  14. GenI: Natural language generation in Haskell INRIA/LORIA/UHP

    E-print Network

    Paris-Sud XI, Université de

    GenI: Natural language generation in Haskell Eric Kow INRIA/LORIA/UHP eric.kow@loria.fr Abstract In this article we present GenI, a chart based surface realisation tool implemented in Haskell. GenI takes corresponds to the input semantics. The aim of the article is not so much to present GenI or to de- scribe how

  15. Automatic reconstruction of a bacterial regulatory network using Natural Language Processing

    Microsoft Academic Search

    Carlos Rodríguez Penagos; Heladia Salgado; Irma Martínez-flores; Julio Collado-vides

    2007-01-01

    BACKGROUND: Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implementation of a state-of-the-art Natural Language Processing system that creates computer-readable networks of regulatory interactions directly from different collections of abstracts and full-text papers. Our major aim is to understand how automatic annotation using Text-Mining techniques

  16. An Overview of Corpus-Based Statistics-Oriented (CBSO) Techniques for Natural Language Processing

    Microsoft Academic Search

    Keh-Yih Su; Tung-Hui Chiang; Jing-Shin Chang

    1996-01-01

    A Corpus-Based Statistics-Oriented (CBSO) methodology, which is an attempt to avoid the drawbacks of traditional rule-based approaches and purely statistical approaches, is introduced in this paper. Rule-based approaches, with rules induced by human experts, had been the dominant paradigm in the natural language processing community. Such approaches, however, suffer from serious difficulties in knowledge acquisition in terms of cost and

  17. Intelligent Software Development Environments: Integrating Natural Language Processing with the Eclipse Platform

    Microsoft Academic Search

    René Witte; Bahar Sateli; Ninus Khamis; Juergen Rilling

    \\u000a Software engineers need to be able to create, modify, and analyze knowledge stored in software artifacts. A significant amount\\u000a of these artifacts contain natural language, like version control commit messages, source code comments, or bug reports. Integrated\\u000a software development environments (IDEs) are widely used, but they are only concerned with structured software artifacts –\\u000a they do not offer support for

  18. Combining Goal Inference and Natural-Language Dialogue for Human-Robot Joint Action

    Microsoft Academic Search

    Mary Ellen Foster; Manuel Giuliani; Markus Rickert; Alois Knoll; Wolfram Erlhagen; Estela Bicho; Luis Louro

    We demonstrate how combining the reasoning compo- nents from two existing systems designed for human-robot joint ac- tion produces an integrated system with greater capabilities than ei- ther of the individual systems. One of the systems supports primarily non-verbal interaction and uses dynamic neural fields to infer the user's goals and to suggest appropriate system responses; the other emphasises natural-language

  19. Natural language processing-based COTS software and related technologies survey.

    SciTech Connect

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.

    2003-09-01

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  20. The Use of Natural Language as an Intuitive Semantic Integration System Interface

    Microsoft Academic Search

    Stanis?aw Kozielski; Micha? ?widerski; Ma?gorzata Bach

    This paper describes the need for intuitive interfaces to complex systems that take their origin from the concepts of Semantic\\u000a Web. Then it shows how Semantic Integration System HILLS can benefit from being merged with Pseudo Natural Language layer\\u000a of Metalog system. The cooperation of these two systems is not perfect though - second part of the paper shows guidelines

  1. Agile sensor tasking for CoIST using natural language knowledge representation and reasoning

    NASA Astrophysics Data System (ADS)

    Braines, David; de Mel, Geeth; Gwilliams, Chris; Parizas, Christos; Pizzocaro, Diego; Bergamaschi, Flavio; Preece, Alun

    2014-06-01

    We describe a system architecture aimed at supporting Intelligence, Surveillance, and Reconnaissance (ISR) activities in a Company Intelligence Support Team (CoIST) using natural language-based knowledge representation and reasoning, and semantic matching of mission tasks to ISR assets. We illustrate an application of the architecture using a High Value Target (HVT) surveillance scenario which demonstrates semi-automated matching and assignment of appropriate ISR assets based on information coming in from existing sensors and human patrols operating in an area of interest and encountering a potential HVT vehicle. We highlight a number of key components of the system but focus mainly on the human/machine conversational interaction involving soldiers on the field providing input in natural language via spoken voice to a mobile device, which is then processed to machine-processable Controlled Natural Language (CNL) and confirmed with the soldier. The system also supports CoIST analysts obtaining real-time situation awareness on the unfolding events through fused CNL information via tools available at the Command and Control (C2). The system demonstrates various modes of operation including: automatic task assignment following inference of new high-importance information, as well as semi-automatic processing, providing the CoIST analyst with situation awareness information relevant to the area of operation.

  2. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E [Los Alamos National Laboratory; Garrett, Kenyan [Los Alamos National Laboratory; Sottile, Matthew [GALOIS; Shreyas, Ns [INDIANA UNIV.

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  3. Reachability Analysis of the HTML5 Parser Specification and its Application to

    E-print Network

    Minamide, Yasuhiko

    Reachability Analysis of the HTML5 Parser Specification and its Application to Compatibility for HTML, HTML5, includes the detailed specification of the parsing algorithm for HTML5 documents, includ of HTML5 and automatically generate HTML documents to test compatibilities of Web browsers. The set

  4. VALIDATION OF BITSTREAM SYNTAX AND SYNTHESIS OF PARSERS IN THE MPEG RECONFIGURABLE VIDEO CODING FRAMEWORK

    E-print Network

    Paris-Sud XI, Université de

    VALIDATION OF BITSTREAM SYNTAX AND SYNTHESIS OF PARSERS IN THE MPEG RECONFIGURABLE VIDEO CODING requires systematic procedures and tools capable of describing the new bitstream syntaxes of such new further explains the problem and describes the technologies used for describing new bitstream syntaxes

  5. Formalizing natural-language spatial relations descriptions with fuzzy decision tree algorithm

    NASA Astrophysics Data System (ADS)

    Xu, Jun; Yao, Changqing

    2006-10-01

    People usually use qualitative terms to express spatial relations, while current geographic information systems (GIS) all use quantitative approaches to store spatial information. The abilities of current GIS to represent and query spatial information about geographic space are limited. In order to incorporate the concepts and methods people use to infer information about geographic space into GIS, research on the formal model of common sense geography becomes increasingly important. Previous research on the formalizations of natural-language descriptions of spatial relations are all based on crisp classification algorithms. But the human languages about spatial relations are ambiguous. There is no clear boundary between "yes" or "no" if a spatial relation predicate can express the spatial relations between objects. So the results of crisp classification algorithms can not formalize natural-language terms well. This paper uses a fuzzy decision tree method to formalize the spatial relations between two linear objects. Topologic and metric indices are used as variables, and the results of a human-subject test are used as training data. The formalization result of the fuzzy decision tree is compared with the result of a crisp decision tree.

  6. The substantive nature of psycholexical personality factors: a comparison across languages.

    PubMed

    Peabody, Dean; De Raad, Boele

    2002-10-01

    The psycholexical approach to personality structure in American English has led to the Big Five factors. The present study considers whether this result is similar or different in other languages. Instead of placing the usual emphasis on quantitative indices, this study examines the substantive nature of the factors. Six studies in European languages were used to develop a taxonomy of content categories. The English translations of the relevant terms were then classified under this taxonomy. The results support the generality of Big Five Factor III (Conscientiousness). Factors IV (Emotional Stability) and V (Intellect) generally did not cohere. Factors I (Extraversion) and II (Agreeableness) tended to split when this was necessary to produce 5 factors. The analysis was extended to several additional studies. PMID:12374448

  7. Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 5867, Portland, Oregon, USA, 2324 June 2011. c 2011 Association for Computational Linguistics

    E-print Network

    of automatic natural language processing (NLP). Urdu is one such language. The main objective of our research.61% F1). The morphologi- cal richness of the Urdu language enables us to extract features based on nounProceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 58

  8. Knowledge acquisition from natural language for expert systems based on classification problem-solving methods

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando

    1989-01-01

    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

  9. MEDSYNDIKATE--a natural language system for the extraction of medical information from findings reports.

    PubMed

    Hahn, Udo; Romacker, Martin; Schulz, Stefan

    2002-12-01

    MEDSYNDIKATE is a natural language processor, which automatically acquires medical information from findings reports. In the course of text analysis their contents is transferred to conceptual representation structures, which constitute a corresponding text knowledge base. MEDSYNDIKATE is particularly adapted to deal properly with text structures, such as various forms of anaphoric reference relations spanning several sentences. The strong demands MEDSYNDIKATE poses on the availability of expressive knowledge sources are accounted for by two alternative approaches to acquire medical domain knowledge (semi)automatically. We also present data for the information extraction performance of MEDSYNDIKATE in terms of the semantic interpretation of three major syntactic patterns in medical documents. PMID:12460632

  10. Discovering novel causal patterns from biomedical natural-language texts using Bayesian nets.

    PubMed

    Atkinson, John; Rivas, Alejandro

    2008-11-01

    Most of the biomedicine text mining approaches do not deal with specific cause--effect patterns that may explain the discoveries. In order to fill this gap, this paper proposes an effective new model for text mining from biomedicine literature that helps to discover cause--effect hypotheses related to diseases, drugs, etc. The supervised approach combines Bayesian inference methods with natural-language processing techniques in order to generate simple and interesting patterns. The results of applying the model to biomedicine text databases and its comparison with other state-of-the-art methods are also discussed. PMID:19000950

  11. Aspects of a Natural Language Based Artificial Intelligence System Report Number Seven: Language and the Structure of Knowledge.

    ERIC Educational Resources Information Center

    Borden, George A.

    ARIS is an artificial intelligence system which uses the English language to learn, understand, and communicate. The system attempts to simulate the psychoneurological processes which enable man to communicate verbally. It uses a modified stratificational grammar model and is being programed in PL/1 (a programing language) for an IBM 360/67…

  12. Language in Nature: on the Evolutionary Roots of a Cultural Phenomenon (draft chapter for The Language Phenomenon)

    E-print Network

    Amsterdam, University of

    for The Language Phenomenon) Willem Zuidema 1. Introduction What distinguishes Man from beast? For all of human, it is no overstatement to say that, from an evolutionary point of view, language is the most striking aspect of the human some of these sources of information to get an idea what form an evolutionary explanation for the human

  13. Fremdsprachenunterricht und natuerliche Zweitsprachigkeit: Spracherwerbssituationen im Vergleich (Foreign Language Teaching and Natural Bilingualism; A Comparison of language Learning Situations).

    ERIC Educational Resources Information Center

    Butzkamm, Wolfgang

    1978-01-01

    A 6th-grade test in English as a foreign language is described and contrasted with second language acquisition by the children of foreign laborers in Germany. The latter calls for special teaching procedures; tests and tapes are described. The teacher's "feel" is considered more important than "scientific" methodology. (IFS/WGA)

  14. IR, NLP, AI and UFOS: or IRrelevance, natural language problems, artful intelligence and user-friendly online systems

    Microsoft Academic Search

    Tamas E. Doszkocs

    1986-01-01

    User Friendly Online Searching is examined in the context of Natural Language Processing in Information Retrieval and Artificial Intelligence. Opportunities for synergetic R & D are identified as the basis for Intelligent Information Retrieval and Artificial Retrieval Intelligence.

  15. On the use of Web resources and natural language processing techniques to improve automatic speech recognition systems Gw· enol

    Microsoft Academic Search

    Guillaume Gravier; Pascale S

    Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing with successive precise topics, such as long multimedia streams, frequently tackling reports and debates. To overcome this problem, this paper shows that Web resources and natural language processing techniques can be effective to automatically collect a

  16. Prevalence and natural history of primary speech and language delay: é ndings from a systematic review of the literature

    Microsoft Academic Search

    James Law; James Boyle; Frances Harris; Avril Harkness; Chad Nye

    The prevalence and the natural history of primary speech and language delays were two of four domains covered in a systematic review of the literature related to screening for speech and language delay carried out for the NHS in the UK. The structure and process of the full literature review is introduced and criteria for inclusion in the two domains

  17. Bridging the Gap to Natural Language: A Review on Intelligent Tutoring Systems based on Latent Semantic Analysis

    Microsoft Academic Search

    Wolfgang Lenhard

    One of the major drawbacks in the implementation of intelligent tutoring systems is the limited capacity to process natural language and to automatically deal with unexpected or unknown vocabulary. Latent Semantic Analysis (LSA) is a statistical technique of automatic language processing, which can attenuate the \\

  18. The Esterel Synchronous Programming Language: Design, Semantics, Implementation

    Microsoft Academic Search

    Gérard Berry; Georges Gonthier

    1992-01-01

    this paper, we shall mostly be concerned by reactive kernels that constitute the central andmost difficult part of reactive systems. In fact, ESTEREL is not a full-fledged programming language, butrather a program generator used to program reactive kernels in the same way as YACC [32] is used toprogram parsers from grammars. The interface and data handling must be specified in

  19. Grammar as a Programming Language. Artificial Intelligence Memo 391.

    ERIC Educational Resources Information Center

    Rowe, Neil

    Student projects that involve writing generative grammars in the computer language, "LOGO," are described in this paper, which presents a grammar-running control structure that allows students to modify and improve the grammar interpreter itself while learning how a simple kind of computer parser works. Included are procedures for programing a…

  20. Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers Andre F. T. Martins

    E-print Network

    Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers Andr´e F. T. Martins Miguel B. Almeida Noah A. Smith# Priberam Labs, Alameda D. Afonso Henriques, 41, 2o , 1000-123 Lisboa, Portugal

  1. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

    NASA Technical Reports Server (NTRS)

    Genuardi, Michael T.

    1993-01-01

    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  2. Teaching the tacit knowledge of programming to noviceswith natural language tutoring

    NASA Astrophysics Data System (ADS)

    Lane, H. Chad; Vanlehn, Kurt

    2005-09-01

    For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and program plans from students in natural language. The system uses a variety of tutoring tactics that leverage students' intuitive understandings of the problem, how it might be solved, and the underlying concepts of programming. We report the results of a small-scale evaluation comparing students who used ProPL with a control group who read the same content. Our primary findings are that students who received tutoring from ProPL seem to have developed an improved ability to solve the composition problem and displayed behaviors that suggest they were able to think at greater levels of abstraction than students in the read-only group.

  3. Formal ontology for natural language processing and the integration of biomedical databases.

    PubMed

    Simon, Jonathan; Dos Santos, Mariana; Fielding, James; Smith, Barry

    2006-01-01

    The central hypothesis underlying this communication is that the methodology and conceptual rigor of a philosophically inspired formal ontology can bring significant benefits in the development and maintenance of application ontologies [A. Flett, M. Dos Santos, W. Ceusters, Some Ontology Engineering Procedures and their Supporting Technologies, EKAW2002, 2003]. This hypothesis has been tested in the collaboration between Language and Computing (L&C), a company specializing in software for supporting natural language processing especially in the medical field, and the Institute for Formal Ontology and Medical Information Science (IFOMIS), an academic research institution concerned with the theoretical foundations of ontology. In the course of this collaboration L&C's ontology, LinKBase, which is designed to integrate and support reasoning across a plurality of external databases, has been subjected to a thorough auditing on the basis of the principles underlying IFOMIS's Basic Formal Ontology (BFO) [B. Smith, Basic Formal Ontology, 2002. http://ontology.buffalo.edu/bfo]. The goal is to transform a large terminology-based ontology into one with the ability to support reasoning applications. Our general procedure has been the implementation of a meta-ontological definition space in which the definitions of all the concepts and relations in LinKBase are standardized in the framework of first-order logic. In this paper we describe how this principles-based standardization has led to a greater degree of internal coherence of the LinKBase structure, and how it has facilitated the construction of mappings between external databases using LinKBase as translation hub. We argue that the collaboration here described represents a new phase in the quest to solve the so-called "Tower of Babel" problem of ontology integration [F. Montayne, J. Flanagan, Formal Ontology: The Foundation for Natural Language Processing, 2003. http://www.landcglobal.com/]. PMID:16153885

  4. A Principled Framework for Constructing Natural Language Interfaces To Temporal Databases

    NASA Astrophysics Data System (ADS)

    Androutsopoulos, Ion

    1996-09-01

    Most existing natural language interfaces to databases (NLIDBs) were designed to be used with ``snapshot'' database systems, that provide very limited facilities for manipulating time-dependent data. Consequently, most NLIDBs also provide very limited support for the notion of time. The database community is becoming increasingly interested in _temporal_ database systems. These are intended to store and manipulate in a principled manner information not only about the present, but also about the past and future. This thesis develops a principled framework for constructing English NLIDBs for _temporal_ databases (NLITDBs), drawing on research in tense and aspect theories, temporal logics, and temporal databases. I first explore temporal linguistic phenomena that are likely to appear in English questions to NLITDBs. Drawing on existing linguistic theories of time, I formulate an account for a large number of these phenomena that is simple enough to be embodied in practical NLITDBs. Exploiting ideas from temporal logics, I then define a temporal meaning representation language, TOP, and I show how the HPSG grammar theory can be modified to incorporate the tense and aspect account of this thesis, and to map a wide range of English questions involving time to appropriate TOP expressions. Finally, I present and prove the correctness of a method to translate from TOP to TSQL2, TSQL2 being a temporal extension of the SQL-92 database language. This way, I establish a sound route from English questions involving time to a general-purpose temporal database language, that can act as a principled framework for building NLITDBs. To demonstrate that this framework is workable, I employ it to develop a prototype NLITDB, implemented using ALE and Prolog.

  5. Syn: A Single Language for Specifying Abstract Syntax Trees, Lexical Analysis,

    E-print Network

    Haddadi, Hamed

    Syn: A Single Language for Specifying Abstract Syntax Trees, Lexical Analysis, Parsing and Pretty aspects of context­free syntax can be specified without redundancy. The language is essentially lexers, parsers, pretty­printers and abstract syntax tree representations from a Syn specification. 1

  6. Role of PROLOG (Programming and Logic) in natural-language processing. Report for September-December 1987

    SciTech Connect

    McHale, M.L.

    1988-03-01

    The field of artificial Intelligence strives to produce computer programs that exhibit intelligent behavior. One of the areas of interest is the processing of natural language. This report discusses the role of the computer language PROLOG in Natural Language Processing (NLP) both from theoretic and pragmatic viewpoints. The reasons for using PROLOG for NLP are numerous. First, linguists can write natural-language grammars almost directly as PROLOG programs; this allows fast-prototyping of NLP systems and facilitates analysis of NLP theories. Second, semantic representations of natural-language texts that use logic formalisms are readily produced in PROLOG because of PROLOG's logical foundations. Third, PROLOG's built-in inferencing mechanisms are often sufficient for inferences on the logical forms produced by NLPs. Fourth, the logical, declarative nature of PROLOG may make it the language of choice for parallel computing systems. Finally, the fact that PROLOG has a de facto standard (Edinburgh) makes the porting of code from one computer system to another virtually trouble free. Perhaps the strongest tie one could make between NLP and PROLOG was stated by John Stuart Mill in his inaugural Address at St. Andrews: The structure of every sentence is a lesson in logic.

  7. Abstract--Named entity recognition (NER) is a popular domain of natural language processing. For this reason, many

    E-print Network

    Boyer, Edmond

    Abstract-- Named entity recognition (NER) is a popular domain of natural language processing. For this reason, many tools exist to perform this task. Amongst other points, they differ in the processing method they rely upon, the entity types they can detect, the nature of the text they can handle, and their input

  8. An English Constraint Grammar (ENGCG) a surface-syntactic parser of English

    Microsoft Academic Search

    Atro Voutilainen; Juha Heikkilä

    This paper outlines a reductionistic surface-syntactic parsing description ofEnglish that was developed 1989--1992 by Atro Voutilainen, Juha Heikkilaand Arto Anttila (Voutilainen et al. 1992; Karlsson et al. 1994) within theConstraint Grammar framework originally proposed by Fred Karlsson (1990).The parsing system employs two main components: a morphological analyserand a reductionistic parser.The morphological analyser and description is based on Koskenniemi's TwolevelModel (1983)

  9. 78CS 536 Fall 2002 Java CUP is a parser-generation tool,

    E-print Network

    Fischer, Charles N.

    to tell the generated parser how to get tokens from the scanner. #12;82CS 536 Fall 2002 © Terminal and Non. RESULT names the left-hand side non-terminal. 84CS 536 Fall 2002 © The Java classes of the symbols { stmts } The left brace is given the name l; the stmts non-terminal is called s. In the action code

  10. 78CS 536 Fall 2002 Java CUP is a parser-generation tool,

    E-print Network

    Fischer, Charles N.

    */ :} This code is used to tell the generated parser how to get tokens from the scanner. #12;82CS 536 Fall 2002-terminal in a rule. RESULT names the left-hand side non-terminal. #12;84CS 536 Fall 2002 © The Java classes prog { stmts } The left brace is given the name l; the stmts non-terminal is called s. In the action

  11. For a new edition of the Encyclopedia of Language and Linguistics. The History of Natural Language Processing and Machine Translation

    Microsoft Academic Search

    Yorick Wilks

    The article surveys fifty years of work in computational language processing and machine translation, and suggests that a great number of the important ideas were present in the earliest days and hampered only back lack of computational power. Sections review the influence of linguistics proper on the computational area, as well as the influence of artificial intelligence and concerns from

  12. Disclosure Control of Natural Language Information to Enable Secure and Enjoyable Communication over the Internet

    NASA Astrophysics Data System (ADS)

    Kataoka, Haruno; Utsumi, Akira; Hirose, Yuki; Yoshiura, Hiroshi

    Disclosure control of natural language information (DCNL), which we are trying to realize, is described. DCNL will be used for securing human communications over the internet, such as through blogs and social network services. Before sentences in the communications are disclosed, they are checked by DCNL and any phrases that could reveal sensitive information are transformed or omitted so that they are no longer revealing. DCNL checks not only phrases that directly represent sensitive information but also those that indirectly suggest it. Combinations of phrases are also checked. DCNL automatically learns the knowledge of sensitive phrases and the suggestive relations between phrases by using co-occurrence analysis and Web retrieval. The users' burden is therefore minimized, i.e., they do not need to define many disclosure control rules. DCNL complements the traditional access control in the fields where reliability needs to be balanced with enjoyment and objects classes for the access control cannot be predefined.

  13. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary.

    PubMed

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda

    2014-02-01

    In April 2012, the National Institutes of Health organized a two-day workshop entitled 'Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making' (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193

  14. Current and future applications of natural language processing in the field of digestive diseases.

    PubMed

    Hou, Jason K; Imler, Timothy D; Imperiale, Thomas F

    2014-08-01

    Natural language processing (NLP) is a technology that uses computer-based linguistics and artificial intelligence to identify and extract information from free-text data sources such as progress notes, procedure and pathology reports, and laboratory and radiologic test results. With the creation of large databases and the trajectory of health care reform, NLP holds the promise of enhancing the availability, quality, and utility of clinical information with the goal of improving documentation, quality, and efficiency of health care in the United States. To date, NLP has shown promise in automatically determining appropriate colonoscopy intervals and identifying cases of inflammatory bowel disease from electronic health records. The objectives of this review are to provide background on NLP and its associated terminology, to describe how NLP has been used thus far in the field of digestive diseases, and to identify its potential future uses. PMID:24858706

  15. Extracting important information from Chinese Operation Notes with natural language processing methods.

    PubMed

    Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei

    2014-04-01

    Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. PMID:24486562

  16. Recognizing Questions and Answers in EMR Templates Using Natural Language Processing.

    PubMed

    Divita, Guy; Shen, Shuying; Carter, Marjorie E; Redd, Andrew; Forbush, Tyler; Palmer, Miland; Samore, Matthew H; Gundlapalli, Adi V

    2014-01-01

    Templated boilerplate structures pose challenges to natural language processing (NLP) tools used for information extraction (IE). Routine error analyses while performing an IE task using Veterans Affairs (VA) medical records identified templates as an important cause of false positives. The baseline NLP pipeline (V3NLP) was adapted to recognize negation, questions and answers (QA) in various template types by adding a negation and slot:value identification annotator. The system was trained using a corpus of 975 documents developed as a reference standard for extracting psychosocial concepts. Iterative processing using the baseline tool and baseline+negation+QA revealed loss of numbers of concepts with a modest increase in true positives in several concept categories. Similar improvement was noted when the adapted V3NLP was used to process a random sample of 318,000 notes. We demonstrate the feasibility of adapting an NLP pipeline to recognize templates. PMID:25000038

  17. Conceptual Dissonance: Evaluating the Efficacy of Natural Language Processing Techniques for Validating Translational Knowledge Constructs

    PubMed Central

    Payne, Philip R.O.; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B.

    2009-01-01

    The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as “Conceptual Dissonance” in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools. PMID:21347178

  18. Parallelism and the Penman natural-language-generation system. Research report

    SciTech Connect

    Tung, Y.W.; Matthiessen, C.; Sondheimer, N.

    1988-04-01

    This report discusses parallel processing for the Penman natural-language-generation system. The authors first analyze the computational requirement of the generation process. They then identify aspects of this computation that could benefit from being carried out in parallel. The Penman generator is composed of a systemic grammar, the Nigel grammar, and its environment. These two components are functionally separated and interface to each other via an inquiry mechanism. This implies that Nigel and its environment can be processed in a distributed way. We also illustrate how both Nigel and the major part of its environment, the KL-TWO knowledge base, can each be processed in parallel. In the Nigel grammar, the systems, choosers and realization statements can be activated simultaneously according to some computational dependency that resembles the system network. The KL-TWO knowledge base can be implemented as a parallel computing system, and two existing approaches, using Classifier Systems and Connectionist Models, respectively, are analyzed and assessed.

  19. Interset: A natural language interface for teleoperated robotic assembly of the EASE space structure

    NASA Technical Reports Server (NTRS)

    Boorsma, Daniel K.

    1989-01-01

    A teleoperated robot was used to assemble the Experimental Assembly of Structures in Extra-vehicular activity (EASE) space structure under neutral buoyancy conditions, simulating a telerobot performing structural assembly in the zero gravity of space. This previous work used a manually controlled teleoperator as a test bed for system performance evaluations. From these results several Artificial Intelligence options were proposed. One of these was further developed into a real time assembly planner. The interface for this system is effective in assembling EASE structures using windowed graphics and a set of networked menus. As the problem space becomes more complex and hence the set of control options increases, a natural language interface may prove to be beneficial to supplement the menu based control strategy. This strategy can be beneficial in situations such as: describing the local environment, maintaining a data base of task event histories, modifying a plan or a heuristic dynamically, summarizing a task in English, or operating in a novel situation.

  20. How many kinds of reasoning? Inference, probability, and natural language semantics.

    PubMed

    Lassiter, Daniel; Goodman, Noah D

    2015-03-01

    The "new paradigm" unifying deductive and inductive reasoning in a Bayesian framework (Oaksford & Chater, 2007; Over, 2009) has been claimed to be falsified by results which show sharp differences between reasoning about necessity vs. plausibility (Heit & Rotello, 2010; Rips, 2001; Rotello & Heit, 2009). We provide a probabilistic model of reasoning with modal expressions such as "necessary" and "plausible" informed by recent work in formal semantics of natural language, and show that it predicts the possibility of non-linear response patterns which have been claimed to be problematic. Our model also makes a strong monotonicity prediction, while two-dimensional theories predict the possibility of reversals in argument strength depending on the modal word chosen. Predictions were tested using a novel experimental paradigm that replicates the previously-reported response patterns with a minimal manipulation, changing only one word of the stimulus between conditions. We found a spectrum of reasoning "modes" corresponding to different modal words, and strong support for our model's monotonicity prediction. This indicates that probabilistic approaches to reasoning can account in a clear and parsimonious way for data previously argued to falsify them, as well as new, more fine-grained, data. It also illustrates the importance of careful attention to the semantics of language employed in reasoning experiments. PMID:25497521

  1. Neural substrates of figurative language during natural speech perception: an fMRI study

    PubMed Central

    Nagels, Arne; Kauschke, Christina; Schrauf, Judith; Whitney, Carin; Straube, Benjamin; Kircher, Tilo

    2013-01-01

    Many figurative expressions are fully conventionalized in everyday speech. Regarding the neural basis of figurative language processing, research has predominantly focused on metaphoric expressions in minimal semantic context. It remains unclear in how far metaphoric expressions during continuous text comprehension activate similar neural networks as isolated metaphors. We therefore investigated the processing of similes (figurative language, e.g., “He smokes like a chimney!”) occurring in a short story. Sixteen healthy, male, native German speakers listened to similes that came about naturally in a short story, while blood-oxygenation-level-dependent (BOLD) responses were measured with functional magnetic resonance imaging (fMRI). For the event-related analysis, similes were contrasted with non-figurative control sentences (CS). The stimuli differed with respect to figurativeness, while they were matched for frequency of words, number of syllables, plausibility, and comprehensibility. Similes contrasted with CS resulted in enhanced BOLD responses in the left inferior (IFG) and adjacent middle frontal gyrus. Concrete CS as compared to similes activated the bilateral middle temporal gyri as well as the right precuneus and the left middle frontal gyrus (LMFG). Activation of the left IFG for similes in a short story is consistent with results on single sentence metaphor processing. The findings strengthen the importance of the left inferior frontal region in the processing of abstract figurative speech during continuous, ecologically-valid speech comprehension; the processing of concrete semantic contents goes along with a down-regulation of bilateral temporal regions. PMID:24065897

  2. ImageParser: a tool for finite element generation from three-dimensional medical images

    PubMed Central

    Yin, HM; Sun, LZ; Wang, G; Yamada, T; Wang, J; Vannier, MW

    2004-01-01

    Background The finite element method (FEM) is a powerful mathematical tool to simulate and visualize the mechanical deformation of tissues and organs during medical examinations or interventions. It is yet a challenge to build up an FEM mesh directly from a volumetric image partially because the regions (or structures) of interest (ROIs) may be irregular and fuzzy. Methods A software package, ImageParser, is developed to generate an FEM mesh from 3-D tomographic medical images. This software uses a semi-automatic method to detect ROIs from the context of image including neighboring tissues and organs, completes segmentation of different tissues, and meshes the organ into elements. Results The ImageParser is shown to build up an FEM model for simulating the mechanical responses of the breast based on 3-D CT images. The breast is compressed by two plate paddles under an overall displacement as large as 20% of the initial distance between the paddles. The strain and tangential Young's modulus distributions are specified for the biomechanical analysis of breast tissues. Conclusion The ImageParser can successfully exact the geometry of ROIs from a complex medical image and generate the FEM mesh with customer-defined segmentation information. PMID:15461787

  3. LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

    NASA Technical Reports Server (NTRS)

    Will, H.

    1994-01-01

    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  4. Interpretation of natural-language data base queries using optimization methods

    SciTech Connect

    Leigh, W.E.

    1984-01-01

    The automatic interpretation of natural language (in this work, English), database questions formulated by a user untrained in the technical aspects of database querying is an established problem in the field of artificial intelligence. State-of-the-art approaches involve the analysis of queries with syntactic and semantic grammars expressed in phrase structure grammar or transition network formalisms. With such method difficulties exist with the detection and resolution of ambiguity, with the misinterpretation possibilities inherent with finite length look-ahead, and with the modification and extension of a mechanism for other sources of semantic knowledge. This work examines the potential of optimization techniques to solve these problems and interpret natural language, database queries. The proposed method involves developing a 0-1 integer programming problem for each query. The possible values that the set of variables in the optimization may take on is an enumeration of possible such individual associations between the database schema and the query. The solution to the integer programming problem corresponds to a single assignment of database data items and relationships to the words in the query. Constraints are derived from systematic and database schema knowledge stored as libraries of templates. An objective function is used to rank the possible associations as to their likelihood of agreement with the intent of the questioner. A test mechanism was built to support evaluation of the proposed method. Suitable knowledge source template sets and an objective function were developed experimentally with the test mechanism from a learning sample of queries. Then the performance of the method was compared to that of an established system (PLANES) on a test set of queries. The performance of the new method was found to be comparable to that of the established system.

  5. Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language

    PubMed Central

    West, Andrew G

    2015-01-01

    Background Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. Objective This paper quantifies the production and consumption of Wikipedia’s medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia’s natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia’s medical contributors, including year-over-year participation trends and editor demographics. Methods Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. Results Wikipedia’s medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85.5% (100/117) had a university education. Conclusions Although Wikipedia has a considerable volume of multilingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size. PMID:25739399

  6. Vision based Interpretation of Natural Sign Languages Richard Bowden12, Andrew Zisserman2, Dave Windridge1, Timor Kadir2, Mike Brady2

    E-print Network

    Bowden, Richard

    Vision based Interpretation of Natural Sign Languages Richard Bowden12, Andrew Zisserman2, Dave base to be explicitly stated. This allows the same system to be used for different sign languages requiring only a change of the knowledge base. Introduction Sign Language is a visual language and consists

  7. Proceedings of the 9th Conference on Computational Natural Language Learning (CoNLL), pages 120127, Ann Arbor, June 2005. c 2005 Association for Computational Linguistics

    E-print Network

    - ural Language Processing (NLP). We then present experimental results obtained on two morphological of analogical relationships and to efficiently implement their computation. In Natural Language Processing (NLP analogies in various contexts: automatic word pronunciation (Yvon, 1999), morphological analysis (Lepage

  8. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 867874, Vancouver, October 2005. c 2005 Association for Computational Linguistics

    E-print Network

    creation of new OCR capabilities for low density languages, improvement of OCR performance for a native com accuracy for low density languages altogether lacking an OCR sys- tem, to significantly improve-specific OCR development, given the expensive and time consum- ing nature of OCR development for new languages

  9. A Scheme For Assessing The Nature Of A Young Child's Language Competence

    ERIC Educational Resources Information Center

    McFetridge, Patricia A.

    1974-01-01

    Article considered recent research in language conducted by a teacher educator from St. Lucia, West Indies. Her specific focus was on the methods devised for collection and analysis of language samples. (Author/RK)

  10. Language in Nature: On the Evolutionary Roots of a Cultural Phenomenon

    NASA Astrophysics Data System (ADS)

    Zuidema, Willem

    What could an evolutionary explanation for language look like? Here I review relevant evidence from linguistics, comparative biology, evolutionary theory and the fossil record, which suggest vocal imitation and hierarchical compositionality as the essential and uniquely human biological foundations of language. I also outline a plausible scenario for how human language evolved, and propose that language preceded, and facilitated the development of, other cognitive domains such as reasoning, the ability to plan, and consciousness.

  11. 9th Annual Oklahoma Native American Youth Language Fair Sam Noble Oklahoma Museum of Natural History

    E-print Network

    Oklahoma, University of

    9th Annual Oklahoma Native American Youth Language Fair Sam Noble Oklahoma from the Oklahoma Native American Youth Language Fair! We are pleasedK ­ 12th) · Song in Native American Language (PreK ­ 12th) NEW for 2011

  12. Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof

    ERIC Educational Resources Information Center

    Robotti, Elisabetta

    2012-01-01

    In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language…

  13. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.

    PubMed

    Comeau, Donald C; Liu, Haibin; Islamaj Do?an, Rezarta; Wilbur, W John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. PMID:24935050

  14. Semi-supervised learning of statistical models for natural language understanding.

    PubMed

    Zhou, Deyu; He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure. PMID:25152899

  15. Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports

    PubMed Central

    Sohn, Sunghwan; Ye, Zi; Liu, Hongfang; Chute, Christopher G.; Kullo, Iftikhar J.

    Prevalence of abdominal aortic aneurysm (AAA) is increasing due to longer life expectancy and implementation of screening programs. Patient-specific longitudinal measurements of AAA are important to understand pathophysiology of disease development and modifiers of abdominal aortic size. In this paper, we applied natural language processing (NLP) techniques to process radiology reports and developed a rule-based algorithm to identify AAA patients and also extract the corresponding aneurysm size with the examination date. AAA patient cohorts were determined by a hierarchical approach that: 1) selected potential AAA reports using keywords; 2) classified reports into AAA-case vs. non-case using rules; and 3) determined the AAA patient cohort based on a report-level classification. Our system was built in an Unstructured Information Management Architecture framework that allows efficient use of existing NLP components. Our system produced an F-score of 0.961 for AAA-case report classification with an accuracy of 0.984 for aneurysm size extraction. PMID:24303276

  16. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

    PubMed Central

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  17. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines.

    PubMed

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne

    2014-01-01

    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice. PMID:24943582

  18. Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing.

    PubMed

    Sippo, Dorothy A; Warden, Graham I; Andriole, Katherine P; Lacson, Ronilda; Ikuta, Ichiro; Birdwell, Robyn L; Khorasani, Ramin

    2013-10-01

    The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies. PMID:23868515

  19. Negation's not solved: generalizability versus optimizability in clinical natural language processing.

    PubMed

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been "solved." This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  20. Natural Language Processing Versus Content-Based Image Analysis for Medical Document Retrieval

    PubMed Central

    Névéol, Aurélie; Deserno, Thomas M.; Darmoni, Stéfan J.; Güld, Mark Oliver; Aronson, Alan R.

    2009-01-01

    One of the most significant recent advances in health information systems has been the shift from paper to electronic documents. While research on automatic text and image processing has taken separate paths, there is a growing need for joint efforts, particularly for electronic health records and biomedical literature databases. This work aims at comparing text-based versus image-based access to multimodal medical documents using state-of-the-art methods of processing text and image components. A collection of 180 medical documents containing an image accompanied by a short text describing it was divided into training and test sets. Content-based image analysis and natural language processing techniques are applied individually and combined for multimodal document analysis. The evaluation consists of an indexing task and a retrieval task based on the “gold standard” codes manually assigned to corpus documents. The performance of text-based and image-based access, as well as combined document features, is compared. Image analysis proves more adequate for both the indexing and retrieval of the images. In the indexing task, multimodal analysis outperforms both independent image and text analysis. This experiment shows that text describing images can be usefully analyzed in the framework of a hybrid text/image retrieval system. PMID:19633735

  1. Natural-Language Syntax as Procedures for Interpretation: The Dynamics of Ellipsis Construal

    NASA Astrophysics Data System (ADS)

    Kempson, Ruth; Gregoromichelaki, Eleni; Meyer-Viol, Wilfried; Purver, Matthew; White, Graham; Cann, Ronnie

    In this paper we set out the preliminaries needed for a formal theory of context, relative to a linguistic framework in which natural-language syntax is defined as procedures for context-dependent interpretation. Dynamic Syntax provides a formalism where both representations of content and context are defined dynamically and structurally, with time-linear monotonic growth across sequences of partial trees as the core structure-inducing notion. The primary data involve elliptical fragments, as these provide less familiar evidence of the requisite concept of context than anaphora, but equally central. As part of our sketch of the framework, we show how apparent anomalies for a time-linear basis for interpretation can be straightforwardly characterised once we adopt a new perspective on syntax as the dynamics of transitions between parse-states. We then take this as the basis for providing an integrated account of ellipsis construal. And, as a bonus, we will show how this intrinsically dynamic perspective extends in a seamless way to dialogue exchanges with free shifting of role between speaking and hearing (split-utterances). We shall argue that what is required to explain such dialogue phenomena is for contexts, as representations of content, to include not merely partial structures but also the sequence of actions that led to such structures.

  2. Pythagoras and the Language of Nature Walking past the village blacksmith's shop, the pleasant ring of hammers striking an

    E-print Network

    O'Laughlin, Jay

    Pythagoras and the Language of Nature Walking past the village blacksmith's shop, the pleasant ring by a universal and all pervading mathematics, numbers in balance and harmony. You are Pythagoras (580-497 B of space and mathematics leads you, Pythagoras, to a critical discovery. * * * * * Let us replicate in our

  3. Does It Really Matter whether Students' Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language?

    ERIC Educational Resources Information Center

    D'Mello, Sidney K.; Dowell, Nia; Graesser, Arthur

    2011-01-01

    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The "speech facilitation" hypothesis predicts that spoken input will "increase" learning, whereas the "text…

  4. An Examination of Natural Language as a Query Formation Tool for Retrieving Information on E-Health from Pub Med.

    ERIC Educational Resources Information Center

    Peterson, Gabriel M.; Su, Kuichun; Ries, James E.; Sievert, Mary Ellen C.

    2002-01-01

    Discussion of Internet use for information searches on health-related topics focuses on a study that examined complexity and variability of natural language in using search terms that express the concept of electronic health (e-health). Highlights include precision of retrieved information; shift in terminology; and queries using the Pub Med…

  5. The Past and 3 Futures of NLP Natural Language text and speech processing (Computational Linguistics) is just over 50

    E-print Network

    Chen, Sheng-Wei

    The Past and 3½ Futures of NLP Natural Language text and speech processing (Computational,and large-scale processing is increasingly being adopted (especially for commercial NLP) in this decade researchers focus on algorithms to effect the transformation of representation required in NLP;and the large

  6. AbstFinder, a prototype abstraction finder for natural language text for use in requirements elicitation: design, methodology, and evaluation

    Microsoft Academic Search

    Leah Goldin; Daniel M. Berry

    1994-01-01

    In order to help solve the problems of requirements elicitation, this paper motivates and describes a new approach, based on traditional signal processing methods, for finding abstractions in natural language text. The design of AbstFinder, an implementation of the approach, and the evaluation of its effectiveness on an industrial-strength example are described

  7. Evaluation of Automated Natural Language Processing in the Further Development of Science Information Retrieval. String Program Reports No. 10.

    ERIC Educational Resources Information Center

    Sager, Naomi

    This investigation matches the emerging techniques in computerized natural language processing against emerging needs for such techniques in the information field to evaluate and extend such techniques for future applications and to establish a basis and direction for further research toward these goals. An overview describes developments in the…

  8. Preprint of: Robotics and Autonomous Systems, 38 (3-4): 171-181(2002) Mobile Robot Programming Using Natural Language.

    E-print Network

    Bugmann, Guido

    2002-01-01

    .plym.ac.uk/soc/staff/guidbugm/ibl/index.html KEYWORDS: Natural Language, Human-robot dialogue, mobile robots learning, corpus collection, route, the system will learn it by combining primitives as instructed by the user. This paper describes the components of the Instruction Based Learning architecture and discusses issues of knowledge representation

  9. A natural language processing (NLP) program effectively extracts key pathologic findings from radical prostatectomy reports.

    PubMed

    Kim, Brian; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil Abraham; Contreras, Richard; Jacobsen, Steven J; Chien, Gary

    2014-08-01

    Introduction and Objective Natural language processing (NLP) software programs have been widely developed to transform complex, free text into simplified, organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. Methods An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included: TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a "gold standard" compiled by two blinded manual reviewers for 100 random pathology reports. Results: NLP demonstrated 100% accuracy for identifying Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report. Conclusions: This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases. PMID:25083914

  10. Measuring Information Acquisition from Sensory Input Using Automated Scoring of Natural-Language Descriptions

    PubMed Central

    Saunders, Daniel R.; Bex, Peter J.; Rose, Dylan J.; Woods, Russell L.

    2014-01-01

    Information acquisition, the gathering and interpretation of sensory information, is a basic function of mobile organisms. We describe a new method for measuring this ability in humans, using free-recall responses to sensory stimuli which are scored objectively using a “wisdom of crowds” approach. As an example, we demonstrate this metric using perception of video stimuli. Immediately after viewing a 30 s video clip, subjects responded to a prompt to give a short description of the clip in natural language. These responses were scored automatically by comparison to a dataset of responses to the same clip by normally-sighted viewers (the crowd). In this case, the normative dataset consisted of responses to 200 clips by 60 subjects who were stratified by age (range 22 to 85y) and viewed the clips in the lab, for 2,400 responses, and by 99 crowdsourced participants (age range 20 to 66y) who viewed clips in their Web browser, for 4,000 responses. We compared different algorithms for computing these similarities and found that a simple count of the words in common had the best performance. It correctly matched 75% of the lab-sourced and 95% of crowdsourced responses to their corresponding clips. We validated the measure by showing that when the amount of information in the clip was degraded using defocus lenses, the shared word score decreased across the five predetermined visual-acuity levels, demonstrating a dose-response effect (N?=?15). This approach, of scoring open-ended immediate free recall of the stimulus, is applicable not only to video, but also to other situations where a measure of the information that is successfully acquired is desirable. Information acquired will be affected by stimulus quality, sensory ability, and cognitive processes, so our metric can be used to assess each of these components when the others are controlled. PMID:24695546

  11. Using rule-based natural language processing to improve disease normalization in biomedical text

    PubMed Central

    Kang, Ning; Singh, Bharat; Afzal, Zubair; van Mulligen, Erik M; Kors, Jan A

    2013-01-01

    Background and objective In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization. Methods We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching. Results Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching. Conclusions We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated. PMID:23043124

  12. Combination of finite state automata and neural network for spoken language understanding

    Microsoft Academic Search

    Chai Wutiwiwatchai; Sadaoki Furui

    2003-01-01

    This paper proposes a novel approach for spoken language understanding based on a combination of weighted finite state automata and an artificial neural network. The former machine acts as a robust parser, which extracts some semantic information called subframes from an input sentence, then the latter machine interprets a concept of the sentence by considering the existence of subframes and

  13. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 127135, Honolulu, October 2008. c 2008 Association for Computational Linguistics

    E-print Network

    Wiebe, Janyce M.

    Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 127- glish, to automatically generate resources for sub- jectivity analysis for a new target language in the target language? Second, assuming the availabil- ity of a tool for automatic subjectivity analysis

  14. Surmounting the Tower of Babel: Monolingual and bilingual 2-year-olds' understanding of the nature of foreign language words.

    PubMed

    Byers-Heinlein, Krista; Chen, Ke Heng; Xu, Fei

    2014-03-01

    Languages function as independent and distinct conventional systems, and so each language uses different words to label the same objects. This study investigated whether 2-year-old children recognize that speakers of their native language and speakers of a foreign language do not share the same knowledge. Two groups of children unfamiliar with Mandarin were tested: monolingual English-learning children (n=24) and bilingual children learning English and another language (n=24). An English speaker taught children the novel label fep. On English mutual exclusivity trials, the speaker asked for the referent of a novel label (wug) in the presence of the fep and a novel object. Both monolingual and bilingual children disambiguated the reference of the novel word using a mutual exclusivity strategy, choosing the novel object rather than the fep. On similar trials with a Mandarin speaker, children were asked to find the referent of a novel Mandarin label kuò. Monolinguals again chose the novel object rather than the object with the English label fep, even though the Mandarin speaker had no access to conventional English words. Bilinguals did not respond systematically to the Mandarin speaker, suggesting that they had enhanced understanding of the Mandarin speaker's ignorance of English words. The results indicate that monolingual children initially expect words to be conventionally shared across all speakers-native and foreign. Early bilingual experience facilitates children's discovery of the nature of foreign language words. PMID:24268905

  15. Natural and constrained language production as a function of age and cognitive abilities

    Microsoft Academic Search

    Cristina D. Rabaglia; Timothy A. Salthouse

    2010-01-01

    Although it is often claimed that verbal abilities are relatively well maintained across the adult lifespan, certain aspects of language production have been found to exhibit cross-sectional differences and longitudinal declines. In the current project age-related differences in controlled and naturalistic elicited language production tasks were examined within the context of a reference battery of cognitive abilities in a moderately

  16. Natural and constrained language production as a function of age and cognitive abilities

    Microsoft Academic Search

    Cristina D. Rabaglia; Timothy A. Salthouse

    2011-01-01

    Although it is often claimed that verbal abilities are relatively well maintained across the adult lifespan, certain aspects of language production have been found to exhibit cross-sectional differences and longitudinal declines. In the current project age-related differences in controlled and naturalistic elicited language production tasks were examined within the context of a reference battery of cognitive abilities in a moderately

  17. The Chaotic Nature of Speech Rhythm: Hints for Fluency in the Language Acquisition Process

    Microsoft Academic Search

    B. Zellner; E. Keller

    2000-01-01

    The acquisition of speech rhythm and speech fluency are importantcomponents of a language acquisition process. This article reviews issues offluency and speech rhythm, based on empirical, experimental andmathematical evidence. An integrated multilevel model of temporal controlfor speech is motivated. Additional insights are obtained through acomparison with chaotic systems.IntroductionAn important aspect of any language acquisition process concerns speechfluency. Speech fluency...

  18. Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task, pages 127130, Portland, Oregon, 23-24 June 2011. c 2011 Association for Computational Linguistics

    E-print Network

    Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task, pages will be identified. As the core of natural language processing, coreference resolution is significant to message in the windows of three words before and after the target word: { ,..., }. (2) Capitalization: Determine whether

  19. A natural-language approach to biomimetic design Biomimetics for Innovation and Design Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto,

    E-print Network

    Shu, Lily H.

    A natural-language approach to biomimetic design L.H. SHU Biomimetics for Innovation and Design for engineer- ing design. Keywords: Analogical Reasoning; Biologically Inspired Design; Biomimetic Design to biomimetic de- sign. First highlighted are challenges in natural-language pro- cessing and analogical

  20. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…