Science.gov

Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. Extending a natural language parser with UMLS knowledge.

    PubMed Central

    McCray, A. T.

    1991-01-01

    Over the past several years our research efforts have been directed toward the identification of natural language processing methods and techniques for improving access to biomedical information stored in computerized form. To provide a testing ground for some of these ideas we have undertaken the development of SPECIALIST, a prototype system for parsing and accessing biomedical text. The system includes linguistic and biomedical knowledge. Linguistic knowledge involves rules and facts about the grammar of the language. Biomedical knowledge involves rules and facts about the domain of biomedicine. The UMLS knowledge sources, Meta-1 and the Semantic Network, as well as the UMLS test collection, have recently contributed to the development of the SPECIALIST system. PMID:1807586

  3. Recognizing noun phrases in medical discharge summaries: an evaluation of two natural language parsers.

    PubMed Central

    Spackman, K. A.; Hersh, W. R.

    1996-01-01

    We evaluated the ability of two natural language parsers, CLARIT and the Xerox Tagger, to identify simple, noun phrases in medical discharge summaries. In twenty randomly selected discharge summaries, there were 1909 unique simple noun phrases. CLARIT and the Xerox Tagger exactly identified 77.0% and 68.7% of the phrases, respectively, and partially identified 85.7% and 80.8% of the phrases. Neither system had been specially modified or tuned to the medical domain. These results suggest that it is possible to apply existing natural language processing (NLP) techniques to large bodies of medical text, in order to empirically identify the terminology used in medicine. Virtually all the noun phrases could be regarded as having special medical connotation and would be candidates for entry into a controlled medical vocabulary. PMID:8947647

  4. Flexible natural language parser based on a two-level representation of syntax

    SciTech Connect

    Lesmo, L.; Torasso, P.

    1983-01-01

    In this paper the authors present a parser which allows to make explicit the interconnections between syntax and semantics, to analyze the sentences in a quasi-deterministic fashion and, in many cases, to identify the roles of the various constituents even if the sentence is ill-formed. The main feature of the approach on which the parser is based consists in a two-level representation of the syntactic knowledge: a first set of rules emits hypotheses about the constituents of the sentence and their functional role and another set of rules verifies whether a hypothesis satisfies the constraints about the well-formedness of sentences. However, the application of the second set of rules is delayed until the semantic knowledge confirms the acceptability of the hypothesis. If the semantics reject it, a new hypothesis is obtained by applying a simple and relatively inexpensive natural modification; a set of these modifications is predefined and only when none of them is applicable a real backup is performed: in most cases this situation corresponds to a case where people would normally garden path. 19 references.

  5. Fence - An Efficient Parser with Ambiguity Support for Model-Driven Language Specification

    E-print Network

    Quesada, Luis; Cortijo, Francisco J

    2011-01-01

    Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven approaches, which constrain language designers to specific kinds of grammars, it needs general parser generators able to deal with ambiguities. In this paper, we propose Fence, an efficient bottom-up parsing algorithm with lexical and syntactic ambiguity support that enables the use of model-based language specification in practice.

  6. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, D.; Forster, M.; Bates, D.A.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, T.; Roblin, Y.; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.; /UCLA

    2006-10-06

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally, this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  7. Deep Natural Language Processing for Italian Sign Language Translation

    E-print Network

    Mazzei, Alessandro

    Deep Natural Language Processing for Italian Sign Language Translation No Author Given No Institute Given Abstract. This paper presents the architecture of a translator from written Italian into Italian.e. a dependency parser for Italian, an ontology based seman- tic interpreter, a generator based on expert

  8. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  9. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  10. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  11. Recognizing Bangla Grammar using Predictive Parser

    E-print Network

    Hasan, K M Azharul; Mondal, Amit; Saha, Amit

    2012-01-01

    We describe a Context Free Grammar (CFG) for Bangla language and hence we propose a Bangla parser based on the grammar. Our approach is very much general to apply in Bangla Sentences and the method is well accepted for parsing a language of a grammar. The proposed parser is a predictive parser and we construct the parse table for recognizing Bangla grammar. Using the parse table we recognize syntactical mistakes of Bangla sentences when there is no entry for a terminal in the parse table. If a natural language can be successfully parsed then grammar checking from this language becomes possible. The proposed scheme is based on Top down parsing method and we have avoided the left recursion of the CFG using the idea of left factoring.

  12. Conjunction, Ellipsis, and Other Discontinuous Constituents in the Constituent Object Parser.

    ERIC Educational Resources Information Center

    Metzler, Douglas P.; And Others

    1990-01-01

    Describes the Constituent Object Parser (COP), a domain independent syntactic parser developed for use in information retrieval and similar applications. The syntactic structure of natural language entities is discussed, and the mechanisms by which COP handles the problems of conjunctions, ellipsis, and discontinuous constituents are explained.…

  13. NEWCAT: Parsing natural language using left-associative grammar

    SciTech Connect

    Hausser, R.

    1986-01-01

    This book shows that constituent structure analysis induces an irregular order of linear composition which is the direct cause of extreme computational inefficiency. It proposes an alternative left-associative grammar which operates with a regular order of linear compositions. Left-associative grammar is based on building up and cancelling valencies. Left-associative parsers differ from all other systems in that the history of the parse doubles as the linguistic analysis. Left-associative grammar is illustrated with two left-associative natural language parsers: one for German and one for English.

  14. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…

  15. Toward a theory of distributed word expert natural language parsing

    NASA Technical Reports Server (NTRS)

    Rieger, C.; Small, S.

    1981-01-01

    An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.

  16. FORMAL SPECIFICATION OF NATURAL LANGUAGE SYNTAX The two-level grammar is investigated as a notation for giving formal

    E-print Network

    FORMAL SPECIFICATION OF NATURAL LANGUAGE SYNTAX ABSTRACT The two-level grammar is investigated-noun-modificatlon by relative clauses, is formalized using a two-level grammar. The principal advantages of two- level grammar a two-level grammar for natural language syntax, we can derive a parser automatically without writing

  17. Fifth Conference Natural Language

    E-print Network

    Fifth Conference on Applied Natural Language Processing Association for Computational Linguistics natural language analyzers and generators. Our tool kit is slowly growing -- adding, in particular didn't understand about natural language. But we are also learning to make better use of the tools we

  18. Automatic natural language parsing

    SciTech Connect

    Sprack-Jones, K.; Wilks, Y.

    1985-01-01

    This collection of papers on automatic natural language parsing examines research and development in language processing over the past decade. It focuses on current trends toward a phrase structure grammar and deterministic parsing.

  19. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K.

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  20. PASCAL LR(1) Parser Generator System

    Energy Science and Technology Software Center (ESTSC)

    1988-05-04

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce completemore »parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.« less

  1. Discriminative Reranking for Natural Language Parsing Michael Collins MCOLLINS@RESEARCH.ATT.COM

    E-print Network

    Collins, Michael

    Discriminative Reranking for Natural Language Parsing Michael Collins MCOLLINS@RESEARCH.ATT.COM AT applied to reranking output of the parser of Collins (1999) on the Wall Street Journal corpus, with a 13, that of Collins (1999),achieved 88.1/88.3% recall/precision on this task. The new model achieves 89.6/89.9% recall

  2. A Natural Language Processing Infrastructure for Turkish A. C. Cem SAY

    E-print Network

    A Natural Language Processing Infrastructure for Turkish A. C. Cem SAY Department of Computer of new applica- tions involving the processing of Turkish. The platform incorporates a lexicon, a morphological analyzer/generator, and a DCG parser/generator that translates Turkish sentences to predicate

  3. Natural Language Processing in aid of FlyBase curators

    E-print Network

    Karamanis, Nikiforos; Seal, Ruth; Lewin, Ian; McQuilton, Peter; Vlachos, Andreas; Gasperin, Caroline; Drysdale, Rachel; Briscoe, Ted

    2008-04-14

    be performed either by using a Hidden Markov Model (HMM) or Conditional Random Fields BMC Bioinformatics 2008, 9:193 http://www.biomedcentral.com/1471-2105/9/193 (CRFs) as discussed by Vlachos [18]. Then, the RASP parser [19] is employed to identify... curate in a similar way as FlyBase, this study is likely to have far-reaching implications. Availability and Requirements  Project name: FlySlip  Project website: http://www.wiki.cl.cam.ac.uk/rowiki/ NaturalLanguage/FlySlip  Programming language: Java...

  4. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate…

  5. Natural Language Sourcebook.

    ERIC Educational Resources Information Center

    Baker, Eva; And Others

    This sourcebook is intended to provide researchers and users of natural language computer systems with a classification scheme to describe language-related problems associated with such systems. Methods from the disciplines of artificial intelligence (AI), education, linguistics, psychology, anthropology, and psychometrics were applied in an…

  6. Natural language parsing in a hybrid connectionist-symbolic architecture

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-03-01

    Most connectionist parsers either cannot guarantee the correctness of their derivations or have to simulate a serial flow of control. In the first case, users have to restrict the tasks (e.g. parse less complex or shorter sentences) of the parser or they need to believe in the soundness of the result. In the second case, the resulting network has lost most of its attractivity because seriality needs to be hard-coded into the structure of the net. We here present a hybrid symbolic connectionist parser, which was designed to fulfill the following goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free grammar, and (3) learning the applicability of parsing rules with a neural network. Our hybrid architecture consists of a serial parsing algorithm and a trainable net. BrainC (Backtracking and Backpropagation in C) combines the well known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent the typical properties of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN- Workstations and was tested with several grammars for English and German. We discuss how BrainC reached its design goals and what results we observed.

  7. Natural Language Spatal Reasoning

    E-print Network

    Tellex, Stefanie

    Natural Language and Spatal Reasoning Stefanie Tellex MIT Media Lab Ph.D. Thesis Defense #12;Where, then, do, any, like, my, now, over, such, our, man, me, even, most, made, after, also, well, did, many, water, until, always, away, public, something, fact, less, through, far, put, head, think, called, set

  8. A distributed intelligent information system with natural language input for ad hoc knowledge discovery in databases

    SciTech Connect

    Fass, D.; Hall, G.; Laurens, O.; McFetridge, P.; Popowich, F.; Rueden, M. von

    1996-11-01

    A distributed information system is described which features a graphic user interface incorporating natural language input and which provides ad hoc knowledge discovery in relational databases. The system is comprised of multiple processes which communicate with each other over a network. The knowledge discovery process involves extracting generalizations from data using background knowledge in the form of concept hierarchies and a learning procedure based upon an attribute-oriented induction technique. The natural language understanding process is a parser based on Head-Driven Phrase Structure Grammar (HPSG), a modern lexicon-based grammar formalism better equipped than older rule-based approaches for handling the often idiosyncratic behavior of words. To generate semantic interpretations, the parser makes use of a process which orders logical access paths in unnormalized databases based on the strength of their dependency structures and on their efficiency of execution.

  9. Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors

    E-print Network

    Kuhn, Tobias

    2011-01-01

    Existing grammar frameworks do not work out particularly well for controlled natural languages (CNL), especially if they are to be used in predictive editors. I introduce in this paper a new grammar notation, called Codeco, which is designed specifically for CNLs and predictive editors. Two different parsers have been implemented and a large subset of Attempto Controlled English (ACE) has been represented in Codeco. The results show that Codeco is practical, adequate and efficient.

  10. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K.

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. In addition, the Cray1 version contains LRLTRAN and CFT-FORTRAN 77 skeletons. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  11. LRSYS. PASCAL LR(1) Parser Generator System

    SciTech Connect

    O`Hair, K.

    1985-04-01

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce complete parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. In addition, the DEC VAX11 version contains LRLTRAN and CFT-FORTRAN 77 skeletons. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.

  12. Introduction to natural language processing

    SciTech Connect

    Harris, M.D.

    1984-01-01

    This book presents an overview of the production by computers and utilization of natural language, as differentiated from programming language. It considers both the practical and theoretical problems of natural language input-output. It presents the computational aspects of the subject with exceptional clarity through the use of concrete programs written in Pascal. It outlines methods for analysis, synthesis, and transformation of language. The book treats syntax and grammar (structure), semantics (inherent meaning), and representation of knowledge (storage and access).

  13. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The book presents papers on natural language processing, focusing on the central issues of representation, reasoning, and recognition. The introduction discusses theoretical issues, historical developments, and current problems and approaches. The book presents work in syntactic models (parsing and grammars), semantic interpretation, discourse interpretation, language action and intentions, language generation, and systems.

  14. Designing and Implementing a Syntactic Parser.

    ERIC Educational Resources Information Center

    Sanders, Alton; Sanders, Ruth

    1987-01-01

    Describes the development in progress of a syntactic parser of German called "Syncheck," which uses the programing language "Prolog." The grammar is written in a formalism called "Definate Clause Grammar." The purpose of "Syncheck" is to provide advice on grammatical correctness to intermediate and advanced college students of German. (Author/LMO)

  15. Left-corner unification-based natural language processing

    SciTech Connect

    Lytinen, S.L.; Tomuro, N.

    1996-12-31

    In this paper, we present an efficient algorithm for parsing natural language using unification grammars. The algorithm is an extension of left-corner parsing, a bottom-up algorithm which utilizes top-down expectations. The extension exploits unification grammar`s uniform representation of syntactic, semantic, and domain knowledge, by incorporating all types of grammatical knowledge into parser expectations. In particular, we extend the notion of the reachability table, which provides information as to whether or not a top-down expectation can be realized by a potential subconstituent, by including all types of grammatical information in table entries, rather than just phrase structure information. While our algorithm`s worst-case computational complexity is no better than that of many other algorithms, we present empirical testing in which average-case linear time performance is achieved. Our testing indicates this to be much improved average-case performance over previous leftcomer techniques.

  16. Advances in natural language processing.

    PubMed

    Hirschberg, Julia; Manning, Christopher D

    2015-07-17

    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. PMID:26185244

  17. Building representations from natural language

    E-print Network

    Seifter, Mark J

    2007-01-01

    In this thesis, I describe a system I built that produces instantiated representations from descriptions embedded in natural language. For example, in the sentence 'The girl walked to the table', my system produces a ...

  18. The Nature of Natural Languages.

    ERIC Educational Resources Information Center

    Pierce, Joe E.

    A variety of types of evidence are examined to help determine the true nature of "deep structure" and what, if any, implications this has for linguistic theory as well as culture theory generally. The evidence accumulated over the past century on the nature of phonetic and phonemic systems is briefly discussed, and the following areas of analysis…

  19. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  20. Prolog implementation of lexical functional grammar as a base for a natural language processing system

    SciTech Connect

    Frey, W.; Reyle, U.

    1983-01-01

    The authors present a system which constructs a database out of a narrative natural language text. Firstly they give a detailed description of the PROLOG implementation of the parser which is based on the theory of lexical functional grammar (LFG). They show that PROLOG provides an efficient tool for LFG implementation. Secondly, they postulate some requirements a semantic representation has to fulfil in order to be able to analyse whole texts. They show how kamps theory meets these requirements by analysing sample discourses involving anaphoric nps. 4 references.

  1. Wild Words: Nature, Language and Outdoor Education.

    ERIC Educational Resources Information Center

    Meisner, Mark

    1993-01-01

    Discusses ways that language misrepresents nature, pointing out that frequently used metaphors and problematic language usage provide limited conceptual and emotional understanding of the natural world and contribute to a degraded view of nature. Discusses strategies for changing language as the first step in changing attitudes toward nature. (LP)

  2. Lagrangian relaxation for natural language decoding

    E-print Network

    Rush, Alexander M. (Alexander Matthew)

    2014-01-01

    The major success story of natural language processing over the last decade has been the development of high-accuracy statistical methods for a wide-range of language applications. The availability of large textual data ...

  3. Comparing Italian parsers on a common treebank: the Evalita experience

    E-print Network

    Mazzei, Alessandro

    Comparing Italian parsers on a common treebank: the Evalita experience C. Bosco*, A. Mazzei*, V Parsing Task has been the first contest among parsing systems for Italian. It is the first attempt in treebank-driven parsing for Italian and other Romance languages. It focuses on datasets, parsing paradigms

  4. A lex-based mad parser and its applications

    SciTech Connect

    Oleg Krivosheev et al.

    2001-07-03

    An embeddable and portable Lex-based MAD language parser has been developed. The parser consists of a front-end which reads a MAD file and keeps beam elements, beam line data and algebraic expressions in tree-like structures, and a back-end, which processes the front-end data to generate an input file or data structures compatible with user applications. Three working programs are described, namely, a MAD to C++ converter, a dynamic C++ object factory and a MAD-MARS beam line builder. Design and implementation issues are discussed.

  5. Natural Artificial Languages: Low-Level Processes.

    ERIC Educational Resources Information Center

    Perlman, Gary

    This paper explores languages for communicating precise ideas within limited domains, which include mathematical notation and general purpose and high level computer programming languages. Low-level properties of such natural artificial languages are discussed, with emphasis on those in which names are chosen for concepts and symbols are chosen…

  6. Multilingual environment and natural acquisition of language

    NASA Astrophysics Data System (ADS)

    Takano, Shunichi; Nakamura, Shigeru

    2000-06-01

    Language and human are not anything in the outside of nature. Not only babies, even adults can acquire new language naturally, if they have a natural multilingual environment around them. The reason it is possible would be that any human has an ability to grasp the whole of language, and at the same time, language has an order which is the easiest to acquire for humans. The process of this natural acquisition and a result of investigating the order of Japanese vowels are introduced. .

  7. Natural language processing: an introduction

    PubMed Central

    Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786

  8. Language Engineering : The Real Bottle Neck of Natural Language Processing

    E-print Network

    PANEL Language Engineering : The Real Bottle Neck of Natural Language Processing Panel Organizer, Makoto Nagao Department of Electrical Engineering Kyoto University, Sakyo, Kyoto, Japan The bottle neck of simplicity. For example: punctua- tion is very important for processing real text, but LTs have nothing

  9. Natural language and spatial reasoning

    E-print Network

    Tellex, Stefanie, 1980-

    2010-01-01

    Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands ...

  10. Formal Language Theory for Natural Language Processing Shuly Wintner

    E-print Network

    linguistics students typically come from two different disciplines: Linguistics or Computer Science with background in linguistics are in- terested in computational linguistics but are over- whelmed Linguistics. Natural Language Processing and Computational Linguistics, Philadelphia, Proceedings

  11. HEADDRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING

    E-print Network

    Collins, Michael

    HEAD­DRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING Michael Collins A DISSERTATION of Dissertation Professor Jean Gallier Graduate Group Chair #12; COPYRIGHT Michael Collins 1999 #12 HEAD­DRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING Michael Collins Supervisor: Professor Mitch

  12. Natural Language Processing on the Web

    E-print Network

    Montréal, Université de

    Natural Language Processing on the Web Guy Lapalme RALI-DIRO, Université de Montréal ! http://www.iro.umontreal.ca/~lapalme #12;Overview · What is Natural Language Processing (NLP) · NLP for the Web · The Web for NLP 2 #12 recognition 5 #12;http://rali.iro.umontreal.ca #12;NLP for the syntactic Web search engines · NLP saved

  13. How Language Processing Constrains (Computational) Natural Language Processing: A Cognitive Perspective

    E-print Network

    How Language Processing Constrains (Computational) Natural Language Processing: A Cognitive at sketching out in bare outline a new model/framework of language processing with its implications for natural language processing. Research in theoretical linguistics, computational linguistics and mathematical

  14. Natural language interface for command and control

    NASA Technical Reports Server (NTRS)

    Shuler, Robert L., Jr.

    1986-01-01

    A working prototype of a flexible 'natural language' interface for command and control situations is presented. This prototype is analyzed from two standpoints. First is the role of natural language for command and control, its realistic requirements, and how well the role can be filled with current practical technology. Second, technical concepts for implementation are discussed and illustrated by their application in the prototype system. It is also shown how adaptive or 'learning' features can greatly ease the task of encoding language knowledge in the language processor.

  15. Parallel processing of natural language

    SciTech Connect

    Chang, H.O.

    1986-01-01

    Two types of parallel natural language processing are studied in this work: (1) the parallelism between syntactic and nonsyntactic processing and (2) the parallelism within syntactic processing. It is recognized that a syntactic category can potentially be attached to more than one node in the syntactic tree of a sentence. Even if all the attachments are syntactically well-formed, nonsyntactic factors such as semantic and pragmatic consideration may require one particular attachment. Syntactic processing must synchronize and communicate with nonsyntactic processing. Two syntactic processing algorithms are proposed for use in a parallel environment: Early's algorithm and the LR(k) algorithm. Conditions are identified to detect the syntactic ambiguity and the algorithms are augmented accordingly. It is shown that by using nonsyntactic information during syntactic processing, backtracking can be reduced, and the performance of the syntactic processor is improved. For the second type of parallelism, it is recognized that one portion of a grammar can be isolated from the rest of the grammar and be processed by a separate processor. A partial grammar of a larger grammar is defined. Parallel syntactic processing is achieved by using two processors concurrently: the main processor (mp) and the two processors concurrently: the main processor (mp) and the auxiliary processor (ap).

  16. Interpreting natural language queries using the UMLS.

    PubMed Central

    Johnson, S. B.; Aguirre, A.; Peng, P.; Cimino, J.

    1993-01-01

    This paper describes AQUA (A QUery Analyzer), the natural language front end of a prototype information retrieval system. AQUA translates a user's natural language query into a representation in the Conceptual Graph formalism. The graph is then used by subsequent components to search various resources such as databases of the medical literature. The focus of the parsing method is on semantics rather than syntax, with semantic restrictions being provided by the UMLS Semantic Net. The intent of the approach is to provide a method that can be emulated easily in applications that require simple natural language interfaces. PMID:8130481

  17. Introduction: Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Smeaton, Alan F.

    1990-01-01

    Discussion of research into information and text retrieval problems highlights the work with automatic natural language processing (NLP) that is reported in this issue. Topics discussed include the occurrences of nominal compounds; anaphoric references; discontinuous language constructs; automatic back-of-the-book indexing; and full-text analysis.…

  18. A System for Natural Language Sentence Generation.

    ERIC Educational Resources Information Center

    Levison, Michael; Lessard, Gregory

    1992-01-01

    Describes the natural language computer program, "Vinci." Explains that using an attribute grammar formalism, Vinci can simulate components of several current linguistic theories. Considers the design of the system and its applications in linguistic modelling and second language acquisition research. Notes Vinci's uses in linguistics instruction…

  19. A Natural Language Interface to Databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1990-01-01

    The development of a Natural Language Interface (NLI) is presented which is semantic-based and uses Conceptual Dependency representation. The system was developed using Lisp and currently runs on a Symbolics Lisp machine.

  20. Sepia: a Framework for Natural Language Semantics

    E-print Network

    Marton, Gregory Adam

    2009-05-28

    To help explore linguistic semantics in the context of computational natural language understanding, Sepia provides a realization the central theoretical idea of categorial grammar: linking words and phrases to compositional ...

  1. Decoding algorithms for complex natural language tasks

    E-print Network

    Deshpande, Pawan

    2007-01-01

    This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a ...

  2. Natural language search of structured documents

    E-print Network

    Oney, Stephen W

    2008-01-01

    This thesis focuses on techniques with which natural language can be used to search for specific elements in a structured document, such as an XML file. The goal is to create a system capable of being trained to identify ...

  3. Learning semantic maps from natural language

    E-print Network

    Hemachandra, Sachithra Madhawa

    2015-01-01

    As robots move into human-occupied environments, the need for effective mechanisms to enable interactions with humans becomes vital. Natural language is a flexible, intuitive medium that can enable such interactions, but ...

  4. Graphical law beneath each written natural language

    E-print Network

    Anindya Kumar Biswas

    2013-10-08

    We study twenty four written natural languages. We draw in the log scale, number of words starting with a letter vs rank of the letter, both normalised. We find that all the graphs are of the similar type. The graphs are tantalisingly closer to the curves of reduced magnetisation vs reduced temperature for magnetic materials. We make a weak conjecture that a curve of magnetisation underlies a written natural language.

  5. Natural Language Description of Emotion

    ERIC Educational Resources Information Center

    Kazemzadeh, Abe

    2013-01-01

    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  6. Combining Semantic Wikis and Controlled Natural Language

    E-print Network

    Kuhn, Tobias

    2008-01-01

    We demonstrate AceWiki that is a semantic wiki using the controlled natural language Attempto Controlled English (ACE). The goal is to enable easy creation and modification of ontologies through the web. Texts in ACE can automatically be translated into first-order logic and other languages, for example OWL. Previous evaluation showed that ordinary people are able to use AceWiki without being instructed.

  7. Knowledge engineering approach to natural language understanding

    SciTech Connect

    Shapiro, S.C.; Neal, J.G.

    1982-01-01

    The authors describe the results of a preliminary study of a knowledge engineering approach to natural language understanding. A computer system is being developed to handle the acquisition, representation, and use of linguistic knowledge. The computer system is rule-based and utilizes a semantic network for knowledge storage and representation. In order to facilitate the interaction between user and system, input of linguistic knowledge and computer responses are in natural language. Knowledge of various types can be entered and utilized: syntactic and semantic; assertions and rules. The inference tracing facility is also being developed as a part of the rule-based system with output in natural language. A detailed example is presented to illustrate the current capabilities and features of the system. 12 references.

  8. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)…

  9. Dual Decomposition for Natural Language Processing

    E-print Network

    Collins, Michael

    Dual Decomposition for Natural Language Processing Alexander M. Rush and Michael Collins #12 · syntactic machine translation (Rush and Collins, 2011) NP-Hard · symmetric HMM alignment (DeNero and Macherey, 2011) · phrase-based translation (Chang and Collins, 2011) · higher-order non

  10. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query…

  11. Book Review Natural Language Processing with Python

    E-print Network

    Book Review Natural Language Processing with Python Steven Bird, Ewan Klein, and Edward Loper, xx+482 pp; paperbound, ISBN 978-0-596-51649-9, $44.99; on-line free of charge at nltk.org/book Reviewed by Michael Elhadad Ben-Gurion University This book comes with "batteries included" (a reference

  12. Natural language information retrieval in digital libraries

    SciTech Connect

    Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.

    1996-12-31

    In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and build a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.

  13. A Priori Analysis of Natural Language Queries.

    ERIC Educational Resources Information Center

    Spiegler, Israel; Elata, Smadar

    1988-01-01

    Presents a model for the a priori analysis of natural language queries which uses an algorithm to transform the query into a logical pattern that is used to determine the answerability of the query. The results of testing by a prototype system implemented in PROLOG are discussed. (20 references) (CLB)

  14. Brain readiness and the nature of language

    PubMed Central

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their “representations” may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique means of representation that allow perceptual forms and contents to be linked into signs. PMID:26441751

  15. Spoken Language Systems -Technical Challenges for Speech and Natural Language Processing

    E-print Network

    Spoken Language Systems - Technical Challenges for Speech and Natural Language Processing Chin is the most natural means of communication among humans. It is also believed that spoken language processing of the existing spoken language systems are rather primitive. For example, speech synthesizers for reading

  16. Identifying Patterns in Geospatial Natural Language Kristin Stock

    E-print Network

    Stock, Kristin

    Identifying Patterns in Geospatial Natural Language Kristin Stock Nottingham Geospatial Institute University of Nottingham Abstract The automated interpretation of geospatial be suitable as an approach to the representation of geospatial natural language that supports

  17. Evolutionary Explanations for Natural Language -Criteria from Evolutionary Biology

    E-print Network

    Amsterdam, University of

    Evolutionary Explanations for Natural Language - Criteria from Evolutionary Biology Willem Zuidema surveys the requirements on evolutionary scenarios that derive from mathematical evolutionary biology for why humans, and humans alone, are capable of acquiring and using natural languages. Second

  18. Representing the Semantics of Natural Language as Constraint Expressions

    E-print Network

    Grossman, Richard W.

    The issue of how to represent the "meaning" of an utterance is central to the problem of computer understanding of natural language. Rather than relying on ad-hoc structures or forcing the complexities of natural language ...

  19. Towards Automatic Generation of Natural Language Generation , Srinivas Bangalore

    E-print Network

    that interact with the user via natural language are in their infancy. As these systems mature and become more system. 1 Introduction Systems that interact with the user via natural language are in their infancy

  20. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  1. Natural languages as collections of resources Robin Cooper

    E-print Network

    Cooper, Robin

    Natural languages as collections of resources Robin Cooper G¨oteborg University Aarne Ranta.2) propose a view on which natural languages are rather to be regarded as collec- tions of resources on general resources for natural languages and we will give a brief characterization of the system in section

  2. Natural Language Text Generation in the Oz Interactive Fiction Project

    E-print Network

    GLINDA: Natural Language Text Generation in the Oz Interactive Fiction Project Mark Kantrowitz July­3890 mkant+@cs.cmu.edu Abstract Interactive fiction presents new requirements for natural language generation. GLINDA, the natural language generation module of the Oz interactive fiction system, is an implemented

  3. THE LOCALITY PHENOMENON AND PARALLEL PROCESSING OF NATURAL LANGUAGE

    E-print Network

    natural language processing systems employ the sequential mode as a necessary evil, or do not evenTHE LOCALITY PHENOMENON AND PARALLEL PROCESSING OF NATURAL LANGUAGE E. L. Lozinskii and S traditions established in computer processing of natural language during the twenty-odd years of research

  4. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  5. Automated database design from natural language input

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando; Segami, Carlos; Delaune, Carl

    1995-01-01

    Users and programmers of small systems typically do not have the skills needed to design a database schema from an English description of a problem. This paper describes a system that automatically designs databases for such small applications from English descriptions provided by end-users. Although the system has been motivated by the space applications at Kennedy Space Center, and portions of it have been designed with that idea in mind, it can be applied to different situations. The system consists of two major components: a natural language understander and a problem-solver. The paper describes briefly the knowledge representation structures constructed by the natural language understander, and, then, explains the problem-solver in detail.

  6. Discovering protein similarity using natural language processing.

    PubMed Central

    Sarkar, Indra N.; Rindflesch, Thomas C.

    2002-01-01

    Extracting protein interaction relationships from textual repositories, such as MEDLINE, may prove useful in generating novel biological hypotheses. Using abstracts relevant to two known functionally related proteins, we modified an existing natural language processing tool to extract protein interaction terms. We were able to obtain functional information about two proteins, Amyloid Precursor Protein and Prion Protein, that have been implicated in the etiology of Alzheimer's Disease and Creutzfeldt-Jakob Disease, respectively. PMID:12463910

  7. An expert system for natural language processing

    NASA Technical Reports Server (NTRS)

    Hennessy, John F.

    1988-01-01

    A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.

  8. Substitutional Semantics and Natural Language Quantification

    E-print Network

    Ludlow, Peter

    sharpening of certain points came out of discussion of these issues with Noam Chomsky and Norbert Hornstein. Finally, I would like to thank the MIT Department of Linguistics and Philosophy for making their facilities available to me during my tenure as a... Visiting Scholar. 'For example. Dale Gottlieb, Ontoloqical Economy; Substitutional Quantification and Mathematics, (Oxford: Oxford University Press, I960). 'Noam Chomsky has suggested to me that it is ille­ gitimate to suppose that natural language...

  9. Temporal Action Language (TAL): A Controlled Language for Consistency Checking of Natural

    E-print Network

    Hayes, Jane E.

    Temporal Action Language (TAL): A Controlled Language for Consistency Checking of Natural Language formal and human readable #12;Proposed Approach · Temporal Action Language TAL as bridge Once a message(Receiver,Msg,Sender) @ 10ms after Requirements In NL TAL Representation sat1(c6,action(Sender, send(Msg, Receiver

  10. Intelligent CAI: An Author Aid for a Natural Language Interface.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)…

  11. Toward Natural Language Computation 1 Alan W. Biermann

    E-print Network

    . On the other hand, many problems could arise when natural language programming is attempted (Dijkstra[11 as a subroutine, thus extending the set of available operations and allowing larger English-language programs constructs such as "if", "repeat", and proce- dure definition. 1. Introduction Natural language programming

  12. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  13. A Framework for Connecting Natural Language and Symbol Sense ...

    E-print Network

    Rachael Kenney

    2013-08-09

    mathematical word problems for English language learners. Submitted to .... For example, symbols name, label, signify, communicate, simplify .... This allows for a more direct translation from the natural language to the symbolic form, which ...

  14. 6.881 Natural Language Processing, Fall 2004

    E-print Network

    Barzilay, Regina

    This course is a graduate level introduction to natural language processing, the primary concern of which is the study of human language from a computational perspective. The class will cover models at the level of syntactic, ...

  15. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  16. Transportable natural-language interfaces: problems and techniques

    SciTech Connect

    Grosz, B.J.

    1982-01-01

    The author considers the question of natural language database access within the context of a project at SRI, TEAM, that is developing techniques for transportable natural-language interfaces. The goal of transportability is to enable nonspecialists to adapt a natural-language processing system for access to an existing conventional database. TEAM is designed to interact with two different kinds of users. During an acquisition dialogue, a database expert (DBE) provides TAEM with information about the files and fields in the conventional database for which a natural-language interface is desired. (Typically this database already exists and is populated, but TAEM also provides facilities for creating small local databases.) This dialogue results in extension of the language-processing and data access components that make it possible for an end user to query the new database in natural language. 13 references.

  17. Understanding and representing natural language meaning

    SciTech Connect

    Waltz, D.L.; Maran, L.R.; Dorfman, M.H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (a) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (b) written a parsing program which selects appropriate word and sentence meanings by a parallel process known as activation and inhibition; (c) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (d) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (e) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (f) constructed a general model for the representation of tense and aspect of verbs; (g) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  18. Understanding and representing natural language meaning

    NASA Astrophysics Data System (ADS)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  19. Understanding natural language for spacecraft sequencing

    NASA Technical Reports Server (NTRS)

    Katz, Boris; Brooks, Robert N., Jr.

    1987-01-01

    The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.

  20. Overview of computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language processing and understanding is the key to enabling humans and their creations to interact with machines in natural language (in contrast to computer language). The doors that such an achievement can open has made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural languages interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations.

  1. Automated Methods for Interpreting Geospatial Natural Language Dr Kristin Stock, Nottingham Geospatial Institute, University of Nottingham

    E-print Network

    Stock, Kristin

    NaturalGeo Automated Methods for Interpreting Geospatial Natural Language Dr Kristin Stock, Nottingham Geospatial Institute, University of Nottingham 2012-2014 What is geospatial natural language diagrams best match 2000 geospatial natural language expressions. The results will populate our

  2. Natural Language Processing: What's Really Involved?

    E-print Network

    NLP to be. If we accept the common assumption that language processing can be isolated from the rest to failure be- cause it misses the point of what language is all about. It is only possible for research to approach human levels of language underStanding are, first, to endow them with the kind of knowledge bases

  3. Incremental Generation of LR Parsers R. Nigel Horspool

    E-print Network

    Horspool, R. Nigel

    . The basic design philosophy of an incremental parser generator, and incremental algorithms for LR0, SLR1 and LALR1 parser generation are discussed in this paper. Some of these algorithms have been incorporated into an implementation of an incremental LALR1 parser generator. Index Terms Compilers, Compiler Tools, Program

  4. Incremental Parser Generation for Tree Adjoining Grammars Anoop Sarkar

    E-print Network

    Sarkar, Anoop

    far. 1 LR Parser Generation Tree Adjoining Grammars (TAGs) are tree rewriting systems which combine- tion of an LR parsing algorithm for TAGs1. Parser generation here is taken to be the con- struction described here can be extended to use SLR(1) tables (Schabes and Vijay-Shanker, 1990). made by the parser

  5. Incremental Parser Generation for Tree Adjoining Grammars* Anoop Sarkar

    E-print Network

    tables built so far. 1 LR Parser Generation Tree Adjoining Grammars (TAGs) are tree rewrit- ing systems) describes the construction of an LR parsing algorithm for TAGs 1. Parser generation here is takenIncremental Parser Generation for Tree Adjoining Grammars* Anoop Sarkar University of Pennsylvania

  6. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  7. GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics

    E-print Network

    GEMS 2011 GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics Proceedings-937284-16-6 ii #12;Introduction GEMS 2011 -- GEometrical Models of Natural Language Semantics -- is the third, is not without its problems. The aim of GEMS is to address two orthogonal types of current challenges. First

  8. Microsoft Natural Language Understanding System and Grammar Checker

    E-print Network

    , the grammar checker integrated in Microsoft Word 97 (also included in Office 97), which was releasedMicrosoft Natural Language Understanding System and Grammar Checker Steve Richardson Microsoft Research One Microsoft Way Redmond, WA 98052 steveri @microsoft.com The Natural Language Processing (NLP

  9. Inferring heuristic classification hierarchies from natural language input

    NASA Technical Reports Server (NTRS)

    Hull, Richard; Gomez, Fernando

    1993-01-01

    A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.

  10. SBVR Business Rules Generation from Natural Language Specification

    E-print Network

    Lee, Mark

    models, software components, 2 Artificial Intelligence for Business Agility -- Papers from the AAAI 2011SBVR Business Rules Generation from Natural Language Specification Imran S. Bajwa, Mark G. Lee of translating natural languages specification to SBVR business rules. The business rules constraint business

  11. MAXIMUM ENTROPY MODELS FOR NATURAL LANGUAGE AMBIGUITY RESOLUTION

    E-print Network

    Rodriguez, Carlos

    MAXIMUM ENTROPY MODELS FOR NATURAL LANGUAGE AMBIGUITY RESOLUTION Adwait Ratnaparkhi A DISSERTATION with whom you argue, discuss, and nurture your ideas. I thank all of the people at Penn and elsewhere who the intellectual freedom to pursue what I believed to be the best way to approach natural language processing

  12. INMED/TINS special issue Nature and nurture in language

    E-print Network

    Dehaene-Lambertz, Ghislaine

    INMED/TINS special issue Nature and nurture in language acquisition: anatomical and functional/TINS special issue Nature and nurture in brain development and neurological disorders, based on presentations

  13. A Hybrid Architecture For Natural Language Understanding

    NASA Astrophysics Data System (ADS)

    Loatman, R. Bruce

    1987-05-01

    The PRC Adaptive Knowledge-based Text Understanding System (PAKTUS) is an environment for developing natural language understanding (NLU) systems. It uses a knowledge-based approach in an integrated hybrid architecture based on a factoring of the NLU problem into its lexi-cal, syntactic, conceptual, domain-specific, and pragmatic components. The goal is a robust system that benefits from the strengths of several NLU methodologies, each applied where most appropriate. PAKTUS employs a frame-based knowledge representation and associative networks throughout. The lexical component uses morphological knowledge and word experts. Syntactic knowledge is represented in an Augmented Transition Network (ATN) grammar that incorporates rule-based programming. Case grammar is used for canonical conceptual representation with constraints. Domain-specific templates represent knowledge about specific applications as patterns of the form used in logic programming. Pragmatic knowledge may augment any of the other types and is added wherever needed for a particular domain. The system has been constructed in an interactive graphic programming environment. It has been used successfully to build a prototype front end for an expert system. This integration of existing technologies makes limited but practical NLU feasible now for narrow, well-defined domains.

  14. Automatically Generating Natural Language Status Reports

    NASA Astrophysics Data System (ADS)

    Kalita, Jugal; Shende, Sunil

    1988-03-01

    In this paper, we describe a system which generates compact natural language status reports for a set of inter-related processes at various stages of progress. The system has three modules - a rule-based domain knowledge representation module, an elaborate text planning module, and a surface generation module. The knowledge representation module models a set of processes that are encountered in a typical office environment, using a body of explicitly sequenced production rules implemented by an augmented Petri net mechanism. The system employs an interval-based temporal network for storing historical information. A text planning module traverses this network to search for events which need to be mentioned in a coherent report describing the current status of the system. The planner combines similar information for succinct presentation whenever applicable. It also takes into consideration various issues such as relevance and redundancy, simple mechanisms for viewing events from multiple perspectives and the application of discourse focus techniques for the generation of good quality text. Finally, an available surface generation module which has been suitably augmented is used to produce well-structured textual reports for our chosen domain.

  15. A Statistical Parser for Czech* Michael Collins

    E-print Network

    A Statistical Parser for Czech* Michael Collins AT&T Labs-Research, Shannon Laboratory, 180 Park in building on the parsing model of (Collins 97). Our final results - 80% de- pendency accuracy - represent a baseline approach, based on the parsing model of (Collins 97), which recovers dependencies with 72

  16. Apple Pie Parser (version 5.7)

    E-print Network

    Manual of Apple Pie Parser (version 5.7) July. 15. 1996 Satoshi SEKINE Computer Science Department://cs.nyu.edu/cs/projects/proteus/sekine #12; Golden Rules to Make Perfect Apple Pie 1. Perfect and fresh apples should be used 2. All, prheated oven to ensure well­done fruit, brown and flaky crust, and thick juices 5. Have friends who love

  17. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers

    PubMed Central

    Ye, Ye; Tsui, Fuchiang (Rich); Wagner, Michael; Espino, Jeremy U; Li, Qi

    2014-01-01

    Objectives To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection. Methods We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios. Results The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p<0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 ‘most influential’ findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76>0.70, p<0.05). Conclusions Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods. PMID:24406261

  18. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  19. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information…

  20. Statistical Approaches to Natural Language Processing CS 4390/5319

    E-print Network

    Ward, Nigel

    Statistical Approaches to Natural Language Processing CS 4390/5319 Spring Semester, 2003 Syllabus language and automata theory ­ human-computer interaction #12;NLP Syllabus 2003 2 ­ machine learning and AI ­ simple data structures ­ basic programming skills ­ the engineering issues involved in building systems

  1. Getting Answers to Natural Language Questions on the Web.

    ERIC Educational Resources Information Center

    Radev, Dragomir R.; Libner, Kelsey; Fan, Weiguo

    2002-01-01

    Describes a study that investigated the use of natural language questions on Web search engines. Highlights include query languages; differences in search engine syntax; and results of logistic regression and analysis of variance that showed aspects of questions that predicted significantly different performances, including the number of words,…

  2. On the Representation of Physical Quantities in Natural Language Text

    E-print Network

    Forbus, Kenneth D.

    language. Our focus is on physical quantities found in descriptions of physical processes that water will eventually boil if you heat it on a stove, that a ball placed at the top of a steep ramp continuous properties can appear in written natural language. Our focus is on physical quantities found

  3. AUTONOMOUS ACQUISITION OF NATURAL LANGUAGE Eric Nivel,1

    E-print Network

    Thórisson, Kristinn Rúnar

    Laboratory ABSTRACT An important part of human intelligence is the ability to use language. Humans learn how between the humans, with no grammar being provided to it a priori, and only high-level information about, natural language, communication 1. INTRODUCTION One of the most useful skills to evolve in humans

  4. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  5. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  6. A discriminative model for understanding natural language route directions

    E-print Network

    Kollar, Thomas Fleming

    2010-01-01

    To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a ...

  7. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  8. Natural language processing for unmanned aerial vehicle guidance interfaces

    E-print Network

    Craparo, Emily M. (Emily Marie), 1980-

    2004-01-01

    In this thesis, the opportunities and challenges involved in applying natural language processing techniques to the control of unmanned aerial vehicles (UAVs) are addressed. The problem of controlling an unmanned aircraft ...

  9. Head-Driven Statistical Models for Natural Language Parsing

    E-print Network

    Collins, Michael

    Head-Driven Statistical Models for Natural Language Parsing Michael Collins #3; MIT Arti#12;cial on this approach. The models were originally introduced in (Collins 1997); the current paper 1 gives considerably

  10. Natural language command of an autonomous micro-air vehicle

    E-print Network

    Huang, Albert S.

    Natural language is a flexible and intuitive modality for conveying directions and commands to a robot but presents a number of computational challenges. Diverse words and phrases must be mapped into structures that the ...

  11. Mixed-Initiative Natural Language Dialogue with Variable Communicative Modes 

    E-print Network

    Ishizaki, Masato

    As speech and natural language processing technology advance, it now reaches a stage where the dialogue control or initiative can be studied to realise usable and friendly human computer interface programs such as computer ...

  12. MOOIDE : natural language interface for programming MOO environments

    E-print Network

    Ahmad, Moinuddin

    2008-01-01

    MOOIDE is an interface to allow novice users to program a MOO environment using natural language. Programming the MOO involves a variety of tasks like creating objects and their states, assigning verb actions to objects, ...

  13. Understanding natural language commands for robotic navigation and mobile manipulation

    E-print Network

    Tellex, Stefanie A.

    This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi-structured environments. Previous approaches have used models ...

  14. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  15. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  16. Overview of computer-based natural language processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language-Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc. in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and other who will be affected by this field as it unfolds.

  17. Overview of Computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  18. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  19. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  20. From Object-Process Diagrams to a Natural Object-Process Language

    E-print Network

    Peleg, Mor

    , it is evident from this small example that formal logic-based languages are far from being natural and intuitiveFrom Object-Process Diagrams to a Natural Object- Process Language Mor Peleg and Dov Dori Faculty. We propose the Ob- ject-Process Language as a textual natural language means for systems specifi

  1. Recycling Terms into a Partial Parser Christian Jacquemin

    E-print Network

    Recycling Terms into a Partial Parser Christian Jacquemin Institut de Recherche en Informatique de with an on- line dictionary. Through FASTR, large terminological data can be recycled for text processing parser by an ability to recycle linguistic knowledge embodied in terminological data. Higher quality

  2. Implementing a lexicalised statistical parser Corrin Lakeland, Alistair Knott

    E-print Network

    will describe our own implementation of a statistical parser. 1 Introduction Between 1996 and 1999, Michael Collins devel- oped a statistical parser (Collins, 1996; 1999) which has become tremendously influential in NLP. Collins' thesis and published papers dis- cuss the theoretical underpinnings of his system

  3. Comparing Italian parsers on a common treebank: the Evalita experience

    E-print Network

    Mazzei, Alessandro

    Comparing Italian parsers on a common treebank: the Evalita experience C. Bosco , A. Mazzei , V been the first contest among parsing systems for Italian. It is the first attempt to compare of participants' parsers are very promising and higher than the state-of-the-art for dependency parsing of Italian

  4. Proof-Theoretic Semantics for a Natural Language Fragment

    NASA Astrophysics Data System (ADS)

    Francez, Nissim; Dyckhoff, Roy

    We propose a Proof - Theoretic Semantics (PTS) for a (positive) fragment E+0 of Natural Language (NL) (English in this case). The semantics is intended [7] to be incorporated into actual grammars, within the framework of Type - Logical Grammar (TLG) [12]. Thereby, this semantics constitutes an alternative to the traditional model - theoretic semantics (MTS), originating in Montague's seminal work [11], used in TLG.

  5. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  6. ON THE NATURE AND NURTURE OF LANGUAGE Elizabeth Bates

    E-print Network

    to Plato and Kant, but in modern times it is most clearly associated with the linguist Noam Chomsky (see on the innateness of language and Plato's original position on the nature of mind, as follows: "How can we interpret [Plato's] proposal in modern terms? A modern variant would be that certain aspects of our knowledge

  7. CS769 Spring 2010 Advanced Natural Language Processing Logistic Regression

    E-print Network

    Zhu, Xiaojin "Jerry"

    CS769 Spring 2010 Advanced Natural Language Processing Logistic Regression Lecturer: Xiaojin Zhu(y|x) directly. A model that estimates p(y|x) directly is known as a discriminative model. Logistic regression(y = 1|x) and p(y = -1|x) with p(y|x) = 1 1 + exp(-y x) . (3) Logistic regression can be easily

  8. A finite and real-time processor for natural language

    SciTech Connect

    Blank, G.D. )

    1989-10-01

    People process natural language in real time and with very limited short-term memories. This article describes a computational architecture for syntactic performance that also requires fixed finite resources. The processor presented here represents syntactic versatility without incurring combinatorial redundancy in the number of transitions or rules. It avoids both excess grammar size and excessive computational complexity.

  9. THE XCALIBUR PROJECT: A Natural Language Interface To Expert Systems

    E-print Network

    Carbonell, Jaime

    in natural language. This renders an otherwise open-ended task tractable. · Recent advances in multi order. Line item 1 added: (2 rp07-aa) >Add a printer with graphics capability fixed or changeable font) >Tell me about the Ixyl 1 The Ixyl 1 is a 240 LPM line printer with plotting capability Except

  10. Learning from a Computer Tutor with Natural Language Capabilities

    ERIC Educational Resources Information Center

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha

    2003-01-01

    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…

  11. The Cambridge/ACLSeries STUDIES IN NATURAL LANGUAGE PROCESSING

    E-print Network

    · theory and methodology of translation · natural language parsing - morphological, syntactic, semantic translation as an application of theoretical linguistics: theories of syntax, semantics and pragmatics[rIJIjflJllrJJlIFl'"'"'....... Machine Translation Editor: Sergei Nirenburg, Carnegie Me/Ion University, Center for Machine Translation

  12. Natural Language Analysis of Patent Claims Svetlana Sheremetyeva

    E-print Network

    Natural Language Analysis of Patent Claims Svetlana Sheremetyeva Department of Computational@mail.dk Abstract We propose a NLP methodology for ana- lyzing patent claims that combines sym- bolic grammar. The methodology can be used in any patent-related application, such as machine translation, improving readability

  13. Exploiting Lexical Regularities in Designing Natural Language Systems

    E-print Network

    Exploiting Lexical Regularities in Designing Natural Language Systems Boris Katz Artificial with alternate expres- sions of the arguments of verbs. The design of the system takes advantage of the results being told that David dressed the baby. Here the appropriate answer would be "I don't know

  14. InstituteforNaturalLanguageProcessing Evaluating noise reduction strategies

    E-print Network

    Reyle, Uwe

    extraction quality filtering by syntactic constraints "cover the *surface with linseed oil varnish" filtering extraction Johannes Schäfer, Ina Rösiger, Ulrich Heid, Michael Dorna Universität Stuttgart, Universität extraction 1 / 37 #12;InstituteforNaturalLanguageProcessing Why do we need noise reduction strategies

  15. Semantic Lexicon Acquisition for Learning Natural Language Interfaces

    E-print Network

    English into Span- ish, Japanese, and Turkish and ran experiments on learn- ing database interfacesSemantic Lexicon Acquisition for Learning Natural Language Interfaces Cynthia A. Thompson@cs.utexas.edu, mooney@cs.utexas.edu Abstract This paper describes a system, WOLFIE (WOrd Learning From Interpreted

  16. Anaphora in Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth DuRoss

    1990-01-01

    Describes the linguistic phenomenon of anaphora; surveys the approaches to anaphora undertaken in theoretical linguistics and natural language processing (NLP); presents results of research conducted at Syracuse University on anaphora in information retrieval; and discusses the future of anaphora research in regard to information retrieval tasks.…

  17. Learning to Disambiguate Natural Language Using World Knowledge

    E-print Network

    Collobert, Ronan

    unique objects in the world, are Now at Google Labs, New York, USA. 1 #12;involved. If one wantsLearning to Disambiguate Natural Language Using World Knowledge Antoine Bordes, Nicolas Usunier LIP allows both world knowledge and linguistic information to be used during learning and prediction. We show

  18. From Web Content Mining to Natural Language Processing

    E-print Network

    Illinois at Chicago, University of

    panels, copyright notices, etc. Surface Web and deep Web. Surface Web: pages that can be browsed using a Web browser. Deep Web: databases that can only be accessed through parameterized query interfacesFrom Web Content Mining to Natural Language Processing Bing Liu Department of Computer Science

  19. Enhancing Subject Access to OPACs: Controlled Vocabulary vs. Natural Language.

    ERIC Educational Resources Information Center

    Cousins, Shirley Anne

    1992-01-01

    Investigation of retrieval performance of controlled vocabulary derived from natural language terms in tables of contents and book indexes assumed that controlled vocabulary representative of users' queries should adequately represent documents' contents. Queries were indexed using Library of Congress Subject Headings (LSCH), Dewey Decimal…

  20. Introduction to Natural Language Processing Computer Science 585--Fall 2009

    E-print Network

    Smith, David A.

    (tm) ice cream [link], then take out my frustration on a variety of great flash games from PopCap GamesWhy NLP? Introduction to Natural Language Processing Computer Science 585--Fall 2009 University(E); inpart by theNational Science Foundation (Grant GP-2495), theNational Institutes of Health

  1. Natural-language access to databases-theoretical/technical issues

    SciTech Connect

    Moore, R.C.

    1982-01-01

    Although there have been many experimental systems for natural-language access to databases, with some now going into actual use, many problems in this area remain to be solved. The author presents descriptions of five problem areas that seem to me not to be adequately handled by any existing system.

  2. A Controlled Natural Language for Business Intelligence Monitoring

    E-print Network

    Pace, Gordon J.

    A Controlled Natural Language for Business Intelligence Monitoring Christian Colombo1 , Jean not an option for non technical business analysts. On the other hand, off-the-shelf business intelligence-the- shelf solution would be to present a simple interface which would allow a business intelligence analyst

  3. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  4. IELR(1): Practical LR(1) Parser Tables for Non-LR(1) Grammars with

    E-print Network

    Malloy, Brian

    - mentation is feasible for generating minimal LR(1) parsers for those grammars. Categories and SubjectIELR(1): Practical LR(1) Parser Tables for Non-LR(1) Grammars with Conflict Resolution Joel E of the art of practical LR parser table generation. Specifically, LALR sometimes generates parser tables

  5. UMLS knowledge for biomedical language processing.

    PubMed Central

    McCray, A T; Aronson, A R; Browne, A C; Rindflesch, T C; Razi, A; Srinivasan, S

    1993-01-01

    This paper describes efforts to provide access to the free text in biomedical databases. The focus of the effort is the development of SPECIALIST, an experimental natural language processing system for the biomedical domain. The system includes a broad coverage parser supported by a large lexicon, modules that provide access to the extensive Unified Medical Language System (UMLS) Knowledge Sources, and a retrieval module that permits experiments in information retrieval. The UMLS Metathesaurus and Semantic Network provide a rich source of biomedical concepts and their interrelationships. Investigations have been conducted to determine the type of information required to effect a map between the language of queries and the language of relevant documents. Mappings are never straightforward and often involve multiple inferences. PMID:8472004

  6. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  7. Developing Formal Correctness Properties from Natural Language Requirements

    NASA Technical Reports Server (NTRS)

    Nikora, Allen P.

    2006-01-01

    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  8. Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis

    NASA Technical Reports Server (NTRS)

    Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.

    2007-01-01

    To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.

  9. Proceedings of the Fourth International Natural Language Generation Conference, pages 95102, Sydney, July 2006. c 2006 Association for Computational Linguistics

    E-print Network

    Proceedings of the Fourth International Natural Language Generation Conference, pages 95 Databases using Natural Language Generation Techniques Catalina Hallett Center for Research in Computing presents a method of querying databases by means of a natural language- like interface which offers

  10. NLP and Linguistics Introduction to Natural Language Processing

    E-print Network

    Smith, David A.

    ;Engineering vs. Science? · One story · NLP took formal language theory and generative linguistics (same source observe language in context Children observe frequencies of language 9 #12;Language learning: Children in context Children observe frequencies of language 9 #12;Language learning: Children listen to language

  11. Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pages 2734, Hyderabad, India, January 2008. c 2008 Asian Federation of Natural Language Processing

    E-print Network

    from at least two drawbacks, which we will call the `Expertise Problem' and the `Half-Life Problem technology for building parsers. The Half-Life Problem concerns the fact that once a parser has been built parser from a corpus (see for example Creutz and Lagus, 2007; Goldsmith, 2001; Goldsmith and Hu, 2004

  12. Natural Language, Knowledge Representation and Discourse James F Allen and Lenhart K Schubert

    E-print Network

    Natural Language, Knowledge Representation and Discourse James F Allen and Lenhart K Schubert Department of Computer Science, University of Rochester, Rochester, NY 14627 Goals Natural language and in text analysis and information reU'ieval.Unfortunately, in most approaches the natural language

  13. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  14. Knowledge discovery and data mining to assist natural language understanding.

    PubMed Central

    Wilcox, A.; Hripcsak, G.

    1998-01-01

    As natural language processing systems become more frequent in clinical use, methods for interpreting the output of these programs become increasingly important. These methods require the effort of a domain expert, who must build specific queries and rules for interpreting the processor output. Knowledge discovery and data mining tools can be used instead of a domain expert to automatically generate these queries and rules. C5.0, a decision tree generator, was used to create a rule base for a natural language understanding system. A general-purpose natural language processor using this rule base was tested on a set of 200 chest radiograph reports. When a small set of reports, classified by physicians, was used as the training set, the generated rule base performed as well as lay persons, but worse than physicians. When a larger set of reports, using ICD9 coding to classify the set, was used for training the system, the rule base performed worse than the physicians and lay persons. It appears that a larger, more accurate training set is needed to increase performance of the method. PMID:9929336

  15. The future role of language resources for natural language parsing (We won't be able to rely on Pierre Vinken forever...

    E-print Network

    The future role of language resources for natural language parsing (We won't be able to rely natural language parsing. I will also reflect on what the translation of existing resources into other@illinois.edu The transformation that natural language parsing has undergone since the nineties would have been impossible without

  16. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  17. Medical Facts to Support Inferencing in Natural Language Processing

    PubMed Central

    Rindflesch, Thomas C.; Pakhomov, Serguei V.; Fiszman, Marcelo; Kilicoglu, Halil; Sanchez, Vincent R.

    2005-01-01

    We report on the use of medical facts to support the enhancement of natural language processing of biomedical text. Inferencing in semantic interpretation depends on a fact repository as well as an ontology. We used statistical methods to construct a repository of drug-disorder co-occurrences from a large collection of clinical notes, and this resource is used to validate inferences automatically drawn during semantic interpretation of Medline citations about pharmacologic interventions for disease. We evaluated the results against a published reference standard for treatment of diseases. PMID:16779117

  18. Augmenting a database knowledge representation for natural language generation

    SciTech Connect

    McCoy, K.F.

    1982-01-01

    The knowledge representation is an important factor in natural language generation since it limits the semantic capabilities of the generation system. This paper identifies several information types in a knowledge representation that can be used to generate meaningful responses to questions about database structure. Creating such a knowledge representation, however, is a long and tedious process. A system is presented which uses the contents of the database to form part of this knowledge representation automatically. It employs three types of world knowledge axioms to ensure that the representation formed is meaningful and contains salient information. 7 references.

  19. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  20. The integration hypothesis of human language evolution and the nature of contemporary languages

    E-print Network

    Miyagawa, Shigeru

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language ...

  1. Understanding the Bottom-Up SLR Parser SafniKhuri Jason Williams

    E-print Network

    Khuri, Sami

    -traditional environment, 1. INTRODUCTION In this work we show how the techniques behind the bottom-up SLR parser canUnderstanding the Bottom-Up SLR Parser SafniKhuri Jason Williams (408) 924-5081 (408) 996 behind the bottom-up SLR parser can be used to perform computer animation. The ditlerent phases

  2. An alternative method of training probabilistic LR parsers Mark-Jan Nederhof

    E-print Network

    An alternative method of training probabilistic LR parsers Mark-Jan Nederhof Faculty of Arts@dei.unipd.it Abstract We discuss existing approaches to train LR parsers, which have been used for statistical in terms of a context-free grammar cannot be expressed in terms of the LR parser constructed from

  3. Validating LR(1) Parsers Jacques-Henri Jourdan1,2

    E-print Network

    . However, Barthwal and Norrish's proof is specific to a particular parser generator that only accepts SLR validation of an LR(1) automaton produced by an untrusted parser generator, as depicted in Fig. 1. AfterValidating LR(1) Parsers Jacques-Henri Jourdan1,2 , Fran¸cois Pottier2 , and Xavier Leroy2 1 ´Ecole

  4. Validating LR(1) Parsers Jacques-Henri Jourdan1,2

    E-print Network

    Leroy, Xavier

    ;Instrumented parser generatorGrammar LR(1) automaton Grammar Certificate ValidatorOK / error Pushdown = not verified, not trusted Fig. 1. General architecture. directly to a hand-written or generated parser. However's proof is specific to a particular parser generator that only accepts SLR grammars. It so happens

  5. COMPARING HUMAN AND MACHINE PERFORMANCE FO R NATURAL LANGUAGE INFORMATION EXTRACTION

    E-print Network

    COMPARING HUMAN AND MACHINE PERFORMANCE FO R NATURAL LANGUAGE INFORMATION EXTRACTION: Results for studying human performance was to better understand the nature of the extraction task and the relative the state of technology for extracting information from natural language text by machine, it i s valuable

  6. Semantic Grammar: A Technique for Constructing Natural Language Interfaces to Instructional Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    A major obstacle to the effective educational use of computers is the lack of a natural means of communication between the student and the computer. This report describes a technique for generating such natural language front-ends for advanced instructional systems. It discusses: (1) the essential properties of a natural language front-end, (2)…

  7. Evaluation of Natural Language Tools for Italian: EVALITA 2007 B. Magnini1

    E-print Network

    Mazzei, Alessandro

    Evaluation of Natural Language Tools for Italian: EVALITA 2007 B. Magnini1 , A. Cappelli2 , F Processing tools for Italian, provided a shared framework where participants' systems had the possibility is to promote the development of language technologies for the Italian language, by providing a shared framework

  8. ONE GRAMMAR OR TWO? Sign Languages and the Nature of Human Language.

    PubMed

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages - such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality, but also in the form of sign languages, has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language - in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. PMID:25013534

  9. Building Gold Standard Corpora for Medical Natural Language Processing Tasks

    PubMed Central

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre

    2012-01-01

    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project. PMID:23304283

  10. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  11. AutoTutor: a tutor with dialogue in natural language.

    PubMed

    Graesser, Arthur C; Lu, Shulan; Jackson, George Tanner; Mitchell, Heather Hite; Ventura, Mathew; Olney, Andrew; Louwerse, Max M

    2004-05-01

    AutoTutor is a learning environment that tutors students by holding a conversation in natural language. AutoTutor has been developed for Newtonian qualitative physics and computer literacy. Its design was inspired by explanation-based constructivist theories of learning, intelligent tutoring systems that adaptively respond to student knowledge, and empirical research on dialogue patterns in tutorial discourse. AutoTutor presents challenging problems (formulated as questions) from a curriculum script and then engages in mixed initiative dialogue that guides the student in building an answer. It provides the student with positive, neutral, or negative feedback on the student's typed responses, pumps the student for more information, prompts the student to fill in missing words, gives hints, fills in missing information with assertions, identifies and corrects erroneous ideas, answers the student's questions, and summarizes answers. AutoTutor has produced learning gains of approximately .70 sigma for deep levels of comprehension. PMID:15354683

  12. Detection of Blood Culture Bacterial Contamination using Natural Language Processing

    PubMed Central

    Matheny, Michael E.; FitzHenry, Fern; Speroff, Theodore; Hathaway, Jacob; Murff, Harvey J.; Brown, Steven H.; Fielstein, Elliot M.; Dittus, Robert S.; Elkin, Peter L.

    2009-01-01

    Microbiology results are reported in semi-structured formats and have a high content of useful patient information. We developed and validated a hybrid regular expression and natural language processing solution for processing blood culture microbiology reports. Multi-center Veterans Affairs training and testing data sets were randomly extracted and manually reviewed to determine the culture and sensitivity as well as contamination results. The tool was iteratively developed for both outcomes using a training dataset, and then evaluated on the test dataset to determine antibiotic susceptibility data extraction and contamination detection performance. Our algorithm had a sensitivity of 84.8% and a positive predictive value of 96.0% for mapping the antibiotics and bacteria with appropriate sensitivity findings in the test data. The bacterial contamination detection algorithm had a sensitivity of 83.3% and a positive predictive value of 81.8%. PMID:20351890

  13. Towards a semantic lexicon for clinical natural language processing.

    PubMed

    Liu, Hongfang; Wu, Stephen T; Li, Dingcheng; Jonnalagadda, Siddhartha; Sohn, Sunghwan; Wagholikar, Kavishwar; Haug, Peter J; Huff, Stanley M; Chute, Christopher G

    2012-01-01

    A semantic lexicon which associates words and phrases in text to concepts is critical for extracting and encoding clinical information in free text and therefore achieving semantic interoperability between structured and unstructured data in Electronic Health Records (EHRs). Directly using existing standard terminologies may have limited coverage with respect to concepts and their corresponding mentions in text. In this paper, we analyze how tokens and phrases in a large corpus distribute and how well the UMLS captures the semantics. A corpus-driven semantic lexicon, MedLex, has been constructed where the semantics is based on the UMLS assisted with variants mined and usage information gathered from clinical text. The detailed corpus analysis of tokens, chunks, and concept mentions shows the UMLS is an invaluable source for natural language processing. Increasing the semantic coverage of tokens provides a good foundation in capturing clinical information comprehensively. The study also yields some insights in developing practical NLP systems. PMID:23304329

  14. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  15. Natural language processing in biomedicine: a unified system architecture overview.

    PubMed

    Doan, Son; Conway, Mike; Phuong, Tu Minh; Ohno-Machado, Lucila

    2014-01-01

    In contemporary electronic medical records much of the clinically important data-signs and symptoms, symptom severity, disease status, etc.-are not provided in structured data fields but rather are encoded in clinician-generated narrative text. Natural language processing (NLP) provides a means of unlocking this important data source for applications in clinical decision support, quality assurance, and public health. This chapter provides an overview of representative NLP systems in biomedicine based on a unified architectural view. A general architecture in an NLP system consists of two main components: background knowledge that includes biomedical knowledge resources and a framework that integrates NLP tools to process text. Systems differ in both components, which we review briefly. Additionally, the challenge facing current research efforts in biomedical NLP includes the paucity of large, publicly available annotated corpora, although initiatives that facilitate data sharing, system evaluation, and collaborative work between researchers in clinical NLP are starting to emerge. PMID:24870142

  16. Natural language estimates of nonlinear response structural sensitivity

    NASA Astrophysics Data System (ADS)

    Kleiber, M.

    1989-09-01

    In many cases attempts to employ numeric, or quantitative, methods to contruct adequate computer models of real engineering situations fall definitely short of expectations. It is believed that one of the reasons may be the computer inability to process imprecise, or fuzzy, terms like “very low”, “about four to six”, etc., which are typical of any judgements made by humans. The objective of the paper is to propose a computer-assisted algorithm for assessing nonlinear response structural sensitivity to imperfections. The fuzzy set theory is used to represent imperfections in terms of the natural language expressions. Both the theory and an illustrative example are meant to display the significance of such an approximate reasoning in getting realistic estimates for structural sensitivity.

  17. Tasking and sharing sensing assets using controlled natural language

    NASA Astrophysics Data System (ADS)

    Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David

    2012-06-01

    We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.

  18. The Nature of Spanish versus English Language Use at Home

    ERIC Educational Resources Information Center

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude

    2014-01-01

    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,…

  19. A Statistical Parser for Czech \\Lambda Michael Collins

    E-print Network

    Collins, Michael

    A Statistical Parser for Czech \\Lambda Michael Collins AT&T Labs--Research, Shannon Laboratory, 180 in building on the parsing model of (Collins 97). Our final results -- 80% de­ pendency accuracy -- represent of (Collins 97), which recovers dependencies with 72% accuracy. We then describe a series of refinements

  20. Weakly Supervised Training of Semantic Parsers Jayant Krishnamurthy

    E-print Network

    Mitchell, Tom

    with state-of-the- art accuracy, while simultaneously recovering much richer semantic structures, such as con corpus. The trained semantic parser extracts binary relations with state-of-the-art performance, while, represented by the "Lex" entries. The second stage applies CCG combination rules, in this case bot

  1. Comparing Italian parsers on a common treebank: the Evalita experience

    E-print Network

    Mazzei, Alessandro

    Comparing Italian parsers on a common treebank: the Evalita experience C. Bosco , A. Mazzei , V contest among parsing systems for Italian. It is the first attempt to compare the approaches of the parsing results are very promising and higher than the state-of-the-art for dependency parsing of Italian

  2. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two…

  3. Integrating casebased learning and cognitive biases for machine learning of natural language

    E-print Network

    Cardie, Claire

    Integrating case­based learning and cognitive biases for machine learning of natural language@cs.cornell.edu Running head: Integrating CBL and cognitive biases August 9, 1999 1 #12; Integrating case­based learning and cognitive biases for machine learning of natural language Abstract This paper shows that psychological

  4. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by…

  5. A Visually Grounded Natural Language Interface for Reference to Spatial Scenes

    E-print Network

    Gorniak, Peter

    using only language (e.g., "the cylinder at the back under a cube" in the 3-D design case, or in a carA Visually Grounded Natural Language Interface for Reference to Spatial Scenes Peter Gorniak MIT in the shared space. We present an analysis of how people describe objects in spatial scenes using natural

  6. Planning in AI and Text Planning in Natural Language JongGyun Lim

    E-print Network

    generation to design systems which can automate the generation of communicative acts. Text planning is a task1 Planning in AI and Text Planning in Natural Language Generation Jong­Gyun Lim Columbia University the content and structure of the natural language text and that of other AI planning tasks. The problem

  7. Success story in software engineering using NIAM (Natural language Information Analysis Methodology)

    SciTech Connect

    Eaton, S.M.; Eaton, D.S.

    1995-10-01

    To create an information system, we employ NIAM (Natural language Information Analysis Methodology). NIAM supports the goals of both the customer and the analyst completely understanding the information. We use the customer`s own unique vocabulary, collect real examples, and validate the information in natural language sentences. Examples are discussed from a successfully implemented information system.

  8. A Prototype Natural Language Interface to a Large Complex Knowledge Base, the Foundational Model of Anatomy

    E-print Network

    Washington at Seattle, University of

    Model of Anatomy (FMA)1 . We describe a program, named GAPP, which takes natural language (NL) questionsA Prototype Natural Language Interface to a Large Complex Knowledge Base, the Foundational Model of Anatomy Gregory Distelhorst, Vishrut Srivastava, Cornelius Rosse, MD, DSc, and James F. Brinkley, MD, Ph

  9. NATURAL LANGUAGE DATA BASE ACCESS WITH PEARL Wendy Lehnert and Steve Shwartz

    E-print Network

    NATURAL LANGUAGE DATA BASE ACCESS WITH PEARL Wendy Lehnert and Steve Shwartz Department of Computer- face or "front-end" system that can process requests stated in the user's natural language. PEARL-specific queries to existing data bases. PEARL analyzes English input with expectation-driven p~rsing techniques

  10. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  11. Using a natural language and gesture interface for unmanned vehicles

    NASA Astrophysics Data System (ADS)

    Perzanowski, Dennis; Schultz, Alan C.; Adams, William; Marsh, Elaine

    2000-07-01

    Unmanned vehicles, such as mobile robots, must exhibit adjustable autonomy. They must be able to be self-sufficient when the situation warrants; however, as they interact with each other and with humans, they must exhibit an ability to dynamically adjust their independence or dependence as co-operative agents attempting to achieve some goal. This is what we mean by adjustable autonomy. We have been investigating various modes of communication that enhance a robot's capability to work interactively with other robots and with humans. Specifically, we have been investigating how natural language and gesture can provide a user- friendly interface to mobile robots. We have extended this initial work to include semantic and pragmatic procedures that allow humans and robots to act co-operatively, based on whether or not goals have been achieved by the various agents in the interaction. By processing commands that are either spoken or initiated by clicking buttons on a Personal Digital Assistant and by gesturing either naturally or symbolically, we are tracking the various goals of the interaction, the agent involved in the interaction, and whether or not the goal has been achieved. The various agents involved in achieving the goals are each aware of their own and others' goals and what goals have been stated or accomplished so that eventually any member of the group, be it robot or a human, if necessary, can interact with the other members to achieve the stated goals of a mission.

  12. "Natural" Language Learning and Learning a Foreign Language in the Classroom.

    ERIC Educational Resources Information Center

    Harris, Vee

    1988-01-01

    Compares the ways students acquire their native language and learn a foreign language. Findings revealed additional areas in need of investigation, including the kinds of activities which tap students' ability to acquire language and the ways in which formal learning can use these activities. (CB)

  13. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    PubMed Central

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  14. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language.

    PubMed

    Jednoróg, Katarzyna; Bola, ?ukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Pawe? M; Marchewka, Artur; Rutkowski, Pawe?

    2015-05-01

    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system j?zykowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski j?zyk migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. PMID:25858311

  15. Applications and Discovery of Granularity Structures in Natural Language Rutu Mulkar-Mehta, Jerry Hobbs and Eduard Hovy

    E-print Network

    Hobbs, Jerry R.

    Applications and Discovery of Granularity Structures in Natural Language Discourse Rutu Mulkar-Mehta of Artificial Intelligence (www.aaai.org). All rights reserved. natural language. In our previous work (Mulkar-Mehta

  16. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  17. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a…

  18. Notes on the Nature of Bilingual Specific Language Impairment

    ERIC Educational Resources Information Center

    de Jong, Jan

    2010-01-01

    Johanne Paradis' Keynote Article can be read as a concise critical review of the research that focuses on the sometimes strained relationship between bilingualism and specific language impairment (SLI). In my comments I will add some thoughts based on our own research on the learning of Dutch as a second language (L2) by children with SLI.

  19. Storing files in a parallel computing system based on user-specified parser function

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  20. Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    PubMed

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C; Doan, Son

    2011-12-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (Probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440

  1. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  2. Parallel Earley's parser and its application to syntactic image analysis

    SciTech Connect

    Chiang, Y.P.; Fu, K.S.

    1983-01-01

    A complete Earley parser which includes recognition and parse extraction has been implemented on a triangular array of processors. The detailed analysis of the complete parser is given. The recognition algorithm is executed in parallel by adopting a new operator, x/sup */, and restricting the input context-free grammar to be lamda-free. The parse extraction algorithm which follows recognition uses a nonrecursive subroutine to generate the correct right-parse in parallel. A special busing arrangement within this array enables the right data to reach the right place at the right time. Simulation examples are provided. The results show that when a string of length >n> is under testing, at the system time 2>n> + 1, the correct right-parse will be obtained if the string is accepted. 15 references.

  3. Learning and comprehension of BASIC and natural language computer programming by novices

    SciTech Connect

    Dyck, J.L.

    1987-01-01

    This study examined the effectiveness of teaching novices to program in Natural Language as a prerequisite for learning BASIC, and the learning and comprehension processes for Natural Language and BASIC computer-programming languages. Three groups of computer-naive subjects participated in five self-paced learning sessions; in each sessions, subjects solved a series of programming problems with immediate feedback. Twenty-four subjects learned to solve BASIC programming problems (BASIC group) for all five sessions, 23 subjects learned to solve corresponding Natural Language programming problems for all five sessions (Natural Language group), and 23 subjects learned to solve Natural Language programming problems for three sessions and then transferred to BASIC for the two sessions (Transfer group). At the end of the fifth session, all subjects completed a post-test which required the subjects to use their programming knowledge in a new way. Results indicated that the Natural Language trained subjects had complete transfer to BASIC, as indicated by no overall difference in comprehension time or accuracy for final BASIC sessions (i.e., sessions four and five) for the Transfer and BASIC groups. In addition, there was an interaction between group and session on accuracy, in which the Transfer group increased its accuracy at a faster rate than the BASIC group.

  4. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  5. The Preservation and Use of Our Languages: Respecting the Natural Order of the Creator.

    ERIC Educational Resources Information Center

    Kirkness, Verna J.

    As a world community, Indigenous peoples are faced with many common challenges in their attempts to maintain the vitality of their respective languages and to honor the "natural order of the Creator." Ten strategies are discussed that are critical to the task of renewing and maintaining Indigenous languages. These strategies are: (1) banking…

  6. International Joint Conference on Natural Language Processing, pages 680684, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    the potential for natural language processing in the language of mental illness. Sim- ilarly, much research that there will be textual indica- tions of these mental illnesses also. In Stirman and Pennebaker (2001), word use, research into the application of NLP to the detection of health illnesses has proved fruit- ful

  7. Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing

    ERIC Educational Resources Information Center

    Cohen, Kevin Bretonnel

    2010-01-01

    Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of…

  8. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  9. Zipf's word frequency law in natural language: a critical review and future directions

    E-print Network

    Makous, Walter

    Zipf's word frequency law in natural language: a critical review and future directions Steven T form known as Zipf's law. This paper first shows that human language has highly complex, reliable structure in the frequency distribution over and above this classic law, though prior data visualization

  10. Large Scale Information Processing System. Volume I. Compiler, Natural Language, and Information Processing.

    ERIC Educational Resources Information Center

    Peterson, Philip L.; And Others

    This volume, the first of three dealing with a number of investigations and studies into the formal structure, advanced technology and application of large scale information processing systems, is concerned with the areas of compiler languages, natural languages and information storage and retrieval. The first report is entitled "Semantics and…

  11. MOUNTAIN: A Translation-based Approach to Natural Language Generation for Dialog

    E-print Network

    Black, Alan W

    MOUNTAIN: A Translation-based Approach to Natural Language Generation for Dialog Systems Brian, USA {blangner,awb}@cs.cmu.edu Abstract. This paper describes the Mountain language generation system a corpus of in-domain human responses, and show typical output of the Mountain system. The results of our

  12. Steps Towards Scenario-Based Programming with a Natural Language Interface

    E-print Network

    Harel, David

    , whenever the TV is turned on and the light outside is too bright. However, if the kids are home, they do phone applications, languages and toolkits that let children program games, robots and more [27, are not programming. Such natural language interactions are often called command and control [38,12], and can

  13. The Nature of Chinese Language Classroom Learning Environments in Singapore Secondary Schools

    ERIC Educational Resources Information Center

    Chua, Siew Lian; Wong, Angela F. L.; Chen, Der-Thanq V.

    2011-01-01

    This article reports findings from a classroom environment study which was designed to investigate the nature of Chinese Language classroom environments in Singapore secondary schools. We used a perceptual instrument, the Chinese Language Classroom Environment Inventory, to investigate teachers' and students' perceptions towards their Chinese…

  14. Development and Evaluation of a Thai Learning System on the Web Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Dansuwan, Suyada; Nishina, Kikuko; Akahori, Kanji; Shimizu, Yasutaka

    2001-01-01

    Describes the Thai Learning System, which is designed to help learners acquire the Thai word order system. The system facilitates the lessons on the Web using HyperText Markup Language and Perl programming, which interfaces with natural language processing by means of Prolog. (Author/VWL)

  15. Description Logics for Natural Language Processing D. Fehrer, U. Hustadt, M. Jaeger, A. Nonnengart,

    E-print Network

    Schmidt, Renate A.

    , called motel. In the late eighties inference in kl­one was shown to be undecidable. Since­known description language ALC. Our system motel serves on one hand as a knowledge base for the natural language known as terminological logics or kl­one­based knowledge representation formalisms) have restricted ex

  16. Using Natural Language Processing, Locus Link, and the Gene Ontology to Compare OMIM to MEDLINE

    E-print Network

    Using Natural Language Processing, Locus Link, and the Gene Ontology to Compare OMIM to MEDLINE, such as the Unified Medical Language System, Gene Ontology, LocusLink, and the Online Inheri- tance In Man (OMIM applicability to biomedical text and that takes advantage of online resources such as LocusLink and the Gene

  17. Automation of a problem list using natural language processing

    PubMed Central

    Meystre, Stephane; Haug, Peter J

    2005-01-01

    Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information. PMID:16135244

  18. A taxonomy of situated language in natural contexts

    E-print Network

    Shaw, George Macaulay

    2011-01-01

    This thesis develops a multi-modal dataset consisting of transcribed speech along with the locations in which that speech took place. Speech with location attached is called situated language, and is represented here as ...

  19. A natural language interface plug-in for cooperative query answering in biological databases

    PubMed Central

    2012-01-01

    Background One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Results Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. Conclusions The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a semantic description of the intended application. We demonstrate the feasibility of our approach with a practical example. PMID:22759613

  20. Of Substance: The Nature of Language Effects on Entity Construal

    PubMed Central

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind (whisk; Imai and Gentner, 1997). Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin-speakers and English-speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's (1999) hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object-kind rather than substance-kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  1. Resolution of linear entity and path geometries expressed via partially-geospatial natural language

    E-print Network

    Marrero, John Javier

    2010-01-01

    When conveying geospatial information via natural language, people typically combine implicit, commonsense knowledge with explicitly-stated information. Usually, much of this is contextual and relies on establishing locations ...

  2. Biomimetic design through natural language analysis to facilitate cross-domain information retrieval

    E-print Network

    Shu, Lily H.

    Biomimetic design through natural language analysis to facilitate cross-domain information, Toronto, Ontario, Canada (Received October 11, 2005; Accepted May 17, 2006! Abstract Biomimetic. Several instances of biomimetic design result from personal observations of biological phenomena. How

  3. A Principled Framework for Constructing Natural Language Interfaces to Temporal Databases 

    E-print Network

    Androutsopoulos, Ioannis

    Most existing natural language interfaces to databases (Nlidbs) were designed to be used with snapshot database systems, that provide very limited facilities for manipulating time-dependent data. Consequently, most Nlidbs ...

  4. Approach to the organization of knowledge and its use in natural language recall tasks

    SciTech Connect

    Mccalla, G.I.

    1983-01-01

    The viewpoint espoused in this paper is that natural language understanding and production is the action of a number of highly integrated domain-specific specialists. Described first is an object oriented representation scheme which allows these specialists to be built. Discussed next is the organization of these specialists into a four-level goal hierarchy that enables the modelling of natural language conversation. It is shown how the representation and natural language structures can be used to facilitate the recall of earlier natural language conversations. Six specific kinds of recall tasks are outlined in terms of these structures and their occurrence in several legal dialogues is examined. Finally, the need for intelligent garbage collection of old episodic information is pointed out. 38 references.

  5. Efficient Lagrangian relaxation algorithms for exact inference in natural language tasks

    E-print Network

    Rush, Alexander M. (Alexander Matthew)

    2011-01-01

    For many tasks in natural language processing, finding the best solution requires a search over a large set of possible structures. Solving these combinatorial search problems exactly can be inefficient, and so researchers ...

  6. Speech and Natural Language Where are we now, and where are we heading?

    E-print Network

    Cortes, Corinna

    ciprianchelba@google.com 04/16/2013 Ciprian Chelba, Quo Vadis Speech and Natural Language ­ p. #12;Case Study:Google Search by Voice What contributed to success: clearly set user expectation by existing text app

  7. Bayesian Inference with Tears a tutorial workbook for natural language researchers

    E-print Network

    Knight, Kevin

    1 Bayesian Inference with Tears a tutorial workbook for natural language researchers Kevin Knight- parameters to twist, and search involves a lot of black art. "How much burn-in did you use?" Good question

  8. Web Text Corpus for Natural Language Processing Vinci Liu and James R. Curran

    E-print Network

    Curran, James R.

    Web Text Corpus for Natural Language Processing Vinci Liu and James R. Curran School of Information Technologies University of Sydney NSW 2006, Australia {vinci,james}@it.usyd.edu.au Abstract Web text has been

  9. IR-NLI: an expert natural language interface to online data bases

    SciTech Connect

    Guida, G.; Tasso, C.

    1983-01-01

    Constructing natural language interfaces to computer systems often requires achievement of advanced reasoning and expert capabilities in addition to basic natural language understanding. In this paper the above issues are faced in the context of an actual application concerning the design of a natural language interface for access to online information retrieval systems. After a short discussion of the peculiarities of this application, which requires both natural language understanding and reasoning capabilities, the general architecture and fundamental design criteria of IR-NLI, a system presently being developed at the University of Udine, are presented. Attention is then focused on the basic functions of IR-NLI, namely, understanding and dialogue, strategy generation, and reasoning. Knowledge representation methods and algorithms adopted are also illustrated. A short example of interaction with IR-NLI is presented. Perspectives and directions for future research are also discussed. 15 references.

  10. Improving Flexibility in Natural Language Interfaces by Reducing Vagueness and Ambiguity

    E-print Network

    Professor of Media Art & Sciences MIT Media Lab Reader: Christopher W. Geib Research Fellow University- controlled personal assistant, Siri, allow users to communicate with computers using natural language. A user

  11. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 600608, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    of language ability (spoken and written), it is imperative that we develop methods for quantifyingProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 600­608, Jeju Island, Korea, 12­14 July 2012. c 2012 Association

  12. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of…

  13. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 843853, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    for Computational Linguistics Inducing a Discriminative Parser to Optimize Machine Translation Reordering Graham-5 Hikari-dai, Seika-cho, Soraku-gun, Kyoto, Japan Abstract This paper proposes a method for learning a discriminative parser for machine trans- lation reordering using only aligned par- allel text. This is done

  14. Comparing the Effects of Structural and Natural Language Use during Direct Instruction with Children with Mental Retardation.

    ERIC Educational Resources Information Center

    Kircaali-Iftar, Gonul; Birkan, Bunyamin; Uysal, Ayten

    1998-01-01

    Effects of structural and natural language use during direct instruction in teaching color and shape concepts to eight Turkish elementary children with moderate mental retardation were compared using an adapted alternating treatments design. Results indicated that natural language use was as effective or more effective than structural language

  15. Language Revitalization.

    ERIC Educational Resources Information Center

    Hinton, Leanne

    2003-01-01

    Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)

  16. The language faculty that wasn't: a usage-based account of natural language recursion.

    PubMed

    Christiansen, Morten H; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking-perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  17. The language faculty that wasn't: a usage-based account of natural language recursion

    PubMed Central

    Christiansen, Morten H.; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  18. Proceedings of the Workshop on Language Processing and Crisis Information 2013, pages 1018, Nagoya, Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing

    E-print Network

    .y@lab.ntt.co.jp iwatuki@riec.tohoku.ac.jp Abstract In order to achieve high-level resilience against disasters, effective language processing applications. 1 Introduction In order to confront natural disasters which could become, Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing Computer

  19. The Trend towards Statistical Models in Natural Language Processing

    E-print Network

    Pennsylvania, University of

    lexicography and stud- ies of language change, to methods for automated indexing and information retrieval of sense and reference), similarity measures among chunks of text (information retrieval, message routing of Computer and Information Science, University of Pennsylvania 1 A Flowering of Corpus-Based Research Over

  20. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…

  1. CS/Informatics Colloquium, 2006-03-03 Natural language,

    E-print Network

    Gasser, Michael

    - Knowledge and cognitive representation - Tacit knowledge, knowing how (Polanyi,Anderson) · Knowledge, and the democratization of knowledge Mike Gasser #12;Collaborators · Steve Hockema · Matt Kane · Amr Sabry,Ahmed Hamed research · Inter-relationships among - Knowledge - Language - (Power) - Informatics #12;What this talk

  2. Getting Answers to Natural Language Questions Dragomir R. Radev

    E-print Network

    Radev, Dragomir R.

    , the presence of a proper noun, and whether the question is time dependent. An additional analysis tested in the United States?," "What percentage of the world's plant and animal species can be found in the Amazon, different search engines have different syntaxes. Some support advanced query languages, others do not. Stop

  3. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  4. Annotation of Tutorial Dialogue Goals for Natural Language Generation

    E-print Network

    Luther, Ken

    tutors exhibit a rich variety of strategies, tactics, and language. Machine tutors typically do not. We and what kinds of behavior we wanted from it. The domain of CIRCSIM-Tutor is the baroreceptor reflex-based intelligent tutor- ing system, we have observed many hours of human tutoring of baroreceptor reflex problems

  5. Integrating Corpus-Based Resources and Natural Language Processing.

    ERIC Educational Resources Information Center

    Cantos, Pascual

    2002-01-01

    Surveys computational linguistic tools presently available, but whose potential has neither been fully considered nor exploited to its full in modern computer assisted language learning (CALL). Discusses the rationale of DDL to engage learning, presenting typical data-driven learning (DDL)-activities, DDL-software, and potential extensions of…

  6. Natural language modeling for phoneme-to-text transcription

    SciTech Connect

    Derouault, A.M.; Merialdo, B.

    1986-11-01

    This paper relates different kinds of language modeling methods that can be applied to the linguistic decoding part of a speech recognition system with a very large vocabulary. These models are studied experimentally on a pseudophonetic input arising from French stenotypy. The authors propose a model which combines the advantages of a statistical modeling with information theoretic tools, and those of a grammatical approach.

  7. Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pages 1827, Uppsala, Sweden, 15-16 July 2010. c 2010 Association for Computational Linguistics

    E-print Network

    is a major bottleneck in scaling semantic parsers. This paper presents a new learn- ing paradigm aimed is a major challenge in scaling semantic parsers. This paper proposes a new model and learning paradigm

  8. Scaling Semantic Parsers with On-the-fly Ontology Matching Tom Kwiatkowski Eunsol Choi Yoav Artzi Luke Zettlemoyer

    E-print Network

    Zettlemoyer, Luke

    Scaling Semantic Parsers with On-the-fly Ontology Matching Tom Kwiatkowski Eunsol Choi Yoav Artzi,eunsol,yoav,lsz}@cs.washington.edu Abstract We consider the challenge of learning seman- tic parsers that scale to large, open-domain problems on a recent Freebase QA corpus. 1 Introduction Semantic parsers map sentences to formal represen- tations

  9. Outline Introduction Parsing natural mathematical language FMathL and GF Conclusion Formal Mathematical Language

    E-print Network

    Neumaier, Arnold

    Peter Schodl working primarily on the Semantic Matrix some work on a grammar for German Mathematics language: very restricted domain small set of frequently repeated phrases usually exact meaning test case. this was the raw material for a lexicon of about 1500 German basic words, a simple morphological grammar (to

  10. The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser.

    PubMed

    Vosse, Theo; Kempen, Gerard

    2009-12-01

    We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described. PMID:19784798

  11. Concreteness and Psychological Distance in Natural Language Use.

    PubMed

    Snefjella, Bryor; Kuperman, Victor

    2015-09-01

    Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. PMID:26239108

  12. Parse trees: from formal to natural languages Matilde Marcolli

    E-print Network

    Marcolli, Matilde

    and Computational Linguistics Winter 2015 CS101 Win2015: Linguistics Parse trees #12;Context-free grammars G = (VN by a grammar G: LG = {w VT | S · P w} language with alphabet VT CS101 Win2015: Linguistics Parse trees #12) then A w1 · · · wn P CS101 Win2015: Linguistics Parse trees #12;Example · Grammar: G = {{S, A}, {a, b}, P

  13. Book Reviews Speech and Language Processing: An Introduction to Natural

    E-print Network

    review will focus primarily on the format and content of SpeechandLanguageProcessing (SLP) as well as its extraction and 1 See, for example, the book's page at amazon.tom or the book's home page at www.cs.colorado.edu/~martin/slp, to name a few--are quoted as well. 3. Content SLP delves into topics as diverse as articulatory phonetics

  14. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  15. Fertility Models for Statistical Natural Language Understanding Stephen Della Pietra , Mark Epstein, Salim Roukos, Todd Ward

    E-print Network

    Fertility Models for Statistical Natural Language Understanding Stephen Della Pietra °, Mark Translation Group's concept of fertil- ity (Brown et al., 1993) to the generation of clumps for natural- lish as many disjoint clump of words. We present two fertility models which attempt to capture

  16. A Visually Grounded Natural Language Interface for Reference to Spatial Scenes

    E-print Network

    Roy, Deb

    under a cube" in the 3-D design case, or in a car navigation do- main, "you are looking for the firstA Visually Grounded Natural Language Interface for Reference to Spatial Scenes Peter Gorniak MIT in the shared space. We present an analysis of how people describe objects in spatial scenes using natural

  17. Rimac: A Natural-Language Dialogue System that Engages Students in Deep Reasoning Dialogues about Physics

    ERIC Educational Resources Information Center

    Katz, Sandra; Jordan, Pamela; Litman, Diane

    2011-01-01

    The natural-language tutorial dialogue system that the authors are developing will allow them to focus on the nature of interactivity during tutoring as a malleable factor. Specifically, it will serve as a research platform for studies that manipulate the frequency and types of verbal alignment processes that take place during tutoring, such as…

  18. For the People...Citizenship Education and Naturalization Information. An English as a Second Language Text.

    ERIC Educational Resources Information Center

    Short, Deborah J.; And Others

    A textbook for English-as-a-Second-Language (ESL) students presents lessons on U.S. citizenship education and naturalization information. The nine lessons cover the following topics: the U.S. system of government; the Bill of Rights; responsibilities and rights of citizens; voting; requirements for naturalization; the application process; the…

  19. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E; Garrett, Kenyan; Sottile, Matthew; Shreyas, Ns

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  20. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.

    PubMed

    Clody, Michael C

    2011-01-01

    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key. PMID:22371983

  1. The parser doesn't ignore intransitivity, after all

    PubMed Central

    Staub, Adrian

    2015-01-01

    Several previous studies (Adams, Clifton, & Mitchell, 1998; Mitchell, 1987; van Gompel & Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eyetracking experiments examined this issue in more detail. Experiment 1 strongly replicated the finding (van Gompel & Pickering, 2001) that readers experience difficulty on this noun phrase in normal reading, and found that this difficulty occurs even with a class of intransitive verbs for which a direct object is categorically prohibited. Experiment 2, however, demonstrated that this effect is not due to syntactic misanalysis, but is instead due to disruption that occurs when a comma is absent at a subordinate clause/main clause boundary. Exploring a different construction, Experiment 3 replicated the finding (Pickering & Traxler, 2003; Traxler & Pickering, 1996) that when a noun phrase “filler” is an implausible direct object for an optionally transitive relative clause verb, processing difficulty results; however, there was no evidence for such difficulty when the relative clause verb was strictly intransitive. Taken together, the three experiments undermine the support for the claim that the parser initially ignores a verb's subcategorization restrictions. PMID:17470005

  2. Ontology-Based Controlled Natural Language Editor Using CFG with Lexical Dependency

    NASA Astrophysics Data System (ADS)

    Namgoong, Hyun; Kim, Hong-Gee

    In recent years, CNL (Controlled Natural Language) has received much attention with regard to ontology-based knowledge acquisition systems. CNLs, as subsets of natural languages, can be useful for both humans and computers by eliminating ambiguity of natural languages. Our previous work, OntoPath [10], proposed to edit natural language-like narratives that are structured in RDF (Resource Description Framework) triples, using a domain-specific ontology as their language constituents. However, our previous work and other systems employing CFG for grammar definition have difficulties in enlarging the expression capacity. A newly developed editor, which we propose in this paper, permits grammar definitions through CFG-LD (Context-Free Grammar with Lexical Dependency) that includes sequential and semantic structures of the grammars. With CFG describing the sequential structure of grammar, lexical dependencies between sentence elements can be designated in the definition system. Through the defined grammars, the implemented editor guides users' narratives in more familiar expressions with a domain-specific ontology and translates the content into RDF triples.

  3. Selecting the Best Mobile Information Service with Natural Language User Input

    NASA Astrophysics Data System (ADS)

    Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu

    Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.

  4. Book Reviews A Connectionist Language Generator

    E-print Network

    language generation system FIG. FIG uses a structured connectionist network to generate either English or Japanese output. The input to FIG comes either from a simple parser of Japanese, or test inputs created generator. 2. A Description of FIG Except for some minor parts, FIG is implemented as a structured

  5. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  6. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  7. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  8. Natural Language Processing with Dynamic Classification Improves P300 Speller Accuracy and Bit Rate

    PubMed Central

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-01-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40%–60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification. PMID:22156110

  9. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  10. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  11. Stochastic Model for the Vocabulary Growth in Natural Languages

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2013-04-01

    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  12. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory

    ERIC Educational Resources Information Center

    Tierney, Patrick J.

    2012-01-01

    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

  13. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…

  14. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…

  15. Design and Evaluation of a Natural Language Tutor for Force and Motion

    E-print Network

    Heckler, Andrew F.

    methodologies and to significant levels of success, have tackled physics topics such as forces, kinematicsDesign and Evaluation of a Natural Language Tutor for Force and Motion Ryan Badeau and Andrew F. Heckler Department of Physics, The Ohio State University, 191 W. Woodruff Avenue, Columbus, OH 43210

  16. Introduction to Special Issue: Understanding the Nature-Nurture Interactions in Language and Learning Differences.

    ERIC Educational Resources Information Center

    Berninger, Virginia Wise

    2001-01-01

    The introduction to this special issue on nature-nurture interactions notes that the following articles represent five biologically oriented research approaches which each provide a tutorial on the investigator's major research tool, a summary of current research understandings regarding language and learning differences, and a discussion of…

  17. Computer Applications in Professional Writing: Systems that Analyze and Describe Natural Language.

    ERIC Educational Resources Information Center

    O'Brien, Frank

    Two varieties of user-friendly computer systems that deal with natural language are now available, providing either at-the-monitor stylistic and grammatic correction of keyed-in writing or a sorting, selecting, and generating of statistical data for any written or spoken document. The editor programs, such as "The Writer's Workbench" (Bell…

  18. BRINGING NATURAL LANGUAGE PROCESSING TO THE MICROCOMPUTER MARKET THE STORY OF Q&A

    E-print Network

    BRINGING NATURAL LANGUAGE PROCESSING TO THE MICROCOMPUTER MARKET THE STORY OF Q&A Gary G. Hendrix of the reaction: * SoftSel, the largest US distributor of microcomputer software, publishes a biweekly "hot list to provide the microcomputer industry's most objective testing, gave Q&A the highest overall evaluation ever

  19. FORMAL AND NATURAL LANGUAGE GENERATION IN THE MERCURY CONVERSATIONAL SYSTEM1

    E-print Network

    FORMAL AND NATURAL LANGUAGE GENERATION IN THE MERCURY CONVERSATIONAL SYSTEM1 Stephanie Seneff MERCURY flight reservation conversational system. Generation makes use of the GENESIS-II generation server that provides access to real flight information, world wide. The system, called Mercury, which has been im

  20. GENERATION -A NEW FRONTIER OF NATURAL LANGUAGE PROCESSING? Aravind K. Joshi

    E-print Network

    GENERATION - A NEW FRONTIER OF NATURAL LANGUAGE PROCESSING? Aravind K. Joshi Department of Computer and Information Science University of Pennsylvania Comprehension and generation are the two complementary aspects that comprehension is harder than generation, (2) problems in comprehension could be formulated in the AI paradigm

  1. Evolving Readable String Test Inputs Using a Natural Language Model to Reduce Human Oracle Cost

    E-print Network

    McMinn, Phil

    Evolving Readable String Test Inputs Using a Natural Language Model to Reduce Human Oracle Cost, checking software behaviour is frequently a painstakingly manual task. Despite the high cost of human time- consuming. One source of human oracle cost is the inherent unreadability of machine

  2. Evaluating a Natural Language Dialogue System: Results and Experiences , Anne De Roeck, Udo Kruschwitz,

    E-print Network

    Webb, Nick

    to measure subjective performance. We introduce a Natural Language Dialogue System, the YPA, developed-display advertisements. We examine the evaluation methodologies used to assess the performance of the YPA during. It has become important to be able to demonstrate the increased effectiveness of the DM component

  3. NATURAL LANGUAGE INPUT TO A COMPUTER-BASED GLAUCOMA CONSULTATION SYST~

    E-print Network

    NATURAL LANGUAGE INPUT TO A COMPUTER-BASED GLAUCOMA CONSULTATION SYST~ Victor B. Cieslelski" for a Computer-Based Glaucoma Consultation System is described. The system views a case as a description with a representation of the structured object GLAUCOMA-PATIENT. There is also a facility for adding domain dependent

  4. Extracting Information on Pneumonia in Infants Using Natural Language Processing of Radiology Reports

    E-print Network

    Extracting Information on Pneumonia in Infants Using Natural Language Processing of Radiology of healthcare-associated pneumonia in new- borns (e.g. neonates) because it produces significant rates to identify healthcare-associated pneumonia in neo- nates. We estimated sensitivity, specific- ity

  5. The International English Language Testing System (IELTS): Its Nature and Development.

    ERIC Educational Resources Information Center

    Ingram, D. E.

    The nature and development of the recently released International English Language Testing System (IELTS) instrument are described. The test is the result of a joint Australian-British project to develop a new test for use with foreign students planning to study in English-speaking countries. It is expected that the modular instrument will become…

  6. Competitively Evolving Decision Trees Against Fixed Training Cases for Natural Language Processing

    E-print Network

    Fernandez, Thomas

    Competitively Evolving Decision Trees Against Fixed Training Cases for Natural Language Processing decision trees for the problem of word sense disam- biguation. The decision trees contain embedded bit of overlearning, we have implemented a fitness penalty function specialized for decision trees which is dependent

  7. Visualization of health information with predications extracted using natural language processing and filtered using the UMLS.

    PubMed

    Miller, Trudi; Leroy, Gondy

    2008-01-01

    Increased availability of and reliance on written health information can tax the abilities of unskilled readers. We are developing a system that uses natural language processing to extract phrases, identify medical terms using the UMLS, and visualize the propositions. This system substantially reduces the amount of information a consumer must read, while providing an alternative to traditional prose based text. PMID:18999136

  8. An Analysis of Methods for Preparing a Large Natural Language Data Base.

    ERIC Educational Resources Information Center

    Porch, Ann

    Relative cost and effectiveness of techniques for preparing a computer compatible data base consisting of approximately one million words of natural language are outlined. Considered are dollar cost, ease of editing, and time consumption. Facility for insertion of identifying information within the text, and updating of a text by merging with…

  9. BIT BY BIT: A Game Simulating Natural Language Processing in Computers

    ERIC Educational Resources Information Center

    Kato, Taichi; Arakawa, Chuichi

    2008-01-01

    BIT BY BIT is an encryption game that is designed to improve students' understanding of natural language processing in computers. Participants encode clear words into binary code using an encryption key and exchange them in the game. BIT BY BIT enables participants who do not understand the concept of binary numbers to perform the process of…

  10. The Linguistic Correlates of Conversational Deception: Comparing Natural Language Processing Technologies

    ERIC Educational Resources Information Center

    Duran, Nicholas D.; Hall, Charles; McCarthy, Philip M.; McNamara, Danielle S.

    2010-01-01

    The words people use and the way they use them can reveal a great deal about their mental states when they attempt to deceive. The challenge for researchers is how to reliably distinguish the linguistic features that characterize these hidden states. In this study, we use a natural language processing tool called Coh-Metrix to evaluate deceptive…

  11. An Evaluation of Help Mechanisms in Natural Language Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Kreymer, Oleg

    2002-01-01

    Evaluates the current state of natural language processing information retrieval systems from the user's point of view, focusing on the structure and components of the systems' help mechanisms. Topics include user/system interaction; semantic parsing; syntactic parsing; semantic mapping; and concept matching. (Author/LRW)

  12. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)…

  13. A Sublanguage Approach to Natural Language Processing for an Expert System.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.; And Others

    1993-01-01

    Reports on the development of an NLP (natural language processing) component for processing the free-text comments on life insurance applications for evaluation by an underwriting expert system. A sublanguage grammar approach with strong reliance on semantic word classes is described. Highlights include lexical analysis, adjacency analysis, and…

  14. Discrimination of Coronal Stops by Bilingual Adults: The Timing and Nature of Language Interaction

    ERIC Educational Resources Information Center

    Sundara, Megha; Polka, Linda

    2008-01-01

    The current study was designed to investigate the timing and nature of interaction between the two languages of bilinguals. For this purpose, we compared discrimination of Canadian French and Canadian English coronal stops by simultaneous bilingual, monolingual and advanced early L2 learners of French and English. French /d/ is phonetically…

  15. CAL Abstract Can natural language recognition technologies be used to enhance the learning

    E-print Network

    experience of young children? Background Natural language as a bridge to useable technology. The features that although the use of a computer did not improve the quality or quantity of childrens writing, the high. Given that there are benefits to be gained from the use of word processing software, children

  16. Extracting Database Role Based Access Control from Unconstrained Natural Language Text

    E-print Network

    Young, R. Michael

    the database. Insider threats exist if individuals can bypass application level security and access-based process to 1) parse existing, unaltered natural language documents such as requirements and policy the "gold standard" [10], to enforce authorization policies within systems. However, many software systems

  17. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  18. MathNat -Mathematical Text in a Controlled Natural Language

    E-print Network

    }@univ-savoie.fr Abstract. The MathNat1 project aims at being a first step towards automatic formalisation and verification is a precisely defined subset of English with restricted grammar and dictio- nary. To make CLM natural and formal mathematics, reduces the usefulness of computer assisted theorem proving in learning, teaching

  19. NATURAL LANGUAGE AND COMPUTER INTEBFACE DESIGN MURRAY TUROFF

    E-print Network

    and precise in nature. Pilots and po- lice are good examples of this. Even working groups within a field allow very young child- ten or deaf persons to better utilize the computer. I see it as immoral- lieve, far more beneficial to the child than giving him canned lessons as his or her first impression

  20. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text Kenneth Ward Church

    E-print Network

    A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text Kenneth Ward Church Bell Laboratories 600 Mountain Ave. Murray Hill, N.J., USA 201-582-5325 alice!k-wc It is well-known that part

  1. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  2. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  3. TR-IIS-07-017 BibPro: A Citation Parser Based on

    E-print Network

    Chen, Sheng-Wei

    TR-IIS-07-017 BibPro: A Citation Parser Based on Sequence Alignment Techniques Kai-Hsiang Yang.sinica.edu.tw/page/library/LIB/TechReport/tr2007/tr07.html #12;BibPro: A Citation Parser Based on Sequence Alignment Techniques Kai-Hsiang Yang the publications utilize many different citation formats, the problem of accurately extracting metadata from

  4. A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language

    PubMed Central

    Golosio, Bruno; Cangelosi, Angelo; Gamotina, Olesya; Masala, Giovanni Luca

    2015-01-01

    Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities. PMID:26560154

  5. A Natural Language for AdS/CFT Correlators

    SciTech Connect

    Fitzpatrick, A.Liam; Kaplan, Jared; Penedones, Joao; Raju, Suvrat; van Rees, Balt C.; /YITP, Stony Brook

    2012-02-14

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  6. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

    PubMed

    Khalifa, Abdulrahman; Meystre, Stéphane

    2015-12-01

    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. PMID:26318122

  7. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks

    PubMed Central

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar

    2015-01-01

    Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions. PMID:26151211

  8. New Trends in Computing Anticipatory Systems : Emergence of Artificial Conscious Intelligence with Machine Learning Natural Language

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2008-10-01

    This paper deals with the challenge to create an Artificial Intelligence System with an Artificial Consciousness. For that, an introduction to computing anticipatory systems is presented, with the definitions of strong and weak anticipation. The quasi-anticipatory systems of Robert Rosen are linked to open-loop controllers. Then, some properties of the natural brain are presented in relation to the triune brain theory of Paul D. MacLean, and the mind time of Benjamin Libet, with his veto of the free will. The theory of the hyperincursive discrete anticipatory systems is recalled in view to introduce the concept of hyperincursive free will, which gives a similar veto mechanism: free will as unpredictable hyperincursive anticipation The concepts of endo-anticipation and exo-anticipation are then defined. Finally, some ideas about artificial conscious intelligence with natural language are presented, in relation to the Turing Machine, Formal Language, Intelligent Agents and Mutli-Agent System.

  9. Laboratory process control using natural language commands from a personal computer

    NASA Technical Reports Server (NTRS)

    Will, Herbert A.; Mackin, Michael A.

    1989-01-01

    PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.

  10. Natural language processing-based COTS software and related technologies survey.

    SciTech Connect

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.

    2003-09-01

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  11. Agile sensor tasking for CoIST using natural language knowledge representation and reasoning

    NASA Astrophysics Data System (ADS)

    Braines, David; de Mel, Geeth; Gwilliams, Chris; Parizas, Christos; Pizzocaro, Diego; Bergamaschi, Flavio; Preece, Alun

    2014-06-01

    We describe a system architecture aimed at supporting Intelligence, Surveillance, and Reconnaissance (ISR) activities in a Company Intelligence Support Team (CoIST) using natural language-based knowledge representation and reasoning, and semantic matching of mission tasks to ISR assets. We illustrate an application of the architecture using a High Value Target (HVT) surveillance scenario which demonstrates semi-automated matching and assignment of appropriate ISR assets based on information coming in from existing sensors and human patrols operating in an area of interest and encountering a potential HVT vehicle. We highlight a number of key components of the system but focus mainly on the human/machine conversational interaction involving soldiers on the field providing input in natural language via spoken voice to a mobile device, which is then processed to machine-processable Controlled Natural Language (CNL) and confirmed with the soldier. The system also supports CoIST analysts obtaining real-time situation awareness on the unfolding events through fused CNL information via tools available at the Command and Control (C2). The system demonstrates various modes of operation including: automatic task assignment following inference of new high-importance information, as well as semi-automatic processing, providing the CoIST analyst with situation awareness information relevant to the area of operation.

  12. Using Natural Language And Voice To Control High Level Tasks In A Robotics Environment

    NASA Astrophysics Data System (ADS)

    Hackenberg, Robert G.

    1987-03-01

    RCA's Advanced Technology Laboratories (ATL) has implemented an integrated system which permits control of high level tasks in a robotics environment through voice input in the form of natural language syntax. The paper to be presented will outline the architecture used to integrate voice recognition and synthesis hardware and natural language and intelligent reasoning software with a supervisory processor that controls robotic and vision operations in the robotic testbed. The application is intended to give the human operator of a Puma 782 industrial robot the ability to combine joystick teleoperation with voice input in order to provide a flexible man-machine interface in a hands-busy environment. The system is designed to give the operator a speech interface which is unobtrusive and undemanding in terms of predetermined syntax requirements. The voice recognizer accepts continuous speech and the natural language processor accepts full and partial sentence fragments and can perform a fair amount of disambiguation and context analysis. Output to the operator comes via the parallel channel of speech synthesis so that the operator does not have to consult the computer's CRT for messages. The messages are generated from the software and offer warnings about unacceptable situations, confirmations of actions completed, and feedback of system data.

  13. Discriminative Syntactic Language Modeling for Speech Recognition Michael Collins

    E-print Network

    Saraçlar, Murat

    Discriminative Syntactic Language Modeling for Speech Recognition Michael Collins MIT CSAIL is parsed using the statistical parser of Collins (1999) to give a parse tree T (w). Information from vector ¯ using the perceptron algorithm (Collins, 2004; Collins, 2002). The per- ceptron algorithm

  14. Grammar as a Programming Language. Artificial Intelligence Memo 391.

    ERIC Educational Resources Information Center

    Rowe, Neil

    Student projects that involve writing generative grammars in the computer language, "LOGO," are described in this paper, which presents a grammar-running control structure that allows students to modify and improve the grammar interpreter itself while learning how a simple kind of computer parser works. Included are procedures for programing a…

  15. Exploring culture, language and the perception of the nature of science

    NASA Astrophysics Data System (ADS)

    Sutherland, Dawn

    2002-01-01

    One dimension of early Canadian education is the attempt of the government to use the education system as an assimilative tool to integrate the First Nations and Me´tis people into Euro-Canadian society. Despite these attempts, many First Nations and Me´tis people retained their culture and their indigenous language. Few science educators have examined First Nations and Western scientific worldviews and the impact they may have on science learning. This study explored the views some First Nations (Cree) and Euro-Canadian Grade-7-level students in Manitoba had about the nature of science. Both qualitative (open-ended questions and interviews) and quantitative (a Likert-scale questionnaire) instruments were used to explore student views. A central hypothesis to this research programme is the possibility that the different world-views of two student populations, Cree and Euro-Canadian, are likely to influence their perceptions of science. This preliminary study explored a range of methodologies to probe the perceptions of the nature of science in these two student populations. It was found that the two cultural groups differed significantly between some of the tenets in a Nature of Scientific Knowledge Scale (NSKS). Cree students significantly differed from Euro-Canadian students on the developmental, testable and unified tenets of the nature of scientific knowledge scale. No significant differences were found in NSKS scores between language groups (Cree students who speak English in the home and those who speak English and Cree or Cree only). The differences found between language groups were primarily in the open-ended questions where preformulated responses were absent. Interviews about critical incidents provided more detailed accounts of the Cree students' perception of the nature of science. The implications of the findings of this study are discussed in relation to the challenges related to research methodology, further areas for investigation, science teaching in First Nations communities and science curriculum development.

  16. Formalizing natural-language spatial relations descriptions with fuzzy decision tree algorithm

    NASA Astrophysics Data System (ADS)

    Xu, Jun; Yao, Changqing

    2006-10-01

    People usually use qualitative terms to express spatial relations, while current geographic information systems (GIS) all use quantitative approaches to store spatial information. The abilities of current GIS to represent and query spatial information about geographic space are limited. In order to incorporate the concepts and methods people use to infer information about geographic space into GIS, research on the formal model of common sense geography becomes increasingly important. Previous research on the formalizations of natural-language descriptions of spatial relations are all based on crisp classification algorithms. But the human languages about spatial relations are ambiguous. There is no clear boundary between "yes" or "no" if a spatial relation predicate can express the spatial relations between objects. So the results of crisp classification algorithms can not formalize natural-language terms well. This paper uses a fuzzy decision tree method to formalize the spatial relations between two linear objects. Topologic and metric indices are used as variables, and the results of a human-subject test are used as training data. The formalization result of the fuzzy decision tree is compared with the result of a crisp decision tree.

  17. Using a Broad-Coverage Parser for Word-Breaking in Japanese Hisami Suzuki, Chris Brockett and Gary Kacmarcik

    E-print Network

    Using a Broad-Coverage Parser for Word-Breaking in Japanese Hisami Suzuki, Chris Brockett and Gary Kacmarcik Microsoft Research One Microsoft Way Redmond WA 98052 USA {hisamis, chrisbkt, garykac}@microsoft.com Abstract We describe a method of word segmentation in Japanese in which a broad-coverage parser selects

  18. Language processor generation with BNF inputs: methods and implementation.

    PubMed

    Shapiro, B

    1977-06-01

    This paper describes an SLR (1) parser generator written in SAIL for the PDR-10. It accepts grammars defined in a BNF formalism and produces a SAIL program module which is the bottom-up parser produced from the grammar. This module may then be compiled and loaded with semantic routines to produce a language processor. The generator is written in SAIL with a heavy emphasis placed on the use of the LEAP facilities of SAIL for the manipulation of the data structures required for the generator. PMID:862394

  19. Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.

    PubMed

    Pyysalo, Sampo; Ginter, Filip; Pahikkala, Tapio; Boberg, Jorma; Järvinen, Jouni; Salakoski, Tapio

    2006-06-01

    We present an evaluation of Link Grammar and Connexor Machinese Syntax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein-protein interactions. In the evaluation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein-protein interaction. We measure the performance of the parsers for recovery of individual dependencies, fully correct parses, and interaction subgraphs. For Link Grammar, an open system that can be inspected in detail, we further perform a comprehensive failure analysis, report specific causes of error, and suggest potential modifications to the grammar. We find that both parsers perform worse on biomedical English than previously reported on general English. While Connexor Machinese Syntax significantly outperforms Link Grammar, the failure analysis suggests specific ways in which the latter could be modified for better performance in the domain. PMID:16099201

  20. LINGUISTICS AND AUTOMATED LANGUAGE PI~OCESSING 1 0.1 This paper is concerned with natural language, computers,

    E-print Network

    linguists with a proper perspective on automated language processing, or computer scientists with a proper with the perspective of linguists on automated language processing and computer scientists on linguistics; Section Z this discussion with a brief inquiry into the sources of the common focus of linguists and computer scientists

  1. Proceedings of the Workshop on Language Processing and Crisis Information 2013, pages 4450, Nagoya, Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing

    E-print Network

    , often referred to in Japan as the Great East Japan Earthquake, was a magnitude 9.0 under sea megathrust, Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing Returning-Home Analysis in Tokyo Metropolitan Area at the time of the Great East Japan Earthquake using Twitter Data Yusuke Hara

  2. Proceedings of the Workshop on Language Processing and Crisis Information 2013, pages 1925, Nagoya, Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing

    E-print Network

    , Japan, 14 October 2013. c 2013 Asian Federation of Natural Language Processing Rescue Activity for the Great East Japan Earthquake Based on a Website that Extracts Rescue Requests from the Net Shin Aida@nict.go.jp Abstract At the early phase of the Great East Japan Earthquake a vast number of tweets were made on Twitter

  3. Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping.

    PubMed

    Smith, Andrew E; Humphreys, Michael S

    2006-05-01

    The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in a nunsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks. PMID:16956103

  4. Knowledge acquisition from natural language for expert systems based on classification problem-solving methods

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando

    1989-01-01

    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

  5. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING: SPECIAL ISSUE ON SOFTWARE LANGUAGE ENGINEERING 1 Grammar Recovery from Parse Trees and

    E-print Network

    Malloy, Brian

    . However, a grammar is not always available for a language, and in these cases, acquiring a grammar of a case study in which we recover and refactor a grammar from version 4.0.0 of the GNU C++ parserIEEE TRANSACTIONS ON SOFTWARE ENGINEERING: SPECIAL ISSUE ON SOFTWARE LANGUAGE ENGINEERING 1 Grammar

  6. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

    NASA Technical Reports Server (NTRS)

    Genuardi, Michael T.

    1993-01-01

    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  7. Testing Grammars For Top-Down Parsers A.M. Paracha and F. Franek

    E-print Network

    Franek, Frantisek

    grammar used. Test cases should cover all possible valid and invalid input conditions. One of the major is specified by means of a formal grammar. A grammar is the main input for the test case generation processTesting Grammars For Top-Down Parsers A.M. Paracha and F. Franek Dept. of Computing and Software Mc

  8. Reachability Analysis of the HTML5 Parser Specification and its Application to

    E-print Network

    Minamide, Yasuhiko

    Reachability Analysis of the HTML5 Parser Specification and its Application to Compatibility for HTML, HTML5, includes the detailed specification of the parsing algorithm for HTML5 documents, includ of HTML5 and automatically generate HTML documents to test compatibilities of Web browsers. The set

  9. Evalita'09 Parsing Task: constituency parsers and the Penn format for Italian

    E-print Network

    Mazzei, Alessandro

    Evalita'09 Parsing Task: constituency parsers and the Penn format for Italian Cristina Bosco is at defining and extending the state of the art for parsing Italian by encouraging the application of existing and constituency. This second track is based on a development set in a format, which is an adaptation for Italian

  10. BibPro: A Citation Parser Based on Sequence Alignment Techniques Chien-Chih Chen1

    E-print Network

    Yang, Kai-Hsiang

    BibPro: A Citation Parser Based on Sequence Alignment Techniques Chien-Chih Chen1 , Kai-Hsiang Yang in different conferences and journals follow different citation formats, so the problem of accurately citation strings by using a gene sequence alignment tool. The main enhancement of BibPro to our previously

  11. Belief Ascription and Model Generative Reasoning: joining two paradigms to a robust parser of messages.

    E-print Network

    Hartley, Roger

    a semantics-based parser for robust parsing of noisy message data. ViewGen represents the beliefs of agents access to both Blue data-bases and military doctrine (in terms of beliefs and goals) and, in certain of messages. Yorick Wilks and Roger Hartley Computing Research Laboratory New Mexico State University Box

  12. Belief Ascription and Model Generative Reasoning: joining two paradigms to a robust parser of messages.

    E-print Network

    a semantics-based parser for robust parsing of noisy message data. ViewGen represents the beliefs of agents and data, has access to both Blue data-bases and military doctrine (in terms of beliefs and goals) and of messages. Yorick Wilks and Roger Hartley Computing Research Laboratory New Mexico State University Box

  13. Building Languages

    MedlinePLUS

    ... Language American Sign Language (ASL) Conceptually Accurate Signed English (CASE) Cued Speech Finger Spelling Listening/Auditory Training Manually Coded English Natural Gestures Speech Speech Reading Family Decision Making ...

  14. Abductive Equivalential Translation and its application to Natural Language Database Interfacing

    NASA Astrophysics Data System (ADS)

    Rayner, Manny

    1994-05-01

    The thesis describes a logical formalization of natural-language database interfacing. We assume the existence of a ``natural language engine'' capable of mediating between surface linguistic string and their representations as ``literal'' logical forms: the focus of interest will be the question of relating ``literal'' logical forms to representations in terms of primitives meaningful to the underlying database engine. We begin by describing the nature of the problem, and show how a variety of interface functionalities can be considered as instances of a type of formal inference task which we call ``Abductive Equivalential Translation'' (AET); functionalities which can be reduced to this form include answering questions, responding to commands, reasoning about the completeness of answers, answering meta-questions of type ``Do you know...'', and generating assertions and questions. In each case, a ``linguistic domain theory'' (LDT) ? and an input formula F are given, and the goal is to construct a formula with certain properties which is equivalent to F, given ? and a set of permitted assumptions. If the LDT is of a certain specified type, whose formulas are either conditional equivalences or Horn-clauses, we show that the AET problem can be reduced to a goal-directed inference method. We present an abstract description of this method, and sketch its realization in Prolog. The relationship between AET and several problems previously discussed in the literature is discussed. In particular, we show how AET can provide a simple and elegant solution to the so-called ``Doctor on Board'' problem, and in effect allows a ``relativization'' of the Closed World Assumption. The ideas in the thesis have all been implemented concretely within the SRI CLARE project, using a real projects and payments database. The LDT for the example database is described in detail, and examples of the types of functionality that can be achieved within the example domain are presented.

  15. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 12351245, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1235 %% applying AGENT `` THEME ;; LOCATION (( cosmetics face (c) woman putting AGENT __ THEME ==makeup (d) woman

  16. The nature of the visual environment induces implicit biases during language-mediated visual search.

    PubMed

    Huettig, Falk; McQueen, James M

    2011-08-01

    Four eyetracking experiments examined whether semantic and visual-shape representations are routinely retrieved from printed word displays and used during language-mediated visual search. Participants listened to sentences containing target words that were similar semantically or in shape to concepts invoked by concurrently displayed printed words. In Experiment 1, the displays contained semantic and shape competitors of the targets along with two unrelated words. There were significant shifts in eye gaze as targets were heard toward semantic but not toward shape competitors. In Experiments 2-4, semantic competitors were replaced with unrelated words, semantically richer sentences were presented to encourage visual imagery, or participants rated the shape similarity of the stimuli before doing the eyetracking task. In all cases, there were no immediate shifts in eye gaze to shape competitors, even though, in response to the Experiment 1 spoken materials, participants looked to these competitors when they were presented as pictures (Huettig & McQueen, 2007). There was a late shape-competitor bias (more than 2,500 ms after target onset) in all experiments. These data show that shape information is not used in online search of printed word displays (whereas it is used with picture displays). The nature of the visual environment appears to induce implicit biases toward particular modes of processing during language-mediated visual search. PMID:21461784

  17. Gender Differences in Natural Language Factors of Subjective Intoxication in College Students: An Experimental Vignette Study

    PubMed Central

    Levitt, Ash; Schlauch, Robert C.; Bartholow, Bruce D.; Sher, Kenneth J.

    2013-01-01

    Background Examining the natural language college students use to describe various levels of intoxication can provide important insight into subjective perceptions of college alcohol use. Previous research (Levitt et al., 2009) has shown that intoxication terms reflect moderate and heavy levels of intoxication, and that self-use of these terms differs by gender among college students. However, it is still unknown whether these terms similarly apply to other individuals and, if so, whether similar gender differences exist. Method To address these issues, the current study examined the application of intoxication terms to characters in experimentally manipulated vignettes of naturalistic drinking situations within a sample of university undergraduates (N = 145). Results Findings supported and extended previous research by showing that other-directed applications of intoxication terms are similar to self-directed applications, and depend on the gender of both the target and the user. Specifically, moderate intoxication terms were applied to and from women more than men, even when the character was heavily intoxicated, whereas heavy intoxication terms were applied to and from men more than women. Conclusions The findings suggest that gender differences in the application of intoxication terms are other-directed as well as self-directed, and that intoxication language can inform gender-specific prevention and intervention efforts targeting problematic alcohol use among college students. PMID:23841828

  18. Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language

    PubMed Central

    West, Andrew G

    2015-01-01

    Background Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. Objective This paper quantifies the production and consumption of Wikipedia’s medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia’s natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia’s medical contributors, including year-over-year participation trends and editor demographics. Methods Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. Results Wikipedia’s medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85.5% (100/117) had a university education. Conclusions Although Wikipedia has a considerable volume of multilingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size. PMID:25739399

  19. Language in Nature: on the Evolutionary Roots of a Cultural Phenomenon (draft chapter for The Language Phenomenon)

    E-print Network

    Koolen, Marijn

    for The Language Phenomenon) Willem Zuidema 1. Introduction What distinguishes Man from beast? For all of human of the popular answers. Humans might walk upright more than any other ape, have less hair, be better at long, are differences of degree and not of kind. One answer, however, has survived all serious scrutiny: humans have

  20. The Usual and the Unusual: Solving Remote Associates Test Tasks Using Simple Statistical Natural Language Processing Based on Language Use

    ERIC Educational Resources Information Center

    Klein, Ariel; Badia, Toni

    2015-01-01

    In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…

  1. Detection of practice pattern trends through Natural Language Processing of clinical narratives and biomedical literature.

    PubMed

    Chen, Elizabeth S; Stetson, Peter D; Lussier, Yves A; Markatou, Marianthi; Hripcsak, George; Friedman, Carol

    2007-01-01

    Clinical knowledge, best evidence, and practice patterns evolve over time. The ability to track these changes and study practice trends may be valuable for performance measurement and quality improvement efforts. The goal of this study was to assess the feasibility and validity of methods to generate and compare trends in biomedical literature and clinical narrative. We focused on the challenge of detecting trends in medication usage over time for two diseases: HIV/AIDS and asthma. Information about disease-specific medications in published randomized control trials and discharge summaries at NewYork-Presbyterian Hospital over a ten-year period were extracted using Natural Language Processing. This paper reports on the ability of our semi-automated process to discover disease-drug practice pattern trends and interpretation of findings across the biomedical and clinical text sources. PMID:18693810

  2. Parallelism and the Penman natural-language-generation system. Research report

    SciTech Connect

    Tung, Y.W.; Matthiessen, C.; Sondheimer, N.

    1988-04-01

    This report discusses parallel processing for the Penman natural-language-generation system. The authors first analyze the computational requirement of the generation process. They then identify aspects of this computation that could benefit from being carried out in parallel. The Penman generator is composed of a systemic grammar, the Nigel grammar, and its environment. These two components are functionally separated and interface to each other via an inquiry mechanism. This implies that Nigel and its environment can be processed in a distributed way. We also illustrate how both Nigel and the major part of its environment, the KL-TWO knowledge base, can each be processed in parallel. In the Nigel grammar, the systems, choosers and realization statements can be activated simultaneously according to some computational dependency that resembles the system network. The KL-TWO knowledge base can be implemented as a parallel computing system, and two existing approaches, using Classifier Systems and Connectionist Models, respectively, are analyzed and assessed.

  3. HUNTER-GATHERER: Three search techniques integrated for natural language semantics

    SciTech Connect

    Beale, S.; Nirenburg, S.; Mahesh, K.

    1996-12-31

    This work integrates three related Al search techniques - constraint satisfaction, branch-and-bound and solution synthesis - and applies the result to semantic processing in natural language (NL). We summarize the approach as {open_quote}Hunter-Gatherer:{close_quotes} (1) branch-and-bound and constraint satisfaction allow us to {open_quote}hunt down{close_quotes} non-optimal and impossible solutions and prune them from the search space. (2) solution synthesis methods then {open_quote}gather{close_quotes} all optimal solutions avoiding exponential complexity. Each of the three techniques is briefly described, as well as their extensions and combinations used in our system. We focus on the combination of solution synthesis and branch-and-bound methods which has enabled near-linear-time processing in our applications. Finally, we illustrate how the use of our technique in a large-scale MT project allowed a drastic reduction in search space.

  4. Interset: A natural language interface for teleoperated robotic assembly of the EASE space structure

    NASA Technical Reports Server (NTRS)

    Boorsma, Daniel K.

    1989-01-01

    A teleoperated robot was used to assemble the Experimental Assembly of Structures in Extra-vehicular activity (EASE) space structure under neutral buoyancy conditions, simulating a telerobot performing structural assembly in the zero gravity of space. This previous work used a manually controlled teleoperator as a test bed for system performance evaluations. From these results several Artificial Intelligence options were proposed. One of these was further developed into a real time assembly planner. The interface for this system is effective in assembling EASE structures using windowed graphics and a set of networked menus. As the problem space becomes more complex and hence the set of control options increases, a natural language interface may prove to be beneficial to supplement the menu based control strategy. This strategy can be beneficial in situations such as: describing the local environment, maintaining a data base of task event histories, modifying a plan or a heuristic dynamically, summarizing a task in English, or operating in a novel situation.

  5. On application of image analysis and natural language processing for music search

    NASA Astrophysics Data System (ADS)

    Gwardys, Grzegorz

    2013-10-01

    In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.

  6. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary.

    PubMed

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda

    2014-02-01

    In April 2012, the National Institutes of Health organized a two-day workshop entitled 'Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making' (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193

  7. Interpretation of natural-language data base queries using optimization methods

    SciTech Connect

    Leigh, W.E.

    1984-01-01

    The automatic interpretation of natural language (in this work, English), database questions formulated by a user untrained in the technical aspects of database querying is an established problem in the field of artificial intelligence. State-of-the-art approaches involve the analysis of queries with syntactic and semantic grammars expressed in phrase structure grammar or transition network formalisms. With such method difficulties exist with the detection and resolution of ambiguity, with the misinterpretation possibilities inherent with finite length look-ahead, and with the modification and extension of a mechanism for other sources of semantic knowledge. This work examines the potential of optimization techniques to solve these problems and interpret natural language, database queries. The proposed method involves developing a 0-1 integer programming problem for each query. The possible values that the set of variables in the optimization may take on is an enumeration of possible such individual associations between the database schema and the query. The solution to the integer programming problem corresponds to a single assignment of database data items and relationships to the words in the query. Constraints are derived from systematic and database schema knowledge stored as libraries of templates. An objective function is used to rank the possible associations as to their likelihood of agreement with the intent of the questioner. A test mechanism was built to support evaluation of the proposed method. Suitable knowledge source template sets and an objective function were developed experimentally with the test mechanism from a learning sample of queries. Then the performance of the method was compared to that of an established system (PLANES) on a test set of queries. The performance of the new method was found to be comparable to that of the established system.

  8. Integrating natural language processing and web GIS for interactive knowledge domain visualization

    NASA Astrophysics Data System (ADS)

    Du, Fangming

    Recent years have seen a powerful shift towards data-rich environments throughout society. This has extended to a change in how the artifacts and products of scientific knowledge production can be analyzed and understood. Bottom-up approaches are on the rise that combine access to huge amounts of academic publications with advanced computer graphics and data processing tools, including natural language processing. Knowledge domain visualization is one of those multi-technology approaches, with its aim of turning domain-specific human knowledge into highly visual representations in order to better understand the structure and evolution of domain knowledge. For example, network visualizations built from co-author relations contained in academic publications can provide insight on how scholars collaborate with each other in one or multiple domains, and visualizations built from the text content of articles can help us understand the topical structure of knowledge domains. These knowledge domain visualizations need to support interactive viewing and exploration by users. Such spatialization efforts are increasingly looking to geography and GIS as a source of metaphors and practical technology solutions, even when non-georeferenced information is managed, analyzed, and visualized. When it comes to deploying spatialized representations online, web mapping and web GIS can provide practical technology solutions for interactive viewing of knowledge domain visualizations, from panning and zooming to the overlay of additional information. This thesis presents a novel combination of advanced natural language processing - in the form of topic modeling - with dimensionality reduction through self-organizing maps and the deployment of web mapping/GIS technology towards intuitive, GIS-like, exploration of a knowledge domain visualization. A complete workflow is proposed and implemented that processes any corpus of input text documents into a map form and leverages a web application framework to let users explore knowledge domain maps interactively. This workflow is implemented and demonstrated for a data set of more than 66,000 conference abstracts.

  9. LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

    NASA Technical Reports Server (NTRS)

    Will, H.

    1994-01-01

    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  10. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 190198, Prague, June 2007. c 2007 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing Research University of Colorado at Boulder yc@colorado.edu James Martin Department of Computer Science no control. A frequent special case of these applications involves queries containing named entities

  11. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.

    PubMed

    Comeau, Donald C; Liu, Haibin; Islamaj Do?an, Rezarta; Wilbur, W John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. PMID:24935050

  12. Moving Toward a Unified Effort to Understand the Nature and Causes of Language Disorders

    E-print Network

    Rice, Mabel L.; Warren, Steven F.

    2005-01-01

    of affectedness. Investigators (and their fund- ing sources) often focus on a particular clinical group, such as specific language impairment (SLI), autism/autism spectrum disorders (ASD), Williams syndrome (WS), Down syndrome (DS), or fragile X syndrome (FXS... precursors and predictors of language acquisition in developmental language disorders, the genes that con- tribute to developmental language disorders in different syndromes, and the Applied Psycholinguistics 26:1 5 Rice & Warren: Unifying the effort...

  13. Language in Nature: On the Evolutionary Roots of a Cultural Phenomenon

    NASA Astrophysics Data System (ADS)

    Zuidema, Willem

    What could an evolutionary explanation for language look like? Here I review relevant evidence from linguistics, comparative biology, evolutionary theory and the fossil record, which suggest vocal imitation and hierarchical compositionality as the essential and uniquely human biological foundations of language. I also outline a plausible scenario for how human language evolved, and propose that language preceded, and facilitated the development of, other cognitive domains such as reasoning, the ability to plan, and consciousness.

  14. Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof

    ERIC Educational Resources Information Center

    Robotti, Elisabetta

    2012-01-01

    In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language

  15. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 15451556, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics

    E-print Network

    Semantic Parsers with On-the-fly Ontology Matching Tom Kwiatkowski Eunsol Choi Yoav Artzi Luke Zettlemoyer­1556, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics Scaling

  16. ScheMoose -Supporting a Functional Language in Moose Katerina Barone-Adesi and Michele Lanza

    E-print Network

    Lanza, Michele

    ScheMoose - Supporting a Functional Language in Moose Katerina Barone-Adesi and Michele Lanza Faculty of Informatics - University of Lugano, Switzerland Abstract The Moose Reengineering environment obtained while implementing ScheMoose, a prototype parser and MSE exporter of Scheme code written itself

  17. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 520, COLING 2012, Mumbai, December 2012.

    E-print Network

    Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 5 HEAT MAPS: an eye-tracking study using multiple input sources Fabio ALVES1, José Luiz GONÇALVES2 effort in studies of reading and writing processes. More recently, eye tracking has also been applied

  18. Some Hypotheses on the Nature of Difficulty and Ease in Second Language Reading: An Application of Schema Theory.

    ERIC Educational Resources Information Center

    Hauptman, Philip C.

    2000-01-01

    Examines one aspect of the second language reading process: the nature of the concepts of "difficulty" and "ease" within a schema-theoretic perspective. Addresses this topic through L2 reading at the university or adult level. Proposes a number of hypotheses suggested by both the empirical and pedagogical literature that will lay the groundwork…

  19. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 7180, COLING 2012, Mumbai, December 2012.

    E-print Network

    Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 71­80, COLING 2012, Mumbai, December 2012. A heuristic-based approach for systematic error correction of gaze@cse.iitb.ac.in, mc.isv@cbs.dk, pb@cse.iitb.ac.in ABSTRACT In eye-tracking research, temporally constant deviations

  20. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 14, COLING 2012, Mumbai, December 2012.

    E-print Network

    Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, pages 1­4, COLING 2012, Mumbai, December 2012. Grounding spoken interaction with real-time gaze in dynamic virtual environments Matthew Crocker Saarland University crocker@coli.uni-saarland.de ABSTRACT Gaze is an important cue

  1. Does It Really Matter whether Students' Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language?

    ERIC Educational Resources Information Center

    D'Mello, Sidney K.; Dowell, Nia; Graesser, Arthur

    2011-01-01

    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The "speech facilitation" hypothesis predicts that spoken input will "increase" learning, whereas the "text…

  2. Surmounting the Tower of Babel: Monolingual and bilingual 2-year-olds' understanding of the nature of foreign language words.

    PubMed

    Byers-Heinlein, Krista; Chen, Ke Heng; Xu, Fei

    2014-03-01

    Languages function as independent and distinct conventional systems, and so each language uses different words to label the same objects. This study investigated whether 2-year-old children recognize that speakers of their native language and speakers of a foreign language do not share the same knowledge. Two groups of children unfamiliar with Mandarin were tested: monolingual English-learning children (n=24) and bilingual children learning English and another language (n=24). An English speaker taught children the novel label fep. On English mutual exclusivity trials, the speaker asked for the referent of a novel label (wug) in the presence of the fep and a novel object. Both monolingual and bilingual children disambiguated the reference of the novel word using a mutual exclusivity strategy, choosing the novel object rather than the fep. On similar trials with a Mandarin speaker, children were asked to find the referent of a novel Mandarin label kuò. Monolinguals again chose the novel object rather than the object with the English label fep, even though the Mandarin speaker had no access to conventional English words. Bilinguals did not respond systematically to the Mandarin speaker, suggesting that they had enhanced understanding of the Mandarin speaker's ignorance of English words. The results indicate that monolingual children initially expect words to be conventionally shared across all speakers-native and foreign. Early bilingual experience facilitates children's discovery of the nature of foreign language words. PMID:24268905

  3. Kerly, A., Hall, P. & Bull, S. (2006). Bringing Chatbots into Education: Towards Natural Language Negotiation of Open Learner Models, in R. Ellis, T. Allen & A.

    E-print Network

    Bull, Susan

    2006-01-01

    and Applications of Artificial Intelligence, Springer. Bringing Chatbots into Education: Towards Natural LanguageKerly, A., Hall, P. & Bull, S. (2006). Bringing Chatbots into Education: Towards Natural Language Negotiation of Open Learner Models, in R. Ellis, T. Allen & A. Tuson (eds), Applications and Innovations

  4. A natural-language approach to biomimetic design Biomimetics for Innovation and Design Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto,

    E-print Network

    Shu, Lily H.

    A natural-language approach to biomimetic design L.H. SHU Biomimetics for Innovation and Design for engineer- ing design. Keywords: Analogical Reasoning; Biologically Inspired Design; Biomimetic Design to biomimetic de- sign. First highlighted are challenges in natural-language pro- cessing and analogical

  5. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…

  6. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  7. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues

    E-print Network

    of psychology. Three sections (37 pages together) are devoted to an outline of the history of computational, the theory of formal grammar. The book consists of a brief introduction, four chapters (which the authors

  8. Text mixing shapes the anatomy of rank-frequency distributions: A modern Zipfian mechanics for natural language

    E-print Network

    Williams, Jake Ryland; Danforth, Christopher M; Dodds, Peter Sheridan

    2014-01-01

    Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this `law' of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora over the last 15 years have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and non-core lexica. Here, we present and defend an alternative hypothesis, that the two scaling regimes result from the act of aggregating texts. We observe that text mixing leads to an effective decay of word introduction, which we show provides accurate predictions of the location and severity of breaks in scaling. Upon examining large corpora from 10 languages, we find emphatic empirical support for the universality of our claim.

  9. The Ising model for changes in word ordering rules in natural languages

    NASA Astrophysics Data System (ADS)

    Itoh, Yoshiaki; Ueda, Sumie

    2004-11-01

    The order of ‘noun and adposition’ is an important parameter of word ordering rules in the world’s languages. The seven parameters, ‘adverb and verb’ and others, depend strongly on the ‘noun and adposition’. Japanese as well as Korean, Tamil and several other languages seem to have a stable structure of word ordering rules, while Thai and other languages, which have the opposite word ordering rules to Japanese, are also stable in structure. It seems therefore that each language in the world fluctuates between these two structures like the Ising model for finite lattice.

  10. A framework for the natural-language-perception-based creative control of unmanned ground vehicles

    NASA Astrophysics Data System (ADS)

    Ghaffari, Masoud; Liao, Xiaoqun; Hall, Ernest L.

    2004-09-01

    Mobile robots must often operate in an unstructured environment cluttered with obstacles and with many possible action paths. That is why mobile robotics problems are complex with many unanswered questions. To reach a high degree of autonomous operation, a new level of learning is required. On the one hand, promising learning theories such as the adaptive critic and creative control have been proposed, while on other hand the human brain"s processing ability has amazed and inspired researchers in the area of Unmanned Ground Vehicles but has been difficult to emulate in practice. A new direction in the fuzzy theory tries to develop a theory to deal with the perceptions conveyed by the natural language. This paper tries to combine these two fields and present a framework for autonomous robot navigation. The proposed creative controller like the adaptive critic controller has information stored in a dynamic database (DB), plus a dynamic task control center (TCC) that functions as a command center to decompose tasks into sub-tasks with different dynamic models and multi-criteria functions. The TCC module utilizes computational theory of perceptions to deal with the high levels of task planning. The authors are currently trying to implement the model on a real mobile robot and the preliminary results have been described in this paper.

  11. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management

    PubMed Central

    Keselman, Alla; Rosemblat, Graciela; Kilicoglu, Halil; Fiszman, Marcelo; Jin, Honglan; Shin, Dongwook; Rindflesch, Thomas C.

    2013-01-01

    Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design. PMID:24311971

  12. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

    PubMed Central

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  13. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines.

    PubMed

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne

    2014-01-01

    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice. PMID:24943582

  14. Bringing Chatbots into education: Towards Natural Language Negotiation of Open Learner Models

    NASA Astrophysics Data System (ADS)

    Kerlyl, Alice; Hall, Phil; Bull, Susan

    There is an extensive body of work on Intelligent Tutoring Systems: computer environments for education, teaching and training that adapt to the needs of the individual learner. Work on personalisation and adaptivity has included research into allowing the student user to enhance the system's adaptivity by improving the accuracy of the underlying learner model. Open Learner Modelling, where the system's model of the user's knowledge is revealed to the user, has been proposed to support student reflection on their learning. Increased accuracy of the learner model can be obtained by the student and system jointly negotiating the learner model. We present the initial investigations into a system to allow people to negotiate the model of their understanding of a topic in natural language. This paper discusses the development and capabilities of both conversational agents (or chatbots) and Intelligent Tutoring Systems, in particular Open Learner Modelling. We describe a Wizard-of-Oz experiment to investigate the feasibility of using a chatbot to support negotiation, and conclude that a fusion of the two fields can lead to developing negotiation techniques for chatbots and the enhancement of the Open Learner Model. This technology, if successful, could have widespread application in schools, universities and other training scenarios.

  15. `Ideal learning' of natural language: Positive results about learning from positive

    E-print Network

    Chater, Nick

    , Gregory Mulhauser, Mike Oaksford, Luca Onnis, Martin Pickering, Emmanuel Pothos, Matthew Roberts, Jerry Seligman and Gerry Wolff and two anonymous reviewers for valuable discussions of these ideas at various is of central importance for the study of human language and language acquisition (e.g., Crain & Lillo-Martin

  16. International Joint Conference on Natural Language Processing, pages 293301, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    .96 and 0.94 on Chinese and English corpora, respectively. Com- parison result also shows that the proposed within mutiple classes in bootstrapping. We evaluate our method in two languages, i.e., Chinese for five categories in both languages, including star, film, TV play, song, and PC game. Experimental

  17. Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants

    E-print Network

    Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants Proceedings of the Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants associated with The 9th International Conference on Recent Advances in Natural Language Processing

  18. DBPQL: A view-oriented query language for the Intel Data Base Processor

    NASA Technical Reports Server (NTRS)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  19. Engineering natural language processing solutions for structured information from clinical text: extracting sentinel events from palliative care consult letters.

    PubMed

    Barrett, Neil; Weber-Jahnke, Jens H; Thai, Vincent

    2013-01-01

    Despite a trend to formalize and codify medical information, natural language communications still play a prominent role in health care workflows, in particular when it comes to hand-overs between providers. Natural language processing (NLP) attempts to bridge the gap between informal, natural language information and coded, machine-interpretable data. This paper reports on a study that applies an advanced NLP method for the extraction of sentinel events in palliative care consult letters. Sentinel events are of interest to predict survival and trajectory for patients with acute palliative conditions. Our NLP method combines several novel characteristics, e.g., the consideration of topological knowledge structures sourced from an ontological terminology system (SNOMED CT). The method has been applied to the extraction of different types of sentinel events, including simple facts, temporal conditions, quantities, and degrees. A random selection of 215 anonymized consult letters was used for the study. The results of the NLP extraction were evaluated by comparison with coded sentinel event data captured independently by clinicians. The average accuracy of the automated extraction was 73.6%. PMID:23920625

  20. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  1. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 862870, Singapore, 6-7 August 2009. c 2009 ACL and AFNLP

    E-print Network

    ­870, Singapore, 6-7 August 2009. c 2009 ACL and AFNLP Bilingual dictionary generation for low-resourced language@yz.yamagata-u.ac.jp Abstract Bilingual dictionaries are vital resources in many areas of natural language processing. Numerous methods of machine translation re- quire bilingual dictionaries with large cover- age, but less

  2. Linguistics in Language Education

    ERIC Educational Resources Information Center

    Kumar, Rajesh; Yunus, Reva

    2014-01-01

    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  3. A natural language processing (NLP) program effectively extracts key pathologic findings from radical prostatectomy reports.

    PubMed

    Kim, Brian; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil Abraham; Contreras, Richard; Jacobsen, Steven J; Chien, Gary

    2014-08-01

    Introduction and Objective Natural language processing (NLP) software programs have been widely developed to transform complex, free text into simplified, organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. Methods An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included: TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a "gold standard" compiled by two blinded manual reviewers for 100 random pathology reports. Results: NLP demonstrated 100% accuracy for identifying Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report. Conclusions: This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases. PMID:25083914

  4. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare

    PubMed Central

    2015-01-01

    Summary Objective To summarize recent research and present a selection of the best papers published in 2014 in the field of clinical Natural Language Processing (NLP). Method A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers. Results The clinical NLP best paper selection shows that the field is tackling text analysis methods of increasing depth. The full review process highlighted five papers addressing foundational methods in clinical NLP using clinically relevant texts from online forums or encyclopedias, clinical texts from Electronic Health Records, and included studies specifically aiming at a practical clinical outcome. The increased access to clinical data that was made possible with the recent progress of de-identification paved the way for the scientific community to address complex NLP problems such as word sense disambiguation, negation, temporal analysis and specific information nugget extraction. These advances in turn allowed for efficient application of NLP to clinical problems such as cancer patient triage. Another line of research investigates online clinically relevant texts and brings interesting insight on communication strategies to convey health-related information. Conclusions The field of clinical NLP is thriving through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques for concrete healthcare purposes. Clinical NLP is becoming mature for practical applications with a significant clinical impact. PMID:26293868

  5. Validation of natural language processing to extract breast cancer pathology procedures and results

    PubMed Central

    Wieneke, Arika E.; Bowles, Erin J. A.; Cronkite, David; Wernli, Karen J.; Gao, Hongyuan; Carrell, David; Buist, Diana S. M.

    2015-01-01

    Background: Pathology reports typically require manual review to abstract research data. We developed a natural language processing (NLP) system to automatically interpret free-text breast pathology reports with limited assistance from manual abstraction. Methods: We used an iterative approach of machine learning algorithms and constructed groups of related findings to identify breast-related procedures and results from free-text pathology reports. We evaluated the NLP system using an all-or-nothing approach to determine which reports could be processed entirely using NLP and which reports needed manual review beyond NLP. We divided 3234 reports for development (2910, 90%), and evaluation (324, 10%) purposes using manually reviewed pathology data as our gold standard. Results: NLP correctly coded 12.7% of the evaluation set, flagged 49.1% of reports for manual review, incorrectly coded 30.8%, and correctly omitted 7.4% from the evaluation set due to irrelevancy (i.e. not breast-related). Common procedures and results were identified correctly (e.g. invasive ductal with 95.5% precision and 94.0% sensitivity), but entire reports were flagged for manual review because of rare findings and substantial variation in pathology report text. Conclusions: The NLP system we developed did not perform sufficiently for abstracting entire breast pathology reports. The all-or-nothing approach resulted in too broad of a scope of work and limited our flexibility to identify breast pathology procedures and results. Our NLP system was also limited by the lack of the gold standard data on rare findings and wide variation in pathology text. Focusing on individual, common elements and improving pathology text report standardization may improve performance. PMID:26167382

  6. International Joint Conference on Natural Language Processing, pages 429437, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    limitations. The main reason for this is that the source language does not usually contain all the information-side of the synchronous gram- mar. Although tranfer-based MT (Lavie, 2008) uses rich feature structures, grammar rules

  7. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax

    E-print Network

    Emily M. Bender (University of Washington) Morgan & Claypool (Synthesis Lectures on Human Language as omnivorous as NLP. For example, while theorists might disagree about whether morphophonology is best modeled

  8. Applications of Finite-State Transducers in Natural-Language Processing

    E-print Network

    Karttunen, Lauri

    inadequate. Noam Chomsky's seminal 1957 work, Syntactic Structures 3], includes a short chapter devoted. In this section Chomsky demonstrates in a few paragraphs that English is not a nite state language. (p. 21) In any

  9. Unsupervised learning of natural languages Zach Solan*, David Horn*, Eytan Ruppin

    E-print Network

    Edelman, Shimon

    , 2004) We address the problem, fundamental to linguistics, bioinformat- ics, and certain other- guages as diverse as English and Chinese, and on protein data correlating sequence with function. computational linguistics grammar induction language acquisition machine learning protein classification Many

  10. Proceedings of Recent Advances in Natural Language Processing, pages 557561, Hissar, Bulgaria, 12-14 September 2011.

    E-print Network

    -14 September 2011. Towards a Corpus-based Approach to Modelling Language Production of Foreign Language to modelling the language use of foreign language learners in various con- texts. We focus on learners with their contexts, at different levels of language proficiency. 1 Introduction Learning a foreign language

  11. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 767777, MIT, Massachusetts, USA, 9-11 October 2010. c 2010 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 767. The works (Chelba and Acero, 2004; Daum´e III, 2007; Finkel and Manning, 2009) belong to this category. 767

  12. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pages 10851094,

    E-print Network

    International Joint Conference on Natural Language Processing, pages 1085­1094, Beijing, China, July 26-31, 2015, wages and promotions (Guven and Islam, 2013). There has been a considerable success in auto- matically

  13. Zipf’s word frequency law in natural language: A critical review and future directions

    PubMed Central

    2014-01-01

    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf ’ s law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf’s law and are then used to evaluate many of the theoretical explanations of Zipf’s law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf’s law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880

  14. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language

    NASA Astrophysics Data System (ADS)

    Szathmáry, Eörs; Szathmáry, Zoltán; Ittzés, Péter; Orbaán, Gero?; Zachár, István; Huszár, Ferenc; Fedor, Anna; Varga, Máté; Számadó, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  15. Proceedings of the Conference on Language and Language Behavior.

    ERIC Educational Resources Information Center

    Zale, Eric M., Ed.

    This volume contains the papers read at the Conference on Language and Language Behavior held at the University of Michigan's Center for Research on Language and Language Behavior in October 1966. Papers are ordered under the following topics: First Language Acquisition in Natural Setting, Controlled Acquisition of First Language Skills, Second…

  16. PNEPs, NEPs for Context Free Parsing: Application to Natural Language Processing

    E-print Network

    Alfonseca, Manuel

    , respectively, for unambiguous and ambiguous grammars in the worst case) This paper is focused on this step languages. We propose PNEP, a simple extension to NEP, and a procedure to translate a grammar into a PNEP to the structure of the grammar, which can contain all kinds of recur- sive, lambda or ambiguous rules

  17. Proceedings of the Workshop on Biomedical Natural Language Processing, pages 1118, Hissar, Bulgaria, 15 September 2011.

    E-print Network

    ; (iii) anamnesis; (iv) patient status; (v) lab data; (vi) medical examiners comments; (vii) discussion in Bulgarian language. Diseases are often described in the medical patient records as free text using. To describe diseases as free text in the medical patient records (PRs) usually is used different terminology

  18. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  19. Children with Specific Language Impairments Perceive Speech Most Categorically when Tokens Are Natural and Meaningful

    ERIC Educational Resources Information Center

    Coady, Jeffry A.; Evans, Julia L.; Mainela-Arnold, Elina; Kluender, Keith R.

    2007-01-01

    Purpose: To examine perceptual deficits as a potential underlying cause of specific language impairments (SLI). Method: Twenty-one children with SLI (8;7-11;11 [years;months]) and 21 age-matched controls participated in categorical perception tasks using four series of syllables for which perceived syllable-initial voicing varied. Series were…

  20. International Joint Conference on Natural Language Processing, pages 815821, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    (Walker et al., 2006) and OntoNotes (Weischedel et al., 2008). BART, the Beautiful Anaphora Res- olution for anaphora resolution in Indian languages. In 2011 a shared task on NLP Tools Contest on Anaphora Resolution Processing (ICON 2011) 1. Four teams participated in this contest with the varying approaches(Chatterji et al

  1. International Joint Conference on Natural Language Processing, pages 12301236, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    -18 October 2013. A Hybrid Morphological Disambiguation System for Turkish Mucahid Kutlu Dept. of Computer, we propose a morphological dis- ambiguation method for Turkish, which is an agglutinative language. We use a hybrid meth- od, which combines statistical information with handcrafted rules and learned

  2. School Meaning Systems: The Symbiotic Nature of Culture and "Language-In-Use"

    ERIC Educational Resources Information Center

    Abawi, Lindy

    2013-01-01

    Recent research has produced evidence to suggest a strong reciprocal link between school context-specific language constructions that reflect a school's vision and schoolwide pedagogy, and the way that meaning making occurs, and a school's culture is characterized. This research was conducted within three diverse settings: one school in…

  3. Natural Language Engineering 1 (1): 000--000 c fl 1994 Cambridge University Press 1

    E-print Network

    MacKay, David J.C.

    .1 Generalizations 15 5.2 Relationship to previous `empirical Bayes' approaches 17 A The Gamma function and Digamma for language modelling. The ideas of this paper are also applicable to other problems such as the modelling with smoothing on a two million word corpus. The methods prove to be about equally accurate

  4. Methodology and Implications of Reconstruction and Automatic Processing of Natural Language of the Classroom.

    ERIC Educational Resources Information Center

    Marlin, Marjorie; Barron, Nancy

    This paper discusses in some detail the procedural areas of reconstruction and automatic processing used by the Classroom Interaction Project of the University of Missouri's Center for Research in Social Behavior in the analysis of classroom language. First discussed is the process of reconstruction, here defined as the "process of adding to…

  5. ON THE COEVOLUTION OF THEORY AND LANGUAGE AND THE NATURE OF SUCCESSFUL INQUIRY

    E-print Network

    Johnson, Kent

    commitments, a satisfactory model of empirical knowledge should describe the coordinated evolution of both norms. 1. Knowledge and the evolution of descriptive language On the standard account of propositional for modeling the evolution of knowledge, even a cursory consideration of actual empirical inquiry indicates

  6. The Ability of Children with Language Impairment to Dissemble Emotions in Hypothetical Scenarios and Natural Situations

    ERIC Educational Resources Information Center

    Brinton, Bonnie; Fujiki, Martin; Hurst, Noel Quist; Jones, Emily Rowberry; Spackman, Matthew P.

    2015-01-01

    Purpose: This study examined the ability of children with language impairment (LI) to dissemble (hide) emotional reactions when socially appropriate to do so. Method: Twenty-two children with LI and their typically developing peers (7;1-10;11 [years;months]) participated in two tasks. First, participants were presented with hypothetical scenarios…

  7. 2015 Security First Corp. May be reproduced only in its original entirety (without revision). SecureParser

    E-print Network

    for Data & Share authentication/integrity 0.07 06/07/2007 Security First Corp. Added Algorithm certificate ©2015 Security First Corp. May be reproduced only in its original entirety (without revision). SecureParser® Version 4.7.0 Non-Proprietary Security Policy Revision 1.37 28 March

  8. Copyright Security First Corp. May be reproduced only in its original entirety (without revision). SecureParser

    E-print Network

    key is used for Data & Share authentication/integrity 0.07 06/07/2007 Security First Corp. AddedCopyright Security First Corp. May be reproduced only in its original entirety (without revision). SecureParser® Version 4.7.0 Security Policy Revision 1.31 6 August 2009 © Security First Corp. 2009 All

  9. IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 BibPro: A Citation Parser Based on Sequence

    E-print Network

    Yang, Kai-Hsiang

    IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 BibPro: A Citation Parser Based on Sequence adopt different citation styles. It is an interesting problem to accurately extract metadata from a citation string which is formatted in one of thousands of different styles. It has attracted a great deal

  10. Crowdsourcing a Normative Natural Language Dataset: A Comparison of Amazon Mechanical Turk and In-Lab Data Collection

    PubMed Central

    Bex, Peter J; Woods, Russell L

    2013-01-01

    Background Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. Objective To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. Methods We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. Results In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (P<.001), and they used a less varied vocabulary. However, there was strong similarity in the words used to describe a particular clip between the two datasets, as a cross-dataset count of shared words showed (P<.001). Within both datasets, responses contained substantial relevant content, with more words in common with responses to the same clip than to other clips (P<.001). There was evidence that responses from female and older crowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Conclusions Crowdsourcing is an effective approach to quickly and economically collect a large reliable dataset of normative natural language responses. PMID:23689038

  11. A natural language query system for Hubble Space Telescope proposal selection

    NASA Technical Reports Server (NTRS)

    Hornick, Thomas; Cohen, William; Miller, Glenn

    1987-01-01

    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  12. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 165171, October 25-29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics

    E-print Network

    -Augmented Machine Translation using Syntax-Label Clustering Hideya Mino, Taro Watanabe and Eiichiro Sumita National Institute of Information and Communications Technology 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, JAPAN, especially when syntax labels are projected from a parser in syntax-augmented machine translation

  13. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pages 939949,

    E-print Network

    International Joint Conference on Natural Language Processing, pages 939­949, Beijing, China, July 26-31, 2015 the two sentences. We experiment on three different natural sentence rewrit- ing tasks and obtain state-of-the-art; Islam and Inkpen, 2009; Yih et al., 2013). Models based on syntactic trees remain the typical choice

  14. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework

    ERIC Educational Resources Information Center

    Vlas, Radu Eduard

    2012-01-01

    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  15. Identification of methicillin-resistant Staphylococcus aureus within the Nation’s Veterans Affairs Medical Centers using natural language processing

    PubMed Central

    2012-01-01

    Background Accurate information is needed to direct healthcare systems’ efforts to control methicillin-resistant Staphylococcus aureus (MRSA). Assembling complete and correct microbiology data is vital to understanding and addressing the multiple drug-resistant organisms in our hospitals. Methods Herein, we describe a system that securely gathers microbiology data from the Department of Veterans Affairs (VA) network of databases. Using natural language processing methods, we applied an information extraction process to extract organisms and susceptibilities from the free-text data. We then validated the extraction against independently derived electronic data and expert annotation. Results We estimate that the collected microbiology data are 98.5% complete and that methicillin-resistant Staphylococcus aureus was extracted accurately 99.7% of the time. Conclusions Applying natural language processing methods to microbiology records appears to be a promising way to extract accurate and useful nosocomial pathogen surveillance data. Both scientific inquiry and the data’s reliability will be dependent on the surveillance system’s capability to compare from multiple sources and circumvent systematic error. The dataset constructed and methods used for this investigation could contribute to a comprehensive infectious disease surveillance system or other pressing needs. PMID:22533507

  16. Research in knowledge representation for natural language communication and planning assistance. Final report, 18 March 1985-30 September 1988

    SciTech Connect

    Goodman, B.A.; Grosz, B.; Haas, A.; Litman, D.; Reinhardt, T.

    1988-11-01

    BBN's DARPA project in Knowledge Representation for Natural Language Communication and Planning Assistance has two primary objectives: 1) To perform research on aspects of the interaction between users who are making complex decisions and systems that are assisting them with their task. In particular, this research is focused on communication and the reasoning required for performing its underlying task of discourse processing, planning, and plan recognition and communication repair. 2) Based on the research objectives to build tools for communication, plan recognition, and planning assistance and for the representation of knowledge and reasoning that underlie all of these processes. This final report summarizes BBN's research activities performed under this contract in the areas of knowledge representation and speech and natural language. In particular, the report discusses the work in the areas of knowledge representation, planning, and discourse modeling. We describe a parallel truth maintenance system. We provide an extension to the sentential theory of propositional attitudes by adding a sentential semantics. The report also contains a description of our research in discourse modelling in the areas of planning and plan recognition.

  17. L3 Interactive Data Language

    Energy Science and Technology Software Center (ESTSC)

    2006-09-05

    The L3 system is a computational steering environment for image processing and scientific computing. It consists of an interactive graphical language and interface. Its purpose is to help advanced users in controlling their computational software and assist in the management of data accumulated during numerical experiments. L3 provides a combination of features not found in other environments; these are: - textual and graphical construction of programs - persistence of programs and associated data - directmore »mapping between the scripts, the parameters, and the produced data - implicit hierarchial data organization - full programmability, including conditionals and functions - incremental execution of programs The software includes the l3 language and the graphical environment. The language is a single-assignment functional language; the implementation consists of lexer, parser, interpreter, storage handler, and editing support, The graphical environment is an event-driven nested list viewer/editor providing graphical elements corresponding to the language. These elements are both the represenation of a users program and active interfaces to the values computed by that program.« less

  18. On the nature and evolution of the neural bases of human language

    NASA Technical Reports Server (NTRS)

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on the brains of human beings and other species provides insight into the evolution of the brain bases of human language. The neural substrate that regulated motor control in the common ancestor of apes and humans most likely was modified to enhance cognitive and linguistic ability. Speech communication played a central role in this process. However, the process that ultimately resulted in the human brain may have started when our earliest hominid ancestors began to walk.

  19. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 451458, Vancouver, October 2005. c 2005 Association for Computational Linguistics

    E-print Network

    for Information Retrieval Hema Raghavan and James Allan Department of Computer Science University of MassachusettsProceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural a string edit distance based method. The ef- fectiveness of these models is evaluated on a name query

  20. The Two Cultures of Science: On Language-Culture Incommensurability Concerning "Nature" and "Observation"

    ERIC Educational Resources Information Center

    Loo, Seng Piew

    2007-01-01

    Culture without nature is empty, nature without culture is deaf Intercultural dialogue in higher education around the globe is needed to improve the theory, policy and practice of science and science education. The culture, cosmology and philosophy of "global" science as practiced today in all societies around the world are seemingly anchored in…

  1. Natural language processing of asthma discharge summaries for the monitoring of patient care.

    PubMed

    Sager, N; Lyman, M; Tick, L J; Nhàn, N T; Bucknall, C E

    1993-01-01

    A technique for monitoring healthcare via the processing of routinely collected narrative documentation is presented. A checklist of important details of asthma management in use in the Glasgow Royal Infirmary (GRI) was translated into SQL queries and applied to a database of 59 GRI discharge summaries analyzed by the New York University Linguistic String Project medical language processor. Tables of retrieved information obtained for each query were compared with the text of the original documents by physician reviewers. Categories (unit = document) were: (1) information present, retrieved correctly; (2) information not present; (3) information present, retrieved with minor or major error; (4) information present, retrieved with minor or major omissions. Category 2 (physician "documentation score") could be used to prioritize manual review and guide feedback to physicians to improve documentation. The semantic structuring and relative completeness of retrieved data suggest their potential use as input to further quality assurance procedures. PMID:8130474

  2. The development of a natural language interface to a geographical information system

    NASA Technical Reports Server (NTRS)

    Toledo, Sue Walker; Davis, Bruce

    1993-01-01

    This paper will discuss a two and a half year long project undertaken to develop an English-language interface for the geographical information system GRASS. The work was carried out for NASA by a small business, Netrologic, based in San Diego, California, under Phase 1 and 2 Small Business Innovative Research contracts. We consider here the potential value of this system whose current functionality addresses numerical, categorical and boolean raster layers and includes the display of point sets defined by constraints on one or more layers, answers yes/no and numerical questions, and creates statistical reports. It also handles complex queries and lexical ambiguities, and allows temporarily switching to UNIX or GRASS.

  3. Writing in science: Exploring teachers' and students' views of the nature of science in language enriched environments

    NASA Astrophysics Data System (ADS)

    Decoito, Isha

    Writing in science can be used to address some of the issues relevant to contemporary scientific literacy, such as the nature of science, which describes the scientific enterprise for science education. This has implications for the kinds of writing tasks students should attempt in the classroom, and for how students should understand the rationale and claims of these tasks. While scientific writing may train the mind to think scientifically in a disciplined and structured way thus encouraging students to gain access to the public domain of scientific knowledge, the counter-argument is that students need to be able to express their thoughts freely in their own language. Writing activities must aim to promote philosophical and epistemological views of science that accurately portray contemporary science. This mixed-methods case study explored language-enriched environments, in this case, secondary science classrooms with a focus on teacher-developed activities, involving diversified writing styles, that were directly linked to the science curriculum. The research foci included: teachers' implementation of these activities in their classrooms; how the activities reflected the teachers' nature of science views; common attributes between students' views of science and how they represented science in their writings; and if, and how the activities influenced students' nature of science views. Teachers' and students' views of writing and the nature of science are illustrated through pre-and post-questionnaire responses; interviews; student work; and classroom observations. Results indicated that diversified writing activities have the potential to accurately portray science to students, personalize learning in science, improve students' overall attitude towards science, and enhance scientific literacy through learning science, learning about science, and doing science. Further research is necessary to develop an understanding of whether the choice of genre has an influence on meaning construction and understanding in science. Finally, this study concluded that the relationship between students' views of the nature of science and writing in science is complex and is dependent on several factors including the teachers' influence and attitude towards student writing in science.

  4. An evaluation of a natural language processing tool for identifying and encoding allergy information in emergency department clinical notes.

    PubMed

    Goss, Foster R; Plasek, Joseph M; Lau, Jason J; Seger, Diane L; Chang, Frank Y; Zhou, Li

    2014-01-01

    Emergency department (ED) visits due to allergic reactions are common. Allergy information is often recorded in free-text provider notes; however, this domain has not yet been widely studied by the natural language processing (NLP) community. We developed an allergy module built on the MTERMS NLP system to identify and encode food, drug, and environmental allergies and allergic reactions. The module included updates to our lexicon using standard terminologies, and novel disambiguation algorithms. We developed an annotation schema and annotated 400 ED notes that served as a gold standard for comparison to MTERMS output. MTERMS achieved an F-measure of 87.6% for the detection of allergen names and no known allergies, 90% for identifying true reactions in each allergy statement where true allergens were also identified, and 69% for linking reactions to their allergen. These preliminary results demonstrate the feasibility using NLP to extract and encode allergy information from clinical notes. PMID:25954363

  5. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  6. Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing.

    PubMed

    Yu, Sheng; Kumamaru, Kanako K; George, Elizabeth; Dunne, Ruth M; Bedayat, Arash; Neykov, Matey; Hunsaker, Andetta R; Dill, Karin E; Cai, Tianxi; Rybicki, Frank J

    2014-12-01

    In this paper we describe an efficient tool based on natural language processing for classifying the detail state of pulmonary embolism (PE) recorded in CT pulmonary angiography reports. The classification tasks include: PE present vs. absent, acute PE vs. others, central PE vs. others, and subsegmental PE vs. others. Statistical learning algorithms were trained with features extracted using the NLP tool and gold standard labels obtained via chart review from two radiologists. The areas under the receiver operating characteristic curves (AUC) for the four tasks were 0.998, 0.945, 0.987, and 0.986, respectively. We compared our classifiers with bag-of-words Naive Bayes classifiers, a standard text mining technology, which gave AUC 0.942, 0.765, 0.766, and 0.712, respectively. PMID:25117751

  7. Classification of CT Pulmonary Angiography Reports by Presence, Chronicity, and Location of Pulmonary Embolism with Natural Language Processing

    PubMed Central

    Yu, Sheng; Kumamaru, Kanako K.; George, Elizabeth; Dunne, Ruth M.; Bedayat, Arash; Neykov, Matey; Hunsaker, Andetta R.; Dill, Karin E.; Cai, Tianxi; Rybicki, Frank J.

    2014-01-01

    In this paper we describe an efficient tool based on natural language processing for classifying the detail state of pulmonary embolism (PE) recorded in CT pulmonary angiography reports. The classification tasks include: PE present vs. absent, acute PE vs. others, central PE vs. others, and sub-segmental PE vs. others. Statistical learning algorithms were trained with features extracted using the NLP tool and gold standard labels obtained via chart review from two radiologists. The areas under the receiver operating characteristic curves (AUG) for the four tasks were 0.998, 0.945, 0.987, and 0.986, respectively. We compared our classifiers with bag-of-words Naive Bayes classifiers, a standard text mining technology, which gave AUG 0.942, 0.765, 0.766, and 0.712, respectively. PMID:25117751

  8. Proceedings of The 2nd Workshop on Natural Language Processing Techniques for Educational Applications, pages 9498, Beijing, China, July 31, 2015. c 2015 Association for Computational Linguistics and Asian Federation of Natural Language Processing

    E-print Network

    preliminary study for building an educational application to help foreign language learning between Turkish for foreign language learning; namely Turkish for English native speakers and English for Turkish native Language Learning Hasan Kaya Istanbul Technical University Department of Computer Engineering Istanbul

  9. Statistical Language Modelling 

    E-print Network

    Gotoh, Yoshihiko; Renals, Steve

    2003-01-01

    Grammar-based natural language processing has reached a level where it can `understand' language to a limited degree in restricted domains. For example, it is possible to parse textual material very accurately and assign ...

  10. American Sign Language

    MedlinePLUS

    ... Langue des Signes Française).Today’s ASL includes some elements of LSF plus the original local sign languages, ... can also be used to model the essential elements and organization of natural language. Another NIDCD-funded ...

  11. Examining the nature of deaf children's vocabulary learning : British Sign Language (BSL) and written English 

    E-print Network

    Warnock, Kristen

    2006-01-01

    The present study had three aims: firstly, we wished to examine the nature of deaf children’s semantic representations on ‘familiar’ and ‘less familiar’ vocabulary items (determined by naming ability). Four tasks were ...

  12. Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

    PubMed Central

    Scott-Phillips, Thomas C.; Blythe, Richard A.

    2013-01-01

    In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication. PMID:24047871

  13. Proceedings of the 8th Workshop on Asian Language Resources, pages 1421, Beijing, China, 21-22 August 2010. c 2010 Asian Federation for Natural Language Processing

    E-print Network

    on Princeton WordNet 2.0 (PWN) as its reference model, can provide means for shallow semantic processing synsets (synonym sets) share with those of PWN, and of which Korean language specific information comes

  14. The nature of the working memory system underlying language processing and its relationship to the long-term memory system

    E-print Network

    Fedorenko, Evelina Georgievna

    2007-01-01

    This thesis examines two questions concerning the working memory system underlying language processing: (1) To what extent is the working memory system underlying language processing domain-specific? and (2) What is the ...

  15. Proceedings of Recent Advances in Natural Language Processing, pages 579583, Hissar, Bulgaria, 7-13 September 2013.

    E-print Network

    .fatiha@uqam.ca Abstract Arabic is a morphologically rich and complex language, which presents signifi- cant challenges engines. 1 Introduction Arabic is a morphologically rich and complex language, in which a word carries inflec- tional language, which makes the morphological analysis complicated. In Arabic, many coordinat

  16. Arbitrary Symbolism in Natural Language Revisited: When Word Forms Carry Meaning

    PubMed Central

    Reilly, Jamie; Westbury, Chris; Kean, Jacob; Peelle, Jonathan E.

    2012-01-01

    Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog). Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 1–2 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete) for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., “apartment”) as abstract and shorter uninflected abstract words (e.g., “fate”) as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words. PMID:22879931

  17. Proceedings of the 14th European Workshop on Natural Language Generation, pages 178182, Sofia, Bulgaria, August 8-9 2013. c 2013 Association for Computational Linguistics

    E-print Network

    Proceedings of the 14th European Workshop on Natural Language Generation, pages 178­182, Sofia corpus of historical texts, GenNext allows the user to generate a template bank organized by semantic oil rig weather reports (SUMTIME-METEO (Reiter et al., 2005)) and require significant investments

  18. Abstract A computer can come to understand natural language the same way Helen Keller did: by using ``syntactic semantics''--a theory of how syntax can suffice

    E-print Network

    Rapaport, William J.

    Abstract A computer can come to understand natural language the same way Helen Keller did: by using-life approximations of Chinese Rooms, focusing on Helen Keller's experiences growing up deaf and blind, locked of this kind of naming. Keywords Animal communication Æ Chinese Room Argument Æ Helen Keller Æ Herbert Terrace

  19. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1985-01-01

    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  20. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 18801891, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1880 created special interest, both theoretical and computational, in short texts. This has led to many recent such a system, includ- ing identity fraud and phishing. In this paper, we introduce the concept of k- signatures

  1. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 16001610, Edinburgh, Scotland, UK, July 2731, 2011. c 2011 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1600 the present study is detection of phishing (Myers, 2007), the at- tempt to defraud through texts to phishing consists of technical methods such as email authen- tication; another looks at profiling

  2. Proceedings of the Second Workshop on Arabic Natural Language Processing, pages 3648, Beijing, China, July 26-31, 2015. c 2014 Association for Computational Linguistics

    E-print Network

    these solutions, respec- tively. 2.1 Morphological Analysis and POS Tagging The morphology of dialectal Arabic had), a morphological analyzer for Egyptian Arabic is proposed with further development in (Salloum & Habash, 2014Proceedings of the Second Workshop on Arabic Natural Language Processing, pages 36­48, Beijing

  3. Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 474482, Chiang Mai, Thailand, November 8 13, 2011. c 2011 AFNLP

    E-print Network

    comprehension. For instance, a person in 5th grade can comprehend a comic book easily but will struggleProceedings of the 5th International Joint Conference on Natural Language Processing, pages 474 that the simplified text produced by the proposed system reduces 1.7 Flesch-Kincaid grade level when compared

  4. Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP), pages 5058, Dublin, Ireland, August 24 2014.

    E-print Network

    Proceedings of the Second Workshop on Natural Language Processing for Social Media (Social Institute for Creative Technologies Los Angeles, CA 90094 metro.smiles@gmail.com, { park, hshim, sagae is motivated by prior research findings in psychology indicating that verbal behavior is a prom- ising

  5. Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), page 165, New York City, June 2006. c 2006 Association for Computational Linguistics

    E-print Network

    Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), page 165, and B. Say. 2003. The annota- tion process in the Turkish treebank. In Proc. of the 4th Intern. Workshop. Say, D. Zeynep Hakkani-T¨ur, and G. T¨ur. 2003. Building a Turkish treebank. In Abeill´e (Abeill

  6. Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015), pages 177182, Beijing, China, July 30, 2015. c 2015 Association for Computational Linguistics

    E-print Network

    Diagnostic Criteria in Quality Data Model Using Natural Language Processing Na Hong Mayo Clinic/ Rochester.edu Dingcheng Li Mayo Clinic/ Rochester, MN, USA Li.dingcheng@mayo.edu Yue Yu Mayo Clinic/ Rochester, MN, USA School of Public Health, Jilin University/ Changchun, China Yu.yue@mayo.edu Hongfang Liu Mayo Clinic

  7. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 344352, Honolulu, October 2008. c 2008 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 344 data sets and few states, and that Variational Bayes does well on large data sets and is competitive with the Gibbs samplers. In terms of times of conver- gence, we find that Variational Bayes was the fastest

  8. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 152162, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 152- ing a simple expectation-maximization (EM) algorithm. We empirically evaluate the pro- posed method multinomial Naive Bayes model with latent variables to conduct supervised word cluster- ing on labeled

  9. Proceedings of the EACL 2009 Workshop on GEMS: GEometical Models of Natural Language Semantics, pages 7482, Athens, Greece, 31 March 2009. c 2009 Association for Computational Linguistics

    E-print Network

    Unsupervised and Constrained Dirichlet Process Mixture Models for Verb Clustering Andreas Vlachos Computer Dirichlet Process Mixture Models (DPMMs) to a learning task in natural language processing (NLP): lexical-semantic verb clustering. We thor- oughly evaluate a method of guiding DP- MMs towards a particular clustering

  10. Proceedings of the 14th European Workshop on Natural Language Generation, pages 1019, Sofia, Bulgaria, August 8-9 2013. c 2013 Association for Computational Linguistics

    E-print Network

    both an ontology and an ontology lexicon in lemon format. Finally, we evaluate fluency and ade- quacy. For this purpose, lemon, a lex- icon model for ontologies, has been devel- oped (McCrae et al., 2011). One of the use cases of lemon is to support natural language generation systems that take as input a knowl- edge

  11. Pupils Reasoning about the Nature of Change Using an Abstract Picture Language.

    ERIC Educational Resources Information Center

    Stylianidou, Fani; Boohan, Richard

    The research is concerned with investigating children's understanding of physical, chemical, and biological changes while using an approach developed by the project Energy and Change. This project aimed to provide novel ways of teaching about the nature and direction of changes, in particular introducing ideas related to the Second Law of…

  12. International Joint Conference on Natural Language Processing, pages 10121016, Nagoya, Japan, 14-18 October 2013.

    E-print Network

    -18 October 2013. Induction of Root and Pattern Lexicon for Unsupervised Morphological Analysis of Arabic on unsupervised learning of Arabic morphology in that it is applicable to naturally-written, unvowelled text. 1 non-concatenative morphology. We apply our approach to inducing an Arabic lexicon of trilateral roots

  13. A perspective on the advancement of natural language processing tasks via topological analysis of complex networks. Comment on "Approaching human language with complex networks" by Cong and Liu

    NASA Astrophysics Data System (ADS)

    Amancio, Diego Raphael

    2014-12-01

    Concepts and methods of complex networks have been applied to probe the properties of a myriad of real systems [1]. The finding that written texts modeled as graphs share several properties of other completely different real systems has inspired the study of language as a complex system [2]. Actually, language can be represented as a complex network in its several levels of complexity. As a consequence, morphological, syntactical and semantical properties have been employed in the construction of linguistic networks [3]. Even the character level has been useful to unfold particular patterns [4,5]. In the review by Cong and Liu [6], the authors emphasize the need to use the topological information of complex networks modeling the various spheres of the language to better understand its origins, evolution and organization. In addition, the authors cite the use of networks in applications aiming at holistic typology and stylistic variations. In this context, I will discuss some possible directions that could be followed in future research directed towards the understanding of language via topological characterization of complex linguistic networks. In addition, I will comment the use of network models for language processing applications. Additional prospects for future practical research lines will also be discussed in this comment.

  14. Neurolinguistic Approach to Natural Language Processing with Applications to Medical Text Analysis

    PubMed Central

    Matykiewicz, Pawe?; Pestian, John

    2008-01-01

    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts that are not found directly in the text. Approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications. PMID:18614334

  15. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.

    PubMed

    Rindflesch, Thomas C; Fiszman, Marcelo

    2003-12-01

    Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering. PMID:14759819

  16. The nature and prevalence of disability in a Ghanaian community as measured by the Language Independent Functional Evaluation

    PubMed Central

    Kelemen, Benjamin William; Haig, Andrew John; Goodnight, Siera; Nyante, Gifty

    2013-01-01

    Introduction The current study uses the Language Independent Functional Evaluation (L.I.F.E.) to evaluate disability in a smaller Ghanaian coastal town to characterize the extent and nature of disability. The L.I.F.E. is a video animated, language free equivalent of the standard 10-item verbal/written Barthel Index functional assessment. Methods Over a four-month period, the L.I.F.E. survey was given to members of the village of Anomabo in a preliminary survey which consisted of recruitment in an un-controlled manner, followed by a systematic, comprehensive survey of three neighborhood clusters. Basic demographics were also collected, along with the observer's assessment of disability. Results 541 inhabitants (264 in the preliminary survey and 277 in systematic survey) completed the L.I.F.E. Participants ranged from 7-100 years old (mean age 32.88, s.d. 20.64) and were 55.9% female. In the systematic study, 16.6% of participants had a less than perfect score on the L.I.F.E., indicating some degree of impairment. Significant differences were found between age groups, but not between sexes, the preliminary and systematic survey, and study location (a=.05). Conclusion The L.I.F.E. and this study methodology can be used to measure the prevalence of disability in African communities. Disability in this community was higher than the frequently cited estimate of 10%. African policymakers can use the L.I.F.E. to measure disability and thus more rationally allocate resources for medical rehabilitation. PMID:23717718

  17. SIMD-parallel understanding of natural language with application to magnitude-only optical parsing of text

    NASA Astrophysics Data System (ADS)

    Schmalz, Mark S.

    1992-08-01

    A novel parallel model of natural language (NL) understanding is presented which can realize high levels of semantic abstraction, and is designed for implementation on synchronous SIMD architectures and optical processors. Theory is expressed in terms of the Image Algebra (IA), a rigorous, concise, inherently parallel notation which unifies the design, analysis, and implementation of image processing algorithms. The IA has been implemented on numerous parallel architectures, and IA preprocessors and interpreters are available for the FORTRAN and Ada languages. In a previous study, we demonstrated the utility of IA for mapping MEA- conformable (Multiple Execution Array) algorithms to optical architectures. In this study, we extend our previous theory to map serial parsing algorithms to the synchronous SIMD paradigm. We initially derive a two-dimensional image that is based upon the adjacency matrix of a semantic graph. Via IA template mappings, the operations of bottom-up parsing, semantic disambiguation, and referential resolution are implemented as image-processing operations upon the adjacency matrix. Pixel-level operations are constrained to Hadamard addition and multiplication, thresholding, and row/column summation, which are available in magnitude-only optics. Assuming high parallelism in the parse rule base, the parsing of n input symbols with a grammar consisting of M rules of arity H, on an N-processor architecture, could exhibit time complexity of T(n)

  18. Combining Speech Recognition/Natural Language Processing with 3D Online Learning Environments to Create Distributed Authentic and Situated Spoken Language Learning

    ERIC Educational Resources Information Center

    Jones, Greg; Squires, Todd; Hicks, Jeramie

    2008-01-01

    This article will describe research done at the National Institute of Multimedia in Education, Japan and the University of North Texas on the creation of a distributed Internet-based spoken language learning system that would provide more interactive and motivating learning than current multimedia and audiotape-based systems. The project combined…

  19. Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pages 123130, Hyderabad, India, January 2008. c 2008 Asian Federation of Natural Language Processing

    E-print Network

    machine translation: Biblical chatter from Finnish to English David Ellis Brown University Providence, RI of the steps in the process, from speech recognition to synthesis, deriving a model of translation that is effective in the domain of spoken language is an interesting and challenging task. If we could teach our

  20. Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pages 105112, Hyderabad, India, January 2008. c 2008 Asian Federation of Natural Language Processing

    E-print Network

    the commercial speech recognition systems available in the market are IBM Via Voice and Scansoft Dragon system-human interaction. These are Speech to Text conversion i.e. Speech Recognition & Text To Speech (TTS) conversion. In this paper the implementation of one issue Speech Recognition for Indian Languages is presented. 1

  1. Language, the Forgotten Content.

    ERIC Educational Resources Information Center

    Kelly, Patricia P., Ed.; Small, Robert C., Jr., Ed.

    1987-01-01

    The ways that students can learn about the nature of the English language and develop a sense of excitement about their language are explored in this focused journal issue. The titles of the essays and their authors are as follows: (1) "Language, the Forgotten Content" (R. Small and P. P. Kelly); (2) "What Should English Teachers Know about…

  2. Language, Gesture, and Space.

    ERIC Educational Resources Information Center

    Emmorey, Karen, Ed.; Reilly, Judy S., Ed.

    A collection of papers addresses a variety of issues regarding the nature and structure of sign language, gesture, and gesture systems. Articles include: "Theoretical Issues Relating Language, Gesture, and Space: An Overview" (Karen Emmorey, Judy S. Reilly); "Real, Surrogate, and Token Space: Grammatical Consequences in ASL American Sign Language"…

  3. Salience: the key to the selection problem in natural language generation

    SciTech Connect

    Conklin, E.J.; McDonald, D.D.

    1982-01-01

    The authors argue that in domains where a strong notion of salience can be defined, it can be used to provide: (1) an elegant solution to the selection problem, i.e. the problem of how to decide whether a given fact should or should not be mentioned in the text; and (2) a simple and direct control framework for the entire deep generation process, coordinating proposing, planning, and realization. (Deep generation involves reasoning about conceptual and rhetorical facts, as opposed to the narrowly linguistic reasoning that takes place during realization.) The authors report on an empirical study of salience in pictures of natural scenes, and its use in a computer program that generates descriptive paragraphs comparable to those produced by people. 13 references.

  4. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 325333, Prague, June 2007. c 2007 Association for Computational Linguistics

    E-print Network

    Linguistics Extending a Thesaurus in the Pan-Chinese Context Oi Yee Kwong and Benjamin K. Tsou Language {rlolivia,rlbtsou}@cityu.edu.hk Abstract In this paper, we address a unique problem in Chinese language processing and report on our study on extending a Chinese the- saurus with region-specific words, mostly from

  5. Tracking irregular morphophonological dependencies in natural language: evidence from the acquisition of subject-verb agreement in French.

    PubMed

    Nazzi, Thierry; Barrière, Isabelle; Goyet, Louise; Kresh, Sarah; Legendre, Géraldine

    2011-07-01

    This study examines French-learning infants' sensitivity to grammatical non-adjacent dependencies involving subject-verb agreement (e.g., le/les garçons lit/lisent 'the boy(s) read(s)') where number is audible on both the determiner of the subject DP and the agreeing verb, and the dependency is spanning across two syntactic phrases. A further particularity of this subsystem of French subject-verb agreement is that number marking on the verb is phonologically highly irregular. Despite the challenge, the HPP results for 24- and 18-month-olds demonstrate knowledge of both number dependencies: between the singular determiner le and the non-adjacent singular verbal forms and between the plural determiner les and the non-adjacent plural verbal forms. A control experiment suggests that the infants are responding to known verb forms, not phonological regularities. Given the paucity of such forms in the adult input documented through a corpus study, these results are interpreted as evidence that 18-month-olds have the ability to extract complex patterns across a range of morphophonologically inconsistent and infrequent items in natural language. PMID:21497801

  6. Automatic extraction of nanoparticle properties using natural language processing: NanoSifter an application to acquire PAMAM dendrimer properties.

    PubMed

    Jones, David E; Igo, Sean; Hurdle, John; Facelli, Julio C

    2014-01-01

    In this study, we demonstrate the use of natural language processing methods to extract, from nanomedicine literature, numeric values of biomedical property terms of poly(amidoamine) dendrimers. We have developed a method for extracting these values for properties taken from the NanoParticle Ontology, using the General Architecture for Text Engineering and a Nearly-New Information Extraction System. We also created a method for associating the identified numeric values with their corresponding dendrimer properties, called NanoSifter. We demonstrate that our system can correctly extract numeric values of dendrimer properties reported in the cancer treatment literature with high recall, precision, and f-measure. The micro-averaged recall was 0.99, precision was 0.84, and f-measure was 0.91. Similarly, the macro-averaged recall was 0.99, precision was 0.87, and f-measure was 0.92. To our knowledge, these results are the first application of text mining to extract and associate dendrimer property terms and their corresponding numeric values. PMID:24392101

  7. Integrating Learner Corpora and Natural Language Processing: A Crucial Step towards Reconciling Technological Sophistication and Pedagogical Effectiveness

    ERIC Educational Resources Information Center

    Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie

    2007-01-01

    Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with…

  8. 2 Evolution in Language and Elsewhere It is a natural principle that the script and the sounds

    E-print Network

    of speciation in life forms.2 And, reflecting the similarities between models of genetic inheritance and those of biological organisms. As biological organisms become extinct, languages also die, never to be heard again linguistic innovations. However, it has long been recognized that language change differs in significant ways

  9. Proceedings of Recent Advances in Natural Language Processing, pages 302308, Hissar, Bulgaria, 12-14 September 2011.

    E-print Network

    construction method is an Expecta- tion Maximization (EM) approach which uses Princeton WordNet 3.0 (PWNNet in any language. Links between PWN synsets and target language words are extracted using a bilingual dictionary. For each of these links a parameter is defined that shows probability of selecting PWN synset

  10. The Acquisition of Written Language: Response and Revision. Writing Research: Multidisciplinary Inquiries into the Nature of Writing Series.

    ERIC Educational Resources Information Center

    Freedman, Sarah Warshauer, Ed.

    Viewing writing as both a form of language learning and an intellectual skill, this book presents essays on how writers acquire trusted inner voices and the roles schools and teachers can play in helping student writers in the learning process. The essays in the book focus on one of three topics: the language of instruction and how response and…

  11. Learning Language through Content: Learning Content through Language.

    ERIC Educational Resources Information Center

    Met, Myriam

    1991-01-01

    A definition and description of elementary school content-based foreign language instruction notes how it promotes natural language learning and higher-order thinking skills, and also addresses curriculum development, language objective definition, and specific applications in mathematics, science, reading and language arts, social studies, and…

  12. Language as a Liberal Art.

    ERIC Educational Resources Information Center

    Stein, Jack M.

    Language, considered as a liberal art, is examined in the light of other philosophical viewpoints concerning the nature of language in relation to second language instruction in this paper. Critical of an earlier mechanistic audio-lingual learning theory, translation approaches to language learning, vocabulary list-oriented courses, graduate…

  13. Gendered Language in Interactive Discourse

    ERIC Educational Resources Information Center

    Hussey, Karen A.; Katz, Albert N.; Leith, Scott A.

    2015-01-01

    Over two studies, we examined the nature of gendered language in interactive discourse. In the first study, we analyzed gendered language from a chat corpus to see whether tokens of gendered language proposed in the gender-as-culture hypothesis (Maltz and Borker in "Language and social identity." Cambridge University Press, Cambridge, pp…

  14. Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 196200, New York City, June 2006. c 2006 Association for Computational Linguistics

    E-print Network

    Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 196 the impact of feature engineering and the choice of ma- chine learning algorithm, with particular focus.84 89.95 Portugese 88.96 84.59 Slovene 81.77 72.42 Spanish 84.87 80.36 Swedish 89.54 79.69 Turkish 73

  15. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

    PubMed Central

    2012-01-01

    Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054

  16. Social Network Development, Language Use, and Language Acquisition during Study Abroad: Arabic Language Learners' Perspectives

    ERIC Educational Resources Information Center

    Dewey, Dan P.; Belnap, R. Kirk; Hillstrom, Rebecca

    2013-01-01

    Language learners and educators have subscribed to the belief that those who go abroad will have many opportunities to use the target language and will naturally become proficient. They also assume that language learners will develop relationships with native speakers allowing them to use the language and become more fluent, an assumption…

  17. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard

    NASA Astrophysics Data System (ADS)

    Lendholt, Matthias; Hammitzsch, Martin; Wächter, Joachim

    2010-05-01

    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System) [3] project where CAP serves as central message format containing both human readable warnings and structured data for automatic processing by message receivers. In particular the spatial reference capabilities are of paramount importance both in CAP and EDXL. Affected areas are addressable via geo codes like HASC (Hierarchical Administrative Subdivision Codes) [4] or UN/LOCODE [5] but also with arbitrary polygons that can be directly generated out of GML [6]. For each affected area standardized criticality values (urgency, severity and certainty) have to be set but also application specific key-value-pairs like estimated time of arrival or maximum inundation height can be specified. This enables - together with multilingualism, message aggregation and message conversion for different dissemination channels - the generation of user-specific tailored warning messages. [1] CAP, http://www.oasis-emergency.org/cap [2] EDXL-DE, http://docs.oasis-open.org/emergency/edxl-de/v1.0/EDXL-DE_Spec_v1.0.pdf [3] DEWS, http://www.dews-online.org [4] HASC, "Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998" ISBN 0-7864-0729-8 [5] UN/LOCODE, http://www.unece.org/cefact/codesfortrade/codes_index.htm [6] GML, http://www.opengeospatial.org/standards/gml

  18. Understanding Natural Language Understanding

    E-print Network

    Pratt, Vaughan

    'm afraid that's something I cannot allow to happen. HAL Siri (2011) Colbert: ... I don't want to search for anything! I want to write the show! Siri: Searching the Web for "search for anything. I want to write the shuffle." Colbert: ... For the love of God, the cameras are on, give me something? Siri: What kind

  19. Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing, pages 4856, Blois (France), July 12-15, 2011. c 2011 Association for Computational Linguistics

    E-print Network

    to precision. 1 Introduction Recognition of named entities (NER) has been a hot topic in Natural Language Processing community for more than fifteen years. Ever since their in- troduction in the scope of the Sixth

  20. Web 2.0-Based Crowdsourcing for High-Quality Gold Standard Development in Clinical Natural Language Processing

    PubMed Central

    Deleger, Louise; Li, Qi; Kaiser, Megan; Stoutenborough, Laura

    2013-01-01

    Background A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Objective Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. Methods To build the gold standard for evaluating the crowdsourcing workers’ performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd’s work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations. Results The agreement between the crowd’s annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task. Conclusions This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlower’s quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches. PMID:23548263