Science.gov

Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. Extending a natural language parser with UMLS knowledge.

    PubMed Central

    McCray, A. T.

    1991-01-01

    Over the past several years our research efforts have been directed toward the identification of natural language processing methods and techniques for improving access to biomedical information stored in computerized form. To provide a testing ground for some of these ideas we have undertaken the development of SPECIALIST, a prototype system for parsing and accessing biomedical text. The system includes linguistic and biomedical knowledge. Linguistic knowledge involves rules and facts about the grammar of the language. Biomedical knowledge involves rules and facts about the domain of biomedicine. The UMLS knowledge sources, Meta-1 and the Semantic Network, as well as the UMLS test collection, have recently contributed to the development of the SPECIALIST system. PMID:1807586

  3. Flexible natural language parser based on a two-level representation of syntax

    SciTech Connect

    Lesmo, L.; Torasso, P.

    1983-01-01

    In this paper the authors present a parser which allows to make explicit the interconnections between syntax and semantics, to analyze the sentences in a quasi-deterministic fashion and, in many cases, to identify the roles of the various constituents even if the sentence is ill-formed. The main feature of the approach on which the parser is based consists in a two-level representation of the syntactic knowledge: a first set of rules emits hypotheses about the constituents of the sentence and their functional role and another set of rules verifies whether a hypothesis satisfies the constraints about the well-formedness of sentences. However, the application of the second set of rules is delayed until the semantic knowledge confirms the acceptability of the hypothesis. If the semantics reject it, a new hypothesis is obtained by applying a simple and relatively inexpensive natural modification; a set of these modifications is predefined and only when none of them is applicable a real backup is performed: in most cases this situation corresponds to a case where people would normally garden path. 19 references.

  4. Application of a rules-based natural language parser to critical value reporting in anatomic pathology.

    PubMed

    Owens, Scott R; Balis, Ulysses G J; Lucas, David R; Myers, Jeffrey L

    2012-03-01

    Critical values in anatomic pathology are rare occurrences and difficult to define with precision. Nevertheless, accrediting institutions require effective and timely communication of all critical values generated by clinical and anatomic laboratories. Provisional gating criteria for potentially critical anatomic diagnoses have been proposed, with some success in their implementation reported in the literature. Ensuring effective communication is challenging, however, making the case for programmatic implementation of a turnkey-style integrated information technology solution. To address this need, we developed a generically deployable laboratory information system-based tool, using a tiered natural language processing predicate calculus inference engine to identify qualifying cases that meet criteria for critical diagnoses but lack an indication in the electronic medical record for an appropriate clinical discussion with the ordering physician of record. Using this tool, we identified an initial cohort of 13,790 cases over a 49-month period, which were further explored by reviewing the available electronic medical record for each patient. Of these cases, 35 (0.3%) were judged to require intervention in the form of direct communication between the attending pathologist and the clinical physician of record. In 8 of the 35 cases, this intervention resulted in the conveyance of new information to the requesting physician and/or a change in the patient's clinical plan. The very low percentage of such cases (0.058%) illustrates their rarity in daily practice, making it unlikely that manual identification/notification approaches alone can reliably manage them. The automated turnkey system was useful in avoiding missed handoffs of significant, clinically actionable diagnoses. PMID:22343338

  5. COD::CIF::Parser: an error-correcting CIF parser for the Perl language

    PubMed Central

    Merkys, Andrius; Vaitkus, Antanas; Butkus, Justas; Okulič-Kazarinas, Mykolas; Kairys, Visvaldas; Gražulis, Saulius

    2016-01-01

    A syntax-correcting CIF parser, COD::CIF::Parser, is presented that can parse CIF 1.1 files and accurately report the position and the nature of the discovered syntactic problems. In addition, the parser is able to automatically fix the most common and the most obvious syntactic deficiencies of the input files. Bindings for Perl, C and Python programming environments are available. Based on COD::CIF::Parser, the cod-tools package for manipulating the CIFs in the Crystallography Open Database (COD) has been developed. The cod-tools package has been successfully used for continuous updates of the data in the automated COD data deposition pipeline, and to check the validity of COD data against the IUCr data validation guidelines. The performance, capabilities and applications of different parsers are compared. PMID:26937241

  6. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, David; Forster, M.; Bates, D.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, Theodore; Roblin, Yves; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.

    2006-07-01

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  7. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, D.; Forster, M.; Bates, D.A.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, T.; Roblin, Y.; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.; /UCLA

    2006-10-06

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally, this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  8. Evaluation of a parallel chart parser. Project Memo

    SciTech Connect

    Grishman, R.; Chitrao, M.

    1987-09-01

    A parallel implementation of a chart parser is described for a shared-memory multiprocessor. The speedups obtained with this parser were measured for a number of small natural-language grammars. For the largest of these, part of an operational question-answering system, the parser ran 5 to 7 times faster than the serial version.

  9. Using a Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon to Assign SNOMED CT Codes to Anatomic Sites and Pathologic Diagnoses in Full Text Pathology Reports

    PubMed Central

    Lowe, Henry J.; Huang, Yang; Regula, Donald P.

    2009-01-01

    To address the problem of extracting structured information from pathology reports for research purposes in the STRIDE Clinical Data Warehouse, we adapted the ChartIndex Medical Language Processing system to automatically identify and map anatomic and diagnostic noun phrases found in full-text pathology reports to SNOMED CT concept descriptors. An evaluation of the systems performance showed a positive predictive value for anatomic concepts of 92.3% and positive predictive value for diagnostic concepts of 84.4%. The experiment also suggested strategies for improving ChartIndexs performance coding pathology reports. PMID:20351885

  10. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  11. An Introductory Lisp Parser.

    ERIC Educational Resources Information Center

    Loritz, Donald

    1987-01-01

    Gives a short grammar of the Lisp computer language. Presents an introductory English parser (Simparse) as an example of how to write a parser in Lisp. Lists references for further explanation. Intended as preparation for teachers who may use computer-assisted language instruction in the future. (LMO)

  12. Parsers in Tutors: What Are They Good For?

    ERIC Educational Resources Information Center

    Holland, Melissa V.; And Others

    1993-01-01

    Possibilities and limitations of a natural language processing technology, with its central engine, the parser, are discussed. Observations are drawn from a project by the U.S. Army Research Institute to develop a German tutor, the BRIDGE, which revolves around a parser. (Contains 19 references.) (Author/LB)

  13. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  14. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  15. Conjunction, Ellipsis, and Other Discontinuous Constituents in the Constituent Object Parser.

    ERIC Educational Resources Information Center

    Metzler, Douglas P.; And Others

    1990-01-01

    Describes the Constituent Object Parser (COP), a domain independent syntactic parser developed for use in information retrieval and similar applications. The syntactic structure of natural language entities is discussed, and the mechanisms by which COP handles the problems of conjunctions, ellipsis, and discontinuous constituents are explained.

  16. Conjunction, Ellipsis, and Other Discontinuous Constituents in the Constituent Object Parser.

    ERIC Educational Resources Information Center

    Metzler, Douglas P.; And Others

    1990-01-01

    Describes the Constituent Object Parser (COP), a domain independent syntactic parser developed for use in information retrieval and similar applications. The syntactic structure of natural language entities is discussed, and the mechanisms by which COP handles the problems of conjunctions, ellipsis, and discontinuous constituents are explained.…

  17. Expression and cut parser for CMS event data

    NASA Astrophysics Data System (ADS)

    Lista, Luca; Jones, Christopher D.; Petrucciani, Giovanni

    2010-04-01

    We present a parser to evaluate expressions and Boolean selections that is applied on CMS event data for event filtering and analysis purposes. The parser is based on Boost Spirit grammar definition, and uses Reflex dictionaries for class introspection. The parser allows for a natural definition of expressions and cuts in users' configurations, and provides good runtime performance compared to other existing parsers.

  18. Can Parsers Be a Legitimate Pedagogical Tool?

    ERIC Educational Resources Information Center

    Vinther, Jane

    2004-01-01

    Parsers are rarely used in language instruction as a primary tool towards a pedagogical end. Visual Interactive Syntax Learning (=VISL) is a programme which is basically a parser put to pedagogical use. It can analyse sentences in 13 different languages, the most advanced programmes being English and Portuguese. The pedagogical purpose is to teach…

  19. Toward a theory of distributed word expert natural language parsing

    NASA Technical Reports Server (NTRS)

    Rieger, C.; Small, S.

    1981-01-01

    An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.

  20. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for

  1. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate

  2. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate…

  3. PASCAL LR(1) Parser Generator System

    Energy Science and Technology Software Center (ESTSC)

    1988-05-04

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce completemore »parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.« less

  4. Natural language parsing in a hybrid connectionist-symbolic architecture

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-03-01

    Most connectionist parsers either cannot guarantee the correctness of their derivations or have to simulate a serial flow of control. In the first case, users have to restrict the tasks (e.g. parse less complex or shorter sentences) of the parser or they need to believe in the soundness of the result. In the second case, the resulting network has lost most of its attractivity because seriality needs to be hard-coded into the structure of the net. We here present a hybrid symbolic connectionist parser, which was designed to fulfill the following goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free grammar, and (3) learning the applicability of parsing rules with a neural network. Our hybrid architecture consists of a serial parsing algorithm and a trainable net. BrainC (Backtracking and Backpropagation in C) combines the well known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent the typical properties of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN- Workstations and was tested with several grammars for English and German. We discuss how BrainC reached its design goals and what results we observed.

  5. A natural language parsing system for encoding admitting diagnoses.

    PubMed Central

    Haug, P. J.; Christensen, L.; Gundersen, M.; Clemons, B.; Koehler, S.; Bauer, K.

    1997-01-01

    Free-text or natural language documents make up an increasing part of the computerized medical record. While they do provide accessible clinical information to health care personnel, they fail to support processes that require clinical data coded according to a shared lexicon and data structure. We have developed a natural language parser that converts free-text admitting diagnoses into a coded form. This application has proven acceptably accurate in the experimental laboratory to warrant a test in the target clinical environment. Here we describe an approach to moving this research application into a production environment where it can contribute to the efforts of the Health Information Services Department. This transition is essential if the products of natural language understanding research are to contribute to patient care in a routine and sustainable way. PMID:9357738

  6. A Parser for On-Line Search System Evaluation.

    ERIC Educational Resources Information Center

    Tremain, Russ; Cooper, Michael D.

    1983-01-01

    Describes computer program which converts text of user input and system responses from online search system into fixed format records describing interaction. Query language syntax, format of output record produced by parser, problems in constructing parser, its performance characteristics, and recommendations for improving process of logging…

  7. La Description des langues naturelles en vue d'applications linguistiques: Actes du colloque (The Description of Natural Languages with a View to Linguistic Applications: Conference Papers). Publication K-10.

    ERIC Educational Resources Information Center

    Ouellon, Conrad, Comp.

    Presentations from a colloquium on applications of research on natural languages to computer science address the following topics: (1) analysis of complex adverbs; (2) parser use in computerized text analysis; (3) French language utilities; (4) lexicographic mapping of official language notices; (5) phonographic codification of Spanish; (6)…

  8. Natural language generation

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    The goal of natural language generation is to replicate human writers or speakers: to generate fluent, grammatical, and coherent text or speech. Produced language, using both explicit and implicit means, must clearly and effectively express some intended message. This demands the use of a lexicon and a grammar together with mechanisms which exploit semantic, discourse and pragmatic knowledge to constrain production. Furthermore, special processors may be required to guide focus, extract presuppositions, and maintain coherency. As with interpretation, generation may require knowledge of the world, including information about the discourse participants as well as knowledge of the specific domain of discourse. All of these processes and knowledge sources must cooperate to produce well-written, unambiguous language. Natural language generation has received less attention than language interpretation due to the nature of language: it is important to interpret all the ways of expressing a message but we need to generate only one. Furthermore, the generative task can often be accomplished by canned text (e.g., error messages or user instructions). The advent of more sophisticated computer systems, however, has intensified the need to express multisentential English.

  9. Designing and Implementing a Syntactic Parser.

    ERIC Educational Resources Information Center

    Sanders, Alton; Sanders, Ruth

    1987-01-01

    Describes the development in progress of a syntactic parser of German called "Syncheck," which uses the programing language "Prolog." The grammar is written in a formalism called "Definate Clause Grammar." The purpose of "Syncheck" is to provide advice on grammatical correctness to intermediate and advanced college students of German. (Author/LMO)

  10. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  11. Towards Authentic Tasks and Experiences: The Example of Parser-based CALL.

    ERIC Educational Resources Information Center

    Schulze, Mathias; Hamel, Marie-Josee

    2000-01-01

    Discusses ways of achieving productive use of authentic language in a computer-assisted language learning (CALL) environment. Concentrates on a particular area of CALL--parser-based CALL. Outlines two existing parsers for two CALL tools that support text production by encouraging learners to concentrate on the linguistic structure of a text…

  12. Left-corner unification-based natural language processing

    SciTech Connect

    Lytinen, S.L.; Tomuro, N.

    1996-12-31

    In this paper, we present an efficient algorithm for parsing natural language using unification grammars. The algorithm is an extension of left-corner parsing, a bottom-up algorithm which utilizes top-down expectations. The extension exploits unification grammar`s uniform representation of syntactic, semantic, and domain knowledge, by incorporating all types of grammatical knowledge into parser expectations. In particular, we extend the notion of the reachability table, which provides information as to whether or not a top-down expectation can be realized by a potential subconstituent, by including all types of grammatical information in table entries, rather than just phrase structure information. While our algorithm`s worst-case computational complexity is no better than that of many other algorithms, we present empirical testing in which average-case linear time performance is achieved. Our testing indicates this to be much improved average-case performance over previous leftcomer techniques.

  13. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The book presents papers on natural language processing, focusing on the central issues of representation, reasoning, and recognition. The introduction discusses theoretical issues, historical developments, and current problems and approaches. The book presents work in syntactic models (parsing and grammars), semantic interpretation, discourse interpretation, language action and intentions, language generation, and systems.

  14. Designing a Constraint Based Parser for Sanskrit

    NASA Astrophysics Data System (ADS)

    Kulkarni, Amba; Pokar, Sheetal; Shukl, Devanand

    Verbal understanding (śā bdabodha) of any utterance requires the knowledge of how words in that utterance are related to each other. Such knowledge is usually available in the form of cognition of grammatical relations. Generative grammars describe how a language codes these relations. Thus the knowledge of what information various grammatical relations convey is available from the generation point of view and not the analysis point of view. In order to develop a parser based on any grammar one should then know precisely the semantic content of the grammatical relations expressed in a language string, the clues for extracting these relations and finally whether these relations are expressed explicitly or implicitly. Based on the design principles that emerge from this knowledge, we model the parser as finding a directed Tree, given a graph with nodes representing the words and edges representing the possible relations between them. Further, we also use the Mīmā ṃsā constraint of ākā ṅkṣā (expectancy) to rule out non-solutions and sannidhi (proximity) to prioritize the solutions. We have implemented a parser based on these principles and its performance was found to be satisfactory giving us a confidence to extend its functionality to handle the complex sentences.

  15. A natural language interface for real-time dialogue in the flight domain

    NASA Technical Reports Server (NTRS)

    Ali, M.; Ai, C.-S.; Ferber, H. J.

    1986-01-01

    A flight expert system (FLES) is being developed to assist pilots in monitoring, diagnosisng and recovering from in-flight faults. To provide a communications interface between the flight crew and FLES, a natural language interface, has been implemented. Input to NALI is processed by three processors: (1) the semantic parser, (2) the knowledge retriever, and (3) the response generator. The architecture of NALI has been designed to process both temporal and nontemporal queries. Provisions have also been made to reduce the number of system modifications required for adapting NALI to other domains. This paper describes the architecture and implementation of NALI.

  16. Prolog implementation of lexical functional grammar as a base for a natural language processing system

    SciTech Connect

    Frey, W.; Reyle, U.

    1983-01-01

    The authors present a system which constructs a database out of a narrative natural language text. Firstly they give a detailed description of the PROLOG implementation of the parser which is based on the theory of lexical functional grammar (LFG). They show that PROLOG provides an efficient tool for LFG implementation. Secondly, they postulate some requirements a semantic representation has to fulfil in order to be able to analyse whole texts. They show how kamps theory meets these requirements by analysing sample discourses involving anaphoric nps. 4 references.

  17. Distributed problem solving and natural language understanding models

    NASA Technical Reports Server (NTRS)

    Rieger, C.

    1980-01-01

    A theory of organization and control for a meaning-based language understanding system is mapped out. In this theory, words, rather than rules, are the units of knowledge, and assume the form of procedural entities which execute as generator-like coroutines. Parsing a sentence in context demands a control environment in wich experts can ask questions of each other, forward hints and suggestions to each other, and suspend. The theory is a cognitive theory of both language representation and parser control.

  18. Advances in natural language processing.

    PubMed

    Hirschberg, Julia; Manning, Christopher D

    2015-07-17

    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. PMID:26185244

  19. A Table Look-Up Parser in Online ILTS Applications

    ERIC Educational Resources Information Center

    Chen, Liang; Tokuda, Naoyuki; Hou, Pingkui

    2005-01-01

    A simple table look-up parser (TLUP) has been developed for parsing and consequently diagnosing syntactic errors in semi-free formatted learners' input sentences of an intelligent language tutoring system (ILTS). The TLUP finds a parse tree for a correct version of an input sentence, diagnoses syntactic errors of the learner by tracing and…

  20. A Table Look-Up Parser in Online ILTS Applications

    ERIC Educational Resources Information Center

    Chen, Liang; Tokuda, Naoyuki; Hou, Pingkui

    2005-01-01

    A simple table look-up parser (TLUP) has been developed for parsing and consequently diagnosing syntactic errors in semi-free formatted learners' input sentences of an intelligent language tutoring system (ILTS). The TLUP finds a parse tree for a correct version of an input sentence, diagnoses syntactic errors of the learner by tracing and

  1. Connectionist natural language parsing with BrainC

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-08-01

    A close examination of pure neural parsers shows that they either could not guarantee the correctness of their derivations or had to hard-code seriality into the structure of the net. The authors therefore decided to use a hybrid architecture, consisting of a serial parsing algorithm and a trainable net. The system fulfills the following design goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free language, and (3) learning the applicability of parsing rules with a neural network to increase the efficiency of the whole system. BrainC (backtracktacking and backpropagation in C) combines the well- known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent typical structures of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN workstations and was tested with several grammars for English and German. The design of the system and then the results are discussed.

  2. A lex-based mad parser and its applications

    SciTech Connect

    Oleg Krivosheev et al.

    2001-07-03

    An embeddable and portable Lex-based MAD language parser has been developed. The parser consists of a front-end which reads a MAD file and keeps beam elements, beam line data and algebraic expressions in tree-like structures, and a back-end, which processes the front-end data to generate an input file or data structures compatible with user applications. Three working programs are described, namely, a MAD to C++ converter, a dynamic C++ object factory and a MAD-MARS beam line builder. Design and implementation issues are discussed.

  3. Wild Words: Nature, Language and Outdoor Education.

    ERIC Educational Resources Information Center

    Meisner, Mark

    1993-01-01

    Discusses ways that language misrepresents nature, pointing out that frequently used metaphors and problematic language usage provide limited conceptual and emotional understanding of the natural world and contribute to a degraded view of nature. Discusses strategies for changing language as the first step in changing attitudes toward nature. (LP)

  4. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The papers assembled fall naturally into six groups dealing respectively with parsing and grammars, semantic interpretation, discourse interpretation (covering, for example, anaphor resolution), language actions and the intentions underlying them, language generation, and systems (notably interface systems). The chapter headings are treated broadly and are taken to imply either that the authors are adopting a particular position about the way processing, and particularly input processing, should be done, or that problems and solutions assigned to one category have no relevance elsewhere. Many individual papers, placed in their most appropriate categories, also contribute to other areas.

  5. New trends in natural language processing: statistical natural language processing.

    PubMed

    Marcus, M

    1995-10-24

    The field of natural language processing (NLP) has seen a dramatic shift in both research direction and methodology in the past several years. In the past, most work in computational linguistics tended to focus on purely symbolic methods. Recently, more and more work is shifting toward hybrid methods that combine new empirical corpus-based methods, including the use of probabilistic and information-theoretic techniques, with traditional symbolic methods. This work is made possible by the recent availability of linguistic databases that add rich linguistic annotation to corpora of natural language text. Already, these methods have led to a dramatic improvement in the performance of a variety of NLP systems with similar improvement likely in the coming years. This paper focuses on these trends, surveying in particular three areas of recent progress: part-of-speech tagging, stochastic parsing, and lexical semantics. PMID:7479725

  6. Natural language in the DP world

    SciTech Connect

    Kaplan, S.J.; Ferris, D.

    1982-08-01

    A natural language system allows a computer user to interact with the machine in an ordinary language such as English or French. The article examines the problems in developing such systems. It also discusses the viability and desirability of natural language systems as an alternative to conventional programming languages.

  7. An efficient pancreatic cyst identification methodology using natural language processing.

    PubMed

    Mehrabi, Saeed; Schmidt, C Max; Waters, Joshua A; Beesley, Chris; Krishnan, Anand; Kesterson, Joe; Dexter, Paul; Al-Haddad, Mohammed A; Tierney, William M; Palakal, Mathew

    2013-01-01

    Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%. PMID:23920672

  8. Syntactic dependency parsers for biomedical-NLP.

    PubMed

    Cohen, Raphael; Elhadad, Michael

    2012-01-01

    Syntactic parsers have made a leap in accuracy and speed in recent years. The high order structural information provided by dependency parsers is useful for a variety of NLP applications. We present a biomedical model for the EasyFirst parser, a fast and accurate parser for creating Stanford Dependencies. We evaluate the models trained in the biomedical domains of EasyFirst and Clear-Parser in a number of task oriented metrics. Both parsers provide stat of the art speed and accuracy in the Genia of over 89%. We show that Clear-Parser excels at tasks relating to negation identification while EasyFirst excels at tasks relating to Named Entities and is more robust to changes in domain. PMID:23304280

  9. Integration of speech with natural language understanding.

    PubMed Central

    Moore, R C

    1995-01-01

    The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding. PMID:7479813

  10. Comparing Natural Language Retrieval: WIN and FREESTYLE.

    ERIC Educational Resources Information Center

    Pritchard-Schoch, Teresa

    1995-01-01

    Compares two natural language processing search engines, WIN (WESTLAW Is Natural) and FREESTYLE, developed by LEXIS. Legal issues in natural language queries were presented to identical libraries in both systems. Results showed that the editorials enhanced relevance; a search would be more thorough using both databases; and if only one system were

  11. Parser Combinators: a Practical Application for Generating Parsers for NMR Data

    PubMed Central

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as parser combinators. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  12. Parser Combinators: a Practical Application for Generating Parsers for NMR Data.

    PubMed

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi Jc; Gryk, Michael R

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as "parser combinators". This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  13. Multilingual environment and natural acquisition of language

    NASA Astrophysics Data System (ADS)

    Takano, Shunichi; Nakamura, Shigeru

    2000-06-01

    Language and human are not anything in the outside of nature. Not only babies, even adults can acquire new language naturally, if they have a natural multilingual environment around them. The reason it is possible would be that any human has an ability to grasp the whole of language, and at the same time, language has an order which is the easiest to acquire for humans. The process of this natural acquisition and a result of investigating the order of Japanese vowels are introduced. .

  14. Talking with computers in natural language

    SciTech Connect

    Werner, T.R.

    1986-01-01

    Great efforts have been made to find a solution to the problem of communication with computers. Two approaches can be distinguished: (1) making the computer language similar to the natural language; (2) making the user's language resemble that of computers through formalisation of the former. This book deals with the first approach: those systems are considered which make it possible to ''talk'' with the user in limited natural language. Contents: Theories and Principles of Designing the Model of Conversation. - Principles of Construction and General Organization of the Model of a Participant in the Conversation. - The Structure of Knowledge and Methods of Representing Reality: Knowledge About the Surrounding Environment. - The System's Knowledge About the Language and the Participants in the Conversation. - Input Sentence Analysis. - Connected-Text (Discourse) Processing. - Synthesis of Statements in Natural Language. - Work to Date on Designing Systems of Conservation. - Afterword. - References. - Subject Index.

  15. Natural language interface for command and control

    NASA Technical Reports Server (NTRS)

    Shuler, Robert L., Jr.

    1986-01-01

    A working prototype of a flexible 'natural language' interface for command and control situations is presented. This prototype is analyzed from two standpoints. First is the role of natural language for command and control, its realistic requirements, and how well the role can be filled with current practical technology. Second, technical concepts for implementation are discussed and illustrated by their application in the prototype system. It is also shown how adaptive or 'learning' features can greatly ease the task of encoding language knowledge in the language processor.

  16. Interpreting natural language queries using the UMLS.

    PubMed Central

    Johnson, S. B.; Aguirre, A.; Peng, P.; Cimino, J.

    1993-01-01

    This paper describes AQUA (A QUery Analyzer), the natural language front end of a prototype information retrieval system. AQUA translates a user's natural language query into a representation in the Conceptual Graph formalism. The graph is then used by subsequent components to search various resources such as databases of the medical literature. The focus of the parsing method is on semantics rather than syntax, with semantic restrictions being provided by the UMLS Semantic Net. The intent of the approach is to provide a method that can be emulated easily in applications that require simple natural language interfaces. PMID:8130481

  17. Language and the Multisemiotic Nature of Mathematics

    ERIC Educational Resources Information Center

    de Oliveira, Luciana C.; Cheng, Dazhi

    2011-01-01

    This article explores how language and the multisemiotic nature of mathematics can present potential challenges for English language learners (ELLs). Based on two qualitative studies of the discourse of mathematics, we discuss some of the linguistic challenges of mathematics for ELLs in order to highlight the potential difficulties they may have

  18. Introduction: Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Smeaton, Alan F.

    1990-01-01

    Discussion of research into information and text retrieval problems highlights the work with automatic natural language processing (NLP) that is reported in this issue. Topics discussed include the occurrences of nominal compounds; anaphoric references; discontinuous language constructs; automatic back-of-the-book indexing; and full-text analysis.…

  19. A System for Natural Language Sentence Generation.

    ERIC Educational Resources Information Center

    Levison, Michael; Lessard, Gregory

    1992-01-01

    Describes the natural language computer program, "Vinci." Explains that using an attribute grammar formalism, Vinci can simulate components of several current linguistic theories. Considers the design of the system and its applications in linguistic modelling and second language acquisition research. Notes Vinci's uses in linguistics instruction

  20. A Natural Language Interface to Databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1990-01-01

    The development of a Natural Language Interface (NLI) is presented which is semantic-based and uses Conceptual Dependency representation. The system was developed using Lisp and currently runs on a Symbolics Lisp machine.

  1. Dependency parsing for medical language and concept representation.

    PubMed

    Steimann, F

    1998-01-01

    The theory of conceptual structures serves as a common basis for natural language processing and medical concept representation. We present a PROLOG-based formalization of dependency grammar that can accommodate conceptual structures in its dependency rules. First results indicate that this formalization provides an operational basis for the implementation of medical language parsers and for the design of medical concept representation languages. PMID:9475953

  2. Natural Language Description of Emotion

    ERIC Educational Resources Information Center

    Kazemzadeh, Abe

    2013-01-01

    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  3. Natural Language Description of Emotion

    ERIC Educational Resources Information Center

    Kazemzadeh, Abe

    2013-01-01

    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we

  4. Knowledge engineering approach to natural language understanding

    SciTech Connect

    Shapiro, S.C.; Neal, J.G.

    1982-01-01

    The authors describe the results of a preliminary study of a knowledge engineering approach to natural language understanding. A computer system is being developed to handle the acquisition, representation, and use of linguistic knowledge. The computer system is rule-based and utilizes a semantic network for knowledge storage and representation. In order to facilitate the interaction between user and system, input of linguistic knowledge and computer responses are in natural language. Knowledge of various types can be entered and utilized: syntactic and semantic; assertions and rules. The inference tracing facility is also being developed as a part of the rule-based system with output in natural language. A detailed example is presented to illustrate the current capabilities and features of the system. 12 references.

  5. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query

  6. Diversity Writing: Natural Languages, Authentic Voices

    ERIC Educational Resources Information Center

    Marzluf, Phillip P.

    2006-01-01

    Though diversity serves as a valuable source for rhetorical inquiry, expressivist instructors who privilege diversity writing may also overemphasize the essential authenticity of their students' vernaculars. This romantic and salvationist impulse reveals the troubling implications of eighteenth-century Natural Language Theory and may,

  7. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)…

  8. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)

  9. Two Interpretive Systems for Natural Language?

    ERIC Educational Resources Information Center

    Frazier, Lyn

    2015-01-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system

  10. A Priori Analysis of Natural Language Queries.

    ERIC Educational Resources Information Center

    Spiegler, Israel; Elata, Smadar

    1988-01-01

    Presents a model for the a priori analysis of natural language queries which uses an algorithm to transform the query into a logical pattern that is used to determine the answerability of the query. The results of testing by a prototype system implemented in PROLOG are discussed. (20 references) (CLB)

  11. Brain readiness and the nature of language

    PubMed Central

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their “representations” may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique means of representation that allow perceptual forms and contents to be linked into signs. PMID:26441751

  12. Brain readiness and the nature of language.

    PubMed

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique means of representation that allow perceptual forms and contents to be linked into signs. PMID:26441751

  13. Current trends with natural language processing.

    PubMed

    Rassinoux, A M; Michel, P A; Wagner, J; Baud, R

    1995-01-01

    Natural Language Processing in the medical domain becomes more and more powerful, efficient, and ready to be used in daily practice. The needs for such tools are enormous in the medical field, due to the vast amount of written texts for medical records. In the authors' point of view, the Electronic Patient Record (EPR) is achieved neither with Information Systems of all kinds nor with commercially available word processing systems. Natural Language Processing (NLP) is one dimension of the EPR, as well as Image Processing and Decision Support Systems. Analysis of medical texts to facilitate indexing and retrieval is well known. The need for a generation tool is to produce progress notes from menu driven systems. The computer systems of tomorrow cannot miss any single dimension. Since 1988, we've been developing an NLP system; it is supported by the European program AIM (Advanced Informatics in Medicine) within the GALEN and HELIOS consortium and the CERS (Commission d'Encouragement á la Recherche Scientifique) in Switzerland. The main directions of development are: a medical language analyzer, a language generator, a query processor, and dictionary building tools to support the Medical Linguistic Knowledge Base (MLKB). The knowledge representation schema is essentially based on Sowa's conceptual graphs, and the MLKB is multilingual from its design phase; it currently incorporates the English and the French languages; it will also continue using German. The goal of this demonstration is to provide evidence of what exists today, what will be soon available, and what is planned for the long term. Complete sentences will be processed in real time, and the browsing capabilities of the MLKB will be exercised. In particular, the following features will be presented: Analysis of complete sentences with verbs and relatives, as extracted from clinical narratives, with special attention to the method of "proximity processing" as developed in our group and the rule based approach to language description to resolve the specific surface language problems as well as the language independent semantic situations. Comparison of results for English, French, and German sentences, showing the commonalities between these languages and, therefore, the re-usable features and the language specific aspects. Generation of noun phrases in English and French, showing the opportunities for translation between these two languages. Application of the analyzer to build a knowledge representation of ICD under the form of conceptual graphs and presentation of the possibilities of a natural language encoding of diagnosis. Strategies for query processing through a sample of abdominal ultrasonography reports, which have been analyzed and stored under the form of conceptual graphs. Feeding in and browsing of the Medical Linguistic Knowledge Base and other Dictionary Building Tools, using the perspective of an international initiative to converge towards a multilingual universal solution, valid for the medical domain. The demonstration platform is Microsoft Windows 4 on a PC, with Microsoft Visual Basic as the GUI and Quintus Prolog as NLP tools language. The same programs were originally developed for Unix-based workstations and are available on multiple platforms under Motif and X11. . PMID:8591530

  14. Automated database design from natural language input

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando; Segami, Carlos; Delaune, Carl

    1995-01-01

    Users and programmers of small systems typically do not have the skills needed to design a database schema from an English description of a problem. This paper describes a system that automatically designs databases for such small applications from English descriptions provided by end-users. Although the system has been motivated by the space applications at Kennedy Space Center, and portions of it have been designed with that idea in mind, it can be applied to different situations. The system consists of two major components: a natural language understander and a problem-solver. The paper describes briefly the knowledge representation structures constructed by the natural language understander, and, then, explains the problem-solver in detail.

  15. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  16. An expert system for natural language processing

    NASA Technical Reports Server (NTRS)

    Hennessy, John F.

    1988-01-01

    A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.

  17. Discovering protein similarity using natural language processing.

    PubMed Central

    Sarkar, Indra N.; Rindflesch, Thomas C.

    2002-01-01

    Extracting protein interaction relationships from textual repositories, such as MEDLINE, may prove useful in generating novel biological hypotheses. Using abstracts relevant to two known functionally related proteins, we modified an existing natural language processing tool to extract protein interaction terms. We were able to obtain functional information about two proteins, Amyloid Precursor Protein and Prion Protein, that have been implicated in the etiology of Alzheimer's Disease and Creutzfeldt-Jakob Disease, respectively. PMID:12463910

  18. Reference And Description In Natural Language

    NASA Astrophysics Data System (ADS)

    Steinberg, Alan N.

    1988-03-01

    We propose a theory for modeling the semantic and pragmatic properties of natural language expressions used to refer. The sorts of expressions to be discussed include proper names, definite noun phrases and personal pronouns. We will focus in this paper on such expressions in the singular, having discussed elsewhere procedures for extending the present sort of analysis to various plural uses of these expressions. Propositions involving referential expressions are formally redefined in a second order predicate calculus, in which various semantic and pragmatic factors involved in establishing and interpreting references are modeled as rules of inference. Uses of referential utterances are differentiated according to the means used for individuating the object referred to. Analyses are provided for anaphoric, contextual, demonstrative, introductory and citational individuative devices. We analyze sentences like 'The man [or John] is wise' as conditionals of the form 'Whatever is uniquely a man [or named "John"] relevant to the present discourse is wise'. So modeled, the presupposition of existence (which historically has concerned much logical analysis of such sentences) is represented as a conversational implicature of the sort which obtains from any proposition of the form '(P -> Q)' to the corresponding `P'. This formalization is intended to serve as part of an empirical theory of natural language phenomena. Being an empirical theory, ours will strive to model the greatest possible diversity of phenomena using a minimum of formal apparatus. Such a theory may provide a foundation for automatic systems to predict and replicate natural language phenomena for purposes of text understanding and synthesis.

  19. Understanding requirements via natural language information modeling

    SciTech Connect

    Sharp, J.K.; Becker, S.D.

    1993-07-01

    Information system requirements that are expressed as simple English sentences provide a clear understanding of what is needed between system specifiers, administrators, users, and developers of information systems. The approach used to develop the requirements is the Natural-language Information Analysis Methodology (NIAM). NIAM allows the processes, events, and business rules to be modeled using natural language. The natural language presentation enables the people who deal with the business issues that are to be supported by the information system to describe exactly the system requirements that designers and developers will implement. Computer prattle is completely eliminated from the requirements discussion. An example is presented that is based upon a section of a DOE Order involving nuclear materials management. Where possible, the section is analyzed to specify the process(es) to be done, the event(s) that start the process, and the business rules that are to be followed during the process. Examples, including constraints, are developed. The presentation steps through the modeling process and shows where the section of the DOE Order needs clarification, extensions or interpretations that could provide a more complete and accurate specification.

  20. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  1. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems

  2. Robust natural language dialogues for instruction tasks

    NASA Astrophysics Data System (ADS)

    Scheutz, Matthias

    2010-04-01

    Being able to understand and carry out spoken natural instructions even in limited domains is extremely challenging for current robots. The difficulties are multifarious, ranging from problems with speech recognizers to difficulties with parsing disfluent speech or resolving references based on perceptual or task-based knowledge. In this paper, we present our efforts at starting to address these problems with an integrated natural language understanding system implemented in our DIARC architecture on a robot that can handle fairly unconstrained spoken ungrammatical and incomplete instructions reliably in a limited domain.

  3. Intelligent CAI: An Author Aid for a Natural Language Interface.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)

  4. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  5. Interactive image retrieval by natural language

    NASA Astrophysics Data System (ADS)

    Harada, Shouji; Itoh, Yukihiro; Nakatani, Hiromasa

    1997-12-01

    This paper presents a method of building an image data retrieval system that can accept Japanese sentences and handle subjective expressions. A naive user who has little knowledge about objects in the database is likely to use subjective words to explain what he or she wants, for instance 'show me a cute one,' or 'I would like to have a simpler one.' Objective interpretation of those expressions is difficult but is indispensable to retrieval systems. In this paper we propose a technique for matching subjective expressions with color feature and discuss usability of our natural language interface.

  6. Written Language Is as Natural as Spoken Language: A Biolinguistic Perspective

    ERIC Educational Resources Information Center

    Aaron, P. G.; Joshi, R. Malatesha

    2006-01-01

    A commonly held belief is that language is an aspect of the biological system since the capacity to acquire language is innate and evolved along Darwinian lines. Written language, on the other hand, is thought to be an artifact and a surrogate of speech; it is, therefore, neither natural nor biological. This disparaging view of written language,

  7. Written Language Is as Natural as Spoken Language: A Biolinguistic Perspective

    ERIC Educational Resources Information Center

    Aaron, P. G.; Joshi, R. Malatesha

    2006-01-01

    A commonly held belief is that language is an aspect of the biological system since the capacity to acquire language is innate and evolved along Darwinian lines. Written language, on the other hand, is thought to be an artifact and a surrogate of speech; it is, therefore, neither natural nor biological. This disparaging view of written language,…

  8. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  9. Natural language insensitive short textual string compression

    NASA Astrophysics Data System (ADS)

    Constantinescu, Cornel; Trelewicz, Jennifer Q.; Arps, Ronald B.

    2004-01-01

    There are applications (such as Internet search engines) where short textual strings, for example abstracts or pieces of Web pages, need to be compressed independently of each other. The usual adaptive compression algorithms perform poorly on these short strings due to the lack of necessary data to learn. In this manuscript, we introduce a compression algorithm targeting short text strings; e.g., containing a few hundred symbols. We also target natural language insensitivity, to facilitate its robust compression and fast implementation. The algorithm is based on the following findings. Applying the move-to-front transform (MTFT) after the Burrows-Wheeler transform (BWT) brings the short textual strings to a "normalized form" where the distribution of the resulting "ranks" has a shape similar over the set of natural language strings. This facilitates the use of a static coding method with few variations, which we call shortBWT, where no on-line learning is needed, to encode the ranks. Finally, for short strings, shortBWT runs very fast because the strings fit into the cache of most current computers. The introduction for this paper will review the mathematical bases of BWT and MTF, it will also review our recently published metric for rapidly pre-characterizing the compressibility of such short textual strings when using these transforms.

  10. Understanding and representing natural language meaning

    NASA Astrophysics Data System (ADS)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  11. Understanding and representing natural language meaning

    SciTech Connect

    Waltz, D.L.; Maran, L.R.; Dorfman, M.H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (a) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (b) written a parsing program which selects appropriate word and sentence meanings by a parallel process known as activation and inhibition; (c) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (d) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (e) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (f) constructed a general model for the representation of tense and aspect of verbs; (g) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  12. Understanding natural language for spacecraft sequencing

    NASA Technical Reports Server (NTRS)

    Katz, Boris; Brooks, Robert N., Jr.

    1987-01-01

    The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.

  13. Transportable natural-language interfaces: problems and techniques

    SciTech Connect

    Grosz, B.J.

    1982-01-01

    The author considers the question of natural language database access within the context of a project at SRI, TEAM, that is developing techniques for transportable natural-language interfaces. The goal of transportability is to enable nonspecialists to adapt a natural-language processing system for access to an existing conventional database. TEAM is designed to interact with two different kinds of users. During an acquisition dialogue, a database expert (DBE) provides TAEM with information about the files and fields in the conventional database for which a natural-language interface is desired. (Typically this database already exists and is populated, but TAEM also provides facilities for creating small local databases.) This dialogue results in extension of the language-processing and data access components that make it possible for an end user to query the new database in natural language. 13 references.

  14. Applications of Weighted Automata in Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Knight, Kevin; May, Jonathan

    We explain why weighted automata are an attractive knowledge representation for natural language problems. We first trace the close historical ties between the two fields, then present two complex real-world problems, transliteration and translation. These problems are usefully decomposed into a pipeline of weighted transducers, and weights can be set to maximize the likelihood of a training corpus using standard algorithms. We additionally describe the representation of language models, critical data sources in natural language processing, as weighted automata. We outline the wide range of work in natural language processing that makes use of weighted string and tree automata and describe current work and challenges.

  15. Overview of computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language processing and understanding is the key to enabling humans and their creations to interact with machines in natural language (in contrast to computer language). The doors that such an achievement can open has made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural languages interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations.

  16. Inferring heuristic classification hierarchies from natural language input

    NASA Technical Reports Server (NTRS)

    Hull, Richard; Gomez, Fernando

    1993-01-01

    A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.

  17. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  18. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities

  19. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers

    PubMed Central

    Ye, Ye; Tsui, Fuchiang (Rich); Wagner, Michael; Espino, Jeremy U; Li, Qi

    2014-01-01

    Objectives To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection. Methods We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios. Results The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p<0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 ‘most influential’ findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76>0.70, p<0.05). Conclusions Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods. PMID:24406261

  20. Whole language and deaf bilingual-bicultural education--naturally!

    PubMed

    Mason, D; Ewoldt, C

    1996-10-01

    This position paper discusses how the tenets of Whole Language and Deaf Bilingual-Bicultural Education complement each other. It stresses that Whole Language is based on natural processes through which children can translate their constructs of personal experiences, observations, and perspectives into modes of communication that include written language and, in the present case, American Sign Language. The paper is based on two emphases: (a) Whole Language emphasizes a two-way teaching/learning process, teachers learning from children, and vice versa; and (b) Deaf Bilingual-Bicultural Education emphasizes American Sign Language as a language of instruction and builds on mutual respect for the similarities and differences in the sociocultural and socioeducational experiences and values of Deaf and hearing people. Both Whole Language and Deaf Bilingual-Bicultural Education attempt to authenticate curriculum by integrating Deaf persons' worldviews as part of educational experience. PMID:8936704

  1. Natural Language Metaphors Covertly Influence Reasoning

    PubMed Central

    Thibodeau, Paul H.; Boroditsky, Lera

    2013-01-01

    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a citys crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced peoples reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents. PMID:23301009

  2. A Hybrid Architecture For Natural Language Understanding

    NASA Astrophysics Data System (ADS)

    Loatman, R. Bruce

    1987-05-01

    The PRC Adaptive Knowledge-based Text Understanding System (PAKTUS) is an environment for developing natural language understanding (NLU) systems. It uses a knowledge-based approach in an integrated hybrid architecture based on a factoring of the NLU problem into its lexi-cal, syntactic, conceptual, domain-specific, and pragmatic components. The goal is a robust system that benefits from the strengths of several NLU methodologies, each applied where most appropriate. PAKTUS employs a frame-based knowledge representation and associative networks throughout. The lexical component uses morphological knowledge and word experts. Syntactic knowledge is represented in an Augmented Transition Network (ATN) grammar that incorporates rule-based programming. Case grammar is used for canonical conceptual representation with constraints. Domain-specific templates represent knowledge about specific applications as patterns of the form used in logic programming. Pragmatic knowledge may augment any of the other types and is added wherever needed for a particular domain. The system has been constructed in an interactive graphic programming environment. It has been used successfully to build a prototype front end for an expert system. This integration of existing technologies makes limited but practical NLU feasible now for narrow, well-defined domains.

  3. Natural language metaphors covertly influence reasoning.

    PubMed

    Thibodeau, Paul H; Boroditsky, Lera

    2013-01-01

    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents. PMID:23301009

  4. Automatically Generating Natural Language Status Reports

    NASA Astrophysics Data System (ADS)

    Kalita, Jugal; Shende, Sunil

    1988-03-01

    In this paper, we describe a system which generates compact natural language status reports for a set of inter-related processes at various stages of progress. The system has three modules - a rule-based domain knowledge representation module, an elaborate text planning module, and a surface generation module. The knowledge representation module models a set of processes that are encountered in a typical office environment, using a body of explicitly sequenced production rules implemented by an augmented Petri net mechanism. The system employs an interval-based temporal network for storing historical information. A text planning module traverses this network to search for events which need to be mentioned in a coherent report describing the current status of the system. The planner combines similar information for succinct presentation whenever applicable. It also takes into consideration various issues such as relevance and redundancy, simple mechanisms for viewing events from multiple perspectives and the application of discourse focus techniques for the generation of good quality text. Finally, an available surface generation module which has been suitably augmented is used to produce well-structured textual reports for our chosen domain.

  5. Porting a lexicalized-grammar parser to the biomedical domain.

    PubMed

    Rimell, Laura; Clark, Stephen

    2009-10-01

    This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought. PMID:19141332

  6. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  7. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information…

  8. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information

  9. The Nature of Written Language Deficits in Children with SLI

    ERIC Educational Resources Information Center

    Mackie, Clare; Dockrell, Julie E.

    2004-01-01

    Children with specific language impairment (SLI) have associated difficulties in reading decoding and reading comprehension. To date, few research studies have examined the children's written language. The aim of the present study was to (a) evaluate the nature and extent of the children's difficulties with writing and (b) investigate the

  10. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  11. Natural Language Processing Neural Network Considering Deep Cases

    NASA Astrophysics Data System (ADS)

    Sagara, Tsukasa; Hagiwara, Masafumi

    In this paper, we propose a novel neural network considering deep cases. It can learn knowledge from natural language documents and can perform recall and inference. Various techniques of natural language processing using Neural Network have been proposed. However, natural language sentences used in these techniques consist of about a few words, and they cannot handle complicated sentences. In order to solve these problems, the proposed network divides natural language sentences into a sentence layer, a knowledge layer, ten kinds of deep case layers and a dictionary layer. It can learn the relations among sentences and among words by dividing sentences. The advantages of the method are as follows: (1) ability to handle complicated sentences; (2) ability to restructure sentences; (3) usage of the conceptual dictionary, Goi-Taikei, as the long term memory in a brain. Two kinds of experiments were carried out by using goo dictionary and Wikipedia as knowledge sources. Superior performance of the proposed neural network has been confirmed.

  12. A Natural Language Interface Concordant with a Knowledge Base.

    PubMed

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively. PMID:26904105

  13. A Natural Language Interface Concordant with a Knowledge Base

    PubMed Central

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively. PMID:26904105

  14. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using

  15. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  16. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  17. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall

  18. Two Types of Definites in Natural Language

    ERIC Educational Resources Information Center

    Schwarz, Florian

    2009-01-01

    This thesis is concerned with the description and analysis of two semantically different types of definite articles in German. While the existence of distinct article paradigms in various Germanic dialects and other languages has been acknowledged in the descriptive literature for quite some time, the theoretical implications of their existence

  19. Two Types of Definites in Natural Language

    ERIC Educational Resources Information Center

    Schwarz, Florian

    2009-01-01

    This thesis is concerned with the description and analysis of two semantically different types of definite articles in German. While the existence of distinct article paradigms in various Germanic dialects and other languages has been acknowledged in the descriptive literature for quite some time, the theoretical implications of their existence…

  20. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  1. Overview of Computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  2. Overview of computer-based natural language processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language-Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc. in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and other who will be affected by this field as it unfolds.

  3. The redundancy of recursion and infinity for natural language.

    PubMed

    Luuk, Erkki; Luuk, Hendrik

    2011-02-01

    An influential line of thought claims that natural language and arithmetic processing require recursion, a putative hallmark of human cognitive processing (Chomsky in Evolution of human language: biolinguistic perspectives. Cambridge University Press, Cambridge, pp 45-61, 2010; Fitch et al. in Cognition 97(2):179-210, 2005; Hauser et al. in Science 298(5598):1569-1579, 2002). First, we question the need for recursion in human cognitive processing by arguing that a generally simpler and less resource demanding process--iteration--is sufficient to account for human natural language and arithmetic performance. We argue that the only motivation for recursion, the infinity in natural language and arithmetic competence, is equally approachable by iteration and recursion. Second, we submit that the infinity in natural language and arithmetic competence reduces to imagining infinite embedding or concatenation, which is completely independent from the ability to implement infinite processing, and thus, independent from both recursion and iteration. Furthermore, we claim that a property of natural language is physically uncountable finity and not discrete infinity. PMID:20652723

  4. The integration hypothesis of human language evolution and the nature of contemporary languages.

    PubMed

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  5. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  6. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  7. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life

  8. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  9. UMLS knowledge for biomedical language processing.

    PubMed Central

    McCray, A T; Aronson, A R; Browne, A C; Rindflesch, T C; Razi, A; Srinivasan, S

    1993-01-01

    This paper describes efforts to provide access to the free text in biomedical databases. The focus of the effort is the development of SPECIALIST, an experimental natural language processing system for the biomedical domain. The system includes a broad coverage parser supported by a large lexicon, modules that provide access to the extensive Unified Medical Language System (UMLS) Knowledge Sources, and a retrieval module that permits experiments in information retrieval. The UMLS Metathesaurus and Semantic Network provide a rich source of biomedical concepts and their interrelationships. Investigations have been conducted to determine the type of information required to effect a map between the language of queries and the language of relevant documents. Mappings are never straightforward and often involve multiple inferences. PMID:8472004

  10. Learning from a Computer Tutor with Natural Language Capabilities

    ERIC Educational Resources Information Center

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha

    2003-01-01

    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced

  11. Anaphora in Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth DuRoss

    1990-01-01

    Describes the linguistic phenomenon of anaphora; surveys the approaches to anaphora undertaken in theoretical linguistics and natural language processing (NLP); presents results of research conducted at Syracuse University on anaphora in information retrieval; and discusses the future of anaphora research in regard to information retrieval tasks.…

  12. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  13. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of

  14. Proof-Theoretic Semantics for a Natural Language Fragment

    NASA Astrophysics Data System (ADS)

    Francez, Nissim; Dyckhoff, Roy

    We propose a Proof - Theoretic Semantics (PTS) for a (positive) fragment E+0 of Natural Language (NL) (English in this case). The semantics is intended [7] to be incorporated into actual grammars, within the framework of Type - Logical Grammar (TLG) [12]. Thereby, this semantics constitutes an alternative to the traditional model - theoretic semantics (MTS), originating in Montague's seminal work [11], used in TLG.

  15. A finite and real-time processor for natural language

    SciTech Connect

    Blank, G.D. )

    1989-10-01

    People process natural language in real time and with very limited short-term memories. This article describes a computational architecture for syntactic performance that also requires fixed finite resources. The processor presented here represents syntactic versatility without incurring combinatorial redundancy in the number of transitions or rules. It avoids both excess grammar size and excessive computational complexity.

  16. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  17. CITE NLM: Natural-Language Searching in an Online Catalog.

    ERIC Educational Resources Information Center

    Doszkocs, Tamas E.

    1983-01-01

    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description

  18. Principles of Organization in Young Children's Natural Language Hierarchies.

    ERIC Educational Resources Information Center

    Callanan, Maureen A.; Markman, Ellen M.

    1982-01-01

    When preschool children think of objects as organized into collections (e.g., forest, army) they solve certain problems better than when they think of the same objects as organized into classes (e.g., trees, soldiers). Present studies indicate preschool children occasionally distort natural language inclusion hierarchies (e.g., oak, tree) into the

  19. The Nature of Object Marking in American Sign Language

    ERIC Educational Resources Information Center

    Gokgoz, Kadir

    2013-01-01

    In this dissertation, I examine the nature of object marking in American Sign Language (ASL). I investigate object marking by means of directionality (the movement of the verb towards a certain location in signing space) and by means of handling classifiers (certain handshapes accompanying the verb). I propose that object marking in ASL is…

  20. Natural-language access to databases-theoretical/technical issues

    SciTech Connect

    Moore, R.C.

    1982-01-01

    Although there have been many experimental systems for natural-language access to databases, with some now going into actual use, many problems in this area remain to be solved. The author presents descriptions of five problem areas that seem to me not to be adequately handled by any existing system.

  1. What Is the Nature of Poststroke Language Recovery and Reorganization?

    PubMed Central

    Kiran, Swathi

    2012-01-01

    This review focuses on three main topics related to the nature of poststroke language recovery and reorganization. The first topic pertains to the nature of anatomical and physiological substrates in the infarcted hemisphere in poststroke aphasia, including the nature of the hemodynamic response in patients with poststroke aphasia, the nature of the peri-infarct tissue, and the neuronal plasticity potential in the infarcted hemisphere. The second section of the paper reviews the current neuroimaging evidence for language recovery in the acute, subacute, and chronic stages of recovery. The third and final section examines changes in connectivity as a function of recovery in poststroke aphasia, specifically in terms of changes in white matter connectivity, changes in functional effective connectivity, and changes in resting state connectivity after stroke. While much progress has been made in our understanding of language recovery, more work needs to be done. Future studies will need to examine whether reorganization of language in poststroke aphasia corresponds to a tighter, more coherent, and efficient network of residual and new regions in the brain. Answering these questions will go a long way towards being able to predict which patients are likely to recover and may benefit from future rehabilitation. PMID:23320190

  2. Research at Yale in Natural Language Processing. Research Report #84.

    ERIC Educational Resources Information Center

    Schank, Roger C.

    This report summarizes the capabilities of five computer programs at Yale that do automatic natural language processing as of the end of 1976. For each program an introduction to its overall intent is given, followed by the input/output, a short discussion of the research underlying the program, and a prognosis for future development. The programs

  3. Design of Lexicons in Some Natural Language Systems.

    ERIC Educational Resources Information Center

    Cercone, Nick; Mercer, Robert

    1980-01-01

    Discusses an investigation of certain problems concerning the structural design of lexicons used in computational approaches to natural language understanding. Emphasizes three aspects of design: retrieval of relevant portions of lexicals items, storage requirements, and representation of meaning in the lexicon. (Available from ALLC, Dr. Rex Last,…

  4. Naturally Simplified Input, Comprehension, and Second Language Acquisition.

    ERIC Educational Resources Information Center

    Ellis, Rod

    This article examines the concept of simplification in second language (SL) learning, reviewing research on the simplified input that both naturalistic and classroom SL learners receive. Research indicates that simplified input, particularly if derived from naturally occurring interactions, does aid comprehension but has not been shown to

  5. Natural language understanding and speech recognition for industrial vision systems

    NASA Astrophysics Data System (ADS)

    Batchelor, Bruce G.

    1992-11-01

    The accepted method of programming machine vision systems for a new application is to incorporate sub-routines from a standard library into code, written specially for the given task. Typical programming languages that might be used here are Pascal, C, and assembly code, although other `conventional' (i.e., imperative) languages are often used instead. The representation of an algorithm to recognize a certain object, in the form of, say, a C language program is clumsy and unnatural, compared to the alternative process of describing the object itself and leaving the software to search for it. The latter method, known as declarative programming, is used extensively both when programming in Prolog and when people talk to one another in English, or other natural languages. Programs to understand a limited sub-set of a natural language can also be written conveniently in Prolog. The article considers the prospects for talking to an image processing system, using only slightly constrained English. Moderately priced speech recognition devices, which interface to a standard desk-top computer and provide a limited repertoire (200 words) as well as the ability to identify isolated words, are already available commercially. At the moment, the goal of talking in English to a computer is incompletely fulfilled. Yet, sufficient progress has been made to encourage greater effort in this direction.

  6. Developing Formal Correctness Properties from Natural Language Requirements

    NASA Technical Reports Server (NTRS)

    Nikora, Allen P.

    2006-01-01

    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  7. Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis

    NASA Technical Reports Server (NTRS)

    Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.

    2007-01-01

    To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.

  8. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  9. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,

  10. The Parser Doesn't Ignore Intransitivity, after All

    ERIC Educational Resources Information Center

    Staub, Adrian

    2007-01-01

    Several previous studies (B. C. Adams, C. Clifton, & D. C. Mitchell, 1998; D. C. Mitchell, 1987; R. P. G. van Gompel & M. J. Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eye-tracking experiments examined this issue in more

  11. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two

  12. The Parser Doesn't Ignore Intransitivity, after All

    ERIC Educational Resources Information Center

    Staub, Adrian

    2007-01-01

    Several previous studies (B. C. Adams, C. Clifton, & D. C. Mitchell, 1998; D. C. Mitchell, 1987; R. P. G. van Gompel & M. J. Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eye-tracking experiments examined this issue in more…

  13. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two…

  14. Conclusiveness of natural languages and recognition of images

    SciTech Connect

    Wojcik, Z.M.

    1983-01-01

    The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It is suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.

  15. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  16. Using natural language processing techniques to inform research on nanotechnology.

    PubMed

    Lewinski, Nastassja A; McInnes, Bridget T

    2015-01-01

    Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  17. Elicitation of natural language representations of uncertainty using computer technology

    SciTech Connect

    Tonn, B.; Goeltz, R.; Travis, C.; Tennessee Univ., Knoxville, TN )

    1989-01-01

    Knowledge elicitation is an important aspect of risk analysis. Knowledge about risks must be accurately elicited from experts for use in risk assessments. Knowledge and perceptions of risks must also be accurately elicited from the public in order to intelligently perform policy analysis and develop and implement programs. Oak Ridge National Laboratory is developing computer technology to effectively and efficiently elicit knowledge from experts and the public. This paper discusses software developed to elicit natural language representations of uncertainty. The software is written in Common Lisp and resides on VAX Computers System and Symbolics Lisp machines. The software has three goals, to determine preferences for using natural language terms for representing uncertainty; likelihood rankings of the terms; and how likelihood estimates are combined to form new terms. The first two goals relate to providing useful results for those interested in risk communication. The third relates to providing cognitive data to further our understanding of people's decision making under uncertainty. The software is used to elicit natural language terms used to express the likelihood of various agents causing cancer in humans and cancer resulting in various maladies, and the likelihood of everyday events. 6 refs., 4 figs., 4 tabs.

  18. Automatic Item Generation via Frame Semantics: Natural Language Generation of Math Word Problems.

    ERIC Educational Resources Information Center

    Deane, Paul; Sheehan, Kathleen

    This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language

  19. Applications of Natural Language Processing in Biodiversity Science

    PubMed Central

    Thessen, Anne E.; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  20. Natural Language Processing Technologies in Radiology Research and Clinical Applications.

    PubMed

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ()RSNA, 2016. PMID:26761536

  1. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  2. Storing files in a parallel computing system based on user-specified parser function

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  3. Knowledge-Assisted Document Retrieval: I. The Natural-Language Interface. II. The Retrieval Process.

    ERIC Educational Resources Information Center

    Biswas, Gautam; And Others

    1987-01-01

    Two articles describe a model for processing natural-language queries in information retrieval systems. Part I proposes a language interface based on fuzzy set techniques to handle the uncertainty inherent in natural-language semantics. Part II develops a model of the retrieval system and describes an implementation using a knowledge-based systems

  4. Deviations in the Zipf and Heaps laws in natural languages

    NASA Astrophysics Data System (ADS)

    Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.

    2014-03-01

    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.

  5. Natural Language Processing as a Discipline at LLNL

    SciTech Connect

    Firpo, M A

    2005-02-04

    The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.

  6. Augmenting a database knowledge representation for natural language generation

    SciTech Connect

    McCoy, K.F.

    1982-01-01

    The knowledge representation is an important factor in natural language generation since it limits the semantic capabilities of the generation system. This paper identifies several information types in a knowledge representation that can be used to generate meaningful responses to questions about database structure. Creating such a knowledge representation, however, is a long and tedious process. A system is presented which uses the contents of the database to form part of this knowledge representation automatically. It employs three types of world knowledge axioms to ensure that the representation formed is meaningful and contains salient information. 7 references.

  7. Medical Facts to Support Inferencing in Natural Language Processing

    PubMed Central

    Rindflesch, Thomas C.; Pakhomov, Serguei V.; Fiszman, Marcelo; Kilicoglu, Halil; Sanchez, Vincent R.

    2005-01-01

    We report on the use of medical facts to support the enhancement of natural language processing of biomedical text. Inferencing in semantic interpretation depends on a fact repository as well as an ontology. We used statistical methods to construct a repository of drug-disorder co-occurrences from a large collection of clinical notes, and this resource is used to validate inferences automatically drawn during semantic interpretation of Medline citations about pharmacologic interventions for disease. We evaluated the results against a published reference standard for treatment of diseases. PMID:16779117

  8. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    SciTech Connect

    Powers, D.M.W.

    1983-04-01

    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  9. Solving problems on base of concepts formalization of language image and figurative meaning of the natural-language constructs

    NASA Astrophysics Data System (ADS)

    Bisikalo, Oleg V.; Cieszczyk, Sławomir; Yussupova, Gulbahar

    2015-12-01

    Building of "clever" thesaurus by algebraic means on base of concepts formalization of language image and figurative meaning of the natural-language constructs in the article are proposed. A formal theory based on a binary operator of directional associative relation is constructed and an understanding of an associative normal form of image constructions is introduced. A model of a commutative semigroup, which provides a presentation of a sentence as three components of an interrogative language image construction, is considered.

  10. Does textual feedback hinder spoken interaction in natural language?

    PubMed

    Le Bigot, Ludovic; Terrier, Patrice; Jamet, Eric; Botherel, Valerie; Rouet, Jean-Francois

    2010-01-01

    The aim of the study was to determine the influence of textual feedback on the content and outcome of spoken interaction with a natural language dialogue system. More specifically, the assumption that textual feedback could disrupt spoken interaction was tested in a human-computer dialogue situation. In total, 48 adult participants, familiar with the system, had to find restaurants based on simple or difficult scenarios using a real natural language service system in a speech-only (phone), speech plus textual dialogue history (multimodal) or text-only (web) modality. The linguistic contents of the dialogues differed as a function of modality, but were similar whether the textual feedback was included in the spoken condition or not. These results add to burgeoning research efforts on multimodal feedback, in suggesting that textual feedback may have little or no detrimental effect on information searching with a real system. STATEMENT OF RELEVANCE: The results suggest that adding textual feedback to interfaces for human-computer dialogue could enhance spoken interaction rather than create interference. The literature currently suggests that adding textual feedback to tasks that depend on the visual sense benefits human-computer interaction. The addition of textual output when the spoken modality is heavily taxed by the task was investigated. PMID:20069480

  11. Breaking the Molds: Signed Languages and the Nature of Human Language

    ERIC Educational Resources Information Center

    Slobin, Dan I.

    2008-01-01

    Grammars of signed languages tend to be based on grammars established for written languages, particularly the written language in use in the surrounding hearing community of a sign language. Such grammars presuppose categories of discrete elements which are combined into various sorts of structures. Recent analyses of signed languages go beyond…

  12. Breaking the Molds: Signed Languages and the Nature of Human Language

    ERIC Educational Resources Information Center

    Slobin, Dan I.

    2008-01-01

    Grammars of signed languages tend to be based on grammars established for written languages, particularly the written language in use in the surrounding hearing community of a sign language. Such grammars presuppose categories of discrete elements which are combined into various sorts of structures. Recent analyses of signed languages go beyond

  13. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  14. Second-language instinct and instruction effects: nature and nurture in second-language acquisition.

    PubMed

    Yusa, Noriaki; Koizumi, Masatoshi; Kim, Jungho; Kimura, Naoki; Uchida, Shinya; Yokoyama, Satoru; Miura, Naoki; Kawashima, Ryuta; Hagiwara, Hiroko

    2011-10-01

    Adults seem to have greater difficulties than children in acquiring a second language (L2) because of the alleged "window of opportunity" around puberty. Postpuberty Japanese participants learned a new English rule with simplex sentences during one month of instruction, and then they were tested on "uninstructed complex sentences" as well as "instructed simplex sentences." The behavioral data show that they can acquire more knowledge than is instructed, suggesting the interweaving of nature (universal principles of grammar, UG) and nurture (instruction) in L2 acquisition. The comparison in the "uninstructed complex sentences" between post-instruction and pre-instruction using functional magnetic resonance imaging reveals a significant activation in Broca's area. Thus, this study provides new insight into Broca's area, where nature and nurture cooperate to produce L2 learners' rich linguistic knowledge. It also shows neural plasticity of adult L2 acquisition, arguing against a critical period hypothesis, at least in the domain of UG. PMID:21254799

  15. Semantic Grammar: A Technique for Constructing Natural Language Interfaces to Instructional Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    A major obstacle to the effective educational use of computers is the lack of a natural means of communication between the student and the computer. This report describes a technique for generating such natural language front-ends for advanced instructional systems. It discusses: (1) the essential properties of a natural language front-end, (2)

  16. Spatial and numerical abilities without a complete natural language.

    PubMed

    Hyde, Daniel C; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A; Spelke, Elizabeth S

    2011-04-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  17. Spatial and numerical abilities without a complete natural language

    PubMed Central

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  18. Towards a semantic lexicon for clinical natural language processing.

    PubMed

    Liu, Hongfang; Wu, Stephen T; Li, Dingcheng; Jonnalagadda, Siddhartha; Sohn, Sunghwan; Wagholikar, Kavishwar; Haug, Peter J; Huff, Stanley M; Chute, Christopher G

    2012-01-01

    A semantic lexicon which associates words and phrases in text to concepts is critical for extracting and encoding clinical information in free text and therefore achieving semantic interoperability between structured and unstructured data in Electronic Health Records (EHRs). Directly using existing standard terminologies may have limited coverage with respect to concepts and their corresponding mentions in text. In this paper, we analyze how tokens and phrases in a large corpus distribute and how well the UMLS captures the semantics. A corpus-driven semantic lexicon, MedLex, has been constructed where the semantics is based on the UMLS assisted with variants mined and usage information gathered from clinical text. The detailed corpus analysis of tokens, chunks, and concept mentions shows the UMLS is an invaluable source for natural language processing. Increasing the semantic coverage of tokens provides a good foundation in capturing clinical information comprehensively. The study also yields some insights in developing practical NLP systems. PMID:23304329

  19. Detection of Blood Culture Bacterial Contamination using Natural Language Processing

    PubMed Central

    Matheny, Michael E.; FitzHenry, Fern; Speroff, Theodore; Hathaway, Jacob; Murff, Harvey J.; Brown, Steven H.; Fielstein, Elliot M.; Dittus, Robert S.; Elkin, Peter L.

    2009-01-01

    Microbiology results are reported in semi-structured formats and have a high content of useful patient information. We developed and validated a hybrid regular expression and natural language processing solution for processing blood culture microbiology reports. Multi-center Veterans Affairs training and testing data sets were randomly extracted and manually reviewed to determine the culture and sensitivity as well as contamination results. The tool was iteratively developed for both outcomes using a training dataset, and then evaluated on the test dataset to determine antibiotic susceptibility data extraction and contamination detection performance. Our algorithm had a sensitivity of 84.8% and a positive predictive value of 96.0% for mapping the antibiotics and bacteria with appropriate sensitivity findings in the test data. The bacterial contamination detection algorithm had a sensitivity of 83.3% and a positive predictive value of 81.8%. PMID:20351890

  20. AutoTutor: a tutor with dialogue in natural language.

    PubMed

    Graesser, Arthur C; Lu, Shulan; Jackson, George Tanner; Mitchell, Heather Hite; Ventura, Mathew; Olney, Andrew; Louwerse, Max M

    2004-05-01

    AutoTutor is a learning environment that tutors students by holding a conversation in natural language. AutoTutor has been developed for Newtonian qualitative physics and computer literacy. Its design was inspired by explanation-based constructivist theories of learning, intelligent tutoring systems that adaptively respond to student knowledge, and empirical research on dialogue patterns in tutorial discourse. AutoTutor presents challenging problems (formulated as questions) from a curriculum script and then engages in mixed initiative dialogue that guides the student in building an answer. It provides the student with positive, neutral, or negative feedback on the student's typed responses, pumps the student for more information, prompts the student to fill in missing words, gives hints, fills in missing information with assertions, identifies and corrects erroneous ideas, answers the student's questions, and summarizes answers. AutoTutor has produced learning gains of approximately .70 sigma for deep levels of comprehension. PMID:15354683

  1. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  2. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  3. Literature-Based Knowledge Discovery using Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Hristovski, D.; Friedman, C.; Rindflesch, T. C.; Peterlin, B.

    Literature-based discovery (LBD) is an emerging methodology for uncovering nonovert relationships in the online research literature. Making such relationships explicit supports hypothesis generation and discovery. Currently LBD systems depend exclusively on co-occurrence of words or concepts in target documents, regardless of whether relations actually exist between the words or concepts. We describe a method to enhance LBD through capture of semantic relations from the literature via use of natural language processing (NLP). This paper reports on an application of LBD that combines two NLP systems: BioMedLEE and SemRep, which are coupled with an LBD system called BITOLA. The two NLP systems complement each other to increase the types of information utilized by BITOLA. We also discuss issues associated with combining heterogeneous systems. Initial experiments suggest this approach can uncover new associations that were not possible using previous methods.

  4. From Web Directories to Ontologies: Natural Language Processing Challenges

    NASA Astrophysics Data System (ADS)

    Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing

    Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.

  5. Tasking and sharing sensing assets using controlled natural language

    NASA Astrophysics Data System (ADS)

    Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David

    2012-06-01

    We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.

  6. Constructing Concept Schemes From Astronomical Telegrams Via Natural Language Clustering

    NASA Astrophysics Data System (ADS)

    Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.

    2012-01-01

    The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs (http://www.astronomerstelegram.org) are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.

  7. A BIBLIOGRAPHY ON THE NATURE, RECOGNITION AND TREATMENT OF LANGUAGE DIFFICULTIES.

    ERIC Educational Resources Information Center

    RAWSON, MARGARET B.

    A SELECTED READING AND REFERENCE LIST OF PUBLICATIONS FROM 1896 TO 1966 ON THE NATURE, RECOGNITION, AND TREATMENT OF LANGUAGE DIFFICULTIES IS PRESENTED. THE TITLES WERE SELECTED ON THE BASIS OF RELEVANCE TO THE GENERAL INTERESTS AND SPECIFIC NEEDS OF PEOPLE CONCERNED WITH LANGUAGE DISORDERS, PARTICULARLY WITH A SPECIFIC LANGUAGE DISABILITY.

  8. ONE GRAMMAR OR TWO? Sign Languages and the Nature of Human Language.

    PubMed

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages - such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality, but also in the form of sign languages, has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language - in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. PMID:25013534

  9. The Nature of Spanish versus English Language Use at Home

    ERIC Educational Resources Information Center

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude

    2014-01-01

    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,

  10. Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    PubMed

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C; Doan, Son

    2011-12-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (Probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440

  11. Using a natural language and gesture interface for unmanned vehicles

    NASA Astrophysics Data System (ADS)

    Perzanowski, Dennis; Schultz, Alan C.; Adams, William; Marsh, Elaine

    2000-07-01

    Unmanned vehicles, such as mobile robots, must exhibit adjustable autonomy. They must be able to be self-sufficient when the situation warrants; however, as they interact with each other and with humans, they must exhibit an ability to dynamically adjust their independence or dependence as co-operative agents attempting to achieve some goal. This is what we mean by adjustable autonomy. We have been investigating various modes of communication that enhance a robot's capability to work interactively with other robots and with humans. Specifically, we have been investigating how natural language and gesture can provide a user- friendly interface to mobile robots. We have extended this initial work to include semantic and pragmatic procedures that allow humans and robots to act co-operatively, based on whether or not goals have been achieved by the various agents in the interaction. By processing commands that are either spoken or initiated by clicking buttons on a Personal Digital Assistant and by gesturing either naturally or symbolically, we are tracking the various goals of the interaction, the agent involved in the interaction, and whether or not the goal has been achieved. The various agents involved in achieving the goals are each aware of their own and others' goals and what goals have been stated or accomplished so that eventually any member of the group, be it robot or a human, if necessary, can interact with the other members to achieve the stated goals of a mission.

  12. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  13. The parser doesn't ignore intransitivity, after all

    PubMed Central

    Staub, Adrian

    2015-01-01

    Several previous studies (Adams, Clifton, & Mitchell, 1998; Mitchell, 1987; van Gompel & Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eyetracking experiments examined this issue in more detail. Experiment 1 strongly replicated the finding (van Gompel & Pickering, 2001) that readers experience difficulty on this noun phrase in normal reading, and found that this difficulty occurs even with a class of intransitive verbs for which a direct object is categorically prohibited. Experiment 2, however, demonstrated that this effect is not due to syntactic misanalysis, but is instead due to disruption that occurs when a comma is absent at a subordinate clause/main clause boundary. Exploring a different construction, Experiment 3 replicated the finding (Pickering & Traxler, 2003; Traxler & Pickering, 1996) that when a noun phrase filler is an implausible direct object for an optionally transitive relative clause verb, processing difficulty results; however, there was no evidence for such difficulty when the relative clause verb was strictly intransitive. Taken together, the three experiments undermine the support for the claim that the parser initially ignores a verb's subcategorization restrictions. PMID:17470005

  14. The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser.

    PubMed

    Vosse, Theo; Kempen, Gerard

    2009-12-01

    We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described. PMID:19784798

  15. Automated Encoding of Clinical Documents Based on Natural Language Processing

    PubMed Central

    Friedman, Carol; Shagina, Lyudmila; Lussier, Yves; Hripcsak, George

    2004-01-01

    Objective: The aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method. Methods: An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to Unified Medical Language System (UMLS) coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by seven experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts. Results: Recall of the system for UMLS coding of all terms was .77 (95% CI .72.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79.87). Recall of the system for extracting all terms was .84 (.81.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was .89 (.87.91), and precision of the experts ranged from .61 to .91. Conclusion: Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than six experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval. PMID:15187068

  16. Natural Language Understanding Systems Within the A. I. Paradigm: A Survey and Some Comparisons.

    ERIC Educational Resources Information Center

    Wilks, Yorick

    The paper surveys the major projects on the understanding of natural language that fall within what may now be called the artificial intelligence paradigm of natural language systems. Some space is devoted to arguing that the paradigm is now a reality and different in significant respects from the generative paradigm of present-day linguistics.

  17. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by

  18. Success story in software engineering using NIAM (Natural language Information Analysis Methodology)

    SciTech Connect

    Eaton, S.M.; Eaton, D.S.

    1995-10-01

    To create an information system, we employ NIAM (Natural language Information Analysis Methodology). NIAM supports the goals of both the customer and the analyst completely understanding the information. We use the customer`s own unique vocabulary, collect real examples, and validate the information in natural language sentences. Examples are discussed from a successfully implemented information system.

  19. "Natural" Language Learning and Learning a Foreign Language in the Classroom.

    ERIC Educational Resources Information Center

    Harris, Vee

    1988-01-01

    Compares the ways students acquire their native language and learn a foreign language. Findings revealed additional areas in need of investigation, including the kinds of activities which tap students' ability to acquire language and the ways in which formal learning can use these activities. (CB)

  20. A common type system for clinical natural language processing

    PubMed Central

    2013-01-01

    Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462

  1. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  2. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  3. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language.

    PubMed

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł

    2015-05-01

    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. PMID:25858311

  4. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a

  5. Whole Language and Deaf Bilingual-Bicultural Education--Naturally!.

    ERIC Educational Resources Information Center

    Mason, David; Ewoldt, Carolyn

    1996-01-01

    This position paper discusses how the tenets of Whole Language and Deaf Bilingual-Bicultural Education complement each other. It stresses that Whole Language emphasizes a two-way teaching/learning process and Deaf Bilingual-Bicultural Education emphasizes mutual respect in the sociocultural experiences and values of deaf and hearing people.

  6. Of Substance: The Nature of Language Effects on Entity Construal

    ERIC Educational Resources Information Center

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.

  7. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a…

  8. Dependency Parser-based Negation Detection in Clinical Narratives.

    PubMed

    Sohn, Sunghwan; Wu, Stephen; Chute, Christopher G

    2012-01-01

    Negation of clinical named entities is common in clinical documents and is a crucial factor to accurately compile patients' clinical conditions and to further support complex phenotype detection. In 2009, Mayo Clinic released the clinical Text Analysis and Knowledge Extraction System (cTAKES), which includes a negation annotator that identifies negation status of a named entity by searching for negation words within a fixed word distance. However, this negation strategy is not sophisticated enough to correctly identify complicated patterns of negation. This paper aims to investigate whether the dependency structure from the cTAKES dependency parser can improve the negation detection performance. Manually compiled negation rules, derived from dependency paths were tested. Dependency negation rules do not limit the negation scope to word distance; instead, they are based on syntactic context. We found that using a dependency-based negation proved a superior alternative to the current cTAKES negation annotator. PMID:22779038

  9. Batch Blast Extractor: an automated blastx parser application

    PubMed Central

    Pirooznia, Mehdi; Perkins, Edward J; Deng, Youping

    2008-01-01

    Motivation BLAST programs are very efficient in finding similarities for sequences. However for large datasets such as ESTs, manual extraction of the information from the batch BLAST output is needed. This can be time consuming, insufficient, and inaccurate. Therefore implementation of a parser application would be extremely useful in extracting information from BLAST outputs. Results We have developed a java application, Batch Blast Extractor, with a user friendly graphical interface to extract information from BLAST output. The application generates a tab delimited text file that can be easily imported into any statistical package such as Excel or SPSS for further analysis. For each BLAST hit, the program obtains and saves the essential features from the BLAST output file that would allow further analysis. The program was written in Java and therefore is OS independent. It works on both Windows and Linux OS with java 1.4 and higher. It is freely available from: PMID:18831775

  10. Parsley: a Command-Line Parser for Astronomical Applications

    NASA Astrophysics Data System (ADS)

    Deich, William

    Parsley is a sophisticated keyword + value parser, packaged as a library of routines that offers an easy method for providing command-line arguments to programs. It makes it easy for the user to enter values, and it makes it easy for the programmer to collect and validate the user's entries. Parsley is tuned for astronomical applications: for example, dates entered in Julian, Modified Julian, calendar, or several other formats are all recognized without special effort by the user or by the programmer; angles can be entered using decimal degrees or dd:mm:ss; time-like intervals as decimal hours, hh:mm:ss, or a variety of other units. Vectors of data are accepted as readily as scalars.

  11. Statistical Learning in a Natural Language by 8-Month-Old Infants

    PubMed Central

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2013-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  12. Evaluation of PHI Hunter in Natural Language Processing Research

    PubMed Central

    Redd, Andrew; Pickard, Steve; Meystre, Stephane; Scehnet, Jeffrey; Bolton, Dan; Heavirland, Julia; Weaver, Allison Lynn; Hope, Carol; Garvin, Jennifer Hornung

    2015-01-01

    Objectives We introduce and evaluate a new, easily accessible tool using a common statistical analysis and business analytics software suite, SAS, which can be programmed to remove specific protected health information (PHI) from a text document. Removal of PHI is important because the quantity of text documents used for research with natural language processing (NLP) is increasing. When using existing data for research, an investigator must remove all PHI not needed for the research to comply with human subjects right to privacy. This process is similar, but not identical, to de-identification of a given set of documents. Materials and methods PHI Hunter removes PHI from free-form text. It is a set of rules to identify and remove patterns in text. PHI Hunter was applied to 473 Department of Veterans Affairs (VA) text documents randomly drawn from a research corpus stored as unstructured text in VA files. Results PHI Hunter performed well with PHI in the form of identification numbers such as Social Security numbers, phone numbers, and medical record numbers. The most commonly missed PHI items were names and locations. Incorrect removal of information occurred with text that looked like identification numbers. Discussion PHI Hunter fills a niche role that is related to but not equal to the role of de-identification tools. It gives research staff a tool to reasonably increase patient privacy. It performs well for highly sensitive PHI categories that are rarely used in research, but still shows possible areas for improvement. More development for patterns of text and linked demographic tables from electronic health records (EHRs) would improve the program so that more precise identifiable information can be removed. Conclusions PHI Hunter is an accessible tool that can flexibly remove PHI not needed for research. If it can be tailored to the specific data set via linked demographic tables, its performance will improve in each new document set. PMID:26807078

  13. Natural language processing in an intelligent writing strategy tutoring system.

    PubMed

    McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod

    2013-06-01

    The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system. PMID:23055164

  14. Automation of a problem list using natural language processing

    PubMed Central

    Meystre, Stephane; Haug, Peter J

    2005-01-01

    Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information. PMID:16135244

  15. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  16. On the neurolinguistic nature of language abnormalities in Huntington's disease.

    PubMed Central

    Wallesch, C W; Fehrenbach, R A

    1988-01-01

    Spontaneous language of 18 patients suffering from Huntington's disease and 15 dysarthric controls suffering from Friedreich's ataxia were investigated. In addition, language functions in various modalities were assessed with the Aachen Aphasia Test (AAT). The Huntington patients exhibited deficits in the syntactical complexity of spontaneous speech and in the Token Test, confrontation naming, and language comprehension subtests of the AAT, which are interpreted as resulting from their dementia. Errors affecting word access mechanisms and production of syntactical structures as such were not encountered. PMID:2452241

  17. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    PubMed Central

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  18. Learning and comprehension of BASIC and natural language computer programming by novices

    SciTech Connect

    Dyck, J.L.

    1987-01-01

    This study examined the effectiveness of teaching novices to program in Natural Language as a prerequisite for learning BASIC, and the learning and comprehension processes for Natural Language and BASIC computer-programming languages. Three groups of computer-naive subjects participated in five self-paced learning sessions; in each sessions, subjects solved a series of programming problems with immediate feedback. Twenty-four subjects learned to solve BASIC programming problems (BASIC group) for all five sessions, 23 subjects learned to solve corresponding Natural Language programming problems for all five sessions (Natural Language group), and 23 subjects learned to solve Natural Language programming problems for three sessions and then transferred to BASIC for the two sessions (Transfer group). At the end of the fifth session, all subjects completed a post-test which required the subjects to use their programming knowledge in a new way. Results indicated that the Natural Language trained subjects had complete transfer to BASIC, as indicated by no overall difference in comprehension time or accuracy for final BASIC sessions (i.e., sessions four and five) for the Transfer and BASIC groups. In addition, there was an interaction between group and session on accuracy, in which the Transfer group increased its accuracy at a faster rate than the BASIC group.

  19. Dynamic changes in network activations characterize early learning of a natural language.

    PubMed

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius

    2014-09-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  20. Dynamic Changes in Network Activations Characterize Early Learning of a Natural Language

    PubMed Central

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S.; Almyrde, Kyle, R.; Fridriksson, Julius

    2014-01-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 minutes of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  1. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,

  2. Natural Constraints in Sign Language Phonology: Data from Anatomy.

    ERIC Educational Resources Information Center

    Mandel, Mark A.

    1979-01-01

    Presents three sets of data (signs from the "Dictionary of ASL," 1976; loan signs; and case histories of specific signs) that demonstrate the involvement of the "knuckle-wrist connection" with American Sign Language phonology. (AM)

  3. Of Substance: The Nature of Language Effects on Entity Construal

    PubMed Central

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind (whisk; Imai and Gentner, 1997). Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin-speakers and English-speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's (1999) hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object-kind rather than substance-kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  4. The Preservation and Use of Our Languages: Respecting the Natural Order of the Creator.

    ERIC Educational Resources Information Center

    Kirkness, Verna J.

    As a world community, Indigenous peoples are faced with many common challenges in their attempts to maintain the vitality of their respective languages and to honor the "natural order of the Creator." Ten strategies are discussed that are critical to the task of renewing and maintaining Indigenous languages. These strategies are: (1) banking

  5. Nine-Month-Olds Extract Structural Principles Required for Natural Language

    ERIC Educational Resources Information Center

    Gerken, LouAnn

    2004-01-01

    Infants' ability to rapidly extract properties of language-like systems during brief laboratory exposures has been taken as evidence about the innate linguistic state of humans. However, previous studies have focused on structural properties that are not central to descriptions of natural language. In the current study, infants were exposed to 3-

  6. Structured Natural-Language Descriptions for Semantic Content Retrieval of Visual Materials.

    ERIC Educational Resources Information Center

    Tam, A. M.; Leung, C. H. C.

    2001-01-01

    Proposes a structure for natural language descriptions of the semantic content of visual materials that requires descriptions to be (modified) keywords, phrases, or simple sentences, with components that are grammatical relations common to many languages. This structure makes it easy to implement a collection's descriptions as a relational

  7. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  8. Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing

    ERIC Educational Resources Information Center

    Cohen, Kevin Bretonnel

    2010-01-01

    Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of…

  9. Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing

    ERIC Educational Resources Information Center

    Cohen, Kevin Bretonnel

    2010-01-01

    Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of

  10. The Nature of Chinese Language Classroom Learning Environments in Singapore Secondary Schools

    ERIC Educational Resources Information Center

    Chua, Siew Lian; Wong, Angela F. L.; Chen, Der-Thanq V.

    2011-01-01

    This article reports findings from a classroom environment study which was designed to investigate the nature of Chinese Language classroom environments in Singapore secondary schools. We used a perceptual instrument, the Chinese Language Classroom Environment Inventory, to investigate teachers' and students' perceptions towards their Chinese

  11. Transfer of a Natural Language System for Problem-Solving in Physics to Other Domains.

    ERIC Educational Resources Information Center

    Oberem, Graham E.

    The limited language capability of CAI systems has made it difficult to personalize problem-solving instruction. The intelligent tutoring system, ALBERT, is a problem-solving monitor and coach that has been used with high school and college level physics students for several years; it uses a natural language system to understand kinematics…

  12. Digging in the Dictionary: Building a Relational Lexicon To Support Natural Language Processing Applications.

    ERIC Educational Resources Information Center

    Evens, Martha; And Others

    Advanced learners of second languages and natural language processing systems both demand much more detailed lexical information than conventional dictionaries provide. Text composition, whether by humans or machines, requires a thorough understanding of relationships between words, such as selectional restrictions, case patterns, factives, and…

  13. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated

  14. Development and Evaluation of a Thai Learning System on the Web Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Dansuwan, Suyada; Nishina, Kikuko; Akahori, Kanji; Shimizu, Yasutaka

    2001-01-01

    Describes the Thai Learning System, which is designed to help learners acquire the Thai word order system. The system facilitates the lessons on the Web using HyperText Markup Language and Perl programming, which interfaces with natural language processing by means of Prolog. (Author/VWL)

  15. A natural language interface plug-in for cooperative query answering in biological databases

    PubMed Central

    2012-01-01

    Background One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Results Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. Conclusions The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a semantic description of the intended application. We demonstrate the feasibility of our approach with a practical example. PMID:22759613

  16. IR-NLI: an expert natural language interface to online data bases

    SciTech Connect

    Guida, G.; Tasso, C.

    1983-01-01

    Constructing natural language interfaces to computer systems often requires achievement of advanced reasoning and expert capabilities in addition to basic natural language understanding. In this paper the above issues are faced in the context of an actual application concerning the design of a natural language interface for access to online information retrieval systems. After a short discussion of the peculiarities of this application, which requires both natural language understanding and reasoning capabilities, the general architecture and fundamental design criteria of IR-NLI, a system presently being developed at the University of Udine, are presented. Attention is then focused on the basic functions of IR-NLI, namely, understanding and dialogue, strategy generation, and reasoning. Knowledge representation methods and algorithms adopted are also illustrated. A short example of interaction with IR-NLI is presented. Perspectives and directions for future research are also discussed. 15 references.

  17. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of

  18. The natural order of events: How speakers of different languages represent events nonverbally

    PubMed Central

    Goldin-Meadow, Susan; So, Wing Chee; zyrek, Asl?; Mylander, Carolyn

    2008-01-01

    To test whether the language we speak influences our behavior even when we are not speaking, we asked speakers of four languages differing in their predominant word orders (English, Turkish, Spanish, and Chinese) to perform two nonverbal tasks: a communicative task (describing an event by using gesture without speech) and a noncommunicative task (reconstructing an event with pictures). We found that the word orders speakers used in their everyday speech did not influence their nonverbal behavior. Surprisingly, speakers of all four languages used the same order and on both nonverbal tasks. This order, actorpatientact, is analogous to the subjectobjectverb pattern found in many languages of the world and, importantly, in newly developing gestural languages. The findings provide evidence for a natural order that we impose on events when describing and reconstructing them nonverbally and exploit when constructing language anew. PMID:18599445

  19. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new

  20. The Natural Approach to Language Teaching: An Update.

    ERIC Educational Resources Information Center

    Terrell, T. D.

    1985-01-01

    It is proposed that language acquisition improves if beginning students are allowed to experience three stages of acquisition: comprehension (preproduction), early speech production, and speech emergence. Each stage requires a different kind of activity building on the previous stage's development. (MSE)

  1. Unit 1001: The Nature of Meaning in Language.

    ERIC Educational Resources Information Center

    Minnesota Univ., Minneapolis. Center for Curriculum Development in English.

    This 10th-grade unit in Minnesota's "language-centered" curriculum introduces the complexity of linguistic meaning by demonstrating the relationships among linguistic symbols, their referents, their interpreters, and the social milieu. The unit begins with a discussion of Ray Bradbury's "The Kilimanjaro Machine," which illustrates how an otherwise

  2. Integrating Corpus-Based Resources and Natural Language Processing.

    ERIC Educational Resources Information Center

    Cantos, Pascual

    2002-01-01

    Surveys computational linguistic tools presently available, but whose potential has neither been fully considered nor exploited to its full in modern computer assisted language learning (CALL). Discusses the rationale of DDL to engage learning, presenting typical data-driven learning (DDL)-activities, DDL-software, and potential extensions of

  3. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards

  4. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  5. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…

  6. Natural language modeling for phoneme-to-text transcription

    SciTech Connect

    Derouault, A.M.; Merialdo, B.

    1986-11-01

    This paper relates different kinds of language modeling methods that can be applied to the linguistic decoding part of a speech recognition system with a very large vocabulary. These models are studied experimentally on a pseudophonetic input arising from French stenotypy. The authors propose a model which combines the advantages of a statistical modeling with information theoretic tools, and those of a grammatical approach.

  7. Language-Centered Social Studies: A Natural Integration.

    ERIC Educational Resources Information Center

    Barrera, Rosalinda B.; Aleman, Magdalena

    1983-01-01

    Described is a newspaper project in which elementary students report life as it was in the Middle Ages. Students are involved in a variety of language-centered activities. For example, they gather and evaluate information about medieval times and write, edit, and proofread articles for the newspaper. (RM)

  8. Comparing the Effects of Structural and Natural Language Use during Direct Instruction with Children with Mental Retardation.

    ERIC Educational Resources Information Center

    Kircaali-Iftar, Gonul; Birkan, Bunyamin; Uysal, Ayten

    1998-01-01

    Effects of structural and natural language use during direct instruction in teaching color and shape concepts to eight Turkish elementary children with moderate mental retardation were compared using an adapted alternating treatments design. Results indicated that natural language use was as effective or more effective than structural language

  9. The language faculty that wasn't: a usage-based account of natural language recursion.

    PubMed

    Christiansen, Morten H; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking-perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  10. The language faculty that wasn't: a usage-based account of natural language recursion

    PubMed Central

    Christiansen, Morten H.; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  11. Concreteness and Psychological Distance in Natural Language Use.

    PubMed

    Snefjella, Bryor; Kuperman, Victor

    2015-09-01

    Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. PMID:26239108

  12. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of

  13. Rimac: A Natural-Language Dialogue System that Engages Students in Deep Reasoning Dialogues about Physics

    ERIC Educational Resources Information Center

    Katz, Sandra; Jordan, Pamela; Litman, Diane

    2011-01-01

    The natural-language tutorial dialogue system that the authors are developing will allow them to focus on the nature of interactivity during tutoring as a malleable factor. Specifically, it will serve as a research platform for studies that manipulate the frequency and types of verbal alignment processes that take place during tutoring, such as…

  14. For the People...Citizenship Education and Naturalization Information. An English as a Second Language Text.

    ERIC Educational Resources Information Center

    Short, Deborah J.; And Others

    A textbook for English-as-a-Second-Language (ESL) students presents lessons on U.S. citizenship education and naturalization information. The nine lessons cover the following topics: the U.S. system of government; the Bill of Rights; responsibilities and rights of citizens; voting; requirements for naturalization; the application process; the…

  15. The feasibility of using natural language processing to extract clinical information from breast pathology reports

    PubMed Central

    Buckley, Julliette M.; Coopey, Suzanne B.; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K.; Kim, Elizabeth M. H.; Garber, Judy E.; Smith, Barbara L.; Gadd, Michele A.; Specht, Michelle C.; Roche, Constance A.; Gudewicz, Thomas M.; Hughes, Kevin S.

    2012-01-01

    Objective: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. Approach and Procedure: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. Results: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. Conclusion: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. PMID:22934236

  16. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E; Garrett, Kenyan; Sottile, Matthew; Shreyas, Ns

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  17. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.

    PubMed

    Clody, Michael C

    2011-01-01

    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key. PMID:22371983

  18. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  19. Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.

    PubMed

    Pyysalo, Sampo; Ginter, Filip; Pahikkala, Tapio; Boberg, Jorma; Jrvinen, Jouni; Salakoski, Tapio

    2006-06-01

    We present an evaluation of Link Grammar and Connexor Machinese Syntax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein-protein interactions. In the evaluation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein-protein interaction. We measure the performance of the parsers for recovery of individual dependencies, fully correct parses, and interaction subgraphs. For Link Grammar, an open system that can be inspected in detail, we further perform a comprehensive failure analysis, report specific causes of error, and suggest potential modifications to the grammar. We find that both parsers perform worse on biomedical English than previously reported on general English. While Connexor Machinese Syntax significantly outperforms Link Grammar, the failure analysis suggests specific ways in which the latter could be modified for better performance in the domain. PMID:16099201

  20. Integrated pathology reporting, indexing, and retrieval system using natural language diagnoses.

    PubMed

    Moore, G W; Boitnott, J K; Miller, R E; Eggleston, J C; Hutchins, G M

    1988-01-01

    Pathology computer systems are making increasing use of natural language diagnoses. The Johns Hopkins Medical Institutions integrated pathology reporting system, a commercial product with extensive, locally added enhancements, covers all information management functions within autopsy and surgical pathology divisions and has on-line linkages to clinical laboratory reports and the medical library's Mini-MEDLINE system. All diagnoses are written in natural language, using a word processor and spelling checker. A security system with personal passwords and different levels of access for different staff members allows reports to be signed out with an electronic signature. The system produces financial reports, overdue case reports, and Boolean searches of the database. Our experience with 128,790 consecutively entered pathology reports suggests that the greater precision of natural language diagnoses makes them the most suitable vehicle for follow-up, retrieval, and systems development functions in pathology. PMID:3070549

  1. Selecting the Best Mobile Information Service with Natural Language User Input

    NASA Astrophysics Data System (ADS)

    Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu

    Information services accessed via mobile phones provide information directly relevant to subscribers daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.

  2. Validation of clinical problems using a UMLS-based semantic parser.

    PubMed

    Goldberg, H S; Hsu, C; Law, V; Safran, C

    1998-01-01

    The capture and symbolization of data from the clinical problem list facilitates the creation of high-fidelity patient resumes for use in aggregate analysis and decision support. We report on the development of a UMLS-based semantic parser and present a preliminary evaluation of the parser in the recognition and validation of disease-related clinical problems. We randomly sampled 20% of the 26,858 unique non-dictionary clinical problems entered into OMR (Online Medical Record) between 1989 and August, 1997, and eliminated a series of qualified problem labels, e.g., history-of, to obtain a dataset of 4122 problem labels. Within this dataset, the authors identified 2810 labels (68.2%) as referring to a broad range of disease-related processes. The parser correctly recognized and validated 1398 of the 2810 disease-related labels (49.8 +/- 1.9%) and correctly excluded 1220 of 1312 non-disease-related labels (93.0 +/- 1.4%). 812 of the 1181 match failures (68.8%) were caused by terms either absent from UMLS or modifiers not accepted by the parser; 369 match failures (31.2%) were caused by labels having patterns not recognized by the parser. By enriching the UMLS lexicon with terms commonly found in provider-entered labels, it appears that performance of the parser can be significantly enhanced over a few subsequent iterations. This initial evaluation provides a foundation from which to make principled additions to the UMLS lexicon locally for use in symbolizing clinical data; further research is necessary to determine applicability to other health care settings. PMID:9929330

  3. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  4. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  5. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  6. Code-Switching: A Natural Phenomenon vs Language "Deficiency."

    ERIC Educational Resources Information Center

    Cheng, Li-Rong; Butler, Katharine

    1989-01-01

    Proposes that code switching (CS) and code mixing are natural phenomena that may result in increased competency in various communicative contexts. Both assets and deficits of CS are analyzed, and an ethnographic approach to the variable underlying CS is recommended. (32 references) (Author/VWL)

  7. Stochastic Model for the Vocabulary Growth in Natural Languages

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2013-04-01

    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipfs and Heaps law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  8. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  9. MMN to natural Arabic CV syllables: 2 - cross language study.

    PubMed

    Zeftawi, M Samir

    2005-12-01

    Mismatch negativity response parameters; latency, amplitude, and duration - to natural Arabic and natural English CV syllables - were obtained from normal-hearing adult Egyptians, in two experiments. In the first experiment, MMN was obtained in response to English CV syllable paradigms (Ba-Wa) and (Ga-Da) differing in formant duration and start of third formant, respectively. In the second experiment, MMN response for Arabic paradigm (Baa-Waa), English paradigm (Ba-Wa), and for Arabic-English paradigm (Waa-Wa) was obtained. Results revealed that the three levels of speech representation; acoustic, phonetic and phonologic could be probed preattentatively by MMN. The acoustic properties of speech signal are processed earlier than the phonetic and phonologic properties. PMID:16055287

  10. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.

  11. A Natural Language Intelligent Tutoring System for Training Pathologists: Implementation and Evaluation

    ERIC Educational Resources Information Center

    El Saadawi, Gilan M.; Tseytlin, Eugene; Legowski, Elizabeth; Jukic, Drazen; Castine, Melissa; Fine, Jeffrey; Gormley, Robert; Crowley, Rebecca S.

    2008-01-01

    Introduction: We developed and evaluated a Natural Language Interface (NLI) for an Intelligent Tutoring System (ITS) in Diagnostic Pathology. The system teaches residents to examine pathologic slides and write accurate pathology reports while providing immediate feedback on errors they make in their slide review and diagnostic reports. Residents

  12. The Linguistic Correlates of Conversational Deception: Comparing Natural Language Processing Technologies

    ERIC Educational Resources Information Center

    Duran, Nicholas D.; Hall, Charles; McCarthy, Philip M.; McNamara, Danielle S.

    2010-01-01

    The words people use and the way they use them can reveal a great deal about their mental states when they attempt to deceive. The challenge for researchers is how to reliably distinguish the linguistic features that characterize these hidden states. In this study, we use a natural language processing tool called Coh-Metrix to evaluate deceptive

  13. Self-Regulated Learning in Learning Environments with Pedagogical Agents that Interact in Natural Language

    ERIC Educational Resources Information Center

    Graesser, Arthur; McNamara, Danielle

    2010-01-01

    This article discusses the occurrence and measurement of self-regulated learning (SRL) both in human tutoring and in computer tutors with agents that hold conversations with students in natural language and help them learn at deeper levels. One challenge in building these computer tutors is to accommodate, encourage, and scaffold SRL because these

  14. Teaching the Tacit Knowledge of Programming to Novices with Natural Language Tutoring

    ERIC Educational Resources Information Center

    Lane, H. Chad; VanLehn, Kurt

    2005-01-01

    For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and

  15. Cross-Linguistic Evidence for the Nature of Age Effects in Second Language Acquisition

    ERIC Educational Resources Information Center

    Dekeyser, Robert; Alfi-Shabtay, Iris; Ravid, Dorit

    2010-01-01

    Few researchers would doubt that ultimate attainment in second language grammar is negatively correlated with age of acquisition, but considerable controversy remains about the nature of this relationship: the exact shape of the age-attainment function and its interpretation. This article presents two parallel studies with native speakers of

  16. A Sublanguage Approach to Natural Language Processing for an Expert System.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.; And Others

    1993-01-01

    Reports on the development of an NLP (natural language processing) component for processing the free-text comments on life insurance applications for evaluation by an underwriting expert system. A sublanguage grammar approach with strong reliance on semantic word classes is described. Highlights include lexical analysis, adjacency analysis, and…

  17. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)…

  18. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  19. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next

  20. The Rape of Mother Nature? Women in the Language of Environmental Discourse.

    ERIC Educational Resources Information Center

    Berman, Tzeporah

    1994-01-01

    Argues that the structure of language reflects and reproduces the dominant model, and reinforces many of the dualistic assumptions which underlie the separation of male and female, nature and culture, mind from body, emotion from reason, and intuition from fact. (LZ)

  1. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages

  2. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for…

  3. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory

    ERIC Educational Resources Information Center

    Tierney, Patrick J.

    2012-01-01

    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

  4. The International English Language Testing System (IELTS): Its Nature and Development.

    ERIC Educational Resources Information Center

    Ingram, D. E.

    The nature and development of the recently released International English Language Testing System (IELTS) instrument are described. The test is the result of a joint Australian-British project to develop a new test for use with foreign students planning to study in English-speaking countries. It is expected that the modular instrument will become

  5. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…

  6. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…

  7. An Evaluation of Help Mechanisms in Natural Language Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Kreymer, Oleg

    2002-01-01

    Evaluates the current state of natural language processing information retrieval systems from the user's point of view, focusing on the structure and components of the systems' help mechanisms. Topics include user/system interaction; semantic parsing; syntactic parsing; semantic mapping; and concept matching. (Author/LRW)

  8. The Nature of Auditory Discrimination Problems in Children with Specific Language Impairment: An MMN Study

    ERIC Educational Resources Information Center

    Davids, Nina; Segers, Eliane; van den Brink, Danielle; Mitterer, Holger; van Balkom, Hans; Hagoort, Peter; Verhoeven, Ludo

    2011-01-01

    Many children with specific language impairment (SLI) show impairments in discriminating auditorily presented stimuli. The present study investigates whether these discrimination problems are speech specific or of a general auditory nature. This was studied using a linguistic and nonlinguistic contrast that were matched for acoustic complexity in

  9. A Sublanguage Approach to Natural Language Processing for an Expert System.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.; And Others

    1993-01-01

    Reports on the development of an NLP (natural language processing) component for processing the free-text comments on life insurance applications for evaluation by an underwriting expert system. A sublanguage grammar approach with strong reliance on semantic word classes is described. Highlights include lexical analysis, adjacency analysis, and

  10. The Development of Language and Abstract Concepts: The Case of Natural Number

    ERIC Educational Resources Information Center

    Condry, Kirsten F.; Spelke, Elizabeth S.

    2008-01-01

    What are the origins of abstract concepts such as "seven," and what role does language play in their development? These experiments probed the natural number words and concepts of 3-year-old children who can recite number words to ten but who can comprehend only one or two. Children correctly judged that a set labeled eight retains this label if…

  11. Discrimination of Coronal Stops by Bilingual Adults: The Timing and Nature of Language Interaction

    ERIC Educational Resources Information Center

    Sundara, Megha; Polka, Linda

    2008-01-01

    The current study was designed to investigate the timing and nature of interaction between the two languages of bilinguals. For this purpose, we compared discrimination of Canadian French and Canadian English coronal stops by simultaneous bilingual, monolingual and advanced early L2 learners of French and English. French /d/ is phonetically

  12. Introduction to Special Issue: Understanding the Nature-Nurture Interactions in Language and Learning Differences.

    ERIC Educational Resources Information Center

    Berninger, Virginia Wise

    2001-01-01

    The introduction to this special issue on nature-nurture interactions notes that the following articles represent five biologically oriented research approaches which each provide a tutorial on the investigator's major research tool, a summary of current research understandings regarding language and learning differences, and a discussion of…

  13. BIT BY BIT: A Game Simulating Natural Language Processing in Computers

    ERIC Educational Resources Information Center

    Kato, Taichi; Arakawa, Chuichi

    2008-01-01

    BIT BY BIT is an encryption game that is designed to improve students' understanding of natural language processing in computers. Participants encode clear words into binary code using an encryption key and exchange them in the game. BIT BY BIT enables participants who do not understand the concept of binary numbers to perform the process of

  14. An Analysis of Methods for Preparing a Large Natural Language Data Base.

    ERIC Educational Resources Information Center

    Porch, Ann

    Relative cost and effectiveness of techniques for preparing a computer compatible data base consisting of approximately one million words of natural language are outlined. Considered are dollar cost, ease of editing, and time consumption. Facility for insertion of identifying information within the text, and updating of a text by merging with

  15. Dimensions of Difficulty in Translating Natural Language into First Order Logic

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Cox, Richard; Dale, Robert

    2009-01-01

    In this paper, we present a study of a large corpus of student logic exercises in which we explore the relationship between two distinct measures of difficulty: the proportion of students whose initial attempt at a given natural language to first-order logic translation is incorrect, and the average number of attempts that are required in order to

  16. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)

  17. The Use of Natural Language Entry and Laser Videodisk Technology in CAI.

    ERIC Educational Resources Information Center

    Abdulla, Abdulla M.; And Others

    1984-01-01

    The use of an authoring system is described that incorporates student interaction with the computer by natural language entry at the keyboard and the use of the microcomputer to direct a random-access laser video-disk player. (Author/MLW)

  18. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…

  19. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for

  20. Reconceptualizing the Nature of Goals and Outcomes in Language/s Education

    ERIC Educational Resources Information Center

    Leung, Constant; Scarino, Angela

    2016-01-01

    Transformations associated with the increasing speed, scale, and complexity of mobilities, together with the information technology revolution, have changed the demography of most countries of the world and brought about accompanying social, cultural, and economic shifts (Heugh, 2013). This complex diversity has changed the very nature of…

  1. A Natural Language for AdS/CFT Correlators

    SciTech Connect

    Fitzpatrick, A.Liam; Kaplan, Jared; Penedones, Joao; Raju, Suvrat; van Rees, Balt C.; /YITP, Stony Brook

    2012-02-14

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  2. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  3. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing

  4. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We

  5. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  6. A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language.

    PubMed

    Golosio, Bruno; Cangelosi, Angelo; Gamotina, Olesya; Masala, Giovanni Luca

    2015-01-01

    Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities. PMID:26560154

  7. A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language

    PubMed Central

    Golosio, Bruno; Cangelosi, Angelo; Gamotina, Olesya; Masala, Giovanni Luca

    2015-01-01

    Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities. PMID:26560154

  8. Grammar as a Programming Language. Artificial Intelligence Memo 391.

    ERIC Educational Resources Information Center

    Rowe, Neil

    Student projects that involve writing generative grammars in the computer language, "LOGO," are described in this paper, which presents a grammar-running control structure that allows students to modify and improve the grammar interpreter itself while learning how a simple kind of computer parser works. Included are procedures for programing a…

  9. Extending the VA CPRS electronic patient record order entry system using natural language processing techniques.

    PubMed Central

    Lovis, C.; Payne, T. H.

    2000-01-01

    An automated practitioner order entry system was recently implemented at the VA Puget Sound Health Care System. Since the introduction of this system, we have experienced various problems, among them an increase in time required for practitioners to enter orders. In order to improve usability and acceptance of the order entry, an alternate pathway was built within CPRS that allows direct natural language based order entry. Implementation of the extension in CPRS has been made possible because of the three layers CPRS architecture and its strong object oriented models. This paper discusses the advantages and needs for a natural language based order entry system and its implementation within an existing order entry system. PMID:11079937

  10. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks.

    PubMed

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar

    2015-01-01

    Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions. PMID:26151211

  11. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

    PubMed

    Khalifa, Abdulrahman; Meystre, Stphane

    2015-12-01

    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. PMID:26318122

  12. New Trends in Computing Anticipatory Systems : Emergence of Artificial Conscious Intelligence with Machine Learning Natural Language

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2008-10-01

    This paper deals with the challenge to create an Artificial Intelligence System with an Artificial Consciousness. For that, an introduction to computing anticipatory systems is presented, with the definitions of strong and weak anticipation. The quasi-anticipatory systems of Robert Rosen are linked to open-loop controllers. Then, some properties of the natural brain are presented in relation to the triune brain theory of Paul D. MacLean, and the mind time of Benjamin Libet, with his veto of the free will. The theory of the hyperincursive discrete anticipatory systems is recalled in view to introduce the concept of hyperincursive free will, which gives a similar veto mechanism: free will as unpredictable hyperincursive anticipation The concepts of endo-anticipation and exo-anticipation are then defined. Finally, some ideas about artificial conscious intelligence with natural language are presented, in relation to the Turing Machine, Formal Language, Intelligent Agents and Mutli-Agent System.

  13. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks

    PubMed Central

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar

    2015-01-01

    Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions. PMID:26151211

  14. Natural language processing-based COTS software and related technologies survey.

    SciTech Connect

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.

    2003-09-01

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  15. Laboratory process control using natural language commands from a personal computer

    NASA Technical Reports Server (NTRS)

    Will, Herbert A.; Mackin, Michael A.

    1989-01-01

    PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.

  16. Agile sensor tasking for CoIST using natural language knowledge representation and reasoning

    NASA Astrophysics Data System (ADS)

    Braines, David; de Mel, Geeth; Gwilliams, Chris; Parizas, Christos; Pizzocaro, Diego; Bergamaschi, Flavio; Preece, Alun

    2014-06-01

    We describe a system architecture aimed at supporting Intelligence, Surveillance, and Reconnaissance (ISR) activities in a Company Intelligence Support Team (CoIST) using natural language-based knowledge representation and reasoning, and semantic matching of mission tasks to ISR assets. We illustrate an application of the architecture using a High Value Target (HVT) surveillance scenario which demonstrates semi-automated matching and assignment of appropriate ISR assets based on information coming in from existing sensors and human patrols operating in an area of interest and encountering a potential HVT vehicle. We highlight a number of key components of the system but focus mainly on the human/machine conversational interaction involving soldiers on the field providing input in natural language via spoken voice to a mobile device, which is then processed to machine-processable Controlled Natural Language (CNL) and confirmed with the soldier. The system also supports CoIST analysts obtaining real-time situation awareness on the unfolding events through fused CNL information via tools available at the Command and Control (C2). The system demonstrates various modes of operation including: automatic task assignment following inference of new high-importance information, as well as semi-automatic processing, providing the CoIST analyst with situation awareness information relevant to the area of operation.

  17. De-Centering English: Highlighting the Dynamic Nature of the English Language to Promote the Teaching of Code-Switching

    ERIC Educational Resources Information Center

    White, John W.

    2011-01-01

    Embracing the dynamic nature of English language can help students learn more about all forms of English. To fully engage students, teachers should not adhere to an anachronistic and static view of English. Instead, they must acknowledge, accept, and even use different language forms within the classroom to make that classroom dynamic, inclusive,

  18. Exploring culture, language and the perception of the nature of science

    NASA Astrophysics Data System (ADS)

    Sutherland, Dawn

    2002-01-01

    One dimension of early Canadian education is the attempt of the government to use the education system as an assimilative tool to integrate the First Nations and Metis people into Euro-Canadian society. Despite these attempts, many First Nations and Metis people retained their culture and their indigenous language. Few science educators have examined First Nations and Western scientific worldviews and the impact they may have on science learning. This study explored the views some First Nations (Cree) and Euro-Canadian Grade-7-level students in Manitoba had about the nature of science. Both qualitative (open-ended questions and interviews) and quantitative (a Likert-scale questionnaire) instruments were used to explore student views. A central hypothesis to this research programme is the possibility that the different world-views of two student populations, Cree and Euro-Canadian, are likely to influence their perceptions of science. This preliminary study explored a range of methodologies to probe the perceptions of the nature of science in these two student populations. It was found that the two cultural groups differed significantly between some of the tenets in a Nature of Scientific Knowledge Scale (NSKS). Cree students significantly differed from Euro-Canadian students on the developmental, testable and unified tenets of the nature of scientific knowledge scale. No significant differences were found in NSKS scores between language groups (Cree students who speak English in the home and those who speak English and Cree or Cree only). The differences found between language groups were primarily in the open-ended questions where preformulated responses were absent. Interviews about critical incidents provided more detailed accounts of the Cree students' perception of the nature of science. The implications of the findings of this study are discussed in relation to the challenges related to research methodology, further areas for investigation, science teaching in First Nations communities and science curriculum development.

  19. Computerized measurement of the content analysis of natural language for use in biomedical and neuropsychiatric research.

    PubMed

    Gottschalk, L A; Bechtel, R

    1995-07-01

    Over several decades, the senior author, with various colleagues, has developed an objective method of measuring the magnitude of commonly useful and pertinent neuropsychiatric and neuropsychological dimensions from the content and form analysis of verbal behavior and natural language. Extensive reliability and validation studies using this method have been published involving English, German, Spanish and many other languages, and which confirm that these Content Analysis Scales can be reliably scored cross-culturally and have construct validity. The validated measures include the Anxiety Scale (and six subscales), the Hostility Outward Scale (and two subscales), the Hostility In Scale, the Ambivalent Hostility Scale, the Social Alienation-Personal Disorganization Scale, the Cognitive Impairment Scale, the Depression Scale (and seven subscales), and the Hope Scale. Here, the authors report the development of artificial intelligence (LISP based) software that can reliably score these Content Analysis Scales, whose achievement facilitates the application of these measures to biomedical and neuropsychiatric research. PMID:7587159

  20. MEDSYNDIKATE--a natural language system for the extraction of medical information from findings reports.

    PubMed

    Hahn, Udo; Romacker, Martin; Schulz, Stefan

    2002-12-01

    MEDSYNDIKATE is a natural language processor, which automatically acquires medical information from findings reports. In the course of text analysis their contents is transferred to conceptual representation structures, which constitute a corresponding text knowledge base. MEDSYNDIKATE is particularly adapted to deal properly with text structures, such as various forms of anaphoric reference relations spanning several sentences. The strong demands MEDSYNDIKATE poses on the availability of expressive knowledge sources are accounted for by two alternative approaches to acquire medical domain knowledge (semi)automatically. We also present data for the information extraction performance of MEDSYNDIKATE in terms of the semantic interpretation of three major syntactic patterns in medical documents. PMID:12460632

  1. Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping.

    PubMed

    Smith, Andrew E; Humphreys, Michael S

    2006-05-01

    The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in a nunsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks. PMID:16956103

  2. Practical systems use natural languages and store human expertise (artificial intelligence)

    SciTech Connect

    Evanczuk, S.; Manuel, T.

    1983-12-01

    For earlier articles see T. Manuel et al., ibid., vol.56, no.22, p.127-37. This second part of a special report on commercial applications of artificial intelligence examines the milestones which mark this major new path for the software industry. It covers state-space search, the problem of ambiguity, augmented transition networks, early commercial products, current and expected personal computer software, natural-language interfaces, research projects, knowledge engineering, the workings of artificial-intelligence-based applications programs, LISP, attributes and object orientation.

  3. Knowledge acquisition from natural language for expert systems based on classification problem-solving methods

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando

    1989-01-01

    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

  4. Aspects of a Natural Language Based Artificial Intelligence System Report Number Seven: Language and the Structure of Knowledge.

    ERIC Educational Resources Information Center

    Borden, George A.

    ARIS is an artificial intelligence system which uses the English language to learn, understand, and communicate. The system attempts to simulate the psychoneurological processes which enable man to communicate verbally. It uses a modified stratificational grammar model and is being programed in PL/1 (a programing language) for an IBM 360/67

  5. Fremdsprachenunterricht und natuerliche Zweitsprachigkeit: Spracherwerbssituationen im Vergleich (Foreign Language Teaching and Natural Bilingualism; A Comparison of language Learning Situations).

    ERIC Educational Resources Information Center

    Butzkamm, Wolfgang

    1978-01-01

    A 6th-grade test in English as a foreign language is described and contrasted with second language acquisition by the children of foreign laborers in Germany. The latter calls for special teaching procedures; tests and tapes are described. The teacher's "feel" is considered more important than "scientific" methodology. (IFS/WGA)

  6. Language.

    PubMed

    Cattaneo, Luigi

    2013-01-01

    Noninvasive focal brain stimulation by means of transcranial magnetic stimulation (TMS) has been used extensively in the past 20 years to investigate normal language functions. The picture emerging from this collection of empirical works is that of several independent modular functions mapped on left-lateralized temporofrontal circuits originating dorsally or ventrally to the auditory cortex. The identification of sounds as language (i.e., phonological transformations) is modulated by TMS applied over the posterior-superior temporal cortex and over the caudal inferior frontal gyrus/ventral premotor cortex complex. Conversely, attribution of semantics to words is modulated successfully by applying TMS to the rostral part of the inferior frontal gyrus. Speech production is typically interfered with by TMS applied to the left inferior frontal gyrus, onto the same cortical areas that also contain phonological representations. The cortical mapping of grammatical functions has been investigated with TMS mainly regarding the category of verbs, which seem to be represented in the left middle frontal gyrus. Most TMS studies have investigated the cortical processing of single words or sublexical elements. Conversely, complex elements of language such as syntax have not been investigated extensively, although a few studies have indicated a left temporal, frontal, and parietal system also involving the neocerebellar cortex. Finally, both the perception and production of nonlinguistic communicative properties of speech, such as prosody, have been mapped by TMS in the peri-Silvian region of the right hemisphere. PMID:24112933

  7. A Principled Framework for Constructing Natural Language Interfaces To Temporal Databases

    NASA Astrophysics Data System (ADS)

    Androutsopoulos, Ion

    1996-09-01

    Most existing natural language interfaces to databases (NLIDBs) were designed to be used with ``snapshot'' database systems, that provide very limited facilities for manipulating time-dependent data. Consequently, most NLIDBs also provide very limited support for the notion of time. The database community is becoming increasingly interested in _temporal_ database systems. These are intended to store and manipulate in a principled manner information not only about the present, but also about the past and future. This thesis develops a principled framework for constructing English NLIDBs for _temporal_ databases (NLITDBs), drawing on research in tense and aspect theories, temporal logics, and temporal databases. I first explore temporal linguistic phenomena that are likely to appear in English questions to NLITDBs. Drawing on existing linguistic theories of time, I formulate an account for a large number of these phenomena that is simple enough to be embodied in practical NLITDBs. Exploiting ideas from temporal logics, I then define a temporal meaning representation language, TOP, and I show how the HPSG grammar theory can be modified to incorporate the tense and aspect account of this thesis, and to map a wide range of English questions involving time to appropriate TOP expressions. Finally, I present and prove the correctness of a method to translate from TOP to TSQL2, TSQL2 being a temporal extension of the SQL-92 database language. This way, I establish a sound route from English questions involving time to a general-purpose temporal database language, that can act as a principled framework for building NLITDBs. To demonstrate that this framework is workable, I employ it to develop a prototype NLITDB, implemented using ALE and Prolog.

  8. Formal ontology for natural language processing and the integration of biomedical databases.

    PubMed

    Simon, Jonathan; Dos Santos, Mariana; Fielding, James; Smith, Barry

    2006-01-01

    The central hypothesis underlying this communication is that the methodology and conceptual rigor of a philosophically inspired formal ontology can bring significant benefits in the development and maintenance of application ontologies [A. Flett, M. Dos Santos, W. Ceusters, Some Ontology Engineering Procedures and their Supporting Technologies, EKAW2002, 2003]. This hypothesis has been tested in the collaboration between Language and Computing (L&C), a company specializing in software for supporting natural language processing especially in the medical field, and the Institute for Formal Ontology and Medical Information Science (IFOMIS), an academic research institution concerned with the theoretical foundations of ontology. In the course of this collaboration L&C's ontology, LinKBase, which is designed to integrate and support reasoning across a plurality of external databases, has been subjected to a thorough auditing on the basis of the principles underlying IFOMIS's Basic Formal Ontology (BFO) [B. Smith, Basic Formal Ontology, 2002. http://ontology.buffalo.edu/bfo]. The goal is to transform a large terminology-based ontology into one with the ability to support reasoning applications. Our general procedure has been the implementation of a meta-ontological definition space in which the definitions of all the concepts and relations in LinKBase are standardized in the framework of first-order logic. In this paper we describe how this principles-based standardization has led to a greater degree of internal coherence of the LinKBase structure, and how it has facilitated the construction of mappings between external databases using LinKBase as translation hub. We argue that the collaboration here described represents a new phase in the quest to solve the so-called "Tower of Babel" problem of ontology integration [F. Montayne, J. Flanagan, Formal Ontology: The Foundation for Natural Language Processing, 2003. http://www.landcglobal.com/]. PMID:16153885

  9. Language of the Earth: Exploring Natural Hazards through a Literary Anthology

    NASA Astrophysics Data System (ADS)

    Malamud, B. D.; Rhodes, F. H. T.

    2009-04-01

    This paper explores natural hazards teaching and communications through the use of a literary anthology of writings about the earth aimed at non-experts. Teaching natural hazards in high-school and university introductory Earth Science and Geography courses revolves mostly around lectures, examinations, and laboratory demonstrations/activities. Often the results of such a course are that a student 'memorizes' the answers, and is penalized when they miss a given fact [e.g., "You lost one point because you were off by 50 km/hr on the wind speed of an F5 tornado."] Although facts and general methodologies are certainly important when teaching natural hazards, it is a strong motivation to a student's assimilation of, and enthusiasm for, this knowledge, if supplemented by writings about the Earth. In this paper, we discuss a literary anthology which we developed [Language of the Earth, Rhodes, Stone, Malamud, Wiley-Blackwell, 2008] which includes many descriptions about natural hazards. Using first- and second-hand accounts of landslides, earthquakes, tsunamis, floods and volcanic eruptions, through the writings of McPhee, Gaskill, Voltaire, Austin, Cloos, and many others, hazards become 'alive', and more than 'just' a compilation of facts and processes. Using short excerpts such as these, or other similar anthologies, of remarkably written accounts and discussions about natural hazards results in 'dry' facts becoming more than just facts. These often highly personal viewpoints of our catostrophic world, provide a useful supplement to a student's understanding of the turbulent world in which we live.

  10. Linking sounds to meanings: Infant statistical learning in a natural language

    PubMed Central

    Hay, Jessica F.; Pelucchi, Bruna; Estes, Katharine Graf; Saffran, Jenny R.

    2011-01-01

    The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants prior experience with the distribution of sounds that make up words in natural languages. PMID:21762650

  11. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

    NASA Technical Reports Server (NTRS)

    Genuardi, Michael T.

    1993-01-01

    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  12. Gesture language use in natural UI: pen-based sketching in conceptual design

    NASA Astrophysics Data System (ADS)

    Ma, Cuixia; Dai, Guozhong

    2003-04-01

    Natural User Interface is one of the important next generation interactions. Computers are not just the tools of many special people or areas but for most people. Ubiquitous computing makes the world magic and more comfortable. In the design domain, current systems, which need the detail information, cannot conveniently support the conceptual design of the early phrase. Pen and paper are the natural and simple tools to use in our daily life, especially in design domain. Gestures are the useful and natural mode in the interaction of pen-based. In natural UI, gestures can be introduced and used through the similar mode to the existing resources in interaction. But the gestures always are defined beforehand without the users' intention and recognized to represent something in certain applications without being transplanted to others. We provide the gesture description language (GDL) to try to cite the useful gestures to the applications conveniently. It can be used in terms of the independent control resource such as menus or icons in applications. So we give the idea from two perspectives: one from the application-dependent point of view and the other from the application-independent point of view.

  13. Abductive Equivalential Translation and its application to Natural Language Database Interfacing

    NASA Astrophysics Data System (ADS)

    Rayner, Manny

    1994-05-01

    The thesis describes a logical formalization of natural-language database interfacing. We assume the existence of a ``natural language engine'' capable of mediating between surface linguistic string and their representations as ``literal'' logical forms: the focus of interest will be the question of relating ``literal'' logical forms to representations in terms of primitives meaningful to the underlying database engine. We begin by describing the nature of the problem, and show how a variety of interface functionalities can be considered as instances of a type of formal inference task which we call ``Abductive Equivalential Translation'' (AET); functionalities which can be reduced to this form include answering questions, responding to commands, reasoning about the completeness of answers, answering meta-questions of type ``Do you know...'', and generating assertions and questions. In each case, a ``linguistic domain theory'' (LDT) Γ and an input formula F are given, and the goal is to construct a formula with certain properties which is equivalent to F, given Γ and a set of permitted assumptions. If the LDT is of a certain specified type, whose formulas are either conditional equivalences or Horn-clauses, we show that the AET problem can be reduced to a goal-directed inference method. We present an abstract description of this method, and sketch its realization in Prolog. The relationship between AET and several problems previously discussed in the literature is discussed. In particular, we show how AET can provide a simple and elegant solution to the so-called ``Doctor on Board'' problem, and in effect allows a ``relativization'' of the Closed World Assumption. The ideas in the thesis have all been implemented concretely within the SRI CLARE project, using a real projects and payments database. The LDT for the example database is described in detail, and examples of the types of functionality that can be achieved within the example domain are presented.

  14. Thermo-msf-parser: an open source Java library to parse and visualize Thermo Proteome Discoverer msf files.

    PubMed

    Colaert, Niklaas; Barsnes, Harald; Vaudel, Marc; Helsens, Kenny; Timmerman, Evy; Sickmann, Albert; Gevaert, Kris; Martens, Lennart

    2011-08-01

    The Thermo Proteome Discoverer program integrates both peptide identification and quantification into a single workflow for peptide-centric proteomics. Furthermore, its close integration with Thermo mass spectrometers has made it increasingly popular in the field. Here, we present a Java library to parse the msf files that constitute the output of Proteome Discoverer. The parser is also implemented as a graphical user interface allowing convenient access to the information found in the msf files, and in Rover, a program to analyze and validate quantitative proteomics information. All code, binaries, and documentation is freely available at http://thermo-msf-parser.googlecode.com. PMID:21714566

  15. Neural substrates of figurative language during natural speech perception: an fMRI study

    PubMed Central

    Nagels, Arne; Kauschke, Christina; Schrauf, Judith; Whitney, Carin; Straube, Benjamin; Kircher, Tilo

    2013-01-01

    Many figurative expressions are fully conventionalized in everyday speech. Regarding the neural basis of figurative language processing, research has predominantly focused on metaphoric expressions in minimal semantic context. It remains unclear in how far metaphoric expressions during continuous text comprehension activate similar neural networks as isolated metaphors. We therefore investigated the processing of similes (figurative language, e.g., He smokes like a chimney!) occurring in a short story. Sixteen healthy, male, native German speakers listened to similes that came about naturally in a short story, while blood-oxygenation-level-dependent (BOLD) responses were measured with functional magnetic resonance imaging (fMRI). For the event-related analysis, similes were contrasted with non-figurative control sentences (CS). The stimuli differed with respect to figurativeness, while they were matched for frequency of words, number of syllables, plausibility, and comprehensibility. Similes contrasted with CS resulted in enhanced BOLD responses in the left inferior (IFG) and adjacent middle frontal gyrus. Concrete CS as compared to similes activated the bilateral middle temporal gyri as well as the right precuneus and the left middle frontal gyrus (LMFG). Activation of the left IFG for similes in a short story is consistent with results on single sentence metaphor processing. The findings strengthen the importance of the left inferior frontal region in the processing of abstract figurative speech during continuous, ecologically-valid speech comprehension; the processing of concrete semantic contents goes along with a down-regulation of bilateral temporal regions. PMID:24065897

  16. Natural Language Query in the Biochemistry and Molecular Biology Domains Based on Cognition Search™

    PubMed Central

    Goldsmith, Elizabeth J.; Mendiratta, Saurabh; Akella, Radha; Dahlgren, Kathleen

    2009-01-01

    Motivation: With the increasing volume of scientific papers and heterogeneous nomenclature in the biomedical literature, it is apparent that an improvement over standard pattern matching available in existing search engines is required. Cognition Search Information Retrieval (CSIR) is a natural language processing (NLP) technology that possesses a large dictionary (lexicon) and large semantic databases, such that search can be based on meaning. Encoded synonymy, ontological relationships, phrases, and seeds for word sense disambiguation offer significant improvement over pattern matching. Thus, the CSIR has the right architecture to form the basis for a scientific search engine. Result: Here we have augmented CSIR to improve access to the MEDLINE database of scientific abstracts. New biochemical, molecular biological and medical language and acronyms were introduced from curated web-based sources. The resulting system was used to interpret MEDLINE abstracts. Meaning-based search of MEDLINE abstracts yields high precision (estimated at >90%), and high recall (estimated at >90%), where synonym, ontology, phrases and sense seeds have been encoded. The present implementation can be found at http://MEDLINE.cognition.com. Contact: Elizabeth.goldsmith@UTsouthwestern.edu Kathleen.dahlgren@cognition.com PMID:21347167

  17. Neural substrates of figurative language during natural speech perception: an fMRI study.

    PubMed

    Nagels, Arne; Kauschke, Christina; Schrauf, Judith; Whitney, Carin; Straube, Benjamin; Kircher, Tilo

    2013-01-01

    Many figurative expressions are fully conventionalized in everyday speech. Regarding the neural basis of figurative language processing, research has predominantly focused on metaphoric expressions in minimal semantic context. It remains unclear in how far metaphoric expressions during continuous text comprehension activate similar neural networks as isolated metaphors. We therefore investigated the processing of similes (figurative language, e.g., "He smokes like a chimney!") occurring in a short story. Sixteen healthy, male, native German speakers listened to similes that came about naturally in a short story, while blood-oxygenation-level-dependent (BOLD) responses were measured with functional magnetic resonance imaging (fMRI). For the event-related analysis, similes were contrasted with non-figurative control sentences (CS). The stimuli differed with respect to figurativeness, while they were matched for frequency of words, number of syllables, plausibility, and comprehensibility. Similes contrasted with CS resulted in enhanced BOLD responses in the left inferior (IFG) and adjacent middle frontal gyrus. Concrete CS as compared to similes activated the bilateral middle temporal gyri as well as the right precuneus and the left middle frontal gyrus (LMFG). Activation of the left IFG for similes in a short story is consistent with results on single sentence metaphor processing. The findings strengthen the importance of the left inferior frontal region in the processing of abstract figurative speech during continuous, ecologically-valid speech comprehension; the processing of concrete semantic contents goes along with a down-regulation of bilateral temporal regions. PMID:24065897

  18. Detection of practice pattern trends through Natural Language Processing of clinical narratives and biomedical literature.

    PubMed

    Chen, Elizabeth S; Stetson, Peter D; Lussier, Yves A; Markatou, Marianthi; Hripcsak, George; Friedman, Carol

    2007-01-01

    Clinical knowledge, best evidence, and practice patterns evolve over time. The ability to track these changes and study practice trends may be valuable for performance measurement and quality improvement efforts. The goal of this study was to assess the feasibility and validity of methods to generate and compare trends in biomedical literature and clinical narrative. We focused on the challenge of detecting trends in medication usage over time for two diseases: HIV/AIDS and asthma. Information about disease-specific medications in published randomized control trials and discharge summaries at NewYork-Presbyterian Hospital over a ten-year period were extracted using Natural Language Processing. This paper reports on the ability of our semi-automated process to discover disease-drug practice pattern trends and interpretation of findings across the biomedical and clinical text sources. PMID:18693810

  19. Interset: A natural language interface for teleoperated robotic assembly of the EASE space structure

    NASA Technical Reports Server (NTRS)

    Boorsma, Daniel K.

    1989-01-01

    A teleoperated robot was used to assemble the Experimental Assembly of Structures in Extra-vehicular activity (EASE) space structure under neutral buoyancy conditions, simulating a telerobot performing structural assembly in the zero gravity of space. This previous work used a manually controlled teleoperator as a test bed for system performance evaluations. From these results several Artificial Intelligence options were proposed. One of these was further developed into a real time assembly planner. The interface for this system is effective in assembling EASE structures using windowed graphics and a set of networked menus. As the problem space becomes more complex and hence the set of control options increases, a natural language interface may prove to be beneficial to supplement the menu based control strategy. This strategy can be beneficial in situations such as: describing the local environment, maintaining a data base of task event histories, modifying a plan or a heuristic dynamically, summarizing a task in English, or operating in a novel situation.

  20. HUNTER-GATHERER: Three search techniques integrated for natural language semantics

    SciTech Connect

    Beale, S.; Nirenburg, S.; Mahesh, K.

    1996-12-31

    This work integrates three related Al search techniques - constraint satisfaction, branch-and-bound and solution synthesis - and applies the result to semantic processing in natural language (NL). We summarize the approach as {open_quote}Hunter-Gatherer:{close_quotes} (1) branch-and-bound and constraint satisfaction allow us to {open_quote}hunt down{close_quotes} non-optimal and impossible solutions and prune them from the search space. (2) solution synthesis methods then {open_quote}gather{close_quotes} all optimal solutions avoiding exponential complexity. Each of the three techniques is briefly described, as well as their extensions and combinations used in our system. We focus on the combination of solution synthesis and branch-and-bound methods which has enabled near-linear-time processing in our applications. Finally, we illustrate how the use of our technique in a large-scale MT project allowed a drastic reduction in search space.

  1. Using natural language processing to analyze physician modifications to data entry templates.

    PubMed Central

    Wilcox, Adam B.; Narus, Scott P.; Bowes, Watson A.

    2002-01-01

    Efficient data entry by clinicians remains a significant challenge for electronic medical records. Current approaches have largely focused on either structured data entry, which can be limiting in expressive power, or free-text entry, which restricts the use of the data for automated decision support. Text-based templates are a semi-structured data entry method that has been used to assist physicians in manually entering clinical notes, by allowing them to edit predefined example notes. We analyzed changes made to 18,726 sentences from text templates, using a natural language processor. The most common changes were addition or deletion of normal observations, or changes in certainty. We identified common modifications that could be captured in structured form by a graphical user interface. PMID:12463955

  2. Using Natural Language Processing to Improve Accuracy of Automated Notifiable Disease Reporting

    PubMed Central

    Friedlin, Jeff; Grannis, Shaun; Overhage, J. Marc

    2008-01-01

    We examined whether using a natural language processing (NLP) system results in improved accuracy and completeness of automated electronic laboratory reporting (ELR) of notifiable conditions. We used data from a community-wide health information exchange that has automated ELR functionality. We focused on methicillin-resistant Staphylococcus Aureus (MRSA), a reportable infection found in unstructured, free-text culture result reports. We used the Regenstrief EXtraction tool (REX) for this work. REX processed 64,554 reports that mentioned MRSA and we compared its output to a gold standard (human review). REX correctly identified 39,491(99.96%) of the 39,508 reports positive for MRSA, and committed only 74 false positive errors. It achieved high sensitivity, specificity, positive predicted value and F-measure. REX identified over two times as many MRSA positive reports as the ELR system without NLP. Using NLP can improve the completeness and accuracy of automated ELR. PMID:18999177

  3. Disclosure Control of Natural Language Information to Enable Secure and Enjoyable Communication over the Internet

    NASA Astrophysics Data System (ADS)

    Kataoka, Haruno; Utsumi, Akira; Hirose, Yuki; Yoshiura, Hiroshi

    Disclosure control of natural language information (DCNL), which we are trying to realize, is described. DCNL will be used for securing human communications over the internet, such as through blogs and social network services. Before sentences in the communications are disclosed, they are checked by DCNL and any phrases that could reveal sensitive information are transformed or omitted so that they are no longer revealing. DCNL checks not only phrases that directly represent sensitive information but also those that indirectly suggest it. Combinations of phrases are also checked. DCNL automatically learns the knowledge of sensitive phrases and the suggestive relations between phrases by using co-occurrence analysis and Web retrieval. The users' burden is therefore minimized, i.e., they do not need to define many disclosure control rules. DCNL complements the traditional access control in the fields where reliability needs to be balanced with enjoyment and objects classes for the access control cannot be predefined.

  4. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary

    PubMed Central

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda

    2014-01-01

    In April 2012, the National Institutes of Health organized a two-day workshop entitled Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193

  5. On application of image analysis and natural language processing for music search

    NASA Astrophysics Data System (ADS)

    Gwardys, Grzegorz

    2013-10-01

    In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.

  6. Adapting a Natural Language Processing Tool to Facilitate Clinical Trial Curation for Personalized Cancer Therapy

    PubMed Central

    Zeng, Jia; Wu, Yonghui; Bailey, Ann; Johnson, Amber; Holla, Vijaykumar; Bernstam, Elmer V.; Xu, Hua; Meric-Bernstam, Funda

    2014-01-01

    The design of personalized cancer therapy based upon patients molecular profile requires an enormous amount of effort to review, analyze and integrate molecular, pharmacological, clinical and patient-specific information. The vast size, rapid expansion and non-standardized formats of the relevant information sources make it difficult for oncologists to gather pertinent information that can support routine personalized treatment. In this paper, we introduce informatics tools that assist the retrieval and curation of cancer-related clinical trials involving targeted therapies. Particularly, we adapted and extended an existing natural language processing tool, and explored its applicability in facilitating our annotation efforts. The system was evaluated using a gold standard of 539 curated clinical trials, demonstrating promising performance and good generalizability (81% accuracy in predicting genotype-selected trials and an average recall of 0.85 in predicting specific selection criteria). PMID:25717412

  7. Knowledge Extraction from MEDLINE by Combining Clustering with Natural Language Processing

    PubMed Central

    Miñarro-Giménez, Jose A.; Kreuzthaler, Markus; Schulz, Stefan

    2015-01-01

    The identification of relevant predicates between co-occurring concepts in scientific literature databases like MEDLINE is crucial for using these sources for knowledge extraction, in order to obtain meaningful biomedical predications as subject-predicate-object triples. We consider the manually assigned MeSH indexing terms (main headings and subheadings) in MEDLINE records as a rich resource for extracting a broad range of domain knowledge. In this paper, we explore the combination of a clustering method for co-occurring concepts based on their related MeSH subheadings in MEDLINE with the use of SemRep, a natural language processing engine, which extracts predications from free text documents. As a result, we generated sets of clusters of co-occurring concepts and identified the most significant predicates for each cluster. The association of such predicates with the co-occurrences of the resulting clusters produces the list of predications, which were checked for relevance. PMID:26958228

  8. Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language

    PubMed Central

    West, Andrew G

    2015-01-01

    Background Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. Objective This paper quantifies the production and consumption of Wikipedia’s medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia’s natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia’s medical contributors, including year-over-year participation trends and editor demographics. Methods Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. Results Wikipedia’s medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85.5% (100/117) had a university education. Conclusions Although Wikipedia has a considerable volume of multilingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size. PMID:25739399

  9. LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

    NASA Technical Reports Server (NTRS)

    Will, H.

    1994-01-01

    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  10. Integrating natural language processing and web GIS for interactive knowledge domain visualization

    NASA Astrophysics Data System (ADS)

    Du, Fangming

    Recent years have seen a powerful shift towards data-rich environments throughout society. This has extended to a change in how the artifacts and products of scientific knowledge production can be analyzed and understood. Bottom-up approaches are on the rise that combine access to huge amounts of academic publications with advanced computer graphics and data processing tools, including natural language processing. Knowledge domain visualization is one of those multi-technology approaches, with its aim of turning domain-specific human knowledge into highly visual representations in order to better understand the structure and evolution of domain knowledge. For example, network visualizations built from co-author relations contained in academic publications can provide insight on how scholars collaborate with each other in one or multiple domains, and visualizations built from the text content of articles can help us understand the topical structure of knowledge domains. These knowledge domain visualizations need to support interactive viewing and exploration by users. Such spatialization efforts are increasingly looking to geography and GIS as a source of metaphors and practical technology solutions, even when non-georeferenced information is managed, analyzed, and visualized. When it comes to deploying spatialized representations online, web mapping and web GIS can provide practical technology solutions for interactive viewing of knowledge domain visualizations, from panning and zooming to the overlay of additional information. This thesis presents a novel combination of advanced natural language processing - in the form of topic modeling - with dimensionality reduction through self-organizing maps and the deployment of web mapping/GIS technology towards intuitive, GIS-like, exploration of a knowledge domain visualization. A complete workflow is proposed and implemented that processes any corpus of input text documents into a map form and leverages a web application framework to let users explore knowledge domain maps interactively. This workflow is implemented and demonstrated for a data set of more than 66,000 conference abstracts.

  11. The Usual and the Unusual: Solving Remote Associates Test Tasks Using Simple Statistical Natural Language Processing Based on Language Use

    ERIC Educational Resources Information Center

    Klein, Ariel; Badia, Toni

    2015-01-01

    In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying

  12. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.

    PubMed

    Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. PMID:24935050

  13. Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof

    ERIC Educational Resources Information Center

    Robotti, Elisabetta

    2012-01-01

    In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language

  14. Statistical Learning in a Natural Language by 8-Month-Old Infants

    ERIC Educational Resources Information Center

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2009-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real

  15. Image statistics of American Sign Language: comparison with faces and natural scenes

    NASA Astrophysics Data System (ADS)

    Bosworth, Rain G.; Bartlett, Marian Stewart; Dobkins, Karen R.

    2006-09-01

    Several lines of evidence suggest that the image statistics of the environment shape visual abilities. To date, the image statistics of natural scenes and faces have been well characterized using Fourier analysis. We employed Fourier analysis to characterize images of signs in American Sign Language (ASL). These images are highly relevant to signers who rely on ASL for communication, and thus the image statistics of ASL might influence signers' visual abilities. Fourier analysis was conducted on 105 static images of signs, and these images were compared with analyses of 100 natural scene images and 100 face images. We obtained two metrics from our Fourier analysis: mean amplitude and entropy of the amplitude across the image set (which is a measure from information theory) as a function of spatial frequency and orientation. The results of our analyses revealed interesting differences in image statistics across the three different image sets, setting up the possibility that ASL experience may alter visual perception in predictable ways. In addition, for all image sets, the mean amplitude results were markedly different from the entropy results, which raises the interesting question of which aspect of an image set (mean amplitude or entropy of the amplitude) is better able to account for known visual abilities.

  16. Does It Really Matter whether Students' Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language?

    ERIC Educational Resources Information Center

    D'Mello, Sidney K.; Dowell, Nia; Graesser, Arthur

    2011-01-01

    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The "speech facilitation" hypothesis predicts that spoken input will "increase" learning, whereas the "text

  17. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  18. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring

  19. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…

  20. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management

    PubMed Central

    Keselman, Alla; Rosemblat, Graciela; Kilicoglu, Halil; Fiszman, Marcelo; Jin, Honglan; Shin, Dongwook; Rindflesch, Thomas C.

    2013-01-01

    Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design. PMID:24311971

  1. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines.

    PubMed

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne

    2014-01-01

    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice. PMID:24943582

  2. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

    PubMed Central

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  3. Determining the Reasons for Medication Prescriptions in the EHR using Knowledge and Natural Language Processing

    PubMed Central

    Li, Ying; Salmasian, Hojjat; Harpaz, Rave; Chase, Herbert; Friedman, Carol

    2011-01-01

    Knowledge of medication indications is significant for automatic applications aimed at improving patient safety, such as computerized physician order entry and clinical decision support systems. The Electronic Health Record (EHR) contains pertinent information related to patient safety such as information related to appropriate prescribing. However, the reasons for medication prescriptions are usually not explicitly documented in the patient record. This paper describes a method that determines the reasons for medication uses based on information occurring in outpatient notes. The method utilizes drug-indication knowledge that we acquired, and natural language processing. Evaluation showed the method obtained a sensitivity of 62.8%, specificity of 93.9%, precision of 90% and F-measure of 73.9%. This pilot study demonstrated that linking external drug indication knowledge to the EHR for determining the reasons for medication use was promising, but also revealed some challenges. Future work will focus on increasing the accuracy and coverage of the indication knowledge and evaluating its performance using a much larger set of drugs frequently used in the outpatient population. PMID:22195134

  4. Natural Language Processing for Lines and Devices in Portable Chest X-Rays

    PubMed Central

    Rubin, Daniel; Wang, Dan; Chambers, Dallas A.; Chambers, Justin G.; South, Brett R.; Goldstein, Mary K.

    2010-01-01

    Radiology reports are unstructured free text documents that describe abnormalities in patients that are visible via imaging modalities such as X-ray. The number of imaging examinations performed in clinical care is enormous, and mining large repositories of radiology reports connected with clinical data such as patient outcomes could enable epidemiological studies, such as correlating the frequency of infections to the presence or length of time medical devices are present in patients. We developed a natural language processing (NLP) system to recognize device mentions in radiology reports and information about their state (insertion or removal) to enable epidemiological research. We tested our system using a reference standard of reports that were annotated to indicate this information. Our system performed with high accuracy (recall and precision of 97% and 99% for device mentions and 91–96% for device insertion status). Our methods are generalizable to other types of radiology reports as well as to other information extraction tasks and could provide the foundation for tools that enable epidemiological research exploration based on mining radiology reports. PMID:21347067

  5. Semi-Supervised Learning of Statistical Models for Natural Language Understanding

    PubMed Central

    He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure. PMID:25152899

  6. Negation's not solved: generalizability versus optimizability in clinical natural language processing.

    PubMed

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been "solved." This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  7. Discovering peripheral arterial disease cases from radiology notes using natural language processing.

    PubMed

    Savova, Guergana K; Fan, Jin; Ye, Zi; Murphy, Sean P; Zheng, Jiaping; Chute, Christopher G; Kullo, Iftikhar J

    2010-01-01

    As part of the Electronic Medical Records and Genomics Network, we applied, extended and evaluated an open source clinical Natural Language Processing system, Mayo's Clinical Text Analysis and Knowledge Extraction System, for the discovery of peripheral arterial disease cases from radiology reports. The manually created gold standard consisted of 223 positive, 19 negative, 63 probable and 150 unknown cases. Overall accuracy agreement between the system and the gold standard was 0.93 as compared to a named entity recognition baseline of 0.46. Sensitivity for the positive, probable and unknown cases was 0.93-0.96, and for the negative cases was 0.72. Specificity and negative predictive value for all categories were in the 90's. The positive predictive value for the positive and unknown categories was in the high 90's, for the negative category was 0.84, and for the probable category was 0.63. We outline the main sources of errors and suggest improvements. PMID:21347073

  8. Semi-supervised learning of statistical models for natural language understanding.

    PubMed

    Zhou, Deyu; He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure. PMID:25152899

  9. A framework for the natural-language-perception-based creative control of unmanned ground vehicles

    NASA Astrophysics Data System (ADS)

    Ghaffari, Masoud; Liao, Xiaoqun; Hall, Ernest L.

    2004-09-01

    Mobile robots must often operate in an unstructured environment cluttered with obstacles and with many possible action paths. That is why mobile robotics problems are complex with many unanswered questions. To reach a high degree of autonomous operation, a new level of learning is required. On the one hand, promising learning theories such as the adaptive critic and creative control have been proposed, while on other hand the human brain"s processing ability has amazed and inspired researchers in the area of Unmanned Ground Vehicles but has been difficult to emulate in practice. A new direction in the fuzzy theory tries to develop a theory to deal with the perceptions conveyed by the natural language. This paper tries to combine these two fields and present a framework for autonomous robot navigation. The proposed creative controller like the adaptive critic controller has information stored in a dynamic database (DB), plus a dynamic task control center (TCC) that functions as a command center to decompose tasks into sub-tasks with different dynamic models and multi-criteria functions. The TCC module utilizes computational theory of perceptions to deal with the high levels of task planning. The authors are currently trying to implement the model on a real mobile robot and the preliminary results have been described in this paper.

  10. DBPQL: A view-oriented query language for the Intel Data Base Processor

    NASA Technical Reports Server (NTRS)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  11. The Ising model for changes in word ordering rules in natural languages

    NASA Astrophysics Data System (ADS)

    Itoh, Yoshiaki; Ueda, Sumie

    2004-11-01

    The order of noun and adposition is an important parameter of word ordering rules in the worlds languages. The seven parameters, adverb and verb and others, depend strongly on the noun and adposition. Japanese as well as Korean, Tamil and several other languages seem to have a stable structure of word ordering rules, while Thai and other languages, which have the opposite word ordering rules to Japanese, are also stable in structure. It seems therefore that each language in the world fluctuates between these two structures like the Ising model for finite lattice.

  12. Programming Languages.

    ERIC Educational Resources Information Center

    Tesler, Lawrence G.

    1984-01-01

    Discusses the nature of programing languages, considering the features of BASIC, LOGO, PASCAL, COBOL, FORTH, APL, and LISP. Also discusses machine/assembly codes, the operation of a compiler, and trends in the evolution of programing languages (including interest in notational systems called object-oriented languages). (JN)

  13. The Nature and Needs of Adult Audiences: Adult Education Language Students and the Media.

    ERIC Educational Resources Information Center

    Davey, Margaret

    1980-01-01

    Examines the type of student who enrolls in an adult education institute in order to learn a language. Assesses their needs in terms of the provisions the British Broadcasting Corporation makes in their multi-media language courses. (Author/PJM)

  14. A natural language processing (NLP) program effectively extracts key pathologic findings from radical prostatectomy reports.

    PubMed

    Kim, Brian; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil Abraham; Contreras, Richard; Jacobsen, Steven J; Chien, Gary

    2014-08-01

    Introduction and Objective Natural language processing (NLP) software programs have been widely developed to transform complex, free text into simplified, organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. Methods An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included: TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a "gold standard" compiled by two blinded manual reviewers for 100 random pathology reports. Results: NLP demonstrated 100% accuracy for identifying Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report. Conclusions: This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases. PMID:25083914

  15. Measuring Information Acquisition from Sensory Input Using Automated Scoring of Natural-Language Descriptions

    PubMed Central

    Saunders, Daniel R.; Bex, Peter J.; Rose, Dylan J.; Woods, Russell L.

    2014-01-01

    Information acquisition, the gathering and interpretation of sensory information, is a basic function of mobile organisms. We describe a new method for measuring this ability in humans, using free-recall responses to sensory stimuli which are scored objectively using a “wisdom of crowds” approach. As an example, we demonstrate this metric using perception of video stimuli. Immediately after viewing a 30 s video clip, subjects responded to a prompt to give a short description of the clip in natural language. These responses were scored automatically by comparison to a dataset of responses to the same clip by normally-sighted viewers (the crowd). In this case, the normative dataset consisted of responses to 200 clips by 60 subjects who were stratified by age (range 22 to 85y) and viewed the clips in the lab, for 2,400 responses, and by 99 crowdsourced participants (age range 20 to 66y) who viewed clips in their Web browser, for 4,000 responses. We compared different algorithms for computing these similarities and found that a simple count of the words in common had the best performance. It correctly matched 75% of the lab-sourced and 95% of crowdsourced responses to their corresponding clips. We validated the measure by showing that when the amount of information in the clip was degraded using defocus lenses, the shared word score decreased across the five predetermined visual-acuity levels, demonstrating a dose-response effect (N = 15). This approach, of scoring open-ended immediate free recall of the stimulus, is applicable not only to video, but also to other situations where a measure of the information that is successfully acquired is desirable. Information acquired will be affected by stimulus quality, sensory ability, and cognitive processes, so our metric can be used to assess each of these components when the others are controlled. PMID:24695546

  16. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare

    PubMed Central

    2015-01-01

    Summary Objective To summarize recent research and present a selection of the best papers published in 2014 in the field of clinical Natural Language Processing (NLP). Method A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers. Results The clinical NLP best paper selection shows that the field is tackling text analysis methods of increasing depth. The full review process highlighted five papers addressing foundational methods in clinical NLP using clinically relevant texts from online forums or encyclopedias, clinical texts from Electronic Health Records, and included studies specifically aiming at a practical clinical outcome. The increased access to clinical data that was made possible with the recent progress of de-identification paved the way for the scientific community to address complex NLP problems such as word sense disambiguation, negation, temporal analysis and specific information nugget extraction. These advances in turn allowed for efficient application of NLP to clinical problems such as cancer patient triage. Another line of research investigates online clinically relevant texts and brings interesting insight on communication strategies to convey health-related information. Conclusions The field of clinical NLP is thriving through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques for concrete healthcare purposes. Clinical NLP is becoming mature for practical applications with a significant clinical impact. PMID:26293868

  17. Developing a natural language processing application for measuring the quality of colonoscopy procedures

    PubMed Central

    Chapman, Wendy W; Saul, Melissa; Dellon, Evan S; Schoen, Robert E; Mehrotra, Ateev

    2011-01-01

    Objective The quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality. Materials and methods Using a set of quality measures published by physician specialty societies, we implemented an NLP engine that extracts 21 variables for 19 quality measures from free-text colonoscopy and pathology reports. We evaluated the performance of the NLP engine on a test set of 453 colonoscopy reports and 226 pathology reports, considering accuracy in extracting the values of the target variables from text, and the reliability of the outcomes of the quality measures as computed from the NLP-extracted information. Results The average accuracy of the NLP engine over all variables was 0.89 (range: 0.621.0) and the average F measure over all variables was 0.74 (range: 0.490.89). The average agreement score, measured as Cohen's ?, between the manually established and NLP-derived outcomes of the quality measures was 0.62 (range: 0.090.86). Discussion For nine of the 19 colonoscopy quality measures, the agreement score was 0.70 or above, which we consider a sufficient score for the NLP-derived outcomes of these measures to be practically useful for quality measurement. Conclusion The use of NLP for information extraction from free-text colonoscopy and pathology reports creates opportunities for large scale, routine quality measurement, which can support quality improvement in colonoscopy care. PMID:21946240

  18. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  19. The Development of Bilingual Proficiency. Final Report. Volume I: The Nature of Language Proficiency, Volume II: Classroom Treatment, Volume III: Social Context and Age.

    ERIC Educational Resources Information Center

    Harley, Birgit; And Others

    The Development of Bilingual Proficiency is a large-scale, five-year research project begun in 1981. The final report contains three volumes, each concentrating on specific issues investigated in the research: (1) the nature of language proficiency, including second language lexical proficiency and the development and growth of metaphor…

  20. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary

  1. The complex of neural networks and probabilistic methods for mathematical modeling of the syntactic structure of a sentence of natural language

    NASA Astrophysics Data System (ADS)

    Sboev, A.; Rybka, R.; Moloshnikov, I.; Gudovskikh, D.

    2016-02-01

    The formalized model to construct the syntactic structure of sentences of a natural language is presented. On base of this model the complex algorithm with use of neural networks founded on data of Russian National language Corpus and set of parameters extracted from this data was developed. The resulted accuracy along with possible accuracy which theoretically could be received with these parameters is presented.

  2. Reading, Language, and Learning.

    ERIC Educational Resources Information Center

    Radwin, Eugene, Ed.; Wolf-Ward, Maryanne, Ed.

    1977-01-01

    This special issue focuses on the nature of the reading process, the nature of language, and the nature of the relationships between them. Specific topics discussed include the bias of language in speech and writing, the functions of language, trends in second-language-acquisition research, learning about psycholinguistic processes by analyzing

  3. Zipf's word frequency law in natural language: a critical review and future directions.

    PubMed

    Piantadosi, Steven T

    2014-10-01

    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880

  4. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language

    NASA Astrophysics Data System (ADS)

    Szathmry, Ers; Szathmry, Zoltn; Ittzs, Pter; Orban, Gero?; Zachr, Istvn; Huszr, Ferenc; Fedor, Anna; Varga, Mt; Szmad, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  5. Zipfs word frequency law in natural language: A critical review and future directions

    PubMed Central

    2014-01-01

    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf s law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipfs law and are then used to evaluate many of the theoretical explanations of Zipfs law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipfs law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880

  6. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search

    PubMed Central

    Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain

    2016-01-01

    Background Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these “experts.” Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. Objective The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the “Google generation” than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Methods Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is “Google-like,” enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Results Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F 1,19=37.3, P<.001), with a main effect of task (F 3,57=6.3, P<.001). Further, participants completed the task significantly faster using the Web search interface (F 1,19=18.0, P<.001). There was also a main effect of task (F 2,38=4.1, P=.025, Greenhouse-Geisser correction applied). Overall, participants were asked to rate learnability, ease of use, and satisfaction. Paired mean comparisons showed that the Web search interface received significantly higher ratings than the traditional search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (P<.001, 95% CI [1.2-3.2]), and satisfaction (P<.001, 95% CI [1.8-3.5]). The results show superior cross-domain usability of Web search, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement. Conclusions The results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation. PMID:26769334

  7. The Ability of Children with Language Impairment to Dissemble Emotions in Hypothetical Scenarios and Natural Situations

    ERIC Educational Resources Information Center

    Brinton, Bonnie; Fujiki, Martin; Hurst, Noel Quist; Jones, Emily Rowberry; Spackman, Matthew P.

    2015-01-01

    Purpose: This study examined the ability of children with language impairment (LI) to dissemble (hide) emotional reactions when socially appropriate to do so. Method: Twenty-two children with LI and their typically developing peers (7;1-10;11 [years;months]) participated in two tasks. First, participants were presented with hypothetical scenarios

  8. School Meaning Systems: The Symbiotic Nature of Culture and "Language-In-Use"

    ERIC Educational Resources Information Center

    Abawi, Lindy

    2013-01-01

    Recent research has produced evidence to suggest a strong reciprocal link between school context-specific language constructions that reflect a school's vision and schoolwide pedagogy, and the way that meaning making occurs, and a school's culture is characterized. This research was conducted within three diverse settings: one school in

  9. The Ability of Children with Language Impairment to Dissemble Emotions in Hypothetical Scenarios and Natural Situations

    ERIC Educational Resources Information Center

    Brinton, Bonnie; Fujiki, Martin; Hurst, Noel Quist; Jones, Emily Rowberry; Spackman, Matthew P.

    2015-01-01

    Purpose: This study examined the ability of children with language impairment (LI) to dissemble (hide) emotional reactions when socially appropriate to do so. Method: Twenty-two children with LI and their typically developing peers (7;1-10;11 [years;months]) participated in two tasks. First, participants were presented with hypothetical scenarios…

  10. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  11. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils

  12. Children with Specific Language Impairments Perceive Speech Most Categorically when Tokens Are Natural and Meaningful

    ERIC Educational Resources Information Center

    Coady, Jeffry A.; Evans, Julia L.; Mainela-Arnold, Elina; Kluender, Keith R.

    2007-01-01

    Purpose: To examine perceptual deficits as a potential underlying cause of specific language impairments (SLI). Method: Twenty-one children with SLI (8;7-11;11 [years;months]) and 21 age-matched controls participated in categorical perception tasks using four series of syllables for which perceived syllable-initial voicing varied. Series were

  13. Crowdsourcing a Normative Natural Language Dataset: A Comparison of Amazon Mechanical Turk and In-Lab Data Collection

    PubMed Central

    Bex, Peter J; Woods, Russell L

    2013-01-01

    Background Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. Objective To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. Methods We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. Results In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (P<.001), and they used a less varied vocabulary. However, there was strong similarity in the words used to describe a particular clip between the two datasets, as a cross-dataset count of shared words showed (P<.001). Within both datasets, responses contained substantial relevant content, with more words in common with responses to the same clip than to other clips (P<.001). There was evidence that responses from female and older crowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Conclusions Crowdsourcing is an effective approach to quickly and economically collect a large reliable dataset of normative natural language responses. PMID:23689038

  14. Linguistics in Language Education

    ERIC Educational Resources Information Center

    Kumar, Rajesh; Yunus, Reva

    2014-01-01

    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  15. Linguistics in Language Education

    ERIC Educational Resources Information Center

    Kumar, Rajesh; Yunus, Reva

    2014-01-01

    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern

  16. Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

    PubMed Central

    Hartzler, Andrea L; Huh, Jina; McDonald, David W; Pratt, Wanda

    2015-01-01

    Background The prevalence and value of patient-generated health text are increasing, but processing such text remains problematic. Although existing biomedical natural language processing (NLP) tools are appealing, most were developed to process clinician- or researcher-generated text, such as clinical notes or journal articles. In addition to being constructed for different types of text, other challenges of using existing NLP include constantly changing technologies, source vocabularies, and characteristics of text. These continuously evolving challenges warrant the need for applying low-cost systematic assessment. However, the primarily accepted evaluation method in NLP, manual annotation, requires tremendous effort and time. Objective The primary objective of this study is to explore an alternative approach—using low-cost, automated methods to detect failures (eg, incorrect boundaries, missed terms, mismapped concepts) when processing patient-generated text with existing biomedical NLP tools. We first characterize common failures that NLP tools can make in processing online community text. We then demonstrate the feasibility of our automated approach in detecting these common failures using one of the most popular biomedical NLP tools, MetaMap. Methods Using 9657 posts from an online cancer community, we explored our automated failure detection approach in two steps: (1) to characterize the failure types, we first manually reviewed MetaMap’s commonly occurring failures, grouped the inaccurate mappings into failure types, and then identified causes of the failures through iterative rounds of manual review using open coding, and (2) to automatically detect these failure types, we then explored combinations of existing NLP techniques and dictionary-based matching for each failure cause. Finally, we manually evaluated the automatically detected failures. Results From our manual review, we characterized three types of failure: (1) boundary failures, (2) missed term failures, and (3) word ambiguity failures. Within these three failure types, we discovered 12 causes of inaccurate mappings of concepts. We used automated methods to detect almost half of 383,572 MetaMap’s mappings as problematic. Word sense ambiguity failure was the most widely occurring, comprising 82.22% of failures. Boundary failure was the second most frequent, amounting to 15.90% of failures, while missed term failures were the least common, making up 1.88% of failures. The automated failure detection achieved precision, recall, accuracy, and F1 score of 83.00%, 92.57%, 88.17%, and 87.52%, respectively. Conclusions We illustrate the challenges of processing patient-generated online health community text and characterize failures of NLP tools on this patient-generated health text, demonstrating the feasibility of our low-cost approach to automatically detect those failures. Our approach shows the potential for scalable and effective solutions to automatically assess the constantly evolving NLP tools and source vocabularies to process patient-generated text. PMID:26323337

  17. A natural language query system for Hubble Space Telescope proposal selection

    NASA Technical Reports Server (NTRS)

    Hornick, Thomas; Cohen, William; Miller, Glenn

    1987-01-01

    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  18. Research in knowledge representation for natural language communication and planning assistance. Final report, 18 March 1985-30 September 1988

    SciTech Connect

    Goodman, B.A.; Grosz, B.; Haas, A.; Litman, D.; Reinhardt, T.

    1988-11-01

    BBN's DARPA project in Knowledge Representation for Natural Language Communication and Planning Assistance has two primary objectives: 1) To perform research on aspects of the interaction between users who are making complex decisions and systems that are assisting them with their task. In particular, this research is focused on communication and the reasoning required for performing its underlying task of discourse processing, planning, and plan recognition and communication repair. 2) Based on the research objectives to build tools for communication, plan recognition, and planning assistance and for the representation of knowledge and reasoning that underlie all of these processes. This final report summarizes BBN's research activities performed under this contract in the areas of knowledge representation and speech and natural language. In particular, the report discusses the work in the areas of knowledge representation, planning, and discourse modeling. We describe a parallel truth maintenance system. We provide an extension to the sentential theory of propositional attitudes by adding a sentential semantics. The report also contains a description of our research in discourse modelling in the areas of planning and plan recognition.

  19. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework

    ERIC Educational Resources Information Center

    Vlas, Radu Eduard

    2012-01-01

    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  20. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework

    ERIC Educational Resources Information Center

    Vlas, Radu Eduard

    2012-01-01

    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,

  1. Specific Language Impairment

    MedlinePLUS

    ... to understand the genetic underpinnings of SLI, the nature of the language deficits that cause it, and ... Twin study suggests language delay due more to nature than nurture University of Kansas ( 7/21/2014 ) ...

  2. On the nature and evolution of the neural bases of human language

    NASA Technical Reports Server (NTRS)

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on the brains of human beings and other species provides insight into the evolution of the brain bases of human language. The neural substrate that regulated motor control in the common ancestor of apes and humans most likely was modified to enhance cognitive and linguistic ability. Speech communication played a central role in this process. However, the process that ultimately resulted in the human brain may have started when our earliest hominid ancestors began to walk.

  3. The Two Cultures of Science: On Language-Culture Incommensurability Concerning "Nature" and "Observation"

    ERIC Educational Resources Information Center

    Loo, Seng Piew

    2007-01-01

    Culture without nature is empty, nature without culture is deaf Intercultural dialogue in higher education around the globe is needed to improve the theory, policy and practice of science and science education. The culture, cosmology and philosophy of "global" science as practiced today in all societies around the world are seemingly anchored in

  4. What Is a Language?

    ERIC Educational Resources Information Center

    Le Page, R. B.

    A discussion on the nature of language argues the following: (1) the concept of a closed and finite rule system is inadequate for the description of natural languages; (2) as a consequence, the writing of variable rules to modify such rule systems so as to accommodate the properties of natural language is inappropriate; (3) the concept of such

  5. L3 Interactive Data Language

    Energy Science and Technology Software Center (ESTSC)

    2006-09-05

    The L3 system is a computational steering environment for image processing and scientific computing. It consists of an interactive graphical language and interface. Its purpose is to help advanced users in controlling their computational software and assist in the management of data accumulated during numerical experiments. L3 provides a combination of features not found in other environments; these are: - textual and graphical construction of programs - persistence of programs and associated data - directmore » mapping between the scripts, the parameters, and the produced data - implicit hierarchial data organization - full programmability, including conditionals and functions - incremental execution of programs The software includes the l3 language and the graphical environment. The language is a single-assignment functional language; the implementation consists of lexer, parser, interpreter, storage handler, and editing support, The graphical environment is an event-driven nested list viewer/editor providing graphical elements corresponding to the language. These elements are both the represenation of a users program and active interfaces to the values computed by that program.« less

  6. Proceedings of the Conference on Language and Language Behavior.

    ERIC Educational Resources Information Center

    Zale, Eric M., Ed.

    This volume contains the papers read at the Conference on Language and Language Behavior held at the University of Michigan's Center for Research on Language and Language Behavior in October 1966. Papers are ordered under the following topics: First Language Acquisition in Natural Setting, Controlled Acquisition of First Language Skills, Second

  7. Natural language processing of asthma discharge summaries for the monitoring of patient care.

    PubMed

    Sager, N; Lyman, M; Tick, L J; Nhn, N T; Bucknall, C E

    1993-01-01

    A technique for monitoring healthcare via the processing of routinely collected narrative documentation is presented. A checklist of important details of asthma management in use in the Glasgow Royal Infirmary (GRI) was translated into SQL queries and applied to a database of 59 GRI discharge summaries analyzed by the New York University Linguistic String Project medical language processor. Tables of retrieved information obtained for each query were compared with the text of the original documents by physician reviewers. Categories (unit = document) were: (1) information present, retrieved correctly; (2) information not present; (3) information present, retrieved with minor or major error; (4) information present, retrieved with minor or major omissions. Category 2 (physician "documentation score") could be used to prioritize manual review and guide feedback to physicians to improve documentation. The semantic structuring and relative completeness of retrieved data suggest their potential use as input to further quality assurance procedures. PMID:8130474

  8. The development of a natural language interface to a geographical information system

    NASA Technical Reports Server (NTRS)

    Toledo, Sue Walker; Davis, Bruce

    1993-01-01

    This paper will discuss a two and a half year long project undertaken to develop an English-language interface for the geographical information system GRASS. The work was carried out for NASA by a small business, Netrologic, based in San Diego, California, under Phase 1 and 2 Small Business Innovative Research contracts. We consider here the potential value of this system whose current functionality addresses numerical, categorical and boolean raster layers and includes the display of point sets defined by constraints on one or more layers, answers yes/no and numerical questions, and creates statistical reports. It also handles complex queries and lexical ambiguities, and allows temporarily switching to UNIX or GRASS.

  9. The embodied nature of medical concepts: image schemas and language for PAIN.

    PubMed

    Prieto Velasco, Juan Antonio; Tercedor Snchez, Maribel

    2014-08-01

    Cognitive linguistics assumes that knowledge is both embodied and situated as far as it is acquired through our bodily interaction with the world in a specific environment (e.g. Barsalou in Lang Cogn Process 18:513-562, 2003; Connell et al. in PLoS One 7:3, 2012). Therefore, embodiment provides an explanation to the mental representation and linguistic expression of concepts. Among the first, we find multimodal conceptual structures, like image schemas, which are schematic representations of embodied experiences resulting from our conceptualization of the surrounding environment (Tercedor Snchez et al. in J Spec Transl 18:187-205, 2012). Furthermore, the way we interact with the environment and its objects is dynamic and configures how we refer to concepts both by means of images and lexicalizations. In this article, we investigate how image schemas underlie verbal and visual representations. They both evoke concepts based on exteroception, interoception and proprioception which can be lexicalized through language. More specifically, we study (1) a multimodal corpus of medical texts to examine how image schemas lexicalize in the language of medicine to represent specialized concepts and (2) medical pictures to explore the depiction of image-schematic concepts, in order to account for the verbal and visual representation of embodied concepts. We explore the concept PAIN, a sensory and emotional experience associated with actual or potential tissue damage, using corpus analysis tools (Sketch Engine) to extract information about the lexicalization of underlying image schemas in definitions and defining contexts. Then, we use the image schemas behind medical concepts to consistently select images which depict our experience of pain and the way we understand it. Finally, such lexicalizations and visualizations will help us assess how we refer to PAIN both verbally and visually. PMID:24390539

  10. Writing in science: Exploring teachers' and students' views of the nature of science in language enriched environments

    NASA Astrophysics Data System (ADS)

    Decoito, Isha

    Writing in science can be used to address some of the issues relevant to contemporary scientific literacy, such as the nature of science, which describes the scientific enterprise for science education. This has implications for the kinds of writing tasks students should attempt in the classroom, and for how students should understand the rationale and claims of these tasks. While scientific writing may train the mind to think scientifically in a disciplined and structured way thus encouraging students to gain access to the public domain of scientific knowledge, the counter-argument is that students need to be able to express their thoughts freely in their own language. Writing activities must aim to promote philosophical and epistemological views of science that accurately portray contemporary science. This mixed-methods case study explored language-enriched environments, in this case, secondary science classrooms with a focus on teacher-developed activities, involving diversified writing styles, that were directly linked to the science curriculum. The research foci included: teachers' implementation of these activities in their classrooms; how the activities reflected the teachers' nature of science views; common attributes between students' views of science and how they represented science in their writings; and if, and how the activities influenced students' nature of science views. Teachers' and students' views of writing and the nature of science are illustrated through pre-and post-questionnaire responses; interviews; student work; and classroom observations. Results indicated that diversified writing activities have the potential to accurately portray science to students, personalize learning in science, improve students' overall attitude towards science, and enhance scientific literacy through learning science, learning about science, and doing science. Further research is necessary to develop an understanding of whether the choice of genre has an influence on meaning construction and understanding in science. Finally, this study concluded that the relationship between students' views of the nature of science and writing in science is complex and is dependent on several factors including the teachers' influence and attitude towards student writing in science.

  11. An Evaluation of a Natural Language Processing Tool for Identifying and Encoding Allergy Information in Emergency Department Clinical Notes

    PubMed Central

    Goss, Foster R.; Plasek, Joseph M.; Lau, Jason J.; Seger, Diane L.; Chang, Frank Y.; Zhou, Li

    2014-01-01

    Emergency department (ED) visits due to allergic reactions are common. Allergy information is often recorded in free-text provider notes; however, this domain has not yet been widely studied by the natural language processing (NLP) community. We developed an allergy module built on the MTERMS NLP system to identify and encode food, drug, and environmental allergies and allergic reactions. The module included updates to our lexicon using standard terminologies, and novel disambiguation algorithms. We developed an annotation schema and annotated 400 ED notes that served as a gold standard for comparison to MTERMS output. MTERMS achieved an F-measure of 87.6% for the detection of allergen names and no known allergies, 90% for identifying true reactions in each allergy statement where true allergens were also identified, and 69% for linking reactions to their allergen. These preliminary results demonstrate the feasibility using NLP to extract and encode allergy information from clinical notes. PMID:25954363

  12. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a designs value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  13. Classification of CT Pulmonary Angiography Reports by Presence, Chronicity, and Location of Pulmonary Embolism with Natural Language Processing

    PubMed Central

    Yu, Sheng; Kumamaru, Kanako K.; George, Elizabeth; Dunne, Ruth M.; Bedayat, Arash; Neykov, Matey; Hunsaker, Andetta R.; Dill, Karin E.; Cai, Tianxi; Rybicki, Frank J.

    2014-01-01

    In this paper we describe an efficient tool based on natural language processing for classifying the detail state of pulmonary embolism (PE) recorded in CT pulmonary angiography reports. The classification tasks include: PE present vs. absent, acute PE vs. others, central PE vs. others, and sub-segmental PE vs. others. Statistical learning algorithms were trained with features extracted using the NLP tool and gold standard labels obtained via chart review from two radiologists. The areas under the receiver operating characteristic curves (AUG) for the four tasks were 0.998, 0.945, 0.987, and 0.986, respectively. We compared our classifiers with bag-of-words Naive Bayes classifiers, a standard text mining technology, which gave AUG 0.942, 0.765, 0.766, and 0.712, respectively. PMID:25117751

  14. Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

    PubMed Central

    Scott-Phillips, Thomas C.; Blythe, Richard A.

    2013-01-01

    In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication. PMID:24047871

  15. Pupils Reasoning about the Nature of Change Using an Abstract Picture Language.

    ERIC Educational Resources Information Center

    Stylianidou, Fani; Boohan, Richard

    The research is concerned with investigating children's understanding of physical, chemical, and biological changes while using an approach developed by the project Energy and Change. This project aimed to provide novel ways of teaching about the nature and direction of changes, in particular introducing ideas related to the Second Law of

  16. Psychological linguistics: A natural science approach to the study of language interactions

    PubMed Central

    Bijou, Sidney W.; Umbreit, John; Ghezzi, Patrick M.; Chao, Chia-Chen

    1986-01-01

    Kantor's theoretical analysis of psychological linguistics offers a natural science approach to the study of linguistic behavior and interactions. This paper includes brief descriptions of (a) some of the basic assumptions of the approach, (b) Kantor's conception of linguistic behavior and interactions, (c) a compatible research method and sample research data, and (d) some areas of research and application. PMID:22477507

  17. Object-oriented knowledge representation in a natural language understanding system of economic surveys

    NASA Astrophysics Data System (ADS)

    Planes, Jean-Christophe; Trigano, Philippe

    1992-03-01

    The HIRONDELLE research project of the Banque de France intends to summarize economic surveys giving statements about a specific economic domain. The principal goal is the detection of causal relations between economic events appearing in the texts. We will focus on knowledge representation, based on three distinct hierarchical structures. The first one concerns the lexical items and allows inheritance of syntactic properties. Descriptions of the applications domains are achieved by a taxonomy based on attribute-value models and case relations, adapted to the economic sectors. The summarization goal of this system defines a set of primitives representing statements and causality meta-language. The semantic analysis of the texts is based on two phases. The first one leads to a propositional representation of the sentences through conceptual graphs formalization, taking into account the syntactic transformations of sentences. The second one is dedicated to the summarizing role of the system, detecting paraphrastic sentences by processing syntactic and semantic transformations like negation or metonymious constructions.

  18. Arbitrary Symbolism in Natural Language Revisited: When Word Forms Carry Meaning

    PubMed Central

    Reilly, Jamie; Westbury, Chris; Kean, Jacob; Peelle, Jonathan E.

    2012-01-01

    Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog). Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 12 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete) for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., apartment) as abstract and shorter uninflected abstract words (e.g., fate) as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words. PMID:22879931

  19. The nature of facilitation and interference in the multilingual language system: insights from treatment in a case of trilingual aphasia.

    PubMed

    Keane, Caitlin; Kiran, Swathi

    2015-01-01

    The rehabilitation study described here sets out to test the premise of Abutalebi and Green's neurocognitive model--specifically, that language selection and control are components of overall cognitive control. We follow a trilingual woman (first language, L1: Amharic; second language, L2: English; third language, L3: French) with damage to the left frontal lobe and left basal ganglia who presented with cognitive control and naming deficits, through two periods of semantic treatment (French, followed by English) to alleviate naming deficits. The results showed that while the participant improved on trained items, she did not show within- or cross-language generalization. In addition, error patterns revealed a substantial increase of interference of the currently trained language into the nontrained language during each of the two treatment phases. These results are consistent with Abutalebi and Green's neurocognitive model and support the claim that language selection and control are components of overall cognitive control. PMID:26377506

  20. Neurolinguistic Approach to Natural Language Processing with Applications to Medical Text Analysis

    PubMed Central

    Matykiewicz, Pawe?; Pestian, John

    2008-01-01

    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts that are not found directly in the text. Approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications. PMID:18614334

  1. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1985-01-01

    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  2. SIMD-parallel understanding of natural language with application to magnitude-only optical parsing of text

    NASA Astrophysics Data System (ADS)

    Schmalz, Mark S.

    1992-08-01

    A novel parallel model of natural language (NL) understanding is presented which can realize high levels of semantic abstraction, and is designed for implementation on synchronous SIMD architectures and optical processors. Theory is expressed in terms of the Image Algebra (IA), a rigorous, concise, inherently parallel notation which unifies the design, analysis, and implementation of image processing algorithms. The IA has been implemented on numerous parallel architectures, and IA preprocessors and interpreters are available for the FORTRAN and Ada languages. In a previous study, we demonstrated the utility of IA for mapping MEA- conformable (Multiple Execution Array) algorithms to optical architectures. In this study, we extend our previous theory to map serial parsing algorithms to the synchronous SIMD paradigm. We initially derive a two-dimensional image that is based upon the adjacency matrix of a semantic graph. Via IA template mappings, the operations of bottom-up parsing, semantic disambiguation, and referential resolution are implemented as image-processing operations upon the adjacency matrix. Pixel-level operations are constrained to Hadamard addition and multiplication, thresholding, and row/column summation, which are available in magnitude-only optics. Assuming high parallelism in the parse rule base, the parsing of n input symbols with a grammar consisting of M rules of arity H, on an N-processor architecture, could exhibit time complexity of T(n)

  3. The nature and prevalence of disability in a Ghanaian community as measured by the Language Independent Functional Evaluation

    PubMed Central

    Kelemen, Benjamin William; Haig, Andrew John; Goodnight, Siera; Nyante, Gifty

    2013-01-01

    Introduction The current study uses the Language Independent Functional Evaluation (L.I.F.E.) to evaluate disability in a smaller Ghanaian coastal town to characterize the extent and nature of disability. The L.I.F.E. is a video animated, language free equivalent of the standard 10-item verbal/written Barthel Index functional assessment. Methods Over a four-month period, the L.I.F.E. survey was given to members of the village of Anomabo in a preliminary survey which consisted of recruitment in an un-controlled manner, followed by a systematic, comprehensive survey of three neighborhood clusters. Basic demographics were also collected, along with the observer's assessment of disability. Results 541 inhabitants (264 in the preliminary survey and 277 in systematic survey) completed the L.I.F.E. Participants ranged from 7-100 years old (mean age 32.88, s.d. 20.64) and were 55.9% female. In the systematic study, 16.6% of participants had a less than perfect score on the L.I.F.E., indicating some degree of impairment. Significant differences were found between age groups, but not between sexes, the preliminary and systematic survey, and study location (a=.05). Conclusion The L.I.F.E. and this study methodology can be used to measure the prevalence of disability in African communities. Disability in this community was higher than the frequently cited estimate of 10%. African policymakers can use the L.I.F.E. to measure disability and thus more rationally allocate resources for medical rehabilitation. PMID:23717718

  4. Evaluation of natural language processing from emergency department computerized medical records for intra-hospital syndromic surveillance

    PubMed Central

    2011-01-01

    Background The identification of patients who pose an epidemic hazard when they are admitted to a health facility plays a role in preventing the risk of hospital acquired infection. An automated clinical decision support system to detect suspected cases, based on the principle of syndromic surveillance, is being developed at the University of Lyon's Hpital de la Croix-Rousse. This tool will analyse structured data and narrative reports from computerized emergency department (ED) medical records. The first step consists of developing an application (UrgIndex) which automatically extracts and encodes information found in narrative reports. The purpose of the present article is to describe and evaluate this natural language processing system. Methods Narrative reports have to be pre-processed before utilizing the French-language medical multi-terminology indexer (ECMT) for standardized encoding. UrgIndex identifies and excludes syntagmas containing a negation and replaces non-standard terms (abbreviations, acronyms, spelling errors...). Then, the phrases are sent to the ECMT through an Internet connection. The indexer's reply, based on Extensible Markup Language, returns codes and literals corresponding to the concepts found in phrases. UrgIndex filters codes corresponding to suspected infections. Recall is defined as the number of relevant processed medical concepts divided by the number of concepts evaluated (coded manually by the medical epidemiologist). Precision is defined as the number of relevant processed concepts divided by the number of concepts proposed by UrgIndex. Recall and precision were assessed for respiratory and cutaneous syndromes. Results Evaluation of 1,674 processed medical concepts contained in 100 ED medical records (50 for respiratory syndromes and 50 for cutaneous syndromes) showed an overall recall of 85.8% (95% CI: 84.1-87.3). Recall varied from 84.5% for respiratory syndromes to 87.0% for cutaneous syndromes. The most frequent cause of lack of processing was non-recognition of the term by UrgIndex (9.7%). Overall precision was 79.1% (95% CI: 77.3-80.8). It varied from 81.4% for respiratory syndromes to 77.0% for cutaneous syndromes. Conclusions This study demonstrates the feasibility of and interest in developing an automated method for extracting and encoding medical concepts from ED narrative reports, the first step required for the detection of potentially infectious patients at epidemic risk. PMID:21798029

  5. A perspective on the advancement of natural language processing tasks via topological analysis of complex networks. Comment on "Approaching human language with complex networks" by Cong and Liu

    NASA Astrophysics Data System (ADS)

    Amancio, Diego Raphael

    2014-12-01

    Concepts and methods of complex networks have been applied to probe the properties of a myriad of real systems [1]. The finding that written texts modeled as graphs share several properties of other completely different real systems has inspired the study of language as a complex system [2]. Actually, language can be represented as a complex network in its several levels of complexity. As a consequence, morphological, syntactical and semantical properties have been employed in the construction of linguistic networks [3]. Even the character level has been useful to unfold particular patterns [4,5]. In the review by Cong and Liu [6], the authors emphasize the need to use the topological information of complex networks modeling the various spheres of the language to better understand its origins, evolution and organization. In addition, the authors cite the use of networks in applications aiming at holistic typology and stylistic variations. In this context, I will discuss some possible directions that could be followed in future research directed towards the understanding of language via topological characterization of complex linguistic networks. In addition, I will comment the use of network models for language processing applications. Additional prospects for future practical research lines will also be discussed in this comment.

  6. Salience: the key to the selection problem in natural language generation

    SciTech Connect

    Conklin, E.J.; McDonald, D.D.

    1982-01-01

    The authors argue that in domains where a strong notion of salience can be defined, it can be used to provide: (1) an elegant solution to the selection problem, i.e. the problem of how to decide whether a given fact should or should not be mentioned in the text; and (2) a simple and direct control framework for the entire deep generation process, coordinating proposing, planning, and realization. (Deep generation involves reasoning about conceptual and rhetorical facts, as opposed to the narrowly linguistic reasoning that takes place during realization.) The authors report on an empirical study of salience in pictures of natural scenes, and its use in a computer program that generates descriptive paragraphs comparable to those produced by people. 13 references.

  7. American Sign Language

    MedlinePLUS

    ... Langue des Signes Franaise).Todays ASL includes some elements of LSF plus the original local sign languages, ... can also be used to model the essential elements and organization of natural language. Another NIDCD-funded ...

  8. Combining Speech Recognition/Natural Language Processing with 3D Online Learning Environments to Create Distributed Authentic and Situated Spoken Language Learning

    ERIC Educational Resources Information Center

    Jones, Greg; Squires, Todd; Hicks, Jeramie

    2008-01-01

    This article will describe research done at the National Institute of Multimedia in Education, Japan and the University of North Texas on the creation of a distributed Internet-based spoken language learning system that would provide more interactive and motivating learning than current multimedia and audiotape-based systems. The project combined

  9. BabelMeSH: Development of a Cross-Language Tool for MEDLINE/PubMed

    PubMed Central

    Liu, Fang; Fontelo, Paul; Ackerman, Michael

    2006-01-01

    BabelMeSH is a cross-language tool for searching MEDLINE/PubMed. Queries can be submitted as single terms or complex phrases in French, Spanish and Portuguese. Citations will be sent to the user in English. It uses a smart parser interface with a medical terms database in MySQL. Preliminary evaluation using compound key words in foreign language medical journals showed an accuracy of 68%, 60% and 51% for French, Spanish and Portuguese, respectively. Development is continuing. PMID:17238631

  10. BabelMeSH: development of a cross-language tool for MEDLINE/PubMed.

    PubMed

    Liu, Fang; Ackerman, Michael; Fontelo, Paul

    2006-01-01

    BabelMeSH is a cross-language tool for searching MEDLINE/PubMed. Queries can be submitted as single terms or complex phrases in French, Spanish and Portuguese. Citations will be sent to the user in English. It uses a smart parser interface with a medical terms database in MySQL. Preliminary evaluation using compound key words in foreign language medical journals showed an accuracy of 68%, 60% and 51% for French, Spanish and Portuguese, respectively. Development is continuing. PMID:17238631

  11. First Language Acquisition and Teaching

    ERIC Educational Resources Information Center

    Cruz-Ferreira, Madalena

    2011-01-01

    "First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in

  12. First Language Acquisition and Teaching

    ERIC Educational Resources Information Center

    Cruz-Ferreira, Madalena

    2011-01-01

    "First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in…

  13. Informatics in radiology: RADTF: a semantic search-enabled, natural language processor-generated radiology teaching file.

    PubMed

    Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L

    2010-11-01

    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex()-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. PMID:20801868

  14. Probabilistic Constraint Logic Programming. Formal Foundations of Quantitative and Statistical Inference in Constraint-Based Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Riezler, Stefan

    2000-08-01

    In this thesis, we present two approaches to a rigorous mathematical and algorithmic foundation of quantitative and statistical inference in constraint-based natural language processing. The first approach, called quantitative constraint logic programming, is conceptualized in a clear logical framework, and presents a sound and complete system of quantitative inference for definite clauses annotated with subjective weights. This approach combines a rigorous formal semantics for quantitative inference based on subjective weights with efficient weight-based pruning for constraint-based systems. The second approach, called probabilistic constraint logic programming, introduces a log-linear probability distribution on the proof trees of a constraint logic program and an algorithm for statistical inference of the parameters and properties of such probability models from incomplete, i.e., unparsed data. The possibility of defining arbitrary properties of proof trees as properties of the log-linear probability model and efficiently estimating appropriate parameter values for them permits the probabilistic modeling of arbitrary context-dependencies in constraint logic programs. The usefulness of these ideas is evaluated empirically in a small-scale experiment on finding the correct parses of a constraint-based grammar. In addition, we address the problem of computational intractability of the calculation of expectations in the inference task and present various techniques to approximately solve this task. Moreover, we present an approximate heuristic technique for searching for the most probable analysis in probabilistic constraint logic programs.

  15. Automatic extraction of nanoparticle properties using natural language processing: NanoSifter an application to acquire PAMAM dendrimer properties.

    PubMed

    Jones, David E; Igo, Sean; Hurdle, John; Facelli, Julio C

    2014-01-01

    In this study, we demonstrate the use of natural language processing methods to extract, from nanomedicine literature, numeric values of biomedical property terms of poly(amidoamine) dendrimers. We have developed a method for extracting these values for properties taken from the NanoParticle Ontology, using the General Architecture for Text Engineering and a Nearly-New Information Extraction System. We also created a method for associating the identified numeric values with their corresponding dendrimer properties, called NanoSifter. We demonstrate that our system can correctly extract numeric values of dendrimer properties reported in the cancer treatment literature with high recall, precision, and f-measure. The micro-averaged recall was 0.99, precision was 0.84, and f-measure was 0.91. Similarly, the macro-averaged recall was 0.99, precision was 0.87, and f-measure was 0.92. To our knowledge, these results are the first application of text mining to extract and associate dendrimer property terms and their corresponding numeric values. PMID:24392101

  16. Rethinking information delivery: using a natural language processing application for point-of-care data discovery*†

    PubMed Central

    Workman, T. Elizabeth; Stoddart, Joan M

    2012-01-01

    Objective: This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs. Methods: The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool. Results: Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57. Conclusion: This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users. PMID:22514507

  17. Tracking irregular morphophonological dependencies in natural language: evidence from the acquisition of subject-verb agreement in French.

    PubMed

    Nazzi, Thierry; Barrire, Isabelle; Goyet, Louise; Kresh, Sarah; Legendre, Graldine

    2011-07-01

    This study examines French-learning infants' sensitivity to grammatical non-adjacent dependencies involving subject-verb agreement (e.g., le/les garons lit/lisent 'the boy(s) read(s)') where number is audible on both the determiner of the subject DP and the agreeing verb, and the dependency is spanning across two syntactic phrases. A further particularity of this subsystem of French subject-verb agreement is that number marking on the verb is phonologically highly irregular. Despite the challenge, the HPP results for 24- and 18-month-olds demonstrate knowledge of both number dependencies: between the singular determiner le and the non-adjacent singular verbal forms and between the plural determiner les and the non-adjacent plural verbal forms. A control experiment suggests that the infants are responding to known verb forms, not phonological regularities. Given the paucity of such forms in the adult input documented through a corpus study, these results are interpreted as evidence that 18-month-olds have the ability to extract complex patterns across a range of morphophonologically inconsistent and infrequent items in natural language. PMID:21497801

  18. Automatic Extraction of Nanoparticle Properties Using Natural Language Processing: NanoSifter an Application to Acquire PAMAM Dendrimer Properties

    PubMed Central

    Jones, David E.; Igo, Sean; Hurdle, John; Facelli, Julio C.

    2014-01-01

    In this study, we demonstrate the use of natural language processing methods to extract, from nanomedicine literature, numeric values of biomedical property terms of poly(amidoamine) dendrimers. We have developed a method for extracting these values for properties taken from the NanoParticle Ontology, using the General Architecture for Text Engineering and a Nearly-New Information Extraction System. We also created a method for associating the identified numeric values with their corresponding dendrimer properties, called NanoSifter. We demonstrate that our system can correctly extract numeric values of dendrimer properties reported in the cancer treatment literature with high recall, precision, and f-measure. The micro-averaged recall was 0.99, precision was 0.84, and f-measure was 0.91. Similarly, the macro-averaged recall was 0.99, precision was 0.87, and f-measure was 0.92. To our knowledge, these results are the first application of text mining to extract and associate dendrimer property terms and their corresponding numeric values. PMID:24392101

  19. An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating.

    PubMed

    Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B

    2015-07-01

    Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers. PMID:26148107

  20. The CMS DBS query language

    NASA Astrophysics Data System (ADS)

    Kuznetsov, Valentin; Riley, Daniel; Afaq, Anzar; Sekhri, Vijay; Guo, Yuyi; Lueking, Lee

    2010-04-01

    The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provide details of the language components and overview of how this component fits into the overall data discovery system architecture.

  1. The Acquisition of Written Language: Response and Revision. Writing Research: Multidisciplinary Inquiries into the Nature of Writing Series.

    ERIC Educational Resources Information Center

    Freedman, Sarah Warshauer, Ed.

    Viewing writing as both a form of language learning and an intellectual skill, this book presents essays on how writers acquire trusted inner voices and the roles schools and teachers can play in helping student writers in the learning process. The essays in the book focus on one of three topics: the language of instruction and how response and…

  2. Integrating Learner Corpora and Natural Language Processing: A Crucial Step towards Reconciling Technological Sophistication and Pedagogical Effectiveness

    ERIC Educational Resources Information Center

    Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie

    2007-01-01

    Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with…

  3. Integrating Learner Corpora and Natural Language Processing: A Crucial Step towards Reconciling Technological Sophistication and Pedagogical Effectiveness

    ERIC Educational Resources Information Center

    Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie

    2007-01-01

    Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with

  4. MeSH Speller + askMEDLINE: auto-completes MeSH terms then searches MEDLINE/PubMed via free-text, natural language queries.

    PubMed

    Fontelo, Paul; Liu, Fang; Ackerman, Michael

    2005-01-01

    Medical terminology is challenging even for healthcare personnel. Spelling errors can make searching MEDLINE/PubMed ineffective. We developed a utility that provides MeSH term and Specialist Lexicon Vocabulary suggestions as it is typed on a search page. The correctly spelled term can be incorporated into a free-text, natural language search or used as a clinical queries search. PMID:16779244

  5. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

    PubMed Central

    2012-01-01

    Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054

  6. Language, Gesture, and Space.

    ERIC Educational Resources Information Center

    Emmorey, Karen, Ed.; Reilly, Judy S., Ed.

    A collection of papers addresses a variety of issues regarding the nature and structure of sign language, gesture, and gesture systems. Articles include: "Theoretical Issues Relating Language, Gesture, and Space: An Overview" (Karen Emmorey, Judy S. Reilly); "Real, Surrogate, and Token Space: Grammatical Consequences in ASL American Sign Language"

  7. Language, Gesture, and Space.

    ERIC Educational Resources Information Center

    Emmorey, Karen, Ed.; Reilly, Judy S., Ed.

    A collection of papers addresses a variety of issues regarding the nature and structure of sign language, gesture, and gesture systems. Articles include: "Theoretical Issues Relating Language, Gesture, and Space: An Overview" (Karen Emmorey, Judy S. Reilly); "Real, Surrogate, and Token Space: Grammatical Consequences in ASL American Sign Language"…

  8. Web 2.0-Based Crowdsourcing for High-Quality Gold Standard Development in Clinical Natural Language Processing

    PubMed Central

    Deleger, Louise; Li, Qi; Kaiser, Megan; Stoutenborough, Laura

    2013-01-01

    Background A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Objective Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. Methods To build the gold standard for evaluating the crowdsourcing workers performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowds work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations. Results The agreement between the crowds annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task. Conclusions This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlowers quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches. PMID:23548263

  9. Advanced computer languages

    SciTech Connect

    Bryce, H.

    1984-05-03

    If software is to become an equal partner in the so-called fifth generation of computers-which of course it must-programming languages and the human interface will need to clear some high hurdles. Again, the solutions being sought turn to cerebral emulation-here, the way that human beings understand language. The result would be natural or English-like languages that would allow a person to communicate with a computer much as he or she does with another person. In the discussion the authors look at fourth level languages and fifth level languages, used in meeting the goal of AI. The higher level languages aim to be non procedural. Application of LISP, and Forth to natural language interface are described as well as programs such as natural link technology package, written in C.

  10. On Language Change: The Invisible Hand in Language.

    ERIC Educational Resources Information Center

    Keller, Rudi

    The nature of language change over time is examined, and an evolutionary theory of language is proposed. The text, intended for laymen, students, and experts alike, first addresses the reasons and mechanisms by which language changes, and attempts to identify a relationship between the essence of language, reasons for change, and the genesis of…

  11. On Language Change: The Invisible Hand in Language.

    ERIC Educational Resources Information Center

    Keller, Rudi

    The nature of language change over time is examined, and an evolutionary theory of language is proposed. The text, intended for laymen, students, and experts alike, first addresses the reasons and mechanisms by which language changes, and attempts to identify a relationship between the essence of language, reasons for change, and the genesis of

  12. Learning Language through Content: Learning Content through Language.

    ERIC Educational Resources Information Center

    Met, Myriam

    1991-01-01

    A definition and description of elementary school content-based foreign language instruction notes how it promotes natural language learning and higher-order thinking skills, and also addresses curriculum development, language objective definition, and specific applications in mathematics, science, reading and language arts, social studies, and

  13. Introducing a gender-neutral pronoun in a natural gender language: the influence of time on attitudes and behavior

    PubMed Central

    Gustafsson Sendén, Marie; Bäck, Emma A.; Lindqvist, Anna

    2015-01-01

    The implementation of gender fair language is often associated with negative reactions and hostile attacks on people who propose a change. This was also the case in Sweden in 2012 when a third gender-neutral pronoun hen was proposed as an addition to the already existing Swedish pronouns for she (hon) and he (han). The pronoun hen can be used both generically, when gender is unknown or irrelevant, and as a transgender pronoun for people who categorize themselves outside the gender dichotomy. In this article we review the process from 2012 to 2015. No other language has so far added a third gender-neutral pronoun, existing parallel with two gendered pronouns, that actually have reached the broader population of language users. This makes the situation in Sweden unique. We present data on attitudes toward hen during the past 4 years and analyze how time is associated with the attitudes in the process of introducing hen to the Swedish language. In 2012 the majority of the Swedish population was negative to the word, but already in 2014 there was a significant shift to more positive attitudes. Time was one of the strongest predictors for attitudes also when other relevant factors were controlled for. The actual use of the word also increased, although to a lesser extent than the attitudes shifted. We conclude that new words challenging the binary gender system evoke hostile and negative reactions, but also that attitudes can normalize rather quickly. We see this finding very positive and hope it could motivate language amendments and initiatives for gender-fair language, although the first responses may be negative. PMID:26191016

  14. Gendered Language in Interactive Discourse

    ERIC Educational Resources Information Center

    Hussey, Karen A.; Katz, Albert N.; Leith, Scott A.

    2015-01-01

    Over two studies, we examined the nature of gendered language in interactive discourse. In the first study, we analyzed gendered language from a chat corpus to see whether tokens of gendered language proposed in the gender-as-culture hypothesis (Maltz and Borker in "Language and social identity." Cambridge University Press, Cambridge, pp…

  15. Gendered Language in Interactive Discourse

    ERIC Educational Resources Information Center

    Hussey, Karen A.; Katz, Albert N.; Leith, Scott A.

    2015-01-01

    Over two studies, we examined the nature of gendered language in interactive discourse. In the first study, we analyzed gendered language from a chat corpus to see whether tokens of gendered language proposed in the gender-as-culture hypothesis (Maltz and Borker in "Language and social identity." Cambridge University Press, Cambridge, pp

  16. Language as a Liberal Art.

    ERIC Educational Resources Information Center

    Stein, Jack M.

    Language, considered as a liberal art, is examined in the light of other philosophical viewpoints concerning the nature of language in relation to second language instruction in this paper. Critical of an earlier mechanistic audio-lingual learning theory, translation approaches to language learning, vocabulary list-oriented courses, graduate…

  17. The Tao of Whole Language.

    ERIC Educational Resources Information Center

    Zola, Meguido

    1989-01-01

    Uses the philosophy of Taoism as a metaphor in describing the whole language approach to language arts instruction. The discussion covers the key principles that inform the whole language approach, the resulting holistic nature of language programs, and the role of the teacher in this approach. (16 references) (CLB)

  18. Cultural Perspectives Toward Language Learning

    ERIC Educational Resources Information Center

    Lin, Li-Li

    2008-01-01

    Cultural conflicts may be derived from using inappropriate language. Appropriate linguistic-pragmatic competence may also be produced by providing various and multicultural backgrounds. Culture and language are linked together naturally, unconsciously, and closely in daily social lives. Culture affects language and language affects culture through…

  19. Understanding Why Things Happen: Case-Studies of Pupils Using an Abstract Picture Language to Represent the Nature of Changes.

    ERIC Educational Resources Information Center

    Stylianidou, Fani; Boohan, Richard

    1998-01-01

    Six 12-year-old students were followed during an eight-month course using "Energy and Change" curricular materials, which introduce ideas related to the Second Law of Thermodynamics through an abstract picture language. Concludes that students had higher levels of generalization in their explanations of physical, chemical, and biological change.

  20. The Nature and Impact of Changes in Home Learning Environment on Development of Language and Academic Skills in Preschool Children

    ERIC Educational Resources Information Center

    Son, Seung-Hee; Morrison, Frederick J.

    2010-01-01

    In this study, we examined changes in the early home learning environment as children approached school entry and whether these changes predicted the development of children's language and academic skills. Findings from a national sample of the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development…

  1. Paying Attention to Attention Allocation in Second-Language Learning: Some Insights into the Nature of Linguistic Thresholds.

    ERIC Educational Resources Information Center

    Hawson, Anne

    1997-01-01

    Three threshold hypotheses proposed by Cummins (1976) and Diaz (1985) as explanations of data on the cognitive consequences of bilingualism are examined in depth and compared to one another. A neuroscientifically updated information-processing perspective on the interaction of second-language comprehension and visual-processing ability is

  2. Language-Dependent Pitch Encoding Advantage in the Brainstem Is Not Limited to Acceleration Rates that Occur in Natural Speech

    ERIC Educational Resources Information Center

    Krishnan, Ananthanarayan; Gandour, Jackson T.; Smalt, Christopher J.; Bidelman, Gavin M.

    2010-01-01

    Experience-dependent enhancement of neural encoding of pitch in the auditory brainstem has been observed for only specific portions of native pitch contours exhibiting high rates of pitch acceleration, irrespective of speech or nonspeech contexts. This experiment allows us to determine whether this language-dependent advantage transfers to

  3. The Nature and Impact of Changes in Home Learning Environment on Development of Language and Academic Skills in Preschool Children

    ERIC Educational Resources Information Center

    Son, Seung-Hee; Morrison, Frederick J.

    2010-01-01

    In this study, we examined changes in the early home learning environment as children approached school entry and whether these changes predicted the development of children's language and academic skills. Findings from a national sample of the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development

  4. Language-Dependent Pitch Encoding Advantage in the Brainstem Is Not Limited to Acceleration Rates that Occur in Natural Speech

    ERIC Educational Resources Information Center

    Krishnan, Ananthanarayan; Gandour, Jackson T.; Smalt, Christopher J.; Bidelman, Gavin M.

    2010-01-01

    Experience-dependent enhancement of neural encoding of pitch in the auditory brainstem has been observed for only specific portions of native pitch contours exhibiting high rates of pitch acceleration, irrespective of speech or nonspeech contexts. This experiment allows us to determine whether this language-dependent advantage transfers to…

  5. Social Network Development, Language Use, and Language Acquisition during Study Abroad: Arabic Language Learners' Perspectives

    ERIC Educational Resources Information Center

    Dewey, Dan P.; Belnap, R. Kirk; Hillstrom, Rebecca

    2013-01-01

    Language learners and educators have subscribed to the belief that those who go abroad will have many opportunities to use the target language and will naturally become proficient. They also assume that language learners will develop relationships with native speakers allowing them to use the language and become more fluent, an assumption…

  6. Individual biases, cultural evolution, and the statistical nature of language universals: the case of colour naming systems.

    PubMed

    Baronchelli, Andrea; Loreto, Vittorio; Puglisi, Andrea

    2015-01-01

    Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces. PMID:26018391

  7. Individual Biases, Cultural Evolution, and the Statistical Nature of Language Universals: The Case of Colour Naming Systems

    PubMed Central

    Baronchelli, Andrea; Loreto, Vittorio; Puglisi, Andrea

    2015-01-01

    Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces. PMID:26018391

  8. Simultaneous natural speech and AAC interventions for children with childhood apraxia of speech: lessons from a speech-language pathologist focus group.

    PubMed

    Oommen, Elizabeth R; McCarthy, John W

    2015-03-01

    In childhood apraxia of speech (CAS), children exhibit varying levels of speech intelligibility depending on the nature of errors in articulation and prosody. Augmentative and alternative communication (AAC) strategies are beneficial, and commonly adopted with children with CAS. This study focused on the decision-making process and strategies adopted by speech-language pathologists (SLPs) when simultaneously implementing interventions that focused on natural speech and AAC. Eight SLPs, with significant clinical experience in CAS and AAC interventions, participated in an online focus group. Thematic analysis revealed eight themes: key decision-making factors; treatment history and rationale; benefits; challenges; therapy strategies and activities; collaboration with team members; recommendations; and other comments. Results are discussed along with clinical implications and directions for future research. PMID:25664542

  9. Extracting noun phrases for all of MEDLINE.

    PubMed Central

    Bennett, N. A.; He, Q.; Powell, K.; Schatz, B. R.

    1999-01-01

    A natural language parser that could extract noun phrases for all medical texts would be of great utility in analyzing content for information retrieval. We discuss the extraction of noun phrases from MEDLINE, using a general parser not tuned specifically for any medical domain. The noun phrase extractor is made up of three modules: tokenization; part-of-speech tagging; noun phrase identification. Using our program, we extracted noun phrases from the entire MEDLINE collection, encompassing 9.3 million abstracts. Over 270 million noun phrases were generated, of which 45 million were unique. The quality of these phrases was evaluated by examining all phrases from a sample collection of abstracts. The precision and recall of the phrases from our general parser compared favorably with those from three other parsers we had previously evaluated. We are continuing to improve our parser and evaluate our claim that a generic parser can effectively extract all the different phrases across the entire medical literature. PMID:10566444

  10. Partial Orders and Measures for Language Preferences.

    ERIC Educational Resources Information Center

    Egghe, Leo; Rousseau, Ronald

    2000-01-01

    Relative own-language preference (ROLP) depends on: the publication share of the language and the self-citing rate. ROLP and the openness of one language with respect to another can be represented by partial order. Logarithmic dependence on the language share(s) seems a natural additional requirement for measuring language preferences. Gives…

  11. Neural network processing of natural language: II. Towards a unified model of corticostriatal function in learning sentence comprehension and non-linguistic sequencing.

    PubMed

    Dominey, Peter Ford; Inui, Toshio; Hoen, Michel

    2009-01-01

    A central issue in cognitive neuroscience today concerns how distributed neural networks in the brain that are used in language learning and processing can be involved in non-linguistic cognitive sequence learning. This issue is informed by a wealth of functional neurophysiology studies of sentence comprehension, along with a number of recent studies that examined the brain processes involved in learning non-linguistic sequences, or artificial grammar learning (AGL). The current research attempts to reconcile these data with several current neurophysiologically based models of sentence processing, through the specification of a neural network model whose architecture is constrained by the known cortico-striato-thalamo-cortical (CSTC) neuroanatomy of the human language system. The challenge is to develop simulation models that take into account constraints both from neuranatomical connectivity, and from functional imaging data, and that can actually learn and perform the same kind of language and artificial syntax tasks. In our proposed model, structural cues encoded in a recurrent cortical network in BA47 activate a CSTC circuit to modulate the flow of lexical semantic information from BA45 to an integrated representation of meaning at the sentence level in BA44/6. During language acquisition, corticostriatal plasticity is employed to allow closed class structure to drive thematic role assignment. From the AGL perspective, repetitive internal structure in the AGL strings is encoded in BA47, and activates the CSTC circuit to predict the next element in the sequence. Simulation results from Caplan's [Caplan, D., Baker, C., & Dehaut, F. (1985). Syntactic determinants of sentence comprehension in aphasia. Cognition, 21, 117-175] test of syntactic comprehension, and from Gomez and Schvaneveldts' [Gomez, R. L., & Schvaneveldt, R. W. (1994). What is learned from artificial grammars?. Transfer tests of simple association. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 396-410] artificial grammar learning experiments are presented. These results are discussed in the context of a brain architecture for learning grammatical structure for multiple natural languages, and non-linguistic sequences. PMID:18835637

  12. An English language interface for constrained domains

    NASA Technical Reports Server (NTRS)

    Page, Brenda J.

    1989-01-01

    The Multi-Satellite Operations Control Center (MSOCC) Jargon Interpreter (MJI) demonstrates an English language interface for a constrained domain. A constrained domain is defined as one with a small and well delineated set of actions and objects. The set of actions chosen for the MJI is from the domain of MSOCC Applications Executive (MAE) Systems Test and Operations Language (STOL) directives and contains directives for signing a cathode ray tube (CRT) on or off, calling up or clearing a display page, starting or stopping a procedure, and controlling history recording. The set of objects chosen consists of CRTs, display pages, STOL procedures, and history files. Translation from English sentences to STOL directives is done in two phases. In the first phase, an augmented transition net (ATN) parser and dictionary are used for determining grammatically correct parsings of input sentences. In the second phase, grammatically typed sentences are submitted to a forward-chaining rule-based system for interpretation and translation into equivalent MAE STOL directives. Tests of the MJI show that it is able to translate individual clearly stated sentences into the subset of directives selected for the prototype. This approach to an English language interface may be used for similarly constrained situations by modifying the MJI's dictionary and rules to reflect the change of domain.

  13. A case of "order insensitivity"? Natural and artificial language processing in a man with primary progressive aphasia.

    PubMed

    Zimmerer, Vitor C; Varley, Rosemary A

    2015-08-01

    Processing of linear word order (linear configuration) is important for virtually all languages and essential to languages such as English which have little functional morphology. Damage to systems underpinning configurational processing may specifically affect word-order reliant sentence structures. We explore order processing in WR, a man with primary progressive aphasia (PPA). In a previous report, we showed how WR showed impaired processing of actives, which rely strongly on word order, but not passives where functional morphology signals thematic roles. Using the artificial grammar learning (AGL) paradigm, we examined WR's ability to process order in non-verbal, visual sequences and compared his profile to that of healthy controls, and aphasic participants with and without severe syntactic disorder. Results suggested that WR, like some other patients with severe syntactic impairment, was unable to detect linear configurational structure. The data are consistent with the notion that disruption of possibly domain-general linearization systems differentially affects processing of active and passive sentence structures. Further research is needed to test this account, and we suggest hypotheses for future studies. PMID:26103599

  14. Modeling Coevolution between Language and Memory Capacity during Language Origin

    PubMed Central

    Gong, Tao; Shuai, Lan

    2015-01-01

    Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language. PMID:26544876

  15. Learning to parse database queries using inductive logic programming

    SciTech Connect

    Zelle, J.M.; Mooney, R.J.

    1996-12-31

    This paper presents recent work using the CHILL parser acquisition system to automate the construction of a natural-language interface for database queries. CHILL treats parser acquisition as the learning of search-control rules within a logic program representing a shift-reduce parser and uses techniques from Inductive Logic Programming to learn relational control knowledge. Starting with a general framework for constructing a suitable logical form, CHILL is able to train on a corpus comprising sentences paired with database queries and induce parsers that map subsequent sentences directly into executable queries. Experimental results with a complete database-query application for U.S. geography show that CHILL is able to learn parsers that outperform a preexisting, hand-crafted counterpart. These results demonstrate the ability of a corpus-based system to produce more than purely syntactic representations. They also provide direct evidence of the utility of an empirical approach at the level of a complete natural language application.

  16. Language Program Evaluation

    ERIC Educational Resources Information Center

    Norris, John M.

    2016-01-01

    Language program evaluation is a pragmatic mode of inquiry that illuminates the complex nature of language-related interventions of various kinds, the factors that foster or constrain them, and the consequences that ensue. Program evaluation enables a variety of evidence-based decisions and actions, from designing programs and implementing…

  17. A Collective Case Study of the Nature of Form-Focused Instruction among Secondary English as a Second Language Teachers

    ERIC Educational Resources Information Center

    Budak, Sevda

    2013-01-01

    The nature of teaching expertise for form-focused instruction (FFI) in secondary schools has received little research attention. FFI research that has been carried out so far has devoted time in exploring the classification of one form of FFI, or the effectiveness of one form of FFI over another. Given that exploring teaching expertise for FFI in

  18. Formulaic Language and the Lexicon.

    ERIC Educational Resources Information Center

    Wray, Alison

    This book explores the nature and purposes of formulaic language, examining patterns across research from the fields of discourse analysis, first language acquisition, language pathology, and applied linguistics. There are 14 chapters in 6 parts. Part 1, "What Formulaic Sequences Are," includes (1) "The Whole and the Parts,""Detecting

  19. Principles of Instructed Language Learning

    ERIC Educational Resources Information Center

    Ellis, Rod

    2005-01-01

    This article represents an attempt to draw together findings from a range of second language acquisition studies in order to formulate a set of general principles for language pedagogy. These principles address such issues as the nature of second language (L2) competence (as formulaic and rule-based knowledge), the contributions of both focus on

  20. Principles of Instructed Language Learning

    ERIC Educational Resources Information Center

    Ellis, Rod

    2005-01-01

    This article represents an attempt to draw together findings from a range of second language acquisition studies in order to formulate a set of general principles for language pedagogy. These principles address such issues as the nature of second language (L2) competence (as formulaic and rule-based knowledge), the contributions of both focus on…

  1. Conceptual Complexity and Apparent Contradictions in Mathematics Language

    ERIC Educational Resources Information Center

    Gough, John

    2007-01-01

    Mathematics is like a language, although technically it is not a natural or informal human language, but a formal, that is, artificially constructed language. Importantly, educators use their natural everyday language to teach the formal language of mathematics. At times, however, instructors encounter problems when the technical words they use,

  2. Evaluation of a Command-line Parser-based Order Entry Pathway for the Department of Veterans Affairs Electronic Patient Record

    PubMed Central

    Lovis, Christian; Chapko, Michael K.; Martin, Diane P.; Payne, Thomas H.; Baud, Robert H.; Hoey, Patty J.; Fihn, Stephan D.

    2001-01-01

    Objective: To improve and simplify electronic order entry in an existing electronic patient record, the authors developed an alternative system for entering orders, which is based on a command- interface using robust and simple natural-language techniques. Design: The authors conducted a randomized evaluation of the new entry pathway, measuring time to complete a standard set of orders, and users' satisfaction measured by questionnaire. A group of 16 physician volunteers from the staff of the Department of Veterans Affairs Puget Sound Health Care System–Seattle Division participated in the evaluation. Results: Thirteen of the 16 physicians (81%) were able to enter medical orders more quickly using the natural-language–based entry system than the standard graphical user interface that uses menus and dialogs (mean time spared, 16.06 ± 4.52 minutes; P=0.029). Compared with the graphical user interface, the command-–based pathway was perceived as easier to learn (P<0.01), was considered easier to use and faster (P<0.01), and was rated better overall (P<0.05). Conclusion: Physicians found the command- interface easier to learn and faster to use than the usual menu-driven system. The major advantage of the system is that it combines an intuitive graphical user interface with the power and speed of a natural-language analyzer. PMID:11522769

  3. Lampooning Language.

    ERIC Educational Resources Information Center

    Gillespie, Tim

    1982-01-01

    Uses trademarks that are calculated misspellings, bumper sticker slogans, the strained and pretentious language of Howard Cosell, and governmental jargon to illustrate how to attune students to the magic and power of language, while poking fun at language abuse. (RL)

  4. Language Death or Language Suicide?

    ERIC Educational Resources Information Center

    Denison, Norman

    1977-01-01

    A discussion of disappearing and no longer used languages in anthropomorphic metaphors "language death" and "language suicide." Three stages in the disappearance of several specific languages are described. Ultimately, the direct cause of "language suicide" is not disappearance of rules but disappearance of speakers; parents stop teaching the

  5. SOL - SIZING AND OPTIMIZATION LANGUAGE COMPILER

    NASA Technical Reports Server (NTRS)

    Scotti, S. J.

    1994-01-01

    SOL is a computer language which is geared to solving design problems. SOL includes the mathematical modeling and logical capabilities of a computer language like FORTRAN but also includes the additional power of non-linear mathematical programming methods (i.e. numerical optimization) at the language level (as opposed to the subroutine level). The language-level use of optimization has several advantages over the traditional, subroutine-calling method of using an optimizer: first, the optimization problem is described in a concise and clear manner which closely parallels the mathematical description of optimization; second, a seamless interface is automatically established between the optimizer subroutines and the mathematical model of the system being optimized; third, the results of an optimization (objective, design variables, constraints, termination criteria, and some or all of the optimization history) are output in a form directly related to the optimization description; and finally, automatic error checking and recovery from an ill-defined system model or optimization description is facilitated by the language-level specification of the optimization problem. Thus, SOL enables rapid generation of models and solutions for optimum design problems with greater confidence that the problem is posed correctly. The SOL compiler takes SOL-language statements and generates the equivalent FORTRAN code and system calls. Because of this approach, the modeling capabilities of SOL are extended by the ability to incorporate existing FORTRAN code into a SOL program. In addition, SOL has a powerful MACRO capability. The MACRO capability of the SOL compiler effectively gives the user the ability to extend the SOL language and can be used to develop easy-to-use shorthand methods of generating complex models and solution strategies. The SOL compiler provides syntactic and semantic error-checking, error recovery, and detailed reports containing cross-references to show where each variable was used. The listings summarize all optimizations, listing the objective functions, design variables, and constraints. The compiler offers error-checking specific to optimization problems, so that simple mistakes will not cost hours of debugging time. The optimization engine used by and included with the SOL compiler is a version of Vanderplatt's ADS system (Version 1.1) modified specifically to work with the SOL compiler. SOL allows the use of the over 100 ADS optimization choices such as Sequential Quadratic Programming, Modified Feasible Directions, interior and exterior penalty function and variable metric methods. Default choices of the many control parameters of ADS are made for the user, however, the user can override any of the ADS control parameters desired for each individual optimization. The SOL language and compiler were developed with an advanced compiler-generation system to ensure correctness and simplify program maintenance. Thus, SOL's syntax was defined precisely by a LALR(1) grammar and the SOL compiler's parser was generated automatically from the LALR(1) grammar with a parser-generator. Hence unlike ad hoc, manually coded interfaces, the SOL compiler's lexical analysis insures that the SOL compiler recognizes all legal SOL programs, can recover from and correct for many errors and report the location of errors to the user. This version of the SOL compiler has been implemented on VAX/VMS computer systems and requires 204 KB of virtual memory to execute. Since the SOL compiler produces FORTRAN code, it requires the VAX FORTRAN compiler to produce an executable program. The SOL compiler consists of 13,000 lines of Pascal code. It was developed in 1986 and last updated in 1988. The ADS and other utility subroutines amount to 14,000 lines of FORTRAN code and were also updated in 1988.

  6. Language Endangerment and Language Revival.

    ERIC Educational Resources Information Center

    Muhlhausler, Peter

    2003-01-01

    Reviews and discusses the following books: "Language Death," by David Crystal; "The Green Book of Language Revitalization in Practice," by Leanne Hinton; and "Vanishing Voices of the World's Languages," by David Nettle. (Author/VWL)

  7. Teaching Additional Languages. Educational Practices Series 6.

    ERIC Educational Resources Information Center

    Judd, Elliot L.; Tan, Lihua; Walberg, Herbert J.

    This booklet describes key principles of and research on teaching additional languages. The 10 chapters focus on the following: (1) "Comprehensible Input" (learners need exposure to meaningful, understandable language); (2) "Language Opportunities" (classroom activities should let students use natural and meaningful language with their

  8. On Teaching Strategies in Second Language Acquisition

    ERIC Educational Resources Information Center

    Yang, Hong

    2008-01-01

    How to acquire a second language is a question of obvious importance to teachers and language learners, and how to teach a second language has also become a matter of concern to the linguists' interest in the nature of primary linguistic data. Starting with the development stages of second language acquisition and Stephen Krashen's theory, this…

  9. Language Transfer in Language Learning. Language Acquisition & Language Disorders 5.

    ERIC Educational Resources Information Center

    Gass, Susan M., Ed.; Selinker, Larry, Ed.

    The study of native language influence in Second Language Acquisition has undergone significant changes over the past few decades. This book, which includes 12 chapters by distinguished researchers in the field of second language acquisition, traces the conceptual history of language transfer from its early role within a Contrastive Analysis

  10. Three design principles of language: the search for parsimony in redundancy.

    PubMed

    Beekhuizen, Barend; Bod, Rens; Zuidema, Willem

    2013-09-01

    In this paper we present three design principles of language - experience, heterogeneity and redundancy--and present recent developments in a family of models incorporating them, namely Data-Oriented Parsing/Unsupervised Data-Oriented Parsing. Although the idea of some form of redundant storage has become part and parcel of parsing technologies and usage-based linguistic approaches alike, the question how much of it is cognitively realistic and/or computationally optimally efficient is an open one. We argue that a segmentation-based approach (Bayesian Model Merging) combined with an all-subtrees approach reduces the number of rules needed to achieve an optimal performance, thus making the parser more efficient. At the same time, starting from unsegmented wholes comes closer to the acquisitional situation of a language learner, and thus adds to the cognitive plausibility of the model. PMID:24416957

  11. An AdaBoost Using a Weak-Learner Generating Several Weak-Hypotheses for Large Training Data of Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Iwakura, Tomoya; Okamoto, Seishi; Asakawa, Kazuo

    AdaBoost is a method to create a final hypothesis by repeatedly generating a weak hypothesis in each training iteration with a given weak learner. AdaBoost-based algorithms are successfully applied to several tasks such as Natural Language Processing (NLP), OCR, and so on. However, learning on the training data consisting of large number of samples and features requires long training time. We propose a fast AdaBoost-based algorithm for learning rules represented by combination of features. Our algorithm constructs a final hypothesis by learning several weak-hypotheses at each iteration. We assign a confidence-rated value to each weak-hypothesis while ensuring a reduction in the theoretical upper bound of the training error of AdaBoost. We evaluate our methods with English POS tagging and text chunking. The experimental results show that the training speed of our algorithm are about 25 times faster than an AdaBoost-based learner, and about 50 times faster than Support Vector Machines with polynomial kernel on the average while maintaining state-of-the-art accuracy.

  12. Facilitating Surveillance of Pulmonary Invasive Mold Diseases in Patients with Haematological Malignancies by Screening Computed Tomography Reports Using Natural Language Processing

    PubMed Central

    Ananda-Rajah, Michelle R.; Martinez, David; Slavin, Monica A.; Cavedon, Lawrence; Dooley, Michael; Cheng, Allen; Thursky, Karin A.

    2014-01-01

    Purpose Prospective surveillance of invasive mold diseases (IMDs) in haematology patients should be standard of care but is hampered by the absence of a reliable laboratory prompt and the difficulty of manual surveillance. We used a high throughput technology, natural language processing (NLP), to develop a classifier based on machine learning techniques to screen computed tomography (CT) reports supportive for IMDs. Patients and Methods We conducted a retrospective case-control study of CT reports from the clinical encounter and up to 12-weeks after, from a random subset of 79 of 270 case patients with 33 probable/proven IMDs by international definitions, and 68 of 257 uninfected-control patients identified from 3 tertiary haematology centres. The classifier was trained and tested on a reference standard of 449 physician annotated reports including a development subset (n?=?366), from a total of 1880 reports, using 10-fold cross validation, comparing binary and probabilistic predictions to the reference standard to generate sensitivity, specificity and area under the receiver-operating-curve (ROC). Results For the development subset, sensitivity/specificity was 91% (95%CI 86% to 94%)/79% (95%CI 71% to 84%) and ROC area was 0.92 (95%CI 89% to 94%). Of 25 (5.6%) missed notifications, only 4 (0.9%) reports were regarded as clinically significant. Conclusion CT reports are a readily available and timely resource that may be exploited by NLP to facilitate continuous prospective IMD surveillance with translational benefits beyond surveillance alone. PMID:25250675

  13. Nature

    NASA Astrophysics Data System (ADS)

    Heinhorst, Sabine; Cannon, Gordon

    1997-01-01

    The fact that two of the original articles by this year's Nobel laureates were published in Nature bears witness to the pivotal role of this journal in documenting pioneering discoveries in all areas of science. The prize for Physiology or Medicine was awarded to immunologists Peter C. Doherty (University of Tennessee) and Rolf M. Zinkernagel (University of Zurich, Switzerland), honoring work that, in the 1970s, laid the foundation for our current understanding of the way in which our immune system differentiates between healthy cells and virus-infected ones that are targeted for destruction (p 465 in the October 10 issue of vol. 383). Three researchers share the Chemistry award for their discovery of C60 buckminsterfullerenes. The work by Robert Curl, Richard Smalley (both at Rice University), and Harry Kroto (University of Sussex, UK) has led to a burst of new approaches to materials development and in carbon chemistry (p 561 of the October 17 issue of vol. 383). This year's Nobel prize in physics went to three U.S. researchers, Douglas Osheroff (Stanford University) and David M. Lee and Robert C. Richardson (Cornell University), who were honored for their work on superfluidity, a frictionless liquid state, of supercooled 3He (p 562 of the October 17 issue of vol. 383).

  14. Language Learning and Language Utilization.

    ERIC Educational Resources Information Center

    Lambert, Richard D.

    The benefits of learning a foreign language and arguments in support of language requirements in the college curriculum are discussed. The arguments concerning the acquisition of indirect benefits not inherent in the language skill itself through the learning of a foreign language prove inconvincing and only serve to divert one's attention from…

  15. Language evolution and human-computer interaction

    NASA Technical Reports Server (NTRS)

    Grudin, Jonathan; Norman, Donald A.

    1991-01-01

    Many of the issues that confront designers of interactive computer systems also appear in natural language evolution. Natural languages and human-computer interfaces share as their primary mission the support of extended 'dialogues' between responsive entities. Because in each case one participant is a human being, some of the pressures operating on natural languages, causing them to evolve in order to better support such dialogue, also operate on human-computer 'languages' or interfaces. This does not necessarily push interfaces in the direction of natural language - since one entity in this dialogue is not a human, this is not to be expected. Nonetheless, by discerning where the pressures that guide natural language evolution also appear in human-computer interaction, we can contribute to the design of computer systems and obtain a new perspective on natural languages.

  16. A Case of Language Revitalisation in "Settled" Australia.

    ERIC Educational Resources Information Center

    Walsh, Michael

    2001-01-01

    Presents a case of language revitalisation in "settled" Australia, considers the nature of the language ecology in indigenous Australia, and advances some of the reasons for the success of this case of language revitalization. (Author/VWL)

  17. Is DNA a language?

    PubMed

    Tsonis, A A; Elsner, J B; Tsonis, P A

    1997-01-01

    DNA sequences usually involve local construction rules that affect different scales. As such their "dictionary" may not follow Zipf's law (a power law) which is followed in every natural language. Indeed, analysis of many DNA sequences suggests that no linguistics connections to DNA exist and that even though it has structure DNA is not a language. Computer simulations and a biological approach to this problem further support these results. PMID:9039397

  18. Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

    PubMed Central

    Kreinacke, Marcos; Spallek, Heiko; Song, Mei; O'Donnell, Jean A

    2011-01-01

    Background An Internet mailing list may be characterized as a virtual community of practice that serves as an information hub with easy access to expert advice and opportunities for social networking. We are interested in mining messages posted to a list for dental practitioners to identify clinical topics. Once we understand the topical domain, we can study dentists’ real information needs and the nature of their shared expertise, and can avoid delivering useless content at the point of care in future informatics applications. However, a necessary first step involves developing procedures to identify messages that are worth studying given our resources for planned, labor-intensive research. Objectives The primary objective of this study was to develop a workflow for finding a manageable number of clinically relevant messages from a much larger corpus of messages posted to an Internet mailing list, and to demonstrate the potential usefulness of our procedures for investigators by retrieving a set of messages tailored to the research question of a qualitative research team. Methods We mined 14,576 messages posted to an Internet mailing list from April 2008 to May 2009. The list has about 450 subscribers, mostly dentists from North America interested in clinical practice. After extensive preprocessing, we used the Natural Language Toolkit to identify clinical phrases and keywords in the messages. Two academic dentists classified collocated phrases in an iterative, consensus-based process to describe the topics discussed by dental practitioners who subscribe to the list. We then consulted with qualitative researchers regarding their research question to develop a plan for targeted retrieval. We used selected phrases and keywords as search strings to identify clinically relevant messages and delivered the messages in a reusable database. Results About half of the subscribers (245/450, 54.4%) posted messages. Natural language processing (NLP) yielded 279,193 clinically relevant tokens or processed words (19% of all tokens). Of these, 2.02% (5634 unique tokens) represent the vocabulary for dental practitioners. Based on pointwise mutual information score and clinical relevance, 325 collocated phrases (eg, fistula filled obturation and herpes zoster) with 108 keywords (eg, mercury) were classified into 13 broad categories with subcategories. In the demonstration, we identified 305 relevant messages (2.1% of all messages) over 10 selected categories with instances of collocated phrases, and 299 messages (2.1%) with instances of phrases or keywords for the category systemic disease. Conclusions A workflow with a sequence of machine-based steps and human classification of NLP-discovered phrases can support researchers who need to identify relevant messages in a much larger corpus. Discovered phrases and keywords are useful search strings to aid targeted retrieval. We demonstrate the potential value of our procedures for qualitative researchers by retrieving a manageable set of messages concerning systemic and oral disease. PMID:22112583

  19. Evolutionary biology of language.

    PubMed Central

    Nowak, M A

    2000-01-01

    Language is the most important evolutionary invention of the last few million years. It was an adaptation that helped our species to exchange information, make plans, express new ideas and totally change the appearance of the planet. How human language evolved from animal communication is one of the most challenging questions for evolutionary biology The aim of this paper is to outline the major principles that guided language evolution in terms of mathematical models of evolutionary dynamics and game theory. I will discuss how natural selection can lead to the emergence of arbitrary signs, the formation of words and syntactic communication. PMID:11127907

  20. Counteracting the Threat of Language Death: The Case of Minority Languages in Botswana

    ERIC Educational Resources Information Center

    Mooko, Theophilus

    2006-01-01

    When Botswana gained independence from the British in 1966, a political decision was taken to designate English as an official language and Setswana, one of the indigenous languages, as a national language. This move disregarded the multilingual nature of Botswana society. Furthermore, although not explicitly stated, the use of other languages