Science.gov

Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. Evaluating contributions of natural language parsers to protein–protein interaction extraction

    PubMed Central

    Miyao, Yusuke; Sagae, Kenji; Sætre, Rune; Matsuzaki, Takuya; Tsujii, Jun'ichi

    2009-01-01

    Motivation: While text mining technologies for biomedical research have gained popularity as a way to take advantage of the explosive growth of information in text form in biomedical papers, selecting appropriate natural language processing (NLP) tools is still difficult for researchers who are not familiar with recent advances in NLP. This article provides a comparative evaluation of several state-of-the-art natural language parsers, focusing on the task of extracting protein–protein interaction (PPI) from biomedical papers. We measure how each parser, and its output representation, contributes to accuracy improvement when the parser is used as a component in a PPI system. Results: All the parsers attained improvements in accuracy of PPI extraction. The levels of accuracy obtained with these different parsers vary slightly, while differences in parsing speed are larger. The best accuracy in this work was obtained when we combined Miyao and Tsujii's Enju parser and Charniak and Johnson's reranking parser, and the accuracy is better than the state-of-the-art results on the same data. Availability: The PPI extraction system used in this work (AkanePPI) is available online at http://www-tsujii.is.s.u-tokyo.ac.jp/-100downloads/downloads.cgi. The evaluated parsers are also available online from each developer's site. Contact: yusuke@is.s.u-tokyo.ac.jp PMID:19073593

  3. Benchmarking natural-language parsers for biological applications using dependency graphs

    PubMed Central

    Clegg, Andrew B; Shepherd, Adrian J

    2007-01-01

    Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351

  4. Application of a rules-based natural language parser to critical value reporting in anatomic pathology.

    PubMed

    Owens, Scott R; Balis, Ulysses G J; Lucas, David R; Myers, Jeffrey L

    2012-03-01

    Critical values in anatomic pathology are rare occurrences and difficult to define with precision. Nevertheless, accrediting institutions require effective and timely communication of all critical values generated by clinical and anatomic laboratories. Provisional gating criteria for potentially critical anatomic diagnoses have been proposed, with some success in their implementation reported in the literature. Ensuring effective communication is challenging, however, making the case for programmatic implementation of a turnkey-style integrated information technology solution. To address this need, we developed a generically deployable laboratory information system-based tool, using a tiered natural language processing predicate calculus inference engine to identify qualifying cases that meet criteria for critical diagnoses but lack an indication in the electronic medical record for an appropriate clinical discussion with the ordering physician of record. Using this tool, we identified an initial cohort of 13,790 cases over a 49-month period, which were further explored by reviewing the available electronic medical record for each patient. Of these cases, 35 (0.3%) were judged to require intervention in the form of direct communication between the attending pathologist and the clinical physician of record. In 8 of the 35 cases, this intervention resulted in the conveyance of new information to the requesting physician and/or a change in the patient's clinical plan. The very low percentage of such cases (0.058%) illustrates their rarity in daily practice, making it unlikely that manual identification/notification approaches alone can reliably manage them. The automated turnkey system was useful in avoiding missed handoffs of significant, clinically actionable diagnoses. PMID:22343338

  5. COD::CIF::Parser: an error-correcting CIF parser for the Perl language

    PubMed Central

    Merkys, Andrius; Vaitkus, Antanas; Butkus, Justas; Okulič-Kazarinas, Mykolas; Kairys, Visvaldas; Gražulis, Saulius

    2016-01-01

    A syntax-correcting CIF parser, COD::CIF::Parser, is presented that can parse CIF 1.1 files and accurately report the position and the nature of the discovered syntactic problems. In addition, the parser is able to automatically fix the most common and the most obvious syntactic deficiencies of the input files. Bindings for Perl, C and Python programming environments are available. Based on COD::CIF::Parser, the cod-tools package for manipulating the CIFs in the Crystallography Open Database (COD) has been developed. The cod-tools package has been successfully used for continuous updates of the data in the automated COD data deposition pipeline, and to check the validity of COD data against the IUCr data validation guidelines. The performance, capabilities and applications of different parsers are compared. PMID:26937241

  6. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, D.; Forster, M.; Bates, D.A.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, T.; Roblin, Y.; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.; /UCLA

    2006-10-06

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally, this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  7. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  8. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  9. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  10. Parsing clinical text: how good are the state-of-the-art parsers?

    PubMed Central

    2015-01-01

    Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F

  11. Fast parsers for Entrez Gene.

    PubMed

    Liu, Mingyi; Grigoriev, Andrei

    2005-07-15

    NCBI completed the transition of its main genome annotation database from Locuslink to Entrez Gene in Spring 2005. However, to this date few parsers exist for the Entrez Gene annotation file. Owing to the widespread use of Locuslink and the popularity of Perl programming language in bioinformatics, a publicly available high performance Entrez Gene parser in Perl is urgently needed. We present four such parsers that were developed using several parsing approaches (Parse::RecDescent, Parse::Yapp, Perl-byacc and Perl 5 regular expressions) and provide the first in-depth comparison of these sophisticated Perl tools. Our fastest parser processes the entire human Entrez Gene annotation file in under 12 min on one Intel Xeon 2.4 GHz CPU and can be of help to the bioinformatics community during and after the transition from Locuslink to Entrez Gene. PMID:15879451

  12. Toward a theory of distributed word expert natural language parsing

    NASA Technical Reports Server (NTRS)

    Rieger, C.; Small, S.

    1981-01-01

    An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.

  13. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate…

  14. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…

  15. PASCAL LR(1) Parser Generator System

    Energy Science and Technology Software Center (ESTSC)

    1988-05-04

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce completemore » parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.« less

  16. Natural language parsing in a hybrid connectionist-symbolic architecture

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-03-01

    Most connectionist parsers either cannot guarantee the correctness of their derivations or have to simulate a serial flow of control. In the first case, users have to restrict the tasks (e.g. parse less complex or shorter sentences) of the parser or they need to believe in the soundness of the result. In the second case, the resulting network has lost most of its attractivity because seriality needs to be hard-coded into the structure of the net. We here present a hybrid symbolic connectionist parser, which was designed to fulfill the following goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free grammar, and (3) learning the applicability of parsing rules with a neural network. Our hybrid architecture consists of a serial parsing algorithm and a trainable net. BrainC (Backtracking and Backpropagation in C) combines the well known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent the typical properties of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN- Workstations and was tested with several grammars for English and German. We discuss how BrainC reached its design goals and what results we observed.

  17. Natural language watermarking

    NASA Astrophysics Data System (ADS)

    Topkara, Mercan; Taskiran, Cuneyt M.; Delp, Edward J., III

    2005-03-01

    In this paper we discuss natural language watermarking, which uses the structure of the sentence constituents in natural language text in order to insert a watermark. This approach is different from techniques, collectively referred to as "text watermarking," which embed information by modifying the appearance of text elements, such as lines, words, or characters. We provide a survey of the current state of the art in natural language watermarking and introduce terminology, techniques, and tools for text processing. We also examine the parallels and differences of the two watermarking domains and outline how techniques from the image watermarking domain may be applicable to the natural language watermarking domain.

  18. La Description des langues naturelles en vue d'applications linguistiques: Actes du colloque (The Description of Natural Languages with a View to Linguistic Applications: Conference Papers). Publication K-10.

    ERIC Educational Resources Information Center

    Ouellon, Conrad, Comp.

    Presentations from a colloquium on applications of research on natural languages to computer science address the following topics: (1) analysis of complex adverbs; (2) parser use in computerized text analysis; (3) French language utilities; (4) lexicographic mapping of official language notices; (5) phonographic codification of Spanish; (6)…

  19. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  20. Natural language generation

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    The goal of natural language generation is to replicate human writers or speakers: to generate fluent, grammatical, and coherent text or speech. Produced language, using both explicit and implicit means, must clearly and effectively express some intended message. This demands the use of a lexicon and a grammar together with mechanisms which exploit semantic, discourse and pragmatic knowledge to constrain production. Furthermore, special processors may be required to guide focus, extract presuppositions, and maintain coherency. As with interpretation, generation may require knowledge of the world, including information about the discourse participants as well as knowledge of the specific domain of discourse. All of these processes and knowledge sources must cooperate to produce well-written, unambiguous language. Natural language generation has received less attention than language interpretation due to the nature of language: it is important to interpret all the ways of expressing a message but we need to generate only one. Furthermore, the generative task can often be accomplished by canned text (e.g., error messages or user instructions). The advent of more sophisticated computer systems, however, has intensified the need to express multisentential English.

  1. A shallow parser based on closed-class words to capture relations in biomedical text.

    PubMed

    Leroy, Gondy; Chen, Hsinchun; Martinez, Jesse D

    2003-06-01

    Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun phrases automatically from free text has been developed and evaluated. It uses heuristics and a noun phraser to capture entities of interest in the text. Cascaded finite state automata structure the relations between individual entities. The automata are based on closed-class English words and model generic relations not limited to specific words. The parser also recognizes coordinating conjunctions and captures negation in text, a feature usually ignored by others. Three cancer researchers evaluated 330 relations extracted from 26 abstracts of interest to them. There were 296 relations correctly extracted from the abstracts resulting in 90% precision of the relations and an average of 11 correct relations per abstract. PMID:14615225

  2. Natural language modeling

    SciTech Connect

    Sharp, J.K.

    1997-11-01

    This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, project management, budget reviews, tax, laws, machine function).

  3. A Table Look-Up Parser in Online ILTS Applications

    ERIC Educational Resources Information Center

    Chen, Liang; Tokuda, Naoyuki; Hou, Pingkui

    2005-01-01

    A simple table look-up parser (TLUP) has been developed for parsing and consequently diagnosing syntactic errors in semi-free formatted learners' input sentences of an intelligent language tutoring system (ILTS). The TLUP finds a parse tree for a correct version of an input sentence, diagnoses syntactic errors of the learner by tracing and…

  4. A lex-based mad parser and its applications

    SciTech Connect

    Oleg Krivosheev et al.

    2001-07-03

    An embeddable and portable Lex-based MAD language parser has been developed. The parser consists of a front-end which reads a MAD file and keeps beam elements, beam line data and algebraic expressions in tree-like structures, and a back-end, which processes the front-end data to generate an input file or data structures compatible with user applications. Three working programs are described, namely, a MAD to C++ converter, a dynamic C++ object factory and a MAD-MARS beam line builder. Design and implementation issues are discussed.

  5. Left-corner unification-based natural language processing

    SciTech Connect

    Lytinen, S.L.; Tomuro, N.

    1996-12-31

    In this paper, we present an efficient algorithm for parsing natural language using unification grammars. The algorithm is an extension of left-corner parsing, a bottom-up algorithm which utilizes top-down expectations. The extension exploits unification grammar`s uniform representation of syntactic, semantic, and domain knowledge, by incorporating all types of grammatical knowledge into parser expectations. In particular, we extend the notion of the reachability table, which provides information as to whether or not a top-down expectation can be realized by a potential subconstituent, by including all types of grammatical information in table entries, rather than just phrase structure information. While our algorithm`s worst-case computational complexity is no better than that of many other algorithms, we present empirical testing in which average-case linear time performance is achieved. Our testing indicates this to be much improved average-case performance over previous leftcomer techniques.

  6. Programming Languages, Natural Languages, and Mathematics

    ERIC Educational Resources Information Center

    Naur, Peter

    1975-01-01

    Analogies are drawn between the social aspects of programming and similar aspects of mathematics and natural languages. By analogy with the history of auxiliary languages it is suggested that Fortran and Cobol will remain dominant. (Available from the Association of Computing Machinery, 1133 Avenue of the Americas, New York, NY 10036.) (Author/TL)

  7. A natural language interface for real-time dialogue in the flight domain

    NASA Technical Reports Server (NTRS)

    Ali, M.; Ai, C.-S.; Ferber, H. J.

    1986-01-01

    A flight expert system (FLES) is being developed to assist pilots in monitoring, diagnosisng and recovering from in-flight faults. To provide a communications interface between the flight crew and FLES, a natural language interface, has been implemented. Input to NALI is processed by three processors: (1) the semantic parser, (2) the knowledge retriever, and (3) the response generator. The architecture of NALI has been designed to process both temporal and nontemporal queries. Provisions have also been made to reduce the number of system modifications required for adapting NALI to other domains. This paper describes the architecture and implementation of NALI.

  8. The Nature of Language Learning.

    ERIC Educational Resources Information Center

    UWM Magazine, 1968

    1968-01-01

    Four behavioral scientists in a colloquium at the University of Wisconsin discussed various aspects of language learning. Concerned primarily with pre-high-school pupils and addressing their remarks to language teachers, the scientists offered these proposals: (1) language teaching is more effective if taught in a natural setting, (2)…

  9. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The book presents papers on natural language processing, focusing on the central issues of representation, reasoning, and recognition. The introduction discusses theoretical issues, historical developments, and current problems and approaches. The book presents work in syntactic models (parsing and grammars), semantic interpretation, discourse interpretation, language action and intentions, language generation, and systems.

  10. Distributed problem solving and natural language understanding models

    NASA Technical Reports Server (NTRS)

    Rieger, C.

    1980-01-01

    A theory of organization and control for a meaning-based language understanding system is mapped out. In this theory, words, rather than rules, are the units of knowledge, and assume the form of procedural entities which execute as generator-like coroutines. Parsing a sentence in context demands a control environment in wich experts can ask questions of each other, forward hints and suggestions to each other, and suspend. The theory is a cognitive theory of both language representation and parser control.

  11. The Nature of Natural Languages.

    ERIC Educational Resources Information Center

    Pierce, Joe E.

    A variety of types of evidence are examined to help determine the true nature of "deep structure" and what, if any, implications this has for linguistic theory as well as culture theory generally. The evidence accumulated over the past century on the nature of phonetic and phonemic systems is briefly discussed, and the following areas of analysis…

  12. Parser Combinators: a Practical Application for Generating Parsers for NMR Data.

    PubMed

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi Jc; Gryk, Michael R

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as "parser combinators". This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  13. A Natural Language Graphics System.

    ERIC Educational Resources Information Center

    Brown, David, C.; Kwasny, Stan C.

    This report describes an experimental system for drawing simple pictures on a computer graphics terminal using natural language input. The system is capable of drawing lines, points, and circles on command from the user, as well as answering questions about system capabilities and objects on the screen. Erasures are permitted and language input…

  14. Advances in natural language processing.

    PubMed

    Hirschberg, Julia; Manning, Christopher D

    2015-07-17

    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. PMID:26185244

  15. Connectionist natural language parsing with BrainC

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-08-01

    A close examination of pure neural parsers shows that they either could not guarantee the correctness of their derivations or had to hard-code seriality into the structure of the net. The authors therefore decided to use a hybrid architecture, consisting of a serial parsing algorithm and a trainable net. The system fulfills the following design goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free language, and (3) learning the applicability of parsing rules with a neural network to increase the efficiency of the whole system. BrainC (backtracktacking and backpropagation in C) combines the well- known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent typical structures of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN workstations and was tested with several grammars for English and German. The design of the system and then the results are discussed.

  16. Wild Words: Nature, Language and Outdoor Education.

    ERIC Educational Resources Information Center

    Meisner, Mark

    1993-01-01

    Discusses ways that language misrepresents nature, pointing out that frequently used metaphors and problematic language usage provide limited conceptual and emotional understanding of the natural world and contribute to a degraded view of nature. Discusses strategies for changing language as the first step in changing attitudes toward nature. (LP)

  17. Computational models of natural language processing

    SciTech Connect

    Bara, B.G.; Guida, G.

    1984-01-01

    The main concern in this work is the illustration of models for natural language processing, and the discussion of their role in the development of computational studies of language. Topics covered include the following: competence and performance in the design of natural language systems; planning and understanding speech acts by interpersonal games; a framework for integrating syntax and semantics; knowledge representation and natural language: extending the expressive power of proposition nodes; viewing parsing as word sense discrimination: a connectionist approach; a propositional language for text representation; from topic and focus of a sentence to linking in a text; language generation by computer; understanding the Chinese language; semantic primitives or meaning postulates: mental models of propositional representations; narrative complexity based on summarization algorithms; using focus to constrain language generation; and towards an integral model of language competence.

  18. Integration of speech with natural language understanding.

    PubMed Central

    Moore, R C

    1995-01-01

    The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding. PMID:7479813

  19. Natural language processing: an introduction

    PubMed Central

    Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786

  20. Multilingual environment and natural acquisition of language

    NASA Astrophysics Data System (ADS)

    Takano, Shunichi; Nakamura, Shigeru

    2000-06-01

    Language and human are not anything in the outside of nature. Not only babies, even adults can acquire new language naturally, if they have a natural multilingual environment around them. The reason it is possible would be that any human has an ability to grasp the whole of language, and at the same time, language has an order which is the easiest to acquire for humans. The process of this natural acquisition and a result of investigating the order of Japanese vowels are introduced. .

  1. Networks Generated from Natural Language Text

    NASA Astrophysics Data System (ADS)

    Biemann, Chris; Quasthoff, Uwe

    The study of large-scale characteristics of graphs that arise in natural language processing is an essential step in finding structural regularities. Structure discovery processes have to be designed with an awareness of these properties. Examining and contrasting the effects of processes that generate graph structures similar to those observed in language data sheds light on the structure of language and its evolution.

  2. Language Interventions in Natural Settings.

    ERIC Educational Resources Information Center

    Cavallaro, Claire C.

    1983-01-01

    Described is a milieu teaching approach to language development of young handicapped children. The method involves prompting and contingent delivery of reinforcers during normal language interactions in such classroom settings as free play, lunch, or instructional periods. (MC)

  3. Natural language interface for command and control

    NASA Technical Reports Server (NTRS)

    Shuler, Robert L., Jr.

    1986-01-01

    A working prototype of a flexible 'natural language' interface for command and control situations is presented. This prototype is analyzed from two standpoints. First is the role of natural language for command and control, its realistic requirements, and how well the role can be filled with current practical technology. Second, technical concepts for implementation are discussed and illustrated by their application in the prototype system. It is also shown how adaptive or 'learning' features can greatly ease the task of encoding language knowledge in the language processor.

  4. Introduction: Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Smeaton, Alan F.

    1990-01-01

    Discussion of research into information and text retrieval problems highlights the work with automatic natural language processing (NLP) that is reported in this issue. Topics discussed include the occurrences of nominal compounds; anaphoric references; discontinuous language constructs; automatic back-of-the-book indexing; and full-text analysis.…

  5. Language and the Multisemiotic Nature of Mathematics

    ERIC Educational Resources Information Center

    de Oliveira, Luciana C.; Cheng, Dazhi

    2011-01-01

    This article explores how language and the multisemiotic nature of mathematics can present potential challenges for English language learners (ELLs). Based on two qualitative studies of the discourse of mathematics, we discuss some of the linguistic challenges of mathematics for ELLs in order to highlight the potential difficulties they may have…

  6. A Natural Language Interface to Databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1990-01-01

    The development of a Natural Language Interface (NLI) is presented which is semantic-based and uses Conceptual Dependency representation. The system was developed using Lisp and currently runs on a Symbolics Lisp machine.

  7. Natural Language Description of Emotion

    ERIC Educational Resources Information Center

    Kazemzadeh, Abe

    2013-01-01

    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  8. Natural Phonology Interference in Second Language Acquisition.

    ERIC Educational Resources Information Center

    Nathan, Geoffrey S.

    The natural phonology theory, related to European structuralism, makes two fundamental assumptions: (1) phonemes are mental images of the sounds of language, and (2) phonological processes represent subconscious mental substitutions of one sound or class of sounds for another that are the natural response to the relative difficulties of sound…

  9. Prediction During Natural Language Comprehension.

    PubMed

    Willems, Roel M; Frank, Stefan L; Nijhof, Annabel D; Hagoort, Peter; van den Bosch, Antal

    2016-06-01

    The notion of prediction is studied in cognitive neuroscience with increasing intensity. We investigated the neural basis of 2 distinct aspects of word prediction, derived from information theory, during story comprehension. We assessed the effect of entropy of next-word probability distributions as well as surprisal A computational model determined entropy and surprisal for each word in 3 literary stories. Twenty-four healthy participants listened to the same 3 stories while their brain activation was measured using fMRI. Reversed speech fragments were presented as a control condition. Brain areas sensitive to entropy were left ventral premotor cortex, left middle frontal gyrus, right inferior frontal gyrus, left inferior parietal lobule, and left supplementary motor area. Areas sensitive to surprisal were left inferior temporal sulcus ("visual word form area"), bilateral superior temporal gyrus, right amygdala, bilateral anterior temporal poles, and right inferior frontal sulcus. We conclude that prediction during language comprehension can occur at several levels of processing, including at the level of word form. Our study exemplifies the power of combining computational linguistics with cognitive neuroscience, and additionally underlines the feasibility of studying continuous spoken language materials with fMRI. PMID:25903464

  10. Natural Language Access to Medical Text *

    PubMed Central

    Walker, Donald E.; Hobbs, Jerry R.

    1981-01-01

    This paper describes research on the development of a methodology for representing the information in texts and of procedures for relating the linguistic structure of a request to the corresponding representations. The work is being done in the context of a prototype system that will allow physicians and other health professionals to access information in a computerized textbook of hepatitis through natural language dialogues. The interpretation of natural language queries is derived from DIAMOND/DIAGRAM, a linguistically motivated, domain-independent natural language interface developed at SRI. A text access component is being developed that uses representations of the propositional content of text passages and of the hierarchical structure of the text as a whole to retrieve relevant information.

  11. Porting a lexicalized-grammar parser to the biomedical domain.

    PubMed

    Rimell, Laura; Clark, Stephen

    2009-10-01

    This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought. PMID:19141332

  12. Entropy analysis of natural language written texts

    NASA Astrophysics Data System (ADS)

    Papadimitriou, C.; Karamanos, K.; Diakonos, F. K.; Constantoudis, V.; Papageorgiou, H.

    2010-08-01

    The aim of the present work is to investigate the relative contribution of ordered and stochastic components in natural written texts and examine the influence of text category and language on these. To this end, a binary representation of written texts and the generated symbolic sequences are examined by the standard block entropy analysis and the Shannon and Kolmogorov entropies are obtained. It is found that both entropies are sensitive to both language and text category with the text category sensitivity to follow almost the same trends in both languages (English and Greek) considered. The values of these entropies are compared with those of stochastically generated symbolic sequences and the nature of correlations present in this representation of real written texts is identified.

  13. Two Interpretive Systems for Natural Language?

    ERIC Educational Resources Information Center

    Frazier, Lyn

    2015-01-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system…

  14. PBS: An Economical Natural Language Query Interpreter.

    ERIC Educational Resources Information Center

    Samstag-Schnock, Uwe; Meadow, Charles T.

    1993-01-01

    Reports on the design and implementation of PBS (Parsing, Boolean Recognition, Stemming), a software module used in conjunction with an intermediary program to interpret natural language queries used for online database searching. Results of a test of the initial version, which is designed for use with bibliographic files, are reported. (13…

  15. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)…

  16. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query…

  17. Brain readiness and the nature of language

    PubMed Central

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their “representations” may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the

  18. Brain readiness and the nature of language.

    PubMed

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique

  19. Natural language processing, pragmatics, and verbal behavior

    PubMed Central

    Cherpas, Chris

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052

  20. Current trends with natural language processing.

    PubMed

    Rassinoux, A M; Michel, P A; Wagner, J; Baud, R

    1995-01-01

    Natural Language Processing in the medical domain becomes more and more powerful, efficient, and ready to be used in daily practice. The needs for such tools are enormous in the medical field, due to the vast amount of written texts for medical records. In the authors' point of view, the Electronic Patient Record (EPR) is achieved neither with Information Systems of all kinds nor with commercially available word processing systems. Natural Language Processing (NLP) is one dimension of the EPR, as well as Image Processing and Decision Support Systems. Analysis of medical texts to facilitate indexing and retrieval is well known. The need for a generation tool is to produce progress notes from menu driven systems. The computer systems of tomorrow cannot miss any single dimension. Since 1988, we've been developing an NLP system; it is supported by the European program AIM (Advanced Informatics in Medicine) within the GALEN and HELIOS consortium and the CERS (Commission d'Encouragement á la Recherche Scientifique) in Switzerland. The main directions of development are: a medical language analyzer, a language generator, a query processor, and dictionary building tools to support the Medical Linguistic Knowledge Base (MLKB). The knowledge representation schema is essentially based on Sowa's conceptual graphs, and the MLKB is multilingual from its design phase; it currently incorporates the English and the French languages; it will also continue using German. The goal of this demonstration is to provide evidence of what exists today, what will be soon available, and what is planned for the long term. Complete sentences will be processed in real time, and the browsing capabilities of the MLKB will be exercised. In particular, the following features will be presented: Analysis of complete sentences with verbs and relatives, as extracted from clinical narratives, with special attention to the method of "proximity processing" as developed in our group and the rule based

  1. ACPYPE - AnteChamber PYthon Parser interfacE

    PubMed Central

    2012-01-01

    Background ACPYPE (or AnteChamber PYthon Parser interfacE) is a wrapper script around the ANTECHAMBER software that simplifies the generation of small molecule topologies and parameters for a variety of molecular dynamics programmes like GROMACS, CHARMM and CNS. It is written in the Python programming language and was developed as a tool for interfacing with other Python based applications such as the CCPN software suite (for NMR data analysis) and ARIA (for structure calculations from NMR data). ACPYPE is open source code, under GNU GPL v3, and is available as a stand-alone application at http://www.ccpn.ac.uk/acpype and as a web portal application at http://webapps.ccpn.ac.uk/acpype. Findings We verified the topologies generated by ACPYPE in three ways: by comparing with default AMBER topologies for standard amino acids; by generating and verifying topologies for a large set of ligands from the PDB; and by recalculating the structures for 5 protein–ligand complexes from the PDB. Conclusions ACPYPE is a tool that simplifies the automatic generation of topology and parameters in different formats for different molecular mechanics programmes, including calculation of partial charges, while being object oriented for integration with other applications. PMID:22824207

  2. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  3. Automated database design from natural language input

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando; Segami, Carlos; Delaune, Carl

    1995-01-01

    Users and programmers of small systems typically do not have the skills needed to design a database schema from an English description of a problem. This paper describes a system that automatically designs databases for such small applications from English descriptions provided by end-users. Although the system has been motivated by the space applications at Kennedy Space Center, and portions of it have been designed with that idea in mind, it can be applied to different situations. The system consists of two major components: a natural language understander and a problem-solver. The paper describes briefly the knowledge representation structures constructed by the natural language understander, and, then, explains the problem-solver in detail.

  4. Reference And Description In Natural Language

    NASA Astrophysics Data System (ADS)

    Steinberg, Alan N.

    1988-03-01

    We propose a theory for modeling the semantic and pragmatic properties of natural language expressions used to refer. The sorts of expressions to be discussed include proper names, definite noun phrases and personal pronouns. We will focus in this paper on such expressions in the singular, having discussed elsewhere procedures for extending the present sort of analysis to various plural uses of these expressions. Propositions involving referential expressions are formally redefined in a second order predicate calculus, in which various semantic and pragmatic factors involved in establishing and interpreting references are modeled as rules of inference. Uses of referential utterances are differentiated according to the means used for individuating the object referred to. Analyses are provided for anaphoric, contextual, demonstrative, introductory and citational individuative devices. We analyze sentences like 'The man [or John] is wise' as conditionals of the form 'Whatever is uniquely a man [or named "John"] relevant to the present discourse is wise'. So modeled, the presupposition of existence (which historically has concerned much logical analysis of such sentences) is represented as a conversational implicature of the sort which obtains from any proposition of the form '(P -> Q)' to the corresponding `P'. This formalization is intended to serve as part of an empirical theory of natural language phenomena. Being an empirical theory, ours will strive to model the greatest possible diversity of phenomena using a minimum of formal apparatus. Such a theory may provide a foundation for automatic systems to predict and replicate natural language phenomena for purposes of text understanding and synthesis.

  5. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  6. LCS: a natural language comprehension system

    NASA Astrophysics Data System (ADS)

    Trigano, Philippe; Talon, Benedicte; Baltazart, Didier; Demko, Christophe; Newstead, Emma

    1991-03-01

    LCS (Language Comprehension System) is a software package designed to improve man-machine communication with computer programs. Different simple structures and functions are available to build man-machine interfaces in natural language. A user may write a sentence in good English or in telegraphical style. The system used pattern matching techniques to detect misspelled words (or badly typed words) and to correct them. Several methods of analysis are available at any level (lexical, syntactic, semantic...). A special knowledge acquisition system is used to introduce new works by giving a description in natural language. A semantic network is extended to a representation close to a connexionist graph, for a better understanding of polysemic words and ambiguities. An application is currently used for a man-machine interface of an expert system in computer-aided education, for a better dialogue with the user during the explanation of reasoning phase. The object of this paper is to present the LCS system, especially at the lexical level, the knowledge representation and acquisition level, and the semantic level (for pronoun references and ambiguity).

  7. Attacks on lexical natural language steganography systems

    NASA Astrophysics Data System (ADS)

    Taskiran, Cuneyt M.; Topkara, Umut; Topkara, Mercan; Delp, Edward J.

    2006-02-01

    Text data forms the largest bulk of digital data that people encounter and exchange daily. For this reason the potential usage of text data as a covert channel for secret communication is an imminent concern. Even though information hiding into natural language text has started to attract great interest, there has been no study on attacks against these applications. In this paper we examine the robustness of lexical steganography systems.In this paper we used a universal steganalysis method based on language models and support vector machines to differentiate sentences modified by a lexical steganography algorithm from unmodified sentences. The experimental accuracy of our method on classification of steganographically modified sentences was 84.9%. On classification of isolated sentences we obtained a high recall rate whereas the precision was low.

  8. Building a Natural Language Interface for the ATNF Pulsar Database for Speeding up Execution of Complex Queries

    NASA Astrophysics Data System (ADS)

    Tang, Rupert; Jenet, F.; Rangel, S.; Dartez, L.

    2010-01-01

    Until now, there has been no available natural language interfaces (NLI's) for querying a database of pulsars (rotating neutron stars emitting radiation at regular intervals). Currently, pulsar records are retrieved through an HTML form accessible via the Australia Telescope National Facility (ATNF) website where one needs to be familiar with pulsar attributes used by the interface (e.g. BLC). Using a NLI relinquishes the need for learning form-specific formalism and allows execution of more powerful queries than those supported by the HTML form. Furthermore, on database access that requires comparison of attributes for all the pulsar records (e.g. what is the fastest pulsar?), using a NLI for retrieving answers to such complex questions is definitely much more efficient and less error-prone. This poster presents the first NLI ever created for the ATNF pulsar database (ATNF-Query) to facilitate database access using complex queries. ATNF-Query is built using a machine learning approach that induces a semantic parser from a question corpus; the innovative application is intended to provide pulsar researchers or laymen with an intelligent language understanding database system for friendly information access.

  9. Intelligent CAI: An Author Aid for a Natural Language Interface.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)…

  10. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  11. Written Language Is as Natural as Spoken Language: A Biolinguistic Perspective

    ERIC Educational Resources Information Center

    Aaron, P. G.; Joshi, R. Malatesha

    2006-01-01

    A commonly held belief is that language is an aspect of the biological system since the capacity to acquire language is innate and evolved along Darwinian lines. Written language, on the other hand, is thought to be an artifact and a surrogate of speech; it is, therefore, neither natural nor biological. This disparaging view of written language,…

  12. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  13. An Evolutionary History of the Natural Language English and the Artificial Language FORTRAN.

    ERIC Educational Resources Information Center

    Koman, Joseph J., III

    1988-01-01

    Notes similarities between certain aspects of the development of the natural language English and the artificial language FORTRAN. Discusses evolutionary history, grammar, style, syntax, varieties, and attempts at standardization. Emphasizes modifications which natural and artificial languages have undergone. Suggests that some modifications were…

  14. Linguistic Analysis of Natural Language Communication with Computers.

    ERIC Educational Resources Information Center

    Thompson, Bozena Henisz

    Interaction with computers in natural language requires a language that is flexible and suited to the task. This study of natural dialogue was undertaken to reveal those characteristics which can make computer English more natural. Experiments were made in three modes of communication: face-to-face, terminal-to-terminal, and human-to-computer,…

  15. Understanding and representing natural language meaning

    NASA Astrophysics Data System (ADS)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  16. Understanding and representing natural language meaning

    SciTech Connect

    Waltz, D.L.; Maran, L.R.; Dorfman, M.H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (a) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (b) written a parsing program which selects appropriate word and sentence meanings by a parallel process known as activation and inhibition; (c) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (d) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (e) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (f) constructed a general model for the representation of tense and aspect of verbs; (g) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  17. Understanding natural language for spacecraft sequencing

    NASA Technical Reports Server (NTRS)

    Katz, Boris; Brooks, Robert N., Jr.

    1987-01-01

    The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.

  18. The Parser Doesn't Ignore Intransitivity, after All

    ERIC Educational Resources Information Center

    Staub, Adrian

    2007-01-01

    Several previous studies (B. C. Adams, C. Clifton, & D. C. Mitchell, 1998; D. C. Mitchell, 1987; R. P. G. van Gompel & M. J. Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eye-tracking experiments examined this issue in more…

  19. MGFp: an open Mascot Generic Format parser library implementation.

    PubMed

    Kirchner, Marc; Steen, Judith A J; Hamprecht, Fred A; Steen, Hanno

    2010-05-01

    Despite the efforts of the mass spectrometry (MS) community to migrate data representation toward modern file formats, legacy text formats still play an important role in MS data processing workflows. We provide a formal grammar and a portable, efficient C++ implementation for a Mascot Generic Format (MGF) parser. Software and technical documentation are available from http://software.steenlab.org/mgfp/. PMID:20334363

  20. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two…

  1. The Linguistic Nature of Language and Communication.

    ERIC Educational Resources Information Center

    Zitlow, Connie S., Ed.

    2001-01-01

    Discusses five recent books about language that address issues that arise in classrooms with an increasing number of diverse dialects and varied home languages. Discusses the complexities of language, misunderstandings in the Ebonics controversy, socioeducational issues, and classroom ideas for teachers. Describes two web sites. (SR)

  2. Inferring heuristic classification hierarchies from natural language input

    NASA Technical Reports Server (NTRS)

    Hull, Richard; Gomez, Fernando

    1993-01-01

    A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.

  3. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  4. Tutorial on techniques and applications for natural language processing

    SciTech Connect

    Hayes, P.J.; Carbonell, J.G.

    1983-10-17

    Natural language communication with computers has long been a major goal of Artificial Intelligence both for what it can tell us about intelligence in general and for its practical utility - data bases, software packages, and Al-based expert systems all require flexible interfaces to a growing community of users who are not able or do not wish to communicate with computers in formal, artificial command languages. Whereas many of the fundamental problems of general natural language processing (NLP) by machine remain to be solved, the area has matured in recent years to the point where practical natural language interfaces to software systems can be constructed in many restricted, but nevertheless useful, circumstances. This tutorial is intended to survey the current state of applied natural language processing by presenting computationally effective NLP techniques, by discussing the range of capabilities these techniques provide for NLP systems, an by discussing their current limitations. Following the introduction, this document is divided into two major sections: the first on language recognition strategies at the single sentence level, and the second on language processing issues that arise during interactive dialogues. In both cases, we concentrate on those aspects of the problem appropriate for interactive natural language interfaces, but relate the techniques and systems discussed to more general work on natural language, independent of application domain.

  5. SISAL: streams and iteration in a single-assignment language. Language reference manual, Version 1. 1

    SciTech Connect

    McGraw, J.; Skedzielewski, S.; Allan, S.; Grit, D.; Oldehoeft, R.; Glauert, J.; Dobes, I.; Hohensee, P.

    1983-07-20

    Many multi-processor systems are currently under study by various groups around the world. The understanding and exploitation of parallelism in these systems is a primary goal of these studies. To facilitate the use and comparison of these systems, we proposed to define a common high-level language. The primary candidate was a single-assignment, applicative, dataflow language. In spite of remarkable diversity in hardware structures, all proposed systems could benefit from the functional semantics of such a language, and from other characteristics such as implicit parallelism, freedom from side effects, locality of effects, etc. A compiler would be produced having a single Front-end (language specific) parser and several Back-end (machine specific) code generators. An Intermediate Format is defined to serve as the interface between parser and code generators and interface between this system and other language systems. Benchmark programs were produced. SISAL is designed to express algorithms for execution on computers capable of highly concurrent operation. More specifically, the application area to be supported is numerical computation which strains the limits of high performance machines, and the primary targets for translation of SISAL programs are dataflow data-driven machines. Nevertheless, it has been our intention that the language not have idiosyncrasies reflecting the particular nature of the application area or target machine. It should be reasonable for SISAL to evolve into a general purpose language appropriate for writing programs to run on future general parallel computers.

  6. A Hybrid Architecture For Natural Language Understanding

    NASA Astrophysics Data System (ADS)

    Loatman, R. Bruce

    1987-05-01

    The PRC Adaptive Knowledge-based Text Understanding System (PAKTUS) is an environment for developing natural language understanding (NLU) systems. It uses a knowledge-based approach in an integrated hybrid architecture based on a factoring of the NLU problem into its lexi-cal, syntactic, conceptual, domain-specific, and pragmatic components. The goal is a robust system that benefits from the strengths of several NLU methodologies, each applied where most appropriate. PAKTUS employs a frame-based knowledge representation and associative networks throughout. The lexical component uses morphological knowledge and word experts. Syntactic knowledge is represented in an Augmented Transition Network (ATN) grammar that incorporates rule-based programming. Case grammar is used for canonical conceptual representation with constraints. Domain-specific templates represent knowledge about specific applications as patterns of the form used in logic programming. Pragmatic knowledge may augment any of the other types and is added wherever needed for a particular domain. The system has been constructed in an interactive graphic programming environment. It has been used successfully to build a prototype front end for an expert system. This integration of existing technologies makes limited but practical NLU feasible now for narrow, well-defined domains.

  7. Sex and Gender in Natural Language.

    ERIC Educational Resources Information Center

    Percival, W. Keith

    The relation between a real-world category (sex) and a linguistic category (gender) is examined. The gender system of Indo-European languages is discussed, and the way gender works in Greek, one of the older Indo-European languages, is examined at some length. The conclusion is that, but for the existence of separate gender-sensitive adjectival…

  8. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  9. Storing files in a parallel computing system based on user-specified parser function

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  10. Getting Answers to Natural Language Questions on the Web.

    ERIC Educational Resources Information Center

    Radev, Dragomir R.; Libner, Kelsey; Fan, Weiguo

    2002-01-01

    Describes a study that investigated the use of natural language questions on Web search engines. Highlights include query languages; differences in search engine syntax; and results of logistic regression and analysis of variance that showed aspects of questions that predicted significantly different performances, including the number of words,…

  11. Semantics of Context-Free Fragments of Natural Languages.

    ERIC Educational Resources Information Center

    Suppes, Patrick

    The objective of this paper is to combine the viewpoint of model-theoretic semantics and generative grammar, to define semantics for context-free languages, and to apply the results to some fragments of natural language. Following the introduction in the first section, Section 2 describes a simple artificial example to illustrate how a semantic…

  12. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information…

  13. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  14. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  15. Natural language watermarking: Challenges in building a practical system

    NASA Astrophysics Data System (ADS)

    Topkara, Mercan; Riccardi, Giuseppe; Hakkani-Tür, Dilek; Atallah, Mikhail J.

    2006-02-01

    This paper gives an overview of the research and implementation challenges we encountered in building an end-to-end natural language processing based watermarking system. With natural language watermarking, we mean embedding the watermark into a text document, using the natural language components as the carrier, in such a way that the modifications are imperceptible to the readers and the embedded information is robust against possible attacks. Of particular interest is using the structure of the sentences in natural language text in order to insert the watermark. We evaluated the quality of the watermarked text using an objective evaluation metric, the BLEU score. BLEU scoring is commonly used in the statistical machine translation community. Our current system prototype achieves 0.45 BLEU score on a scale [0,1].

  16. Natural Language Processing Neural Network Considering Deep Cases

    NASA Astrophysics Data System (ADS)

    Sagara, Tsukasa; Hagiwara, Masafumi

    In this paper, we propose a novel neural network considering deep cases. It can learn knowledge from natural language documents and can perform recall and inference. Various techniques of natural language processing using Neural Network have been proposed. However, natural language sentences used in these techniques consist of about a few words, and they cannot handle complicated sentences. In order to solve these problems, the proposed network divides natural language sentences into a sentence layer, a knowledge layer, ten kinds of deep case layers and a dictionary layer. It can learn the relations among sentences and among words by dividing sentences. The advantages of the method are as follows: (1) ability to handle complicated sentences; (2) ability to restructure sentences; (3) usage of the conceptual dictionary, Goi-Taikei, as the long term memory in a brain. Two kinds of experiments were carried out by using goo dictionary and Wikipedia as knowledge sources. Superior performance of the proposed neural network has been confirmed.

  17. A Natural Language Interface Concordant with a Knowledge Base

    PubMed Central

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively. PMID:26904105

  18. Geometry and Language--A Natural Connection.

    ERIC Educational Resources Information Center

    Pereira-Mendoza, Lionel

    1997-01-01

    Describes an activity which provides a starting point for exploring geometric concepts through language. The activity incorporates three general ideas: (1) relating to personal experience; (2) integrating with other disciplines; and (3) requiring students to express their ideas. A sample activity would include writing a story about what life would…

  19. Two Types of Definites in Natural Language

    ERIC Educational Resources Information Center

    Schwarz, Florian

    2009-01-01

    This thesis is concerned with the description and analysis of two semantically different types of definite articles in German. While the existence of distinct article paradigms in various Germanic dialects and other languages has been acknowledged in the descriptive literature for quite some time, the theoretical implications of their existence…

  20. FromTo-CLIR: Web-Based Natural Language Interface for Cross-Language Information Retrieval.

    ERIC Educational Resources Information Center

    Kim, Taewan; Sim, Chul-Min; Yuh, Sanghwa; Jung, Hanmin; Kim, Young-Kil; Choi, Sung-Kwon; Park, Dong-In; Choi, Key Sun

    1999-01-01

    Describes the implementation of FromTo-CLIR, a Web-based natural-language interface for cross-language information retrieval that was tested with Korean and Japanese. Proposes a method that uses a semantic category tree and collocation to resolve the ambiguity of query translation. (Author/LRW)

  1. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  2. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  3. Overview of Computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  4. Overview of computer-based natural language processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language-Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc. in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and other who will be affected by this field as it unfolds.

  5. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  6. Experience with a mixed semantic/syntactic parser.

    PubMed Central

    Haug, P. J.; Koehler, S.; Lau, L. M.; Wang, P.; Rocha, R.; Huff, S. M.

    1995-01-01

    The value of the computerized medical record is derived in part from the availability of medical information in a coded form accessible to manipulation by processes designed for automated decision support, medical research, and computer assistance in the management of health care delivery. To meet these needs medical reports captured and stored as natural language documents must be encoded. Below we discuss an ongoing formative process aimed at developing a natural language understanding system for chest x-ray reports. Comparative data showing the progress of this process is presented. PMID:8563286

  7. The redundancy of recursion and infinity for natural language.

    PubMed

    Luuk, Erkki; Luuk, Hendrik

    2011-02-01

    An influential line of thought claims that natural language and arithmetic processing require recursion, a putative hallmark of human cognitive processing (Chomsky in Evolution of human language: biolinguistic perspectives. Cambridge University Press, Cambridge, pp 45-61, 2010; Fitch et al. in Cognition 97(2):179-210, 2005; Hauser et al. in Science 298(5598):1569-1579, 2002). First, we question the need for recursion in human cognitive processing by arguing that a generally simpler and less resource demanding process--iteration--is sufficient to account for human natural language and arithmetic performance. We argue that the only motivation for recursion, the infinity in natural language and arithmetic competence, is equally approachable by iteration and recursion. Second, we submit that the infinity in natural language and arithmetic competence reduces to imagining infinite embedding or concatenation, which is completely independent from the ability to implement infinite processing, and thus, independent from both recursion and iteration. Furthermore, we claim that a property of natural language is physically uncountable finity and not discrete infinity. PMID:20652723

  8. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  9. UMLS knowledge for biomedical language processing.

    PubMed Central

    McCray, A T; Aronson, A R; Browne, A C; Rindflesch, T C; Razi, A; Srinivasan, S

    1993-01-01

    This paper describes efforts to provide access to the free text in biomedical databases. The focus of the effort is the development of SPECIALIST, an experimental natural language processing system for the biomedical domain. The system includes a broad coverage parser supported by a large lexicon, modules that provide access to the extensive Unified Medical Language System (UMLS) Knowledge Sources, and a retrieval module that permits experiments in information retrieval. The UMLS Metathesaurus and Semantic Network provide a rich source of biomedical concepts and their interrelationships. Investigations have been conducted to determine the type of information required to effect a map between the language of queries and the language of relevant documents. Mappings are never straightforward and often involve multiple inferences. PMID:8472004

  10. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  11. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  12. Natural language processing and the Now-or-Never bottleneck.

    PubMed

    Gómez-Rodríguez, Carlos

    2016-01-01

    Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science. PMID:27561430

  13. Toward a Natural Language Interface for EHR Questions

    PubMed Central

    Roberts, Kirk; Demner-Fushman, Dina

    2015-01-01

    This paper presents a pilot study on the process of manually annotating natural language EHR questions with a formal meaning representation. This formal representation could then be used as a structured query as part of a natural language interface for electronic health records. This study analyzes the challenges of representing EHR questions as structured queries as well as the feasibility of creating a sufficiently large corpus of manually annotated structured queries for EHR questions. A set of 100 EHR questions, sampled from actual questions asked by ICU physicians[1], is used to perform the analysis. The ultimate goal of this research is to enable automatic methods for understanding EHR questions for use in a natural language EHR interface. PMID:26306260

  14. Software Development Of XML Parser Based On Algebraic Tools

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2011-12-01

    In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.

  15. What Is the Nature of Poststroke Language Recovery and Reorganization?

    PubMed Central

    Kiran, Swathi

    2012-01-01

    This review focuses on three main topics related to the nature of poststroke language recovery and reorganization. The first topic pertains to the nature of anatomical and physiological substrates in the infarcted hemisphere in poststroke aphasia, including the nature of the hemodynamic response in patients with poststroke aphasia, the nature of the peri-infarct tissue, and the neuronal plasticity potential in the infarcted hemisphere. The second section of the paper reviews the current neuroimaging evidence for language recovery in the acute, subacute, and chronic stages of recovery. The third and final section examines changes in connectivity as a function of recovery in poststroke aphasia, specifically in terms of changes in white matter connectivity, changes in functional effective connectivity, and changes in resting state connectivity after stroke. While much progress has been made in our understanding of language recovery, more work needs to be done. Future studies will need to examine whether reorganization of language in poststroke aphasia corresponds to a tighter, more coherent, and efficient network of residual and new regions in the brain. Answering these questions will go a long way towards being able to predict which patients are likely to recover and may benefit from future rehabilitation. PMID:23320190

  16. The Nature of Object Marking in American Sign Language

    ERIC Educational Resources Information Center

    Gokgoz, Kadir

    2013-01-01

    In this dissertation, I examine the nature of object marking in American Sign Language (ASL). I investigate object marking by means of directionality (the movement of the verb towards a certain location in signing space) and by means of handling classifiers (certain handshapes accompanying the verb). I propose that object marking in ASL is…

  17. Learning by Communicating in Natural Language with Conversational Agents

    ERIC Educational Resources Information Center

    Graesser, Arthur; Li, Haiying; Forsyth, Carol

    2014-01-01

    Learning is facilitated by conversational interactions both with human tutors and with computer agents that simulate human tutoring and ideal pedagogical strategies. In this article, we describe some intelligent tutoring systems (e.g., AutoTutor) in which agents interact with students in natural language while being sensitive to their cognitive…

  18. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor.

    ERIC Educational Resources Information Center

    Dadlez, Eva M.

    1984-01-01

    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big Brother"…

  19. Principles of Organization in Young Children's Natural Language Hierarchies.

    ERIC Educational Resources Information Center

    Callanan, Maureen A.; Markman, Ellen M.

    1982-01-01

    When preschool children think of objects as organized into collections (e.g., forest, army) they solve certain problems better than when they think of the same objects as organized into classes (e.g., trees, soldiers). Present studies indicate preschool children occasionally distort natural language inclusion hierarchies (e.g., oak, tree) into the…

  20. Research at Yale in Natural Language Processing. Research Report #84.

    ERIC Educational Resources Information Center

    Schank, Roger C.

    This report summarizes the capabilities of five computer programs at Yale that do automatic natural language processing as of the end of 1976. For each program an introduction to its overall intent is given, followed by the input/output, a short discussion of the research underlying the program, and a prognosis for future development. The programs…

  1. Naturally Simplified Input, Comprehension, and Second Language Acquisition.

    ERIC Educational Resources Information Center

    Ellis, Rod

    This article examines the concept of simplification in second language (SL) learning, reviewing research on the simplified input that both naturalistic and classroom SL learners receive. Research indicates that simplified input, particularly if derived from naturally occurring interactions, does aid comprehension but has not been shown to…

  2. Dealing with Quantifier Scope Ambiguity in Natural Language Understanding

    ERIC Educational Resources Information Center

    Hafezi Manshadi, Mohammad

    2014-01-01

    Quantifier scope disambiguation (QSD) is one of the most challenging problems in deep natural language understanding (NLU) systems. The most popular approach for dealing with QSD is to simply leave the semantic representation (scope-) underspecified and to incrementally add constraints to filter out unwanted readings. Scope underspecification has…

  3. Design of Lexicons in Some Natural Language Systems.

    ERIC Educational Resources Information Center

    Cercone, Nick; Mercer, Robert

    1980-01-01

    Discusses an investigation of certain problems concerning the structural design of lexicons used in computational approaches to natural language understanding. Emphasizes three aspects of design: retrieval of relevant portions of lexicals items, storage requirements, and representation of meaning in the lexicon. (Available from ALLC, Dr. Rex Last,…

  4. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  5. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  6. Proof-Theoretic Semantics for a Natural Language Fragment

    NASA Astrophysics Data System (ADS)

    Francez, Nissim; Dyckhoff, Roy

    We propose a Proof - Theoretic Semantics (PTS) for a (positive) fragment E+0 of Natural Language (NL) (English in this case). The semantics is intended [7] to be incorporated into actual grammars, within the framework of Type - Logical Grammar (TLG) [12]. Thereby, this semantics constitutes an alternative to the traditional model - theoretic semantics (MTS), originating in Montague's seminal work [11], used in TLG.

  7. Learning from a Computer Tutor with Natural Language Capabilities

    ERIC Educational Resources Information Center

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha

    2003-01-01

    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…

  8. CITE NLM: Natural-Language Searching in an Online Catalog.

    ERIC Educational Resources Information Center

    Doszkocs, Tamas E.

    1983-01-01

    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…

  9. Parsley: a Command-Line Parser for Astronomical Applications

    NASA Astrophysics Data System (ADS)

    Deich, William

    Parsley is a sophisticated keyword + value parser, packaged as a library of routines that offers an easy method for providing command-line arguments to programs. It makes it easy for the user to enter values, and it makes it easy for the programmer to collect and validate the user's entries. Parsley is tuned for astronomical applications: for example, dates entered in Julian, Modified Julian, calendar, or several other formats are all recognized without special effort by the user or by the programmer; angles can be entered using decimal degrees or dd:mm:ss; time-like intervals as decimal hours, hh:mm:ss, or a variety of other units. Vectors of data are accepted as readily as scalars.

  10. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  11. Developing Formal Correctness Properties from Natural Language Requirements

    NASA Technical Reports Server (NTRS)

    Nikora, Allen P.

    2006-01-01

    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  12. Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis

    NASA Technical Reports Server (NTRS)

    Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.

    2007-01-01

    To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.

  13. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  14. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  15. Natural language processing and its future in medicine.

    PubMed

    Friedman, C; Hripcsak, G

    1999-08-01

    If accurate clinical information were available electronically, automated applications could be developed to use this information to improve patient care and lower costs. However, to be fully retrievable, clinical information must be structured or coded. Many online patient reports are not coded, but are recorded in natural-language text that cannot be reliably accessed. Natural language processing (NLP) can solve this problem by extracting and structuring text-based clinical information, making clinical data available for use. NLP systems are quite difficult to develop, as they require substantial amounts of knowledge, but progress has definitely been made. Some NLP systems have been developed and tested and have demonstrated promising performance in practical clinical applications; some of these systems have already been deployed. The authors provide background information about NLP, briefly describe some of the systems that have been recently developed, and discuss the future of NLP in medicine. PMID:10495728

  16. Recent advances in natural language processing for biomedical applications.

    PubMed

    Collier, Nigel; Nazarenko, Adeline; Baud, Robert; Ruch, Patrick

    2006-06-01

    We survey a set a recent advances in natural language processing applied to biomedical applications, which were presented in Geneva, Switzerland, in 2004 at an international workshop. While text mining applied to molecular biology and biomedical literature can report several interesting achievements, we observe that studies applied to clinical contents are still rare. In general, we argue that clinical corpora, including electronic patient records, must be made available to fill the gap between bioinformatics and medical informatics. PMID:16139564

  17. A statistical natural language processor for medical reports.

    PubMed Central

    Taira, R. K.; Soderland, S. G.

    1999-01-01

    Statistical natural language processors have been the focus of much research during the past decade. The main advantage of such an approach over grammatical rule-based approaches is its scalability to new domains. We present a statistical NLP for the domain of radiology and report on methods of knowledge acquisition, parsing, semantic interpretation, and evaluation. Preliminary performance data are given. A discussion of the perceived benefit, limitations and future work is presented. PMID:10566505

  18. A Codasyl-Type Schema for Natural Language Medical Records

    PubMed Central

    Sager, N.; Tick, L.; Story, G.; Hirschman, L.

    1980-01-01

    This paper describes a CODASYL (network) database schema for information derived from narrative clinical reports. The goal of this work is to create an automated process that accepts natural language documents as input and maps this information into a database of a type managed by existing database management systems. The schema described here represents the medical events and facts identified through the natural language processing. This processing decomposes each narrative into a set of elementary assertions, represented as MEDFACT records in the database. Each assertion in turn consists of a subject and a predicate classed according to a limited number of medical event types, e.g., signs/symptoms, laboratory tests, etc. The subject and predicate are represented by EVENT records which are owned by the MEDFACT record associated with the assertion. The CODASYL-type network structure was found to be suitable for expressing most of the relations needed to represent the natural language information. However, special mechanisms were developed for storing the time relations between EVENT records and for recording connections (such as causality) between certain MEDFACT records. This schema has been implemented using the UNIVAC DMS-1100 DBMS.

  19. Applications of Natural Language Processing in Biodiversity Science

    PubMed Central

    Thessen, Anne E.; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  20. Natural Language Processing Technologies in Radiology Research and Clinical Applications.

    PubMed

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. PMID:26761536

  1. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  2. Deviations in the Zipf and Heaps laws in natural languages

    NASA Astrophysics Data System (ADS)

    Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.

    2014-03-01

    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.

  3. A natural language understanding system combining syntactic and semantic techniques.

    PubMed Central

    Haug, P.; Koehler, S.; Lau, L. M.; Wang, P.; Rocha, R.; Huff, S.

    1994-01-01

    A large proportion of the medical record currently available in computerized medical information systems is in the form of free text reports. While the accessibility of this source of data is improved through inclusion in the computerized record, it remains unavailable for automated decision support, medical research, and management of medical delivery systems. Natural language understanding systems (NLUS) designed to encode free text reports represent one approach to making this information available for these uses. Below we describe an experimental NLUS designed to parse the reports of chest radiographs and store the clinical data extracted in a medical data base. PMID:7949928

  4. Restricted natural language processing for case simulation tools.

    PubMed Central

    Lehmann, C. U.; Nguyen, B.; Kim, G. R.; Johnson, K. B.; Lehmann, H. P.

    1999-01-01

    For Interactive Patient II, a multimedia case simulation designed to improve history-taking skills, we created a new natural language interface called GRASP (General Recognition and Analysis of Sentences and Phrases) that allows students to interact with the program at a higher level of realism. Requirements included the ability to handle ambiguous word senses and to match user questions/queries to unique Canonical Phrases, which are used to identify case findings in our knowledge database. In a simulation of fifty user queries, some of which contained ambiguous words, this tool was 96% accurate in identifying concepts. PMID:10566424

  5. Natural Language Processing as a Discipline at LLNL

    SciTech Connect

    Firpo, M A

    2005-02-04

    The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.

  6. Description directed control: its implications for natural language generation

    SciTech Connect

    Mcdonald, D.D.

    1983-01-01

    This paper proposes a very specifically constrained virtual machine design for goal-directed natural language generation based on a refinement of the technique of data-directed control that the author has termed description-directed control. Important psycholinguistic properties of generation follow inescapably from the use of this control technique, including: efficient runtimes, bounded lookahead, indelible decisions, incremental production of the text, and inescapable adherence to gramaticality. The technique also provides a possible explanation for some well-known universal constraints, though this cannot be confirmed without further empirical investigation. 29 references.

  7. Query2Question: Translating Visualization Interaction into Natural Language.

    PubMed

    Nafari, Maryam; Weaver, Chris

    2015-06-01

    Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Existing systems and techniques for recording provenance of interaction focus either on comprehensive automated recording of low-level interaction events or on idiosyncratic manual transcription of high-level analysis activities. In this paper, we present the architecture and translation design of a query-to-question (Q2Q) system that automatically records user interactions and presents them semantically using natural language (written English). Q2Q takes advantage of domain knowledge and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a visual log of styled text that complements and effectively extends the functionality of visualization tools. We present Q2Q as a means to support a cross-examination process in which questions rather than interactions are the focus of analytic reasoning and action. We describe the architecture and implementation of the Q2Q system, discuss key design factors and variations that effect question generation, and present several visualizations that incorporate Q2Q for analysis in a variety of knowledge domains. PMID:26357239

  8. Suicide Note Classification Using Natural Language Processing: A Content Analysis

    PubMed Central

    Pestian, John; Nasrallah, Henry; Matykiewicz, Pawel; Bennett, Aurora; Leenaars, Antoon

    2010-01-01

    Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient’s thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes. PMID:21643548

  9. Natural Language Processing in Radiology: A Systematic Review.

    PubMed

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A

    2016-05-01

    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article. PMID:27089187

  10. Does textual feedback hinder spoken interaction in natural language?

    PubMed

    Le Bigot, Ludovic; Terrier, Patrice; Jamet, Eric; Botherel, Valerie; Rouet, Jean-Francois

    2010-01-01

    The aim of the study was to determine the influence of textual feedback on the content and outcome of spoken interaction with a natural language dialogue system. More specifically, the assumption that textual feedback could disrupt spoken interaction was tested in a human-computer dialogue situation. In total, 48 adult participants, familiar with the system, had to find restaurants based on simple or difficult scenarios using a real natural language service system in a speech-only (phone), speech plus textual dialogue history (multimodal) or text-only (web) modality. The linguistic contents of the dialogues differed as a function of modality, but were similar whether the textual feedback was included in the spoken condition or not. These results add to burgeoning research efforts on multimodal feedback, in suggesting that textual feedback may have little or no detrimental effect on information searching with a real system. STATEMENT OF RELEVANCE: The results suggest that adding textual feedback to interfaces for human-computer dialogue could enhance spoken interaction rather than create interference. The literature currently suggests that adding textual feedback to tasks that depend on the visual sense benefits human-computer interaction. The addition of textual output when the spoken modality is heavily taxed by the task was investigated. PMID:20069480

  11. Event construal and temporal distance in natural language.

    PubMed

    Bhatia, Sudeep; Walasek, Lukasz

    2016-07-01

    Construal level theory proposes that events that are temporally proximate are represented more concretely than events that are temporally distant. We tested this prediction using two large natural language text corpora. In study 1 we examined posts on Twitter that referenced the future, and found that tweets mentioning temporally proximate dates used more concrete words than those mentioning distant dates. In study 2 we obtained all New York Times articles that referenced U.S. presidential elections between 1987 and 2007. We found that the concreteness of the words in these articles increased with the temporal proximity to their corresponding election. Additionally the reduction in concreteness after the election was much greater than the increase in concreteness leading up to the election, though both changes in concreteness were well described by an exponential function. We replicated this finding with New York Times articles referencing US public holidays. Overall, our results provide strong support for the predictions of construal level theory, and additionally illustrate how large natural language datasets can be used to inform psychological theory. PMID:27015347

  12. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    SciTech Connect

    Powers, D.M.W.

    1983-04-01

    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  13. Solving problems on base of concepts formalization of language image and figurative meaning of the natural-language constructs

    NASA Astrophysics Data System (ADS)

    Bisikalo, Oleg V.; Cieszczyk, Sławomir; Yussupova, Gulbahar

    2015-12-01

    Building of "clever" thesaurus by algebraic means on base of concepts formalization of language image and figurative meaning of the natural-language constructs in the article are proposed. A formal theory based on a binary operator of directional associative relation is constructed and an understanding of an associative normal form of image constructions is introduced. A model of a commutative semigroup, which provides a presentation of a sentence as three components of an interrogative language image construction, is considered.

  14. Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.

    PubMed

    Pyysalo, Sampo; Ginter, Filip; Pahikkala, Tapio; Boberg, Jorma; Järvinen, Jouni; Salakoski, Tapio

    2006-06-01

    We present an evaluation of Link Grammar and Connexor Machinese Syntax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein-protein interactions. In the evaluation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein-protein interaction. We measure the performance of the parsers for recovery of individual dependencies, fully correct parses, and interaction subgraphs. For Link Grammar, an open system that can be inspected in detail, we further perform a comprehensive failure analysis, report specific causes of error, and suggest potential modifications to the grammar. We find that both parsers perform worse on biomedical English than previously reported on general English. While Connexor Machinese Syntax significantly outperforms Link Grammar, the failure analysis suggests specific ways in which the latter could be modified for better performance in the domain. PMID:16099201

  15. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  16. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  17. Breaking the Molds: Signed Languages and the Nature of Human Language

    ERIC Educational Resources Information Center

    Slobin, Dan I.

    2008-01-01

    Grammars of signed languages tend to be based on grammars established for written languages, particularly the written language in use in the surrounding hearing community of a sign language. Such grammars presuppose categories of discrete elements which are combined into various sorts of structures. Recent analyses of signed languages go beyond…

  18. Validation of clinical problems using a UMLS-based semantic parser.

    PubMed

    Goldberg, H S; Hsu, C; Law, V; Safran, C

    1998-01-01

    The capture and symbolization of data from the clinical problem list facilitates the creation of high-fidelity patient resumes for use in aggregate analysis and decision support. We report on the development of a UMLS-based semantic parser and present a preliminary evaluation of the parser in the recognition and validation of disease-related clinical problems. We randomly sampled 20% of the 26,858 unique non-dictionary clinical problems entered into OMR (Online Medical Record) between 1989 and August, 1997, and eliminated a series of qualified problem labels, e.g., history-of, to obtain a dataset of 4122 problem labels. Within this dataset, the authors identified 2810 labels (68.2%) as referring to a broad range of disease-related processes. The parser correctly recognized and validated 1398 of the 2810 disease-related labels (49.8 +/- 1.9%) and correctly excluded 1220 of 1312 non-disease-related labels (93.0 +/- 1.4%). 812 of the 1181 match failures (68.8%) were caused by terms either absent from UMLS or modifiers not accepted by the parser; 369 match failures (31.2%) were caused by labels having patterns not recognized by the parser. By enriching the UMLS lexicon with terms commonly found in provider-entered labels, it appears that performance of the parser can be significantly enhanced over a few subsequent iterations. This initial evaluation provides a foundation from which to make principled additions to the UMLS lexicon locally for use in symbolizing clinical data; further research is necessary to determine applicability to other health care settings. PMID:9929330

  19. Children with Language Disorders: Natural History and Academic Success.

    ERIC Educational Resources Information Center

    Bashir, Anthony S.; Scavuzzo, Annebelle

    1992-01-01

    This article addresses the academic difficulties of children with language disorders (including dyslexia) and suggests that their persistent academic vulnerability results from the lifelong need to acquire language, to learn with language, and to apply language knowledge for academic learning and social development. The need for continuing…

  20. Semantic Grammar: A Technique for Constructing Natural Language Interfaces to Instructional Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    A major obstacle to the effective educational use of computers is the lack of a natural means of communication between the student and the computer. This report describes a technique for generating such natural language front-ends for advanced instructional systems. It discusses: (1) the essential properties of a natural language front-end, (2)…

  1. What baboons can (not) tell us about natural language grammars.

    PubMed

    Poletiek, Fenna H; Fitz, Hartmut; Bocanegra, Bruno R

    2016-06-01

    Rey et al. (2012) present data from a study with baboons that they interpret in support of the idea that center-embedded structures in human language have their origin in low level memory mechanisms and associative learning. Critically, the authors claim that the baboons showed a behavioral preference that is consistent with center-embedded sequences over other types of sequences. We argue that the baboons' response patterns suggest that two mechanisms are involved: first, they can be trained to associate a particular response with a particular stimulus, and, second, when faced with two conditioned stimuli in a row, they respond to the most recent one first, copying behavior they had been rewarded for during training. Although Rey et al. (2012) 'experiment shows that the baboons' behavior is driven by low level mechanisms, it is not clear how the animal behavior reported, bears on the phenomenon of Center Embedded structures in human syntax. Hence, (1) natural language syntax may indeed have been shaped by low level mechanisms, and (2) the baboons' behavior is driven by low level stimulus response learning, as Rey et al. propose. But is the second evidence for the first? We will discuss in what ways this study can and cannot give evidential value for explaining the origin of Center Embedded recursion in human grammar. More generally, their study provokes an interesting reflection on the use of animal studies in order to understand features of the human linguistic system. PMID:26026382

  2. Spatial and numerical abilities without a complete natural language.

    PubMed

    Hyde, Daniel C; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A; Spelke, Elizabeth S

    2011-04-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  3. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  4. Literature-Based Knowledge Discovery using Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Hristovski, D.; Friedman, C.; Rindflesch, T. C.; Peterlin, B.

    Literature-based discovery (LBD) is an emerging methodology for uncovering nonovert relationships in the online research literature. Making such relationships explicit supports hypothesis generation and discovery. Currently LBD systems depend exclusively on co-occurrence of words or concepts in target documents, regardless of whether relations actually exist between the words or concepts. We describe a method to enhance LBD through capture of semantic relations from the literature via use of natural language processing (NLP). This paper reports on an application of LBD that combines two NLP systems: BioMedLEE and SemRep, which are coupled with an LBD system called BITOLA. The two NLP systems complement each other to increase the types of information utilized by BITOLA. We also discuss issues associated with combining heterogeneous systems. Initial experiments suggest this approach can uncover new associations that were not possible using previous methods.

  5. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  6. Excess entropy in natural language: Present state and perspectives

    NASA Astrophysics Data System (ADS)

    Debowski, Łukasz

    2011-09-01

    We review recent progress in understanding the meaning of mutual information in natural language. Let us define words in a text as strings that occur sufficiently often. In a few previous papers, we have shown that a power-law distribution for so defined words (a.k.a. Herdan's law) is obeyed if there is a similar power-law growth of (algorithmic) mutual information between adjacent portions of texts of increasing length. Moreover, the power-law growth of information holds if texts describe a complicated infinite (algorithmically) random object in a highly repetitive way, according to an analogous power-law distribution. The described object may be immutable (like a mathematical or physical constant) or may evolve slowly in time (like cultural heritage). Here, we reflect on the respective mathematical results in a less technical way. We also discuss feasibility of deciding to what extent these results apply to the actual human communication.

  7. Towards a semantic lexicon for clinical natural language processing.

    PubMed

    Liu, Hongfang; Wu, Stephen T; Li, Dingcheng; Jonnalagadda, Siddhartha; Sohn, Sunghwan; Wagholikar, Kavishwar; Haug, Peter J; Huff, Stanley M; Chute, Christopher G

    2012-01-01

    A semantic lexicon which associates words and phrases in text to concepts is critical for extracting and encoding clinical information in free text and therefore achieving semantic interoperability between structured and unstructured data in Electronic Health Records (EHRs). Directly using existing standard terminologies may have limited coverage with respect to concepts and their corresponding mentions in text. In this paper, we analyze how tokens and phrases in a large corpus distribute and how well the UMLS captures the semantics. A corpus-driven semantic lexicon, MedLex, has been constructed where the semantics is based on the UMLS assisted with variants mined and usage information gathered from clinical text. The detailed corpus analysis of tokens, chunks, and concept mentions shows the UMLS is an invaluable source for natural language processing. Increasing the semantic coverage of tokens provides a good foundation in capturing clinical information comprehensively. The study also yields some insights in developing practical NLP systems. PMID:23304329

  8. From Web Directories to Ontologies: Natural Language Processing Challenges

    NASA Astrophysics Data System (ADS)

    Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing

    Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.

  9. The Semantic Web as a Linguistic Resource: Opportunities for Natural Language Generation

    NASA Astrophysics Data System (ADS)

    Mellish, Chris; Sun, Xiantang

    This paper argues that, because the documents of the semantic web are created by human beings, they are actually much more like natural language documents than theory would have us believe. We present evidence that natural language words are used extensively and in complex ways in current ontologies. This leads to a number of dangers for the semantic web, but also opens up interesting new challenges for natural language processing. This is illustrated by our own work using natural language generation to present parts of ontologies.

  10. Tasking and sharing sensing assets using controlled natural language

    NASA Astrophysics Data System (ADS)

    Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David

    2012-06-01

    We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.

  11. Constructing Concept Schemes From Astronomical Telegrams Via Natural Language Clustering

    NASA Astrophysics Data System (ADS)

    Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.

    2012-01-01

    The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs (http://www.astronomerstelegram.org) are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.

  12. Human-Level Natural Language Understanding: False Progress and Real Challenges

    ERIC Educational Resources Information Center

    Bignoli, Perrin G.

    2013-01-01

    The field of Natural Language Processing (NLP) focuses on the study of how utterances composed of human-level languages can be understood and generated. Typically, there are considered to be three intertwined levels of structure that interact to create meaning in language: syntax, semantics, and pragmatics. Not only is a large amount of…

  13. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  14. The Nature of Spanish versus English Language Use at Home

    ERIC Educational Resources Information Center

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude

    2014-01-01

    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,…

  15. Towards a continuous population model for natural language vowel shift.

    PubMed

    Shipman, Patrick D; Faria, Sérgio H; Strickland, Christopher

    2013-09-01

    The Great English Vowel Shift of 16th-19th centuries and the current Northern Cities Vowel Shift are two examples of collective language processes characterized by regular phonetic changes, that is, gradual changes in vowel pronunciation over time. Here we develop a structured population approach to modeling such regular changes in the vowel systems of natural languages, taking into account learning patterns and effects such as social trends. We treat vowel pronunciation as a continuous variable in vowel space and allow for a continuous dependence of vowel pronunciation in time and age of the speaker. The theory of mixtures with continuous diversity provides a framework for the model, which extends the McKendrick-von Foerster equation to populations with age and phonetic structures. We develop the general balance equations for such populations and propose explicit expressions for the factors that impact the evolution of the vowel pronunciation distribution. For illustration, we present two examples of numerical simulations. In the first one we study a stationary solution corresponding to a state of phonetic equilibrium, in which speakers of all ages share a similar phonetic profile. We characterize the variance of the phonetic distribution in terms of a parameter measuring a ratio of phonetic attraction to dispersion. In the second example we show how vowel shift occurs upon starting with an initial condition consisting of a majority pronunciation that is affected by an immigrant minority with a different vowel pronunciation distribution. The approach developed here for vowel systems may be applied also to other learning situations and other time-dependent processes of cognition in self-interacting populations, like opinions or perceptions. PMID:23624180

  16. Success story in software engineering using NIAM (Natural language Information Analysis Methodology)

    SciTech Connect

    Eaton, S.M.; Eaton, D.S.

    1995-10-01

    To create an information system, we employ NIAM (Natural language Information Analysis Methodology). NIAM supports the goals of both the customer and the analyst completely understanding the information. We use the customer`s own unique vocabulary, collect real examples, and validate the information in natural language sentences. Examples are discussed from a successfully implemented information system.

  17. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by…

  18. One grammar or two? Sign Languages and the Nature of Human Language

    PubMed Central

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages—such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality but also in the form of sign languages has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language—in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. PMID:25013534

  19. Natural language processing and visualization in the molecular imaging domain.

    PubMed

    Tulipano, P Karina; Tao, Ying; Millar, William S; Zanzonico, Pat; Kolbert, Katherine; Xu, Hua; Yu, Hong; Chen, Lifeng; Lussier, Yves A; Friedman, Carol

    2007-06-01

    Molecular imaging is at the crossroads of genomic sciences and medical imaging. Information within the molecular imaging literature could be used to link to genomic and imaging information resources and to organize and index images in a way that is potentially useful to researchers. A number of natural language processing (NLP) systems are available to automatically extract information from genomic literature. One existing NLP system, known as BioMedLEE, automatically extracts biological information consisting of biomolecular substances and phenotypic data. This paper focuses on the adaptation, evaluation, and application of BioMedLEE to the molecular imaging domain. In order to adapt BioMedLEE for this domain, we extend an existing molecular imaging terminology and incorporate it into BioMedLEE. BioMedLEE's performance is assessed with a formal evaluation study. The system's performance, measured as recall and precision, is 0.74 (95% CI: [.70-.76]) and 0.70 (95% CI [.63-.76]), respectively. We adapt a JAVA viewer known as PGviewer for the simultaneous visualization of images with NLP extracted information. PMID:17084109

  20. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  1. Crowdsourcing and curation: perspectives from biology and natural language processing

    PubMed Central

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel

    2016-01-01

    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging ‘the crowd’; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9–11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing. Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives PMID:27504010

  2. Crowdsourcing and curation: perspectives from biology and natural language processing.

    PubMed

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel

    2016-01-01

    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives. PMID:27504010

  3. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language.

    PubMed

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł

    2015-05-01

    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. PMID:25858311

  4. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a…

  5. Notes on the Nature of Bilingual Specific Language Impairment

    ERIC Educational Resources Information Center

    de Jong, Jan

    2010-01-01

    Johanne Paradis' Keynote Article can be read as a concise critical review of the research that focuses on the sometimes strained relationship between bilingualism and specific language impairment (SLI). In my comments I will add some thoughts based on our own research on the learning of Dutch as a second language (L2) by children with SLI.

  6. Teaching Language to Handicapped Children in Natural Settings.

    ERIC Educational Resources Information Center

    Cavallaro, Claire C.; Poulson, Claire L.

    1985-01-01

    An adaptation of incidental teaching procedures was used to teach individually defined language responses to four language-delayed children (4-8 years old). Data showed that frequency of each targeted response increased only during incidental teaching, with most targeted responses produced spontaneously; and the teachers correctly implemented the…

  7. Automation of a problem list using natural language processing

    PubMed Central

    Meystre, Stephane; Haug, Peter J

    2005-01-01

    Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized

  8. Natural language processing in an intelligent writing strategy tutoring system.

    PubMed

    McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod

    2013-06-01

    The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system. PMID:23055164

  9. Of substance: the nature of language effects on entity construal.

    PubMed

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-06-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D. (1997). A cross-linguistic study of early word meaning: Universal ontology and linguistic influence. Cognition, 62, 169-200]. Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin speakers and English speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's [Prasada, S. (1999). Names for things and stuff: An Aristotelian perspective. In R. Jackendoff, P. Bloom, & K. Wynn (Eds.), Language, logic, and concepts (pp. 119-146). Cambridge, MA: MIT Press] hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object kind rather than substance kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  10. Statistical Learning in a Natural Language by 8-Month-Old Infants

    PubMed Central

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2013-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  11. Statistical learning in a natural language by 8-month-old infants.

    PubMed

    Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R

    2009-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  12. On the neurolinguistic nature of language abnormalities in Huntington's disease.

    PubMed Central

    Wallesch, C W; Fehrenbach, R A

    1988-01-01

    Spontaneous language of 18 patients suffering from Huntington's disease and 15 dysarthric controls suffering from Friedreich's ataxia were investigated. In addition, language functions in various modalities were assessed with the Aachen Aphasia Test (AAT). The Huntington patients exhibited deficits in the syntactical complexity of spontaneous speech and in the Token Test, confrontation naming, and language comprehension subtests of the AAT, which are interpreted as resulting from their dementia. Errors affecting word access mechanisms and production of syntactical structures as such were not encountered. PMID:2452241

  13. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  14. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    PubMed Central

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  15. Of Substance: The Nature of Language Effects on Entity Construal

    PubMed Central

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind (whisk; Imai and Gentner, 1997). Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin-speakers and English-speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's (1999) hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object-kind rather than substance-kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  16. Of Substance: The Nature of Language Effects on Entity Construal

    ERIC Educational Resources Information Center

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.…

  17. Dynamic changes in network activations characterize early learning of a natural language.

    PubMed

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius

    2014-09-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  18. Dynamic Changes in Network Activations Characterize Early Learning of a Natural Language

    PubMed Central

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S.; Almyrde, Kyle, R.; Fridriksson, Julius

    2014-01-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 minutes of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  19. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  20. The Nature of Chinese Language Classroom Learning Environments in Singapore Secondary Schools

    ERIC Educational Resources Information Center

    Chua, Siew Lian; Wong, Angela F. L.; Chen, Der-Thanq V.

    2011-01-01

    This article reports findings from a classroom environment study which was designed to investigate the nature of Chinese Language classroom environments in Singapore secondary schools. We used a perceptual instrument, the Chinese Language Classroom Environment Inventory, to investigate teachers' and students' perceptions towards their Chinese…

  1. Development and Evaluation of a Thai Learning System on the Web Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Dansuwan, Suyada; Nishina, Kikuko; Akahori, Kanji; Shimizu, Yasutaka

    2001-01-01

    Describes the Thai Learning System, which is designed to help learners acquire the Thai word order system. The system facilitates the lessons on the Web using HyperText Markup Language and Perl programming, which interfaces with natural language processing by means of Prolog. (Author/VWL)

  2. Transfer of a Natural Language System for Problem-Solving in Physics to Other Domains.

    ERIC Educational Resources Information Center

    Oberem, Graham E.

    The limited language capability of CAI systems has made it difficult to personalize problem-solving instruction. The intelligent tutoring system, ALBERT, is a problem-solving monitor and coach that has been used with high school and college level physics students for several years; it uses a natural language system to understand kinematics…

  3. A Fuzzy Set Approach to Modifiers and Vagueness in Natural Language

    ERIC Educational Resources Information Center

    Hersh, Harry M.; Caramazza, Alfonso

    1976-01-01

    The proposition that natural language concepts are represented as fuzzy sets, a generalization of the traditional theory of sets, of meaning components and that language operators--adverbs, negative markers, and adjectives--can be considered as operators on fuzzy sets was assessed empirically. (Editor/RK)

  4. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  5. Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing

    ERIC Educational Resources Information Center

    Cohen, Kevin Bretonnel

    2010-01-01

    Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of…

  6. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of…

  7. Prospects for Knowledge-Based Customization of Natural Language Query Systems.

    ERIC Educational Resources Information Center

    Damerau, Fred J.

    1988-01-01

    Discusses the potential sources of knowledge for customizing transportable natural language query systems, including sophisticated dictionaries, database content, and human database experts. A rough quantification of the importance of each source is provided. (17 references) (Author/CLB)

  8. GE FRST Evaluation Report: How Well Does a Statistically-Based Natural Language Processing System Score Natural Language Constructed-Responses?

    ERIC Educational Resources Information Center

    Burstein, Jill C.; Kaplan, Randy M.

    There is a considerable interest at Educational Testing Service (ETS) to include performance-based, natural language constructed-response items on standardized tests. Such items can be developed, but the projected time and costs required to have these items scored by human graders would be prohibitive. In order for ETS to include these types of…

  9. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  10. Language Teaching in Literature Departments: Natural Partnership or Shotgun Marriage?

    ERIC Educational Resources Information Center

    Shumway, Nicolas

    1990-01-01

    It is noted that language teaching in literature departments in many colleges and universities does not confer the same benefits or offer the same rewards as teaching literature. The problems this imbalance creates in academic rigor and continuity and the questions addressing why this is imbalance exists are discussed. (GLR)

  11. Natural language modeling for phoneme-to-text transcription

    SciTech Connect

    Derouault, A.M.; Merialdo, B.

    1986-11-01

    This paper relates different kinds of language modeling methods that can be applied to the linguistic decoding part of a speech recognition system with a very large vocabulary. These models are studied experimentally on a pseudophonetic input arising from French stenotypy. The authors propose a model which combines the advantages of a statistical modeling with information theoretic tools, and those of a grammatical approach.

  12. Unit 1001: The Nature of Meaning in Language.

    ERIC Educational Resources Information Center

    Minnesota Univ., Minneapolis. Center for Curriculum Development in English.

    This 10th-grade unit in Minnesota's "language-centered" curriculum introduces the complexity of linguistic meaning by demonstrating the relationships among linguistic symbols, their referents, their interpreters, and the social milieu. The unit begins with a discussion of Ray Bradbury's "The Kilimanjaro Machine," which illustrates how an otherwise…

  13. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…

  14. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E; Garrett, Kenyan; Sottile, Matthew; Shreyas, Ns

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  15. The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution.

    PubMed

    Benítez-Burraco, Antonio; Murphy, Elliot

    2016-01-01

    Autism spectrum disorders (ASD) are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this article is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to) genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations. PMID:27047363

  16. The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution

    PubMed Central

    Benítez-Burraco, Antonio; Murphy, Elliot

    2016-01-01

    Autism spectrum disorders (ASD) are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this article is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to) genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations. PMID:27047363

  17. The language faculty that wasn't: a usage-based account of natural language recursion

    PubMed Central

    Christiansen, Morten H.; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  18. Concreteness and Psychological Distance in Natural Language Use

    PubMed Central

    Snefjella, Bryor; Kuperman, Victor

    2015-01-01

    Existing evidence shows that more abstract mental representations are formed, and more abstract language is used, to characterize phenomena which are more distant from self. Yet the precise form of the functional relationship between distance and linguistic abstractness has been unknown. In four studies, we test whether more abstract language is used in textual references to more geographically distant cities (Study 1), times further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social media users, we determine that linguistic concreteness is a curvilinear function of the logarithm of distance and discuss psychological underpinnings of the mathematical properties of the relationship. We also demonstrate that gradient curvilinear effects of geographic and temporal distance on concreteness are near-identical, suggesting uniformity in representation of abstractness along multiple dimensions. PMID:26239108

  19. Concreteness and Psychological Distance in Natural Language Use.

    PubMed

    Snefjella, Bryor; Kuperman, Victor

    2015-09-01

    Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. PMID:26239108

  20. Novels, Scripts, and Natural Discourse: Mediated Models of Language Use and Understanding.

    ERIC Educational Resources Information Center

    Soukup, Paul A.

    This paper explores the question of interpreting complex texts, particularly scripts and multimedia presentations. The paper first reviews the literature on natural discourse, noting that although interpreting spoken natural language seems rather straightforward, many scholars have discussed what is required to make sense of discourse. The paper…

  1. Rimac: A Natural-Language Dialogue System that Engages Students in Deep Reasoning Dialogues about Physics

    ERIC Educational Resources Information Center

    Katz, Sandra; Jordan, Pamela; Litman, Diane

    2011-01-01

    The natural-language tutorial dialogue system that the authors are developing will allow them to focus on the nature of interactivity during tutoring as a malleable factor. Specifically, it will serve as a research platform for studies that manipulate the frequency and types of verbal alignment processes that take place during tutoring, such as…

  2. For the People...Citizenship Education and Naturalization Information. An English as a Second Language Text.

    ERIC Educational Resources Information Center

    Short, Deborah J.; And Others

    A textbook for English-as-a-Second-Language (ESL) students presents lessons on U.S. citizenship education and naturalization information. The nine lessons cover the following topics: the U.S. system of government; the Bill of Rights; responsibilities and rights of citizens; voting; requirements for naturalization; the application process; the…

  3. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  4. The Right Tool for the Job: Techniques for Analysis of Natural Language Use.

    ERIC Educational Resources Information Center

    Green, Georgia M.

    A variety of techniques for collecting and analyzing information about the natural use of natural languages is surveyed, emphasizing the importance of recognizing the properties of a research task that make a given technique more or less suitable to it rather than comparing techniques globally and ranking them absolutely. An initial goal is to…

  5. The feasibility of using natural language processing to extract clinical information from breast pathology reports

    PubMed Central

    Buckley, Julliette M.; Coopey, Suzanne B.; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K.; Kim, Elizabeth M. H.; Garber, Judy E.; Smith, Barbara L.; Gadd, Michele A.; Specht, Michelle C.; Roche, Constance A.; Gudewicz, Thomas M.; Hughes, Kevin S.

    2012-01-01

    Objective: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. Approach and Procedure: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. Results: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. Conclusion: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. PMID:22934236

  6. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.

    PubMed

    Clody, Michael C

    2011-01-01

    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key. PMID:22371983

  7. Integrated pathology reporting, indexing, and retrieval system using natural language diagnoses.

    PubMed

    Moore, G W; Boitnott, J K; Miller, R E; Eggleston, J C; Hutchins, G M

    1988-01-01

    Pathology computer systems are making increasing use of natural language diagnoses. The Johns Hopkins Medical Institutions integrated pathology reporting system, a commercial product with extensive, locally added enhancements, covers all information management functions within autopsy and surgical pathology divisions and has on-line linkages to clinical laboratory reports and the medical library's Mini-MEDLINE system. All diagnoses are written in natural language, using a word processor and spelling checker. A security system with personal passwords and different levels of access for different staff members allows reports to be signed out with an electronic signature. The system produces financial reports, overdue case reports, and Boolean searches of the database. Our experience with 128,790 consecutively entered pathology reports suggests that the greater precision of natural language diagnoses makes them the most suitable vehicle for follow-up, retrieval, and systems development functions in pathology. PMID:3070549

  8. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  9. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  10. Unifying syntactic theory and sentence processing difficulty through a connectionist minimalist parser.

    PubMed

    Gerth, Sabrina; Beim Graben, Peter

    2009-12-01

    Syntactic theory provides a rich array of representational assumptions about linguistic knowledge and processes. Such detailed and independently motivated constraints on grammatical knowledge ought to play a role in sentence comprehension. However most grammar-based explanations of processing difficulty in the literature have attempted to use grammatical representations and processes per se to explain processing difficulty. They did not take into account that the description of higher cognition in mind and brain encompasses two levels: on the one hand, at the macrolevel, symbolic computation is performed, and on the other hand, at the microlevel, computation is achieved through processes within a dynamical system. One critical question is therefore how linguistic theory and dynamical systems can be unified to provide an explanation for processing effects. Here, we present such a unification for a particular account to syntactic theory: namely a parser for Stabler's Minimalist Grammars, in the framework of Smolensky's Integrated Connectionist/Symbolic architectures. In simulations we demonstrate that the connectionist minimalist parser produces predictions which mirror global empirical findings from psycholinguistic research. PMID:19795221

  11. Spatial and Numerical Abilities without a Complete Natural Language

    ERIC Educational Resources Information Center

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations…

  12. Stochastic Model for the Vocabulary Growth in Natural Languages

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2013-04-01

    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  13. Grammar as a Programming Language. Artificial Intelligence Memo 391.

    ERIC Educational Resources Information Center

    Rowe, Neil

    Student projects that involve writing generative grammars in the computer language, "LOGO," are described in this paper, which presents a grammar-running control structure that allows students to modify and improve the grammar interpreter itself while learning how a simple kind of computer parser works. Included are procedures for programing a…

  14. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  15. Language Revitalization.

    ERIC Educational Resources Information Center

    Hinton, Leanne

    2003-01-01

    Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)

  16. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  17. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  18. Verification Processes in Recognition Memory: The Role of Natural Language Mediators

    ERIC Educational Resources Information Center

    Marshall, Philip H.; Smith, Randolph A. S.

    1977-01-01

    The existence of verification processes in recognition memory was confirmed in the context of Adams' (Adams & Bray, 1970) closed-loop theory. Subjects' recognition was tested following a learning session. The expectation was that data would reveal consistent internal relationships supporting the position that natural language mediation plays an…

  19. Discrimination of Coronal Stops by Bilingual Adults: The Timing and Nature of Language Interaction

    ERIC Educational Resources Information Center

    Sundara, Megha; Polka, Linda

    2008-01-01

    The current study was designed to investigate the timing and nature of interaction between the two languages of bilinguals. For this purpose, we compared discrimination of Canadian French and Canadian English coronal stops by simultaneous bilingual, monolingual and advanced early L2 learners of French and English. French /d/ is phonetically…

  20. BIT BY BIT: A Game Simulating Natural Language Processing in Computers

    ERIC Educational Resources Information Center

    Kato, Taichi; Arakawa, Chuichi

    2008-01-01

    BIT BY BIT is an encryption game that is designed to improve students' understanding of natural language processing in computers. Participants encode clear words into binary code using an encryption key and exchange them in the game. BIT BY BIT enables participants who do not understand the concept of binary numbers to perform the process of…

  1. Art Related Experiences for Social Science, Natural Science, and Language Arts.

    ERIC Educational Resources Information Center

    Mack, Edward B.

    This booklet is intended to serve as an introduction to art experiences that relate to studies in social science, natural science, and language arts. It is designed to develop a better understanding of the dynamics of interaction of the abiotic, biotic, and cultural factors of the total environment as manifest in art forms. Each section, presented…

  2. Speech perception and reading: two parallel modes of understanding language and implications for acquiring literacy naturally.

    PubMed

    Massaro, Dominic W

    2012-01-01

    I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally. PMID:22953690

  3. Cross-Linguistic Evidence for the Nature of Age Effects in Second Language Acquisition

    ERIC Educational Resources Information Center

    Dekeyser, Robert; Alfi-Shabtay, Iris; Ravid, Dorit

    2010-01-01

    Few researchers would doubt that ultimate attainment in second language grammar is negatively correlated with age of acquisition, but considerable controversy remains about the nature of this relationship: the exact shape of the age-attainment function and its interpretation. This article presents two parallel studies with native speakers of…

  4. Dimensions of Difficulty in Translating Natural Language into First Order Logic

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Cox, Richard; Dale, Robert

    2009-01-01

    In this paper, we present a study of a large corpus of student logic exercises in which we explore the relationship between two distinct measures of difficulty: the proportion of students whose initial attempt at a given natural language to first-order logic translation is incorrect, and the average number of attempts that are required in order to…

  5. The Rape of Mother Nature? Women in the Language of Environmental Discourse.

    ERIC Educational Resources Information Center

    Berman, Tzeporah

    1994-01-01

    Argues that the structure of language reflects and reproduces the dominant model, and reinforces many of the dualistic assumptions which underlie the separation of male and female, nature and culture, mind from body, emotion from reason, and intuition from fact. (LZ)

  6. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for…

  7. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)…

  8. You Are Your Words: Modeling Students' Vocabulary Knowledge with Natural Language Processing Tools

    ERIC Educational Resources Information Center

    Allen, Laura K.; McNamara, Danielle S.

    2015-01-01

    The current study investigates the degree to which the lexical properties of students' essays can inform stealth assessments of their vocabulary knowledge. In particular, we used indices calculated with the natural language processing tool, TAALES, to predict students' performance on a measure of vocabulary knowledge. To this end, two corpora were…

  9. Introduction to Special Issue: Understanding the Nature-Nurture Interactions in Language and Learning Differences.

    ERIC Educational Resources Information Center

    Berninger, Virginia Wise

    2001-01-01

    The introduction to this special issue on nature-nurture interactions notes that the following articles represent five biologically oriented research approaches which each provide a tutorial on the investigator's major research tool, a summary of current research understandings regarding language and learning differences, and a discussion of…

  10. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…