Science.gov

Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. Application of a rules-based natural language parser to critical value reporting in anatomic pathology.

    PubMed

    Owens, Scott R; Balis, Ulysses G J; Lucas, David R; Myers, Jeffrey L

    2012-03-01

    Critical values in anatomic pathology are rare occurrences and difficult to define with precision. Nevertheless, accrediting institutions require effective and timely communication of all critical values generated by clinical and anatomic laboratories. Provisional gating criteria for potentially critical anatomic diagnoses have been proposed, with some success in their implementation reported in the literature. Ensuring effective communication is challenging, however, making the case for programmatic implementation of a turnkey-style integrated information technology solution. To address this need, we developed a generically deployable laboratory information system-based tool, using a tiered natural language processing predicate calculus inference engine to identify qualifying cases that meet criteria for critical diagnoses but lack an indication in the electronic medical record for an appropriate clinical discussion with the ordering physician of record. Using this tool, we identified an initial cohort of 13,790 cases over a 49-month period, which were further explored by reviewing the available electronic medical record for each patient. Of these cases, 35 (0.3%) were judged to require intervention in the form of direct communication between the attending pathologist and the clinical physician of record. In 8 of the 35 cases, this intervention resulted in the conveyance of new information to the requesting physician and/or a change in the patient's clinical plan. The very low percentage of such cases (0.058%) illustrates their rarity in daily practice, making it unlikely that manual identification/notification approaches alone can reliably manage them. The automated turnkey system was useful in avoiding missed handoffs of significant, clinically actionable diagnoses.

  3. COD::CIF::Parser: an error-correcting CIF parser for the Perl language

    PubMed Central

    Merkys, Andrius; Vaitkus, Antanas; Butkus, Justas; Okulič-Kazarinas, Mykolas; Kairys, Visvaldas; Gražulis, Saulius

    2016-01-01

    A syntax-correcting CIF parser, COD::CIF::Parser, is presented that can parse CIF 1.1 files and accurately report the position and the nature of the discovered syntactic problems. In addition, the parser is able to automatically fix the most common and the most obvious syntactic deficiencies of the input files. Bindings for Perl, C and Python programming environments are available. Based on COD::CIF::Parser, the cod-tools package for manipulating the CIFs in the Crystallography Open Database (COD) has been developed. The cod-tools package has been successfully used for continuous updates of the data in the automated COD data deposition pipeline, and to check the validity of COD data against the IUCr data validation guidelines. The performance, capabilities and applications of different parsers are compared. PMID:26937241

  4. The Accelerator Markup Language and the Universal Accelerator Parser

    SciTech Connect

    Sagan, D.; Forster, M.; Bates, D.A.; Wolski, A.; Schmidt, F.; Walker, N.J.; Larrieu, T.; Roblin, Y.; Pelaia, T.; Tenenbaum, P.; Woodley, M.; Reiche, S.; /UCLA

    2006-10-06

    A major obstacle to collaboration on accelerator projects has been the sharing of lattice description files between modeling codes. To address this problem, a lattice description format called Accelerator Markup Language (AML) has been created. AML is based upon the standard eXtensible Markup Language (XML) format; this provides the flexibility for AML to be easily extended to satisfy changing requirements. In conjunction with AML, a software library, called the Universal Accelerator Parser (UAP), is being developed to speed the integration of AML into any program. The UAP is structured to make it relatively straightforward (by giving appropriate specifications) to read and write lattice files in any format. This will allow programs that use the UAP code to read a variety of different file formats. Additionally, this will greatly simplify conversion of files from one format to another. Currently, besides AML, the UAP supports the MAD lattice format.

  5. Natural-Language Parser for PBEM

    NASA Technical Reports Server (NTRS)

    James, Mark

    2010-01-01

    A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.

  6. An Introductory Lisp Parser.

    ERIC Educational Resources Information Center

    Loritz, Donald

    1987-01-01

    Gives a short grammar of the Lisp computer language. Presents an introductory English parser (Simparse) as an example of how to write a parser in Lisp. Lists references for further explanation. Intended as preparation for teachers who may use computer-assisted language instruction in the future. (LMO)

  7. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  8. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  9. Expression and cut parser for CMS event data

    NASA Astrophysics Data System (ADS)

    Lista, Luca; Jones, Christopher D.; Petrucciani, Giovanni

    2010-04-01

    We present a parser to evaluate expressions and Boolean selections that is applied on CMS event data for event filtering and analysis purposes. The parser is based on Boost Spirit grammar definition, and uses Reflex dictionaries for class introspection. The parser allows for a natural definition of expressions and cuts in users' configurations, and provides good runtime performance compared to other existing parsers.

  10. Parsing clinical text: how good are the state-of-the-art parsers?

    PubMed Central

    2015-01-01

    Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F

  11. NEWCAT: Parsing natural language using left-associative grammar

    SciTech Connect

    Hausser, R.

    1986-01-01

    This book shows that constituent structure analysis induces an irregular order of linear composition which is the direct cause of extreme computational inefficiency. It proposes an alternative left-associative grammar which operates with a regular order of linear compositions. Left-associative grammar is based on building up and cancelling valencies. Left-associative parsers differ from all other systems in that the history of the parse doubles as the linguistic analysis. Left-associative grammar is illustrated with two left-associative natural language parsers: one for German and one for English.

  12. Toward a theory of distributed word expert natural language parsing

    NASA Technical Reports Server (NTRS)

    Rieger, C.; Small, S.

    1981-01-01

    An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.

  13. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning

    ERIC Educational Resources Information Center

    Heift, Trude; Schulze, Mathias

    2012-01-01

    This book provides the first comprehensive overview of theoretical issues, historical developments and current trends in ICALL (Intelligent Computer-Assisted Language Learning). It assumes a basic familiarity with Second Language Acquisition (SLA) theory and teaching, CALL and linguistics. It is of interest to upper undergraduate and/or graduate…

  14. PASCAL LR(1) Parser Generator System

    1988-05-04

    LRSYS is a complete LR(1) parser generator system written entirely in a portable subset of Pascal. The system, LRSYS, includes a grammar analyzer program (LR) which reads a context-free (BNF) grammar as input and produces LR(1) parsing tables as output, a lexical analyzer generator (LEX) which reads regular expressions created by the REG process as input and produces lexical tables as output, and various parser skeletons that get merged with the tables to produce completemore » parsers (SMAKE). Current parser skeletons include Pascal, FORTRAN 77, and C. Other language skeletons can easily be added to the system. LRSYS is based on the LR program.« less

  15. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…

  16. A Morphological Parser for Linguistic Exploration.

    ERIC Educational Resources Information Center

    Weber, David

    The computerized morphological parser, AMPLE, grew out of work in computer assisted dialect adaptation. AMPLE contains no language-specific code, but is controlled entirely through external, user-written files, the notations of which were designed for linguists. AMPLE's constructs are linguistic: e.g., allomorph, morpheme, conditioning…

  17. Natural language parsing in a hybrid connectionist-symbolic architecture

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-03-01

    Most connectionist parsers either cannot guarantee the correctness of their derivations or have to simulate a serial flow of control. In the first case, users have to restrict the tasks (e.g. parse less complex or shorter sentences) of the parser or they need to believe in the soundness of the result. In the second case, the resulting network has lost most of its attractivity because seriality needs to be hard-coded into the structure of the net. We here present a hybrid symbolic connectionist parser, which was designed to fulfill the following goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free grammar, and (3) learning the applicability of parsing rules with a neural network. Our hybrid architecture consists of a serial parsing algorithm and a trainable net. BrainC (Backtracking and Backpropagation in C) combines the well known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent the typical properties of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN- Workstations and was tested with several grammars for English and German. We discuss how BrainC reached its design goals and what results we observed.

  18. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  19. Natural language generation

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    The goal of natural language generation is to replicate human writers or speakers: to generate fluent, grammatical, and coherent text or speech. Produced language, using both explicit and implicit means, must clearly and effectively express some intended message. This demands the use of a lexicon and a grammar together with mechanisms which exploit semantic, discourse and pragmatic knowledge to constrain production. Furthermore, special processors may be required to guide focus, extract presuppositions, and maintain coherency. As with interpretation, generation may require knowledge of the world, including information about the discourse participants as well as knowledge of the specific domain of discourse. All of these processes and knowledge sources must cooperate to produce well-written, unambiguous language. Natural language generation has received less attention than language interpretation due to the nature of language: it is important to interpret all the ways of expressing a message but we need to generate only one. Furthermore, the generative task can often be accomplished by canned text (e.g., error messages or user instructions). The advent of more sophisticated computer systems, however, has intensified the need to express multisentential English.

  20. Natural language modeling

    SciTech Connect

    Sharp, J.K.

    1997-11-01

    This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, project management, budget reviews, tax, laws, machine function).

  1. A Table Look-Up Parser in Online ILTS Applications

    ERIC Educational Resources Information Center

    Chen, Liang; Tokuda, Naoyuki; Hou, Pingkui

    2005-01-01

    A simple table look-up parser (TLUP) has been developed for parsing and consequently diagnosing syntactic errors in semi-free formatted learners' input sentences of an intelligent language tutoring system (ILTS). The TLUP finds a parse tree for a correct version of an input sentence, diagnoses syntactic errors of the learner by tracing and…

  2. Left-corner unification-based natural language processing

    SciTech Connect

    Lytinen, S.L.; Tomuro, N.

    1996-12-31

    In this paper, we present an efficient algorithm for parsing natural language using unification grammars. The algorithm is an extension of left-corner parsing, a bottom-up algorithm which utilizes top-down expectations. The extension exploits unification grammar`s uniform representation of syntactic, semantic, and domain knowledge, by incorporating all types of grammatical knowledge into parser expectations. In particular, we extend the notion of the reachability table, which provides information as to whether or not a top-down expectation can be realized by a potential subconstituent, by including all types of grammatical information in table entries, rather than just phrase structure information. While our algorithm`s worst-case computational complexity is no better than that of many other algorithms, we present empirical testing in which average-case linear time performance is achieved. Our testing indicates this to be much improved average-case performance over previous leftcomer techniques.

  3. Syntactic dependency parsers for biomedical-NLP.

    PubMed

    Cohen, Raphael; Elhadad, Michael

    2012-01-01

    Syntactic parsers have made a leap in accuracy and speed in recent years. The high order structural information provided by dependency parsers is useful for a variety of NLP applications. We present a biomedical model for the EasyFirst parser, a fast and accurate parser for creating Stanford Dependencies. We evaluate the models trained in the biomedical domains of EasyFirst and Clear-Parser in a number of task oriented metrics. Both parsers provide stat of the art speed and accuracy in the Genia of over 89%. We show that Clear-Parser excels at tasks relating to negation identification while EasyFirst excels at tasks relating to Named Entities and is more robust to changes in domain.

  4. Programming Languages, Natural Languages, and Mathematics

    ERIC Educational Resources Information Center

    Naur, Peter

    1975-01-01

    Analogies are drawn between the social aspects of programming and similar aspects of mathematics and natural languages. By analogy with the history of auxiliary languages it is suggested that Fortran and Cobol will remain dominant. (Available from the Association of Computing Machinery, 1133 Avenue of the Americas, New York, NY 10036.) (Author/TL)

  5. Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2

    PubMed Central

    Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.

    2015-01-01

    Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080

  6. A natural language interface for real-time dialogue in the flight domain

    NASA Technical Reports Server (NTRS)

    Ali, M.; Ai, C.-S.; Ferber, H. J.

    1986-01-01

    A flight expert system (FLES) is being developed to assist pilots in monitoring, diagnosisng and recovering from in-flight faults. To provide a communications interface between the flight crew and FLES, a natural language interface, has been implemented. Input to NALI is processed by three processors: (1) the semantic parser, (2) the knowledge retriever, and (3) the response generator. The architecture of NALI has been designed to process both temporal and nontemporal queries. Provisions have also been made to reduce the number of system modifications required for adapting NALI to other domains. This paper describes the architecture and implementation of NALI.

  7. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The book presents papers on natural language processing, focusing on the central issues of representation, reasoning, and recognition. The introduction discusses theoretical issues, historical developments, and current problems and approaches. The book presents work in syntactic models (parsing and grammars), semantic interpretation, discourse interpretation, language action and intentions, language generation, and systems.

  8. Parser Combinators: a Practical Application for Generating Parsers for NMR Data.

    PubMed

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi Jc; Gryk, Michael R

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as "parser combinators". This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods.

  9. Parser Combinators: a Practical Application for Generating Parsers for NMR Data

    PubMed Central

    Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

    2013-01-01

    Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525

  10. Distributed problem solving and natural language understanding models

    NASA Technical Reports Server (NTRS)

    Rieger, C.

    1980-01-01

    A theory of organization and control for a meaning-based language understanding system is mapped out. In this theory, words, rather than rules, are the units of knowledge, and assume the form of procedural entities which execute as generator-like coroutines. Parsing a sentence in context demands a control environment in wich experts can ask questions of each other, forward hints and suggestions to each other, and suspend. The theory is a cognitive theory of both language representation and parser control.

  11. A Natural Language Graphics System.

    ERIC Educational Resources Information Center

    Brown, David, C.; Kwasny, Stan C.

    This report describes an experimental system for drawing simple pictures on a computer graphics terminal using natural language input. The system is capable of drawing lines, points, and circles on command from the user, as well as answering questions about system capabilities and objects on the screen. Erasures are permitted and language input…

  12. Advances in natural language processing.

    PubMed

    Hirschberg, Julia; Manning, Christopher D

    2015-07-17

    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area.

  13. Connectionist natural language parsing with BrainC

    NASA Astrophysics Data System (ADS)

    Mueller, Adrian; Zell, Andreas

    1991-08-01

    A close examination of pure neural parsers shows that they either could not guarantee the correctness of their derivations or had to hard-code seriality into the structure of the net. The authors therefore decided to use a hybrid architecture, consisting of a serial parsing algorithm and a trainable net. The system fulfills the following design goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free language, and (3) learning the applicability of parsing rules with a neural network to increase the efficiency of the whole system. BrainC (backtracktacking and backpropagation in C) combines the well- known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent typical structures of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN workstations and was tested with several grammars for English and German. The design of the system and then the results are discussed.

  14. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The papers assembled fall naturally into six groups dealing respectively with parsing and grammars, semantic interpretation, discourse interpretation (covering, for example, anaphor resolution), language actions and the intentions underlying them, language generation, and systems (notably interface systems). The chapter headings are treated broadly and are taken to imply either that the authors are adopting a particular position about the way processing, and particularly input processing, should be done, or that problems and solutions assigned to one category have no relevance elsewhere. Many individual papers, placed in their most appropriate categories, also contribute to other areas.

  15. Models of natural language understanding.

    PubMed Central

    Bates, M

    1995-01-01

    This paper surveys some of the fundamental problems in natural language (NL) understanding (syntax, semantics, pragmatics, and discourse) and the current approaches to solving them. Some recent developments in NL processing include increased emphasis on corpus-based rather than example- or intuition-based work, attempts to measure the coverage and effectiveness of NL systems, dealing with discourse and dialogue phenomena, and attempts to use both analytic and stochastic knowledge. Critical areas for the future include grammars that are appropriate to processing large amounts of real language; automatic (or at least semi-automatic) methods for deriving models of syntax, semantics, and pragmatics; self-adapting systems; and integration with speech processing. Of particular importance are techniques that can be tuned to such requirements as full versus partial understanding and spoken language versus text. Portability (the ease with which one can configure an NL system for a particular application) is one of the largest barriers to application of this technology. PMID:7479812

  16. Unsupervised learning of natural languages

    PubMed Central

    Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon

    2005-01-01

    We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885

  17. Unsupervised learning of natural languages.

    PubMed

    Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon

    2005-08-16

    We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885

  18. Computational models of natural language processing

    SciTech Connect

    Bara, B.G.; Guida, G.

    1984-01-01

    The main concern in this work is the illustration of models for natural language processing, and the discussion of their role in the development of computational studies of language. Topics covered include the following: competence and performance in the design of natural language systems; planning and understanding speech acts by interpersonal games; a framework for integrating syntax and semantics; knowledge representation and natural language: extending the expressive power of proposition nodes; viewing parsing as word sense discrimination: a connectionist approach; a propositional language for text representation; from topic and focus of a sentence to linking in a text; language generation by computer; understanding the Chinese language; semantic primitives or meaning postulates: mental models of propositional representations; narrative complexity based on summarization algorithms; using focus to constrain language generation; and towards an integral model of language competence.

  19. Comparing Natural Language Retrieval: WIN and FREESTYLE.

    ERIC Educational Resources Information Center

    Pritchard-Schoch, Teresa

    1995-01-01

    Compares two natural language processing search engines, WIN (WESTLAW Is Natural) and FREESTYLE, developed by LEXIS. Legal issues in natural language queries were presented to identical libraries in both systems. Results showed that the editorials enhanced relevance; a search would be more thorough using both databases; and if only one system were…

  20. Natural language processing: an introduction

    PubMed Central

    Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786

  1. Knowledge representation and natural language processing

    SciTech Connect

    Weischedel, R.M.

    1986-07-01

    In principle, natural language and knowledge representation are closely related. This paper investigates this by demonstrating how several natural language phenomena, such as definite reference, ambiguity, ellipsis, ill-formed input, figures of speech, and vagueness, require diverse knowledge sources and reasoning. The breadth of kinds of knowledge needed to represent morphology, syntax, semantics, and pragmatics is surveyed. Furthermore, several current issues in knowledge representation, such as logic versus semantic nets, general-purpose versus special-purpose reasoners, adequacy of first-order logic, wait-and-see strategies, and default reasoning, are illustrated in terms of their relation to natural language processing and how natural language impact the issues.

  2. Multilingual environment and natural acquisition of language

    NASA Astrophysics Data System (ADS)

    Takano, Shunichi; Nakamura, Shigeru

    2000-06-01

    Language and human are not anything in the outside of nature. Not only babies, even adults can acquire new language naturally, if they have a natural multilingual environment around them. The reason it is possible would be that any human has an ability to grasp the whole of language, and at the same time, language has an order which is the easiest to acquire for humans. The process of this natural acquisition and a result of investigating the order of Japanese vowels are introduced. .

  3. Natural Artificial Languages: Low-Level Processes.

    ERIC Educational Resources Information Center

    Perlman, Gary

    This paper explores languages for communicating precise ideas within limited domains, which include mathematical notation and general purpose and high level computer programming languages. Low-level properties of such natural artificial languages are discussed, with emphasis on those in which names are chosen for concepts and symbols are chosen…

  4. The Natural Method of Language Learning: Systematized.

    ERIC Educational Resources Information Center

    Hobson, Arline B.

    In this monograph, the language and pedagogical concepts embodied in the Tucson Early Education Model are used to develop a systematized method of natural language learning. It is hypothesized that young children in school continually resystematize their language, and that conscious and systematic modeling by the teacher should accelerate this…

  5. Neural Network Computing and Natural Language Processing.

    ERIC Educational Resources Information Center

    Borchardt, Frank

    1988-01-01

    Considers the application of neural network concepts to traditional natural language processing and demonstrates that neural network computing architecture can: (1) learn from actual spoken language; (2) observe rules of pronunciation; and (3) reproduce sounds from the patterns derived by its own processes. (Author/CB)

  6. Thought beyond language: neural dissociation of algebra and natural language.

    PubMed

    Monti, Martin M; Parsons, Lawrence M; Osherson, Daniel N

    2012-08-01

    A central question in cognitive science is whether natural language provides combinatorial operations that are essential to diverse domains of thought. In the study reported here, we addressed this issue by examining the role of linguistic mechanisms in forging the hierarchical structures of algebra. In a 3-T functional MRI experiment, we showed that processing of the syntax-like operations of algebra does not rely on the neural mechanisms of natural language. Our findings indicate that processing the syntax of language elicits the known substrate of linguistic competence, whereas algebraic operations recruit bilateral parietal brain regions previously implicated in the representation of magnitude. This double dissociation argues against the view that language provides the structure of thought across all cognitive domains.

  7. A Natural Language Interface to Databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1990-01-01

    The development of a Natural Language Interface (NLI) is presented which is semantic-based and uses Conceptual Dependency representation. The system was developed using Lisp and currently runs on a Symbolics Lisp machine.

  8. Evolution, brain, and the nature of language.

    PubMed

    Berwick, Robert C; Friederici, Angela D; Chomsky, Noam; Bolhuis, Johan J

    2013-02-01

    Language serves as a cornerstone for human cognition, yet much about its evolution remains puzzling. Recent research on this question parallels Darwin's attempt to explain both the unity of all species and their diversity. What has emerged from this research is that the unified nature of human language arises from a shared, species-specific computational ability. This ability has identifiable correlates in the brain and has remained fixed since the origin of language approximately 100 thousand years ago. Although songbirds share with humans a vocal imitation learning ability, with a similar underlying neural organization, language is uniquely human.

  9. Natural Language Description of Emotion

    ERIC Educational Resources Information Center

    Kazemzadeh, Abe

    2013-01-01

    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  10. Porting a lexicalized-grammar parser to the biomedical domain.

    PubMed

    Rimell, Laura; Clark, Stephen

    2009-10-01

    This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought. PMID:19141332

  11. Incremental Bayesian Category Learning from Natural Language

    ERIC Educational Resources Information Center

    Frermann, Lea; Lapata, Mirella

    2016-01-01

    Models of category learning have been extensively studied in cognitive science and primarily tested on perceptual abstractions or artificial stimuli. In this paper, we focus on categories acquired from natural language stimuli, that is, words (e.g., "chair" is a member of the furniture category). We present a Bayesian model that, unlike…

  12. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)…

  13. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query…

  14. Natural Language interactions with artificial experts

    SciTech Connect

    Finin, T.W.; Joshi, A.K.; Webber, B.F.

    1986-07-01

    The aim of this paper is to justify why Natural Language (NL) interaction, of a very rich functionality, is critical to the effective use of Expert Systems and to describe what is needed and what has been done to support such interaction. Interactive functions discussed here include defining terms, paraphrasing, correcting misconceptions, avoiding misconceptons, and modifying questions.

  15. Two Interpretive Systems for Natural Language?

    ERIC Educational Resources Information Center

    Frazier, Lyn

    2015-01-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system…

  16. Brain readiness and the nature of language.

    PubMed

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique

  17. Brain readiness and the nature of language

    PubMed Central

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their “representations” may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the

  18. Brain readiness and the nature of language.

    PubMed

    Bouchard, Denis

    2015-01-01

    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique

  19. Natural language processing, pragmatics, and verbal behavior

    PubMed Central

    Cherpas, Chris

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052

  20. Natural language processing, pragmatics, and verbal behavior.

    PubMed

    Cherpas, C

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a "user expert"-a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system.

  1. Linear separability in superordinate natural language concepts.

    PubMed

    Ruts, Wim; Storms, Gert; Hampton, James

    2004-01-01

    Two experiments are reported in which linear separability was investigated in superordinate natural language concept pairs (e.g., toiletry-sewing gear). Representations of the exemplars of semantically related concept pairs were derived in two to five dimensions using multidimensional scaling (MDS) of similarities based on possession of the concept features. Next, category membership, obtained from an exemplar generation study (in Experiment 1) and from a forced-choice classification task (in Experiment 2) was predicted from the coordinates of the MDS representation using log linear analysis. The results showed that all natural kind concept pairs were perfectly linearly separable, whereas artifact concept pairs showed several violations. Clear linear separability of natural language concept pairs is in line with independent cue models. The violations in the artifact pairs, however, yield clear evidence against the independent cue models.

  2. Natural language understanding and logic programming

    SciTech Connect

    Dahl, V.; Saint-Dizier, P.

    1985-01-01

    Logic programming has been used in many natural language understanding applications, mainly in the areas of analysis, metagrammatical formalisms, logical treatment of linguistic problems, and meaning representations for naturla language. The particular methods and formal systems developed in this context usually exhibit attractive features of logic while remaining in the more pragmatic area of programming: conciseness, modularity, a declarative meaning that is independent from machine behaviour, and logical inference. All of these features, common to logic programming and to logic metagrammars, have been made possible through a chaining of various fundamental ideas. Outstanding among these are the resolution principle. Prolog itself; and interpretation of logic as a programming language. The machines of a relatively near future are likely to incorporate many related capabilities while increasing their speed manyfold. The Japanese Fifth Generation Computer project has triggered efforts towards future generations of computer systems based on these concepts. The potential in understanding natural language through logic programming is growing rapidly, and it might be wise to integrate the various theoretical and practical aspects involved, rather than yielding to the temptation of using all the extra power for programming ad-hoc systems. This conference is an effort toward such an integration.

  3. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  4. Automated database design from natural language input

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando; Segami, Carlos; Delaune, Carl

    1995-01-01

    Users and programmers of small systems typically do not have the skills needed to design a database schema from an English description of a problem. This paper describes a system that automatically designs databases for such small applications from English descriptions provided by end-users. Although the system has been motivated by the space applications at Kennedy Space Center, and portions of it have been designed with that idea in mind, it can be applied to different situations. The system consists of two major components: a natural language understander and a problem-solver. The paper describes briefly the knowledge representation structures constructed by the natural language understander, and, then, explains the problem-solver in detail.

  5. An expert system for natural language processing

    NASA Technical Reports Server (NTRS)

    Hennessy, John F.

    1988-01-01

    A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.

  6. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  7. LCS: a natural language comprehension system

    NASA Astrophysics Data System (ADS)

    Trigano, Philippe; Talon, Benedicte; Baltazart, Didier; Demko, Christophe; Newstead, Emma

    1991-03-01

    LCS (Language Comprehension System) is a software package designed to improve man-machine communication with computer programs. Different simple structures and functions are available to build man-machine interfaces in natural language. A user may write a sentence in good English or in telegraphical style. The system used pattern matching techniques to detect misspelled words (or badly typed words) and to correct them. Several methods of analysis are available at any level (lexical, syntactic, semantic...). A special knowledge acquisition system is used to introduce new works by giving a description in natural language. A semantic network is extended to a representation close to a connexionist graph, for a better understanding of polysemic words and ambiguities. An application is currently used for a man-machine interface of an expert system in computer-aided education, for a better dialogue with the user during the explanation of reasoning phase. The object of this paper is to present the LCS system, especially at the lexical level, the knowledge representation and acquisition level, and the semantic level (for pronoun references and ambiguity).

  8. Building a Natural Language Interface for the ATNF Pulsar Database for Speeding up Execution of Complex Queries

    NASA Astrophysics Data System (ADS)

    Tang, Rupert; Jenet, F.; Rangel, S.; Dartez, L.

    2010-01-01

    Until now, there has been no available natural language interfaces (NLI's) for querying a database of pulsars (rotating neutron stars emitting radiation at regular intervals). Currently, pulsar records are retrieved through an HTML form accessible via the Australia Telescope National Facility (ATNF) website where one needs to be familiar with pulsar attributes used by the interface (e.g. BLC). Using a NLI relinquishes the need for learning form-specific formalism and allows execution of more powerful queries than those supported by the HTML form. Furthermore, on database access that requires comparison of attributes for all the pulsar records (e.g. what is the fastest pulsar?), using a NLI for retrieving answers to such complex questions is definitely much more efficient and less error-prone. This poster presents the first NLI ever created for the ATNF pulsar database (ATNF-Query) to facilitate database access using complex queries. ATNF-Query is built using a machine learning approach that induces a semantic parser from a question corpus; the innovative application is intended to provide pulsar researchers or laymen with an intelligent language understanding database system for friendly information access.

  9. Two interpretive systems for natural language?

    PubMed

    Frazier, Lyn

    2015-02-01

    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system is one that uses the grammar together with knowledge of how the human production system works. It is token based and only delivers plausible meanings, including meanings based on a repaired input when the input might have been produced as a speech error.

  10. Intelligent CAI: An Author Aid for a Natural Language Interface.

    ERIC Educational Resources Information Center

    Burton, Richard R.; Brown, John Seely

    This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)…

  11. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  12. MGFp: an open Mascot Generic Format parser library implementation.

    PubMed

    Kirchner, Marc; Steen, Judith A J; Hamprecht, Fred A; Steen, Hanno

    2010-05-01

    Despite the efforts of the mass spectrometry (MS) community to migrate data representation toward modern file formats, legacy text formats still play an important role in MS data processing workflows. We provide a formal grammar and a portable, efficient C++ implementation for a Mascot Generic Format (MGF) parser. Software and technical documentation are available from http://software.steenlab.org/mgfp/. PMID:20334363

  13. The Parser Doesn't Ignore Intransitivity, after All

    ERIC Educational Resources Information Center

    Staub, Adrian

    2007-01-01

    Several previous studies (B. C. Adams, C. Clifton, & D. C. Mitchell, 1998; D. C. Mitchell, 1987; R. P. G. van Gompel & M. J. Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eye-tracking experiments examined this issue in more…

  14. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two…

  15. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  16. Written Language Is as Natural as Spoken Language: A Biolinguistic Perspective

    ERIC Educational Resources Information Center

    Aaron, P. G.; Joshi, R. Malatesha

    2006-01-01

    A commonly held belief is that language is an aspect of the biological system since the capacity to acquire language is innate and evolved along Darwinian lines. Written language, on the other hand, is thought to be an artifact and a surrogate of speech; it is, therefore, neither natural nor biological. This disparaging view of written language,…

  17. An Evolutionary History of the Natural Language English and the Artificial Language FORTRAN.

    ERIC Educational Resources Information Center

    Koman, Joseph J., III

    1988-01-01

    Notes similarities between certain aspects of the development of the natural language English and the artificial language FORTRAN. Discusses evolutionary history, grammar, style, syntax, varieties, and attempts at standardization. Emphasizes modifications which natural and artificial languages have undergone. Suggests that some modifications were…

  18. Linguistic Analysis of Natural Language Communication with Computers.

    ERIC Educational Resources Information Center

    Thompson, Bozena Henisz

    Interaction with computers in natural language requires a language that is flexible and suited to the task. This study of natural dialogue was undertaken to reveal those characteristics which can make computer English more natural. Experiments were made in three modes of communication: face-to-face, terminal-to-terminal, and human-to-computer,…

  19. Understanding and representing natural language meaning

    SciTech Connect

    Waltz, D.L.; Maran, L.R.; Dorfman, M.H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (a) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (b) written a parsing program which selects appropriate word and sentence meanings by a parallel process known as activation and inhibition; (c) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (d) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (e) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (f) constructed a general model for the representation of tense and aspect of verbs; (g) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  20. Understanding and representing natural language meaning

    NASA Astrophysics Data System (ADS)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  1. Understanding natural language for spacecraft sequencing

    NASA Technical Reports Server (NTRS)

    Katz, Boris; Brooks, Robert N., Jr.

    1987-01-01

    The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.

  2. Inferring heuristic classification hierarchies from natural language input

    NASA Technical Reports Server (NTRS)

    Hull, Richard; Gomez, Fernando

    1993-01-01

    A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.

  3. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  4. Tutorial on techniques and applications for natural language processing

    SciTech Connect

    Hayes, P.J.; Carbonell, J.G.

    1983-10-17

    Natural language communication with computers has long been a major goal of Artificial Intelligence both for what it can tell us about intelligence in general and for its practical utility - data bases, software packages, and Al-based expert systems all require flexible interfaces to a growing community of users who are not able or do not wish to communicate with computers in formal, artificial command languages. Whereas many of the fundamental problems of general natural language processing (NLP) by machine remain to be solved, the area has matured in recent years to the point where practical natural language interfaces to software systems can be constructed in many restricted, but nevertheless useful, circumstances. This tutorial is intended to survey the current state of applied natural language processing by presenting computationally effective NLP techniques, by discussing the range of capabilities these techniques provide for NLP systems, an by discussing their current limitations. Following the introduction, this document is divided into two major sections: the first on language recognition strategies at the single sentence level, and the second on language processing issues that arise during interactive dialogues. In both cases, we concentrate on those aspects of the problem appropriate for interactive natural language interfaces, but relate the techniques and systems discussed to more general work on natural language, independent of application domain.

  5. Storing files in a parallel computing system based on user-specified parser function

    DOEpatents

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  6. A Hybrid Architecture For Natural Language Understanding

    NASA Astrophysics Data System (ADS)

    Loatman, R. Bruce

    1987-05-01

    The PRC Adaptive Knowledge-based Text Understanding System (PAKTUS) is an environment for developing natural language understanding (NLU) systems. It uses a knowledge-based approach in an integrated hybrid architecture based on a factoring of the NLU problem into its lexi-cal, syntactic, conceptual, domain-specific, and pragmatic components. The goal is a robust system that benefits from the strengths of several NLU methodologies, each applied where most appropriate. PAKTUS employs a frame-based knowledge representation and associative networks throughout. The lexical component uses morphological knowledge and word experts. Syntactic knowledge is represented in an Augmented Transition Network (ATN) grammar that incorporates rule-based programming. Case grammar is used for canonical conceptual representation with constraints. Domain-specific templates represent knowledge about specific applications as patterns of the form used in logic programming. Pragmatic knowledge may augment any of the other types and is added wherever needed for a particular domain. The system has been constructed in an interactive graphic programming environment. It has been used successfully to build a prototype front end for an expert system. This integration of existing technologies makes limited but practical NLU feasible now for narrow, well-defined domains.

  7. Natural language metaphors covertly influence reasoning.

    PubMed

    Thibodeau, Paul H; Boroditsky, Lera

    2013-01-01

    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents.

  8. Natural language metaphors covertly influence reasoning.

    PubMed

    Thibodeau, Paul H; Boroditsky, Lera

    2013-01-01

    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents. PMID:23301009

  9. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  10. Semantics of Context-Free Fragments of Natural Languages.

    ERIC Educational Resources Information Center

    Suppes, Patrick

    The objective of this paper is to combine the viewpoint of model-theoretic semantics and generative grammar, to define semantics for context-free languages, and to apply the results to some fragments of natural language. Following the introduction in the first section, Section 2 describes a simple artificial example to illustrate how a semantic…

  11. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information…

  12. Natural Chunks of Language: Teaching Speech Through Speech.

    ERIC Educational Resources Information Center

    Henry, Alex

    1996-01-01

    Outlines a teaching approach for oral English for special purposes that inputs chunks of natural language orally. These "chunks" can be segmented by the learner into fixed and variable units, while simultaneously making the learner aware of the paradigmatic, syntagmatic, and phonological aspects of the language being taught. This approach uses…

  13. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  14. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  15. Natural Language Processing Neural Network Considering Deep Cases

    NASA Astrophysics Data System (ADS)

    Sagara, Tsukasa; Hagiwara, Masafumi

    In this paper, we propose a novel neural network considering deep cases. It can learn knowledge from natural language documents and can perform recall and inference. Various techniques of natural language processing using Neural Network have been proposed. However, natural language sentences used in these techniques consist of about a few words, and they cannot handle complicated sentences. In order to solve these problems, the proposed network divides natural language sentences into a sentence layer, a knowledge layer, ten kinds of deep case layers and a dictionary layer. It can learn the relations among sentences and among words by dividing sentences. The advantages of the method are as follows: (1) ability to handle complicated sentences; (2) ability to restructure sentences; (3) usage of the conceptual dictionary, Goi-Taikei, as the long term memory in a brain. Two kinds of experiments were carried out by using goo dictionary and Wikipedia as knowledge sources. Superior performance of the proposed neural network has been confirmed.

  16. A Natural Language Interface Concordant with a Knowledge Base.

    PubMed

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively. PMID:26904105

  17. A Natural Language Interface Concordant with a Knowledge Base

    PubMed Central

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively. PMID:26904105

  18. A Natural Language Interface Concordant with a Knowledge Base.

    PubMed

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively.

  19. Two Types of Definites in Natural Language

    ERIC Educational Resources Information Center

    Schwarz, Florian

    2009-01-01

    This thesis is concerned with the description and analysis of two semantically different types of definite articles in German. While the existence of distinct article paradigms in various Germanic dialects and other languages has been acknowledged in the descriptive literature for quite some time, the theoretical implications of their existence…

  20. Software Development Of XML Parser Based On Algebraic Tools

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2011-12-01

    In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.

  1. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  2. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  3. Overview of Computer-based Natural Language Processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  4. Overview of computer-based natural language processing

    SciTech Connect

    Gevarter, W.B.

    1983-04-01

    Computer-based Natural Language-Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines in natural language (like English, Japanese, German, etc. in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state-of-the-art of the technology, issues and research requirements, the major participants, and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and other who will be affected by this field as it unfolds.

  5. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  6. The redundancy of recursion and infinity for natural language.

    PubMed

    Luuk, Erkki; Luuk, Hendrik

    2011-02-01

    An influential line of thought claims that natural language and arithmetic processing require recursion, a putative hallmark of human cognitive processing (Chomsky in Evolution of human language: biolinguistic perspectives. Cambridge University Press, Cambridge, pp 45-61, 2010; Fitch et al. in Cognition 97(2):179-210, 2005; Hauser et al. in Science 298(5598):1569-1579, 2002). First, we question the need for recursion in human cognitive processing by arguing that a generally simpler and less resource demanding process--iteration--is sufficient to account for human natural language and arithmetic performance. We argue that the only motivation for recursion, the infinity in natural language and arithmetic competence, is equally approachable by iteration and recursion. Second, we submit that the infinity in natural language and arithmetic competence reduces to imagining infinite embedding or concatenation, which is completely independent from the ability to implement infinite processing, and thus, independent from both recursion and iteration. Furthermore, we claim that a property of natural language is physically uncountable finity and not discrete infinity. PMID:20652723

  7. The redundancy of recursion and infinity for natural language.

    PubMed

    Luuk, Erkki; Luuk, Hendrik

    2011-02-01

    An influential line of thought claims that natural language and arithmetic processing require recursion, a putative hallmark of human cognitive processing (Chomsky in Evolution of human language: biolinguistic perspectives. Cambridge University Press, Cambridge, pp 45-61, 2010; Fitch et al. in Cognition 97(2):179-210, 2005; Hauser et al. in Science 298(5598):1569-1579, 2002). First, we question the need for recursion in human cognitive processing by arguing that a generally simpler and less resource demanding process--iteration--is sufficient to account for human natural language and arithmetic performance. We argue that the only motivation for recursion, the infinity in natural language and arithmetic competence, is equally approachable by iteration and recursion. Second, we submit that the infinity in natural language and arithmetic competence reduces to imagining infinite embedding or concatenation, which is completely independent from the ability to implement infinite processing, and thus, independent from both recursion and iteration. Furthermore, we claim that a property of natural language is physically uncountable finity and not discrete infinity.

  8. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  9. The integration hypothesis of human language evolution and the nature of contemporary languages.

    PubMed

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis.

  10. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  11. Natural language processing and the Now-or-Never bottleneck.

    PubMed

    Gómez-Rodríguez, Carlos

    2016-01-01

    Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science. PMID:27561430

  12. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  13. Quicky location determination based on geographic keywords of natural language

    NASA Astrophysics Data System (ADS)

    Guo, Danhuai; Cui, Weihong

    2007-06-01

    In location determination based on natural language, it is common to find the location by describing relationship between the undetermined position and one or several determined position. That indicates that the uncertainty of location determination processing is derived from the one of natural language procedure, the one of spatial position description and the one of spatial relationship description. Most of current researches and regular GIS software take certainty as prerequisite and try to avoid uncertainty and its influence. The research reported in this paper is an attempt to create a new combing method of Artificial Intelligence (AI), Fuzzy set theory and spatial information science named Quickly Location Determination based on Geographic Keywords (QLDGK) to rise to the challenge of location searching technique based on natural language. QLDGK have two technical gists. The first one is geographic-keywords-library and special natural-language-separation-model-library that increases the language processing efficiency. The second one is fuzzy theory based definition of spatial relationship, spatial metric and spatial orientation that extends the searching scope and defines variant confidences on variant searching outcome. QLDGK takes consideration on both higher query efficiency and the lower omission rate. The above method has been proved workable and efficient by QLDGK prototype system which was tested by about 12000 emergency call reports from K-city, Southwest of China, and achieved the test result with 78% accuracy in highest confidence and 8% omitting ration.

  14. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  15. Dealing with Quantifier Scope Ambiguity in Natural Language Understanding

    ERIC Educational Resources Information Center

    Hafezi Manshadi, Mohammad

    2014-01-01

    Quantifier scope disambiguation (QSD) is one of the most challenging problems in deep natural language understanding (NLU) systems. The most popular approach for dealing with QSD is to simply leave the semantic representation (scope-) underspecified and to incrementally add constraints to filter out unwanted readings. Scope underspecification has…

  16. The Nature of Object Marking in American Sign Language

    ERIC Educational Resources Information Center

    Gokgoz, Kadir

    2013-01-01

    In this dissertation, I examine the nature of object marking in American Sign Language (ASL). I investigate object marking by means of directionality (the movement of the verb towards a certain location in signing space) and by means of handling classifiers (certain handshapes accompanying the verb). I propose that object marking in ASL is…

  17. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor.

    ERIC Educational Resources Information Center

    Dadlez, Eva M.

    1984-01-01

    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big Brother"…

  18. Principles of Organization in Young Children's Natural Language Hierarchies.

    ERIC Educational Resources Information Center

    Callanan, Maureen A.; Markman, Ellen M.

    1982-01-01

    When preschool children think of objects as organized into collections (e.g., forest, army) they solve certain problems better than when they think of the same objects as organized into classes (e.g., trees, soldiers). Present studies indicate preschool children occasionally distort natural language inclusion hierarchies (e.g., oak, tree) into the…

  19. Proof-Theoretic Semantics for a Natural Language Fragment

    NASA Astrophysics Data System (ADS)

    Francez, Nissim; Dyckhoff, Roy

    We propose a Proof - Theoretic Semantics (PTS) for a (positive) fragment E+0 of Natural Language (NL) (English in this case). The semantics is intended [7] to be incorporated into actual grammars, within the framework of Type - Logical Grammar (TLG) [12]. Thereby, this semantics constitutes an alternative to the traditional model - theoretic semantics (MTS), originating in Montague's seminal work [11], used in TLG.

  20. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  1. Design of Lexicons in Some Natural Language Systems.

    ERIC Educational Resources Information Center

    Cercone, Nick; Mercer, Robert

    1980-01-01

    Discusses an investigation of certain problems concerning the structural design of lexicons used in computational approaches to natural language understanding. Emphasizes three aspects of design: retrieval of relevant portions of lexicals items, storage requirements, and representation of meaning in the lexicon. (Available from ALLC, Dr. Rex Last,…

  2. Learning from a Computer Tutor with Natural Language Capabilities

    ERIC Educational Resources Information Center

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha

    2003-01-01

    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…

  3. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  4. Research at Yale in Natural Language Processing. Research Report #84.

    ERIC Educational Resources Information Center

    Schank, Roger C.

    This report summarizes the capabilities of five computer programs at Yale that do automatic natural language processing as of the end of 1976. For each program an introduction to its overall intent is given, followed by the input/output, a short discussion of the research underlying the program, and a prognosis for future development. The programs…

  5. Learning by Communicating in Natural Language with Conversational Agents

    ERIC Educational Resources Information Center

    Graesser, Arthur; Li, Haiying; Forsyth, Carol

    2014-01-01

    Learning is facilitated by conversational interactions both with human tutors and with computer agents that simulate human tutoring and ideal pedagogical strategies. In this article, we describe some intelligent tutoring systems (e.g., AutoTutor) in which agents interact with students in natural language while being sensitive to their cognitive…

  6. Spinoza II: Conceptual Case-Based Natural Language Analysis.

    ERIC Educational Resources Information Center

    Schank, Roger C.; And Others

    This paper presents the theoretical changes that have developed in Conceptual Dependency Theory and their ramifications in computer analysis of natural language. The major items of concern are: the elimination of reliance on "grammar rules" for parsing with the emphasis given to conceptual rule based parsing; the development of a conceptual case…

  7. CITE NLM: Natural-Language Searching in an Online Catalog.

    ERIC Educational Resources Information Center

    Doszkocs, Tamas E.

    1983-01-01

    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…

  8. What Is the Nature of Poststroke Language Recovery and Reorganization?

    PubMed Central

    Kiran, Swathi

    2012-01-01

    This review focuses on three main topics related to the nature of poststroke language recovery and reorganization. The first topic pertains to the nature of anatomical and physiological substrates in the infarcted hemisphere in poststroke aphasia, including the nature of the hemodynamic response in patients with poststroke aphasia, the nature of the peri-infarct tissue, and the neuronal plasticity potential in the infarcted hemisphere. The second section of the paper reviews the current neuroimaging evidence for language recovery in the acute, subacute, and chronic stages of recovery. The third and final section examines changes in connectivity as a function of recovery in poststroke aphasia, specifically in terms of changes in white matter connectivity, changes in functional effective connectivity, and changes in resting state connectivity after stroke. While much progress has been made in our understanding of language recovery, more work needs to be done. Future studies will need to examine whether reorganization of language in poststroke aphasia corresponds to a tighter, more coherent, and efficient network of residual and new regions in the brain. Answering these questions will go a long way towards being able to predict which patients are likely to recover and may benefit from future rehabilitation. PMID:23320190

  9. Natural language understanding and speech recognition for industrial vision systems

    NASA Astrophysics Data System (ADS)

    Batchelor, Bruce G.

    1992-11-01

    The accepted method of programming machine vision systems for a new application is to incorporate sub-routines from a standard library into code, written specially for the given task. Typical programming languages that might be used here are Pascal, C, and assembly code, although other `conventional' (i.e., imperative) languages are often used instead. The representation of an algorithm to recognize a certain object, in the form of, say, a C language program is clumsy and unnatural, compared to the alternative process of describing the object itself and leaving the software to search for it. The latter method, known as declarative programming, is used extensively both when programming in Prolog and when people talk to one another in English, or other natural languages. Programs to understand a limited sub-set of a natural language can also be written conveniently in Prolog. The article considers the prospects for talking to an image processing system, using only slightly constrained English. Moderately priced speech recognition devices, which interface to a standard desk-top computer and provide a limited repertoire (200 words) as well as the ability to identify isolated words, are already available commercially. At the moment, the goal of talking in English to a computer is incompletely fulfilled. Yet, sufficient progress has been made to encourage greater effort in this direction.

  10. Developing Formal Correctness Properties from Natural Language Requirements

    NASA Technical Reports Server (NTRS)

    Nikora, Allen P.

    2006-01-01

    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  11. Natural Language Control of Resources for Experimental Data Acquisition Systems

    PubMed Central

    Harbort, Robert A.; Franklin, David; Spencer, James H.

    1980-01-01

    This presentation outlines the results of research into providing a “friendly interface” between a medical scientist and a medical data acquisition system for doing clinical research. The intended user of the system is presumed to have no knowledge of programming languages. The research has emphasized outlining the needs of such a user in terms of hardware configuration, developing specifications for meeting these needs dynamically, and creating a natural language control structure for setting up experiments without the help of a programmer or electronics technician.

  12. Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis

    NASA Technical Reports Server (NTRS)

    Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.

    2007-01-01

    To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.

  13. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  14. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  15. Conclusiveness of natural languages and recognition of images

    SciTech Connect

    Wojcik, Z.M.

    1983-01-01

    The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It is suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.

  16. Elicitation of natural language representations of uncertainty using computer technology

    SciTech Connect

    Tonn, B.; Goeltz, R.; Travis, C.; Tennessee Univ., Knoxville, TN )

    1989-01-01

    Knowledge elicitation is an important aspect of risk analysis. Knowledge about risks must be accurately elicited from experts for use in risk assessments. Knowledge and perceptions of risks must also be accurately elicited from the public in order to intelligently perform policy analysis and develop and implement programs. Oak Ridge National Laboratory is developing computer technology to effectively and efficiently elicit knowledge from experts and the public. This paper discusses software developed to elicit natural language representations of uncertainty. The software is written in Common Lisp and resides on VAX Computers System and Symbolics Lisp machines. The software has three goals, to determine preferences for using natural language terms for representing uncertainty; likelihood rankings of the terms; and how likelihood estimates are combined to form new terms. The first two goals relate to providing useful results for those interested in risk communication. The third relates to providing cognitive data to further our understanding of people's decision making under uncertainty. The software is used to elicit natural language terms used to express the likelihood of various agents causing cancer in humans and cancer resulting in various maladies, and the likelihood of everyday events. 6 refs., 4 figs., 4 tabs.

  17. A Codasyl-Type Schema for Natural Language Medical Records

    PubMed Central

    Sager, N.; Tick, L.; Story, G.; Hirschman, L.

    1980-01-01

    This paper describes a CODASYL (network) database schema for information derived from narrative clinical reports. The goal of this work is to create an automated process that accepts natural language documents as input and maps this information into a database of a type managed by existing database management systems. The schema described here represents the medical events and facts identified through the natural language processing. This processing decomposes each narrative into a set of elementary assertions, represented as MEDFACT records in the database. Each assertion in turn consists of a subject and a predicate classed according to a limited number of medical event types, e.g., signs/symptoms, laboratory tests, etc. The subject and predicate are represented by EVENT records which are owned by the MEDFACT record associated with the assertion. The CODASYL-type network structure was found to be suitable for expressing most of the relations needed to represent the natural language information. However, special mechanisms were developed for storing the time relations between EVENT records and for recording connections (such as causality) between certain MEDFACT records. This schema has been implemented using the UNIVAC DMS-1100 DBMS.

  18. Natural Language Processing Technologies in Radiology Research and Clinical Applications.

    PubMed

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. PMID:26761536

  19. Applications of Natural Language Processing in Biodiversity Science

    PubMed Central

    Thessen, Anne E.; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  20. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  1. Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    PubMed

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C; Doan, Son

    2011-12-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (Probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser.

  2. Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    PubMed

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C; Doan, Son

    2011-12-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (Probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440

  3. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  4. TreeParser-aided Klee diagrams display taxonomic clusters in DNA barcode and nuclear gene datasets.

    PubMed

    Stoeckle, Mark Y; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood.

  5. Description directed control: its implications for natural language generation

    SciTech Connect

    Mcdonald, D.D.

    1983-01-01

    This paper proposes a very specifically constrained virtual machine design for goal-directed natural language generation based on a refinement of the technique of data-directed control that the author has termed description-directed control. Important psycholinguistic properties of generation follow inescapably from the use of this control technique, including: efficient runtimes, bounded lookahead, indelible decisions, incremental production of the text, and inescapable adherence to gramaticality. The technique also provides a possible explanation for some well-known universal constraints, though this cannot be confirmed without further empirical investigation. 29 references.

  6. Natural Language Processing as a Discipline at LLNL

    SciTech Connect

    Firpo, M A

    2005-02-04

    The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.

  7. Natural Language Processing in Radiology: A Systematic Review.

    PubMed

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A

    2016-05-01

    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. PMID:27089187

  8. A general natural-language text processor for clinical radiology.

    PubMed Central

    Friedman, C; Alderson, P O; Austin, J H; Cimino, J J; Johnson, S B

    1994-01-01

    OBJECTIVE: Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms. DESIGN: The natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines semantic patterns and a target form. The second phase, regularization, standardizes the terms in the initial target structure via a compositional mapping of multi-word phrases. The third phase, encoding, maps the terms to a controlled vocabulary. Radiology is the test domain for the processor and the target structure is a formal model for representing clinical information in that domain. MEASUREMENTS: The impression sections of 230 radiology reports were encoded by the processor. Results of an automated query of the resultant database for the occurrences of four diseases were compared with the analysis of a panel of three physicians to determine recall and precision. RESULTS: Without training specific to the four diseases, recall and precision of the system (combined effect of the processor and query generator) were 70% and 87%. Training of the query component increased recall to 85% without changing precision. PMID:7719797

  9. Suicide Note Classification Using Natural Language Processing: A Content Analysis

    PubMed Central

    Pestian, John; Nasrallah, Henry; Matykiewicz, Pawel; Bennett, Aurora; Leenaars, Antoon

    2010-01-01

    Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient’s thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes. PMID:21643548

  10. Intelligent agents as a basis for natural language interfaces

    SciTech Connect

    Chin, D.N.

    1987-01-01

    Typical natural-language interfaces respond passively to the users's commands and queries. They cannot volunteer information, correction user misconceptions, or reject unethical requests. In order to do these things, a system must be an intelligent agent. UC (UNIX Consultant), a natural language system that helps the user solve problems in using the UNIX operating system, is such an intelligent agent. The agent component of UC in UCEgo. UCEgo provides UC with its own goals and plans. By adopting different goals in different situations, UCEgo creates and executes different plans, enabling it to interact appropriately with the user. UCEgo adopts goals from its themes, adopts subgoals during planning, and adopts metagoals for dealing with goal interactions. It also adopts goals when it notices that the user either lacks necessary knowledge, or has incorrect beliefs. In these cases, UCEgo plans to volunteer information or correct the user's misconception as appropriate. The user's knowledge and beliefs are modeled by the KNOME (KNOwledge Model of Expertise) component of UC. KNOME is a double-stereotype system which categorizes users by expertise and categorizes UNIX facts by difficulty.

  11. Event construal and temporal distance in natural language.

    PubMed

    Bhatia, Sudeep; Walasek, Lukasz

    2016-07-01

    Construal level theory proposes that events that are temporally proximate are represented more concretely than events that are temporally distant. We tested this prediction using two large natural language text corpora. In study 1 we examined posts on Twitter that referenced the future, and found that tweets mentioning temporally proximate dates used more concrete words than those mentioning distant dates. In study 2 we obtained all New York Times articles that referenced U.S. presidential elections between 1987 and 2007. We found that the concreteness of the words in these articles increased with the temporal proximity to their corresponding election. Additionally the reduction in concreteness after the election was much greater than the increase in concreteness leading up to the election, though both changes in concreteness were well described by an exponential function. We replicated this finding with New York Times articles referencing US public holidays. Overall, our results provide strong support for the predictions of construal level theory, and additionally illustrate how large natural language datasets can be used to inform psychological theory.

  12. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    SciTech Connect

    Powers, D.M.W.

    1983-04-01

    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  13. Solving problems on base of concepts formalization of language image and figurative meaning of the natural-language constructs

    NASA Astrophysics Data System (ADS)

    Bisikalo, Oleg V.; Cieszczyk, Sławomir; Yussupova, Gulbahar

    2015-12-01

    Building of "clever" thesaurus by algebraic means on base of concepts formalization of language image and figurative meaning of the natural-language constructs in the article are proposed. A formal theory based on a binary operator of directional associative relation is constructed and an understanding of an associative normal form of image constructions is introduced. A model of a commutative semigroup, which provides a presentation of a sentence as three components of an interrogative language image construction, is considered.

  14. MASCOT HTML and XML parser: an implementation of a novel object model for protein identification data.

    PubMed

    Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L

    2006-11-01

    Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.

  15. Emerging Approach of Natural Language Processing in Opinion Mining: A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  16. Second-language instinct and instruction effects: nature and nurture in second-language acquisition.

    PubMed

    Yusa, Noriaki; Koizumi, Masatoshi; Kim, Jungho; Kimura, Naoki; Uchida, Shinya; Yokoyama, Satoru; Miura, Naoki; Kawashima, Ryuta; Hagiwara, Hiroko

    2011-10-01

    Adults seem to have greater difficulties than children in acquiring a second language (L2) because of the alleged "window of opportunity" around puberty. Postpuberty Japanese participants learned a new English rule with simplex sentences during one month of instruction, and then they were tested on "uninstructed complex sentences" as well as "instructed simplex sentences." The behavioral data show that they can acquire more knowledge than is instructed, suggesting the interweaving of nature (universal principles of grammar, UG) and nurture (instruction) in L2 acquisition. The comparison in the "uninstructed complex sentences" between post-instruction and pre-instruction using functional magnetic resonance imaging reveals a significant activation in Broca's area. Thus, this study provides new insight into Broca's area, where nature and nurture cooperate to produce L2 learners' rich linguistic knowledge. It also shows neural plasticity of adult L2 acquisition, arguing against a critical period hypothesis, at least in the domain of UG.

  17. An intelligent simulation generator with a natural language interface

    SciTech Connect

    Ford, D.R.

    1988-01-01

    What the decision maker needs is the ability to construct computer, simulation models in less time and with fewer resources. In order to accomplish this more intelligence needs to be added to programs that construct simulation programs. This is where knowledge-based systems techniques can contribute greatly. In this research, a knowledge-based systems approach to generating simulation code is suggested. Also, a prototype system is built which is called the Intelligent Simulation Generator (ISG). This system uses knowledge of how simulation models are formulated and knowledge about the SIMAN simulation language in producing computer, simulation models from a constrained, natural language description. The methodology is detailed and two test scenarios are given to substantiate the ability of the ISG. While this approach has proven fruitful, there are some severe limitations that must be overcome in order to advance the capabilities of the system. In essence, a better understanding of model formulation is necessary and then this knowledge must be captured and incorporated into the ISG.

  18. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  19. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  20. Detection of Blood Culture Bacterial Contamination using Natural Language Processing

    PubMed Central

    Matheny, Michael E.; FitzHenry, Fern; Speroff, Theodore; Hathaway, Jacob; Murff, Harvey J.; Brown, Steven H.; Fielstein, Elliot M.; Dittus, Robert S.; Elkin, Peter L.

    2009-01-01

    Microbiology results are reported in semi-structured formats and have a high content of useful patient information. We developed and validated a hybrid regular expression and natural language processing solution for processing blood culture microbiology reports. Multi-center Veterans Affairs training and testing data sets were randomly extracted and manually reviewed to determine the culture and sensitivity as well as contamination results. The tool was iteratively developed for both outcomes using a training dataset, and then evaluated on the test dataset to determine antibiotic susceptibility data extraction and contamination detection performance. Our algorithm had a sensitivity of 84.8% and a positive predictive value of 96.0% for mapping the antibiotics and bacteria with appropriate sensitivity findings in the test data. The bacterial contamination detection algorithm had a sensitivity of 83.3% and a positive predictive value of 81.8%. PMID:20351890

  1. Detection of blood culture bacterial contamination using natural language processing.

    PubMed

    Matheny, Michael E; Fitzhenry, Fern; Speroff, Theodore; Hathaway, Jacob; Murff, Harvey J; Brown, Steven H; Fielstein, Elliot M; Dittus, Robert S; Elkin, Peter L

    2009-11-14

    Microbiology results are reported in semi-structured formats and have a high content of useful patient information. We developed and validated a hybrid regular expression and natural language processing solution for processing blood culture microbiology reports. Multi-center Veterans Affairs training and testing data sets were randomly extracted and manually reviewed to determine the culture and sensitivity as well as contamination results. The tool was iteratively developed for both outcomes using a training dataset, and then evaluated on the test dataset to determine antibiotic susceptibility data extraction and contamination detection performance. Our algorithm had a sensitivity of 84.8% and a positive predictive value of 96.0% for mapping the antibiotics and bacteria with appropriate sensitivity findings in the test data. The bacterial contamination detection algorithm had a sensitivity of 83.3% and a positive predictive value of 81.8%.

  2. Natural language acquisition in large scale neural semantic networks

    NASA Astrophysics Data System (ADS)

    Ealey, Douglas

    This thesis puts forward the view that a purely signal- based approach to natural language processing is both plausible and desirable. By questioning the veracity of symbolic representations of meaning, it argues for a unified, non-symbolic model of knowledge representation that is both biologically plausible and, potentially, highly efficient. Processes to generate a grounded, neural form of this model-dubbed the semantic filter-are discussed. The combined effects of local neural organisation, coincident with perceptual maturation, are used to hypothesise its nature. This theoretical model is then validated in light of a number of fundamental neurological constraints and milestones. The mechanisms of semantic and episodic development that the model predicts are then used to explain linguistic properties, such as propositions and verbs, syntax and scripting. To mimic the growth of locally densely connected structures upon an unbounded neural substrate, a system is developed that can grow arbitrarily large, data- dependant structures composed of individual self- organising neural networks. The maturational nature of the data used results in a structure in which the perception of concepts is refined by the networks, but demarcated by subsequent structure. As a consequence, the overall structure shows significant memory and computational benefits, as predicted by the cognitive and neural models. Furthermore, the localised nature of the neural architecture also avoids the increasing error sensitivity and redundancy of traditional systems as the training domain grows. The semantic and episodic filters have been demonstrated to perform as well, or better, than more specialist networks, whilst using significantly larger vocabularies, more complex sentence forms and more natural corpora.

  3. Human-Level Natural Language Understanding: False Progress and Real Challenges

    ERIC Educational Resources Information Center

    Bignoli, Perrin G.

    2013-01-01

    The field of Natural Language Processing (NLP) focuses on the study of how utterances composed of human-level languages can be understood and generated. Typically, there are considered to be three intertwined levels of structure that interact to create meaning in language: syntax, semantics, and pragmatics. Not only is a large amount of…

  4. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  5. The Nature of Spanish versus English Language Use at Home

    ERIC Educational Resources Information Center

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude

    2014-01-01

    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,…

  6. ONE GRAMMAR OR TWO? Sign Languages and the Nature of Human Language.

    PubMed

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages - such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality, but also in the form of sign languages, has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language - in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. PMID:25013534

  7. ONE GRAMMAR OR TWO? Sign Languages and the Nature of Human Language.

    PubMed

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages - such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality, but also in the form of sign languages, has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language - in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression.

  8. Success story in software engineering using NIAM (Natural language Information Analysis Methodology)

    SciTech Connect

    Eaton, S.M.; Eaton, D.S.

    1995-10-01

    To create an information system, we employ NIAM (Natural language Information Analysis Methodology). NIAM supports the goals of both the customer and the analyst completely understanding the information. We use the customer`s own unique vocabulary, collect real examples, and validate the information in natural language sentences. Examples are discussed from a successfully implemented information system.

  9. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by…

  10. One grammar or two? Sign Languages and the Nature of Human Language

    PubMed Central

    Lillo-Martin, Diane C; Gajewski, Jon

    2014-01-01

    Linguistic research has identified abstract properties that seem to be shared by all languages—such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality but also in the form of sign languages has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language—in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. PMID:25013534

  11. Crowdsourcing and curation: perspectives from biology and natural language processing

    PubMed Central

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel

    2016-01-01

    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging ‘the crowd’; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9–11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing. Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives PMID:27504010

  12. Facilitating cancer research using natural language processing of pathology reports.

    PubMed

    Xu, Hua; Anderson, Kristin; Grann, Victor R; Friedman, Carol

    2004-01-01

    Many ongoing clinical research projects, such as projects involving studies associated with cancer, involve manual capture of information in surgical pathology reports so that the information can be used to determine the eligibility of recruited patients for the study and to provide other information, such as cancer prognosis. Natural language processing (NLP) systems offer an alternative to automated coding, but pathology reports have certain features that are difficult for NLP systems. This paper describes how a preprocessor was integrated with an existing NLP system (MedLEE) in order to reduce modification to the NLP system and to improve performance. The work was done in conjunction with an ongoing clinical research project that assesses disparities and risks of developing breast cancer for minority women. An evaluation of the system was performed using manually coded data from the research project's database as a gold standard. The evaluation outcome showed that the extended NLP system had a sensitivity of 90.6% and a precision of 91.6%. Results indicated that this system performed satisfactorily for capturing information for the cancer research project.

  13. Crowdsourcing and curation: perspectives from biology and natural language processing.

    PubMed

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel

    2016-01-01

    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives. PMID:27504010

  14. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  15. Crowdsourcing and curation: perspectives from biology and natural language processing.

    PubMed

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel

    2016-01-01

    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives.

  16. A common type system for clinical natural language processing

    PubMed Central

    2013-01-01

    Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462

  17. A grammar-based semantic similarity algorithm for natural language sentences.

    PubMed

    Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  18. ImageParser: a tool for finite element generation from three-dimensional medical images

    PubMed Central

    Yin, HM; Sun, LZ; Wang, G; Yamada, T; Wang, J; Vannier, MW

    2004-01-01

    Background The finite element method (FEM) is a powerful mathematical tool to simulate and visualize the mechanical deformation of tissues and organs during medical examinations or interventions. It is yet a challenge to build up an FEM mesh directly from a volumetric image partially because the regions (or structures) of interest (ROIs) may be irregular and fuzzy. Methods A software package, ImageParser, is developed to generate an FEM mesh from 3-D tomographic medical images. This software uses a semi-automatic method to detect ROIs from the context of image including neighboring tissues and organs, completes segmentation of different tissues, and meshes the organ into elements. Results The ImageParser is shown to build up an FEM model for simulating the mechanical responses of the breast based on 3-D CT images. The breast is compressed by two plate paddles under an overall displacement as large as 20% of the initial distance between the paddles. The strain and tangential Young's modulus distributions are specified for the biomechanical analysis of breast tissues. Conclusion The ImageParser can successfully exact the geometry of ROIs from a complex medical image and generate the FEM mesh with customer-defined segmentation information. PMID:15461787

  19. Nature and Nurture in School-Based Second Language Achievement

    ERIC Educational Resources Information Center

    Dale, Philip S.; Harlaar, Nicole; Plomin, Robert

    2012-01-01

    Variability in achievement across learners is a hallmark of second language (L2) learning, especially in academic-based learning. The Twins Early Development Study (TEDS), based on a large, population-representative sample in the United Kingdom, provides the first opportunity to examine individual differences in second language achievement in a…

  20. Notes on the Nature of Bilingual Specific Language Impairment

    ERIC Educational Resources Information Center

    de Jong, Jan

    2010-01-01

    Johanne Paradis' Keynote Article can be read as a concise critical review of the research that focuses on the sometimes strained relationship between bilingualism and specific language impairment (SLI). In my comments I will add some thoughts based on our own research on the learning of Dutch as a second language (L2) by children with SLI.

  1. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language.

    PubMed

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł

    2015-05-01

    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. PMID:25858311

  2. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language.

    PubMed

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł

    2015-05-01

    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language.

  3. Natural language processing in an intelligent writing strategy tutoring system.

    PubMed

    McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod

    2013-06-01

    The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system.

  4. Automation of a problem list using natural language processing

    PubMed Central

    Meystre, Stephane; Haug, Peter J

    2005-01-01

    Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized

  5. Evaluation of PHI Hunter in Natural Language Processing Research

    PubMed Central

    Redd, Andrew; Pickard, Steve; Meystre, Stephane; Scehnet, Jeffrey; Bolton, Dan; Heavirland, Julia; Weaver, Allison Lynn; Hope, Carol; Garvin, Jennifer Hornung

    2015-01-01

    Objectives We introduce and evaluate a new, easily accessible tool using a common statistical analysis and business analytics software suite, SAS, which can be programmed to remove specific protected health information (PHI) from a text document. Removal of PHI is important because the quantity of text documents used for research with natural language processing (NLP) is increasing. When using existing data for research, an investigator must remove all PHI not needed for the research to comply with human subjects’ right to privacy. This process is similar, but not identical, to de-identification of a given set of documents. Materials and methods PHI Hunter removes PHI from free-form text. It is a set of rules to identify and remove patterns in text. PHI Hunter was applied to 473 Department of Veterans Affairs (VA) text documents randomly drawn from a research corpus stored as unstructured text in VA files. Results PHI Hunter performed well with PHI in the form of identification numbers such as Social Security numbers, phone numbers, and medical record numbers. The most commonly missed PHI items were names and locations. Incorrect removal of information occurred with text that looked like identification numbers. Discussion PHI Hunter fills a niche role that is related to but not equal to the role of de-identification tools. It gives research staff a tool to reasonably increase patient privacy. It performs well for highly sensitive PHI categories that are rarely used in research, but still shows possible areas for improvement. More development for patterns of text and linked demographic tables from electronic health records (EHRs) would improve the program so that more precise identifiable information can be removed. Conclusions PHI Hunter is an accessible tool that can flexibly remove PHI not needed for research. If it can be tailored to the specific data set via linked demographic tables, its performance will improve in each new document set. PMID:26807078

  6. Of substance: the nature of language effects on entity construal.

    PubMed

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-06-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D. (1997). A cross-linguistic study of early word meaning: Universal ontology and linguistic influence. Cognition, 62, 169-200]. Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin speakers and English speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's [Prasada, S. (1999). Names for things and stuff: An Aristotelian perspective. In R. Jackendoff, P. Bloom, & K. Wynn (Eds.), Language, logic, and concepts (pp. 119-146). Cambridge, MA: MIT Press] hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object kind rather than substance kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind.

  7. On the neurolinguistic nature of language abnormalities in Huntington's disease.

    PubMed Central

    Wallesch, C W; Fehrenbach, R A

    1988-01-01

    Spontaneous language of 18 patients suffering from Huntington's disease and 15 dysarthric controls suffering from Friedreich's ataxia were investigated. In addition, language functions in various modalities were assessed with the Aachen Aphasia Test (AAT). The Huntington patients exhibited deficits in the syntactical complexity of spontaneous speech and in the Token Test, confrontation naming, and language comprehension subtests of the AAT, which are interpreted as resulting from their dementia. Errors affecting word access mechanisms and production of syntactical structures as such were not encountered. PMID:2452241

  8. Statistical learning in a natural language by 8-month-old infants.

    PubMed

    Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R

    2009-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  9. Statistical Learning in a Natural Language by 8-Month-Old Infants

    PubMed Central

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2013-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  10. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  11. Of Substance: The Nature of Language Effects on Entity Construal

    ERIC Educational Resources Information Center

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.…

  12. Learning and comprehension of BASIC and natural language computer programming by novices

    SciTech Connect

    Dyck, J.L.

    1987-01-01

    This study examined the effectiveness of teaching novices to program in Natural Language as a prerequisite for learning BASIC, and the learning and comprehension processes for Natural Language and BASIC computer-programming languages. Three groups of computer-naive subjects participated in five self-paced learning sessions; in each sessions, subjects solved a series of programming problems with immediate feedback. Twenty-four subjects learned to solve BASIC programming problems (BASIC group) for all five sessions, 23 subjects learned to solve corresponding Natural Language programming problems for all five sessions (Natural Language group), and 23 subjects learned to solve Natural Language programming problems for three sessions and then transferred to BASIC for the two sessions (Transfer group). At the end of the fifth session, all subjects completed a post-test which required the subjects to use their programming knowledge in a new way. Results indicated that the Natural Language trained subjects had complete transfer to BASIC, as indicated by no overall difference in comprehension time or accuracy for final BASIC sessions (i.e., sessions four and five) for the Transfer and BASIC groups. In addition, there was an interaction between group and session on accuracy, in which the Transfer group increased its accuracy at a faster rate than the BASIC group.

  13. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    PubMed Central

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  14. The language-specific nature of grammatical development: evidence from bilingual language learners.

    PubMed

    Marchman, Virginia A; Martínez-Sussmann, Carmen; Dale, Philip S

    2004-04-01

    The fact that early lexical and grammatical acquisition are strongly correlated has been cited as evidence against the view that the language faculty is composed of dissociable and autonomous modules (Bates & Goodman, 1997). However, previous studies have not yet eliminated the possibility that lexical-grammar associations may be attributable to language-general individual differences (e.g. children who are good at learning words are good at learning grammar). Parent report assessments of toddlers who are simultaneously learning English and Spanish (n = 113) allow an examination of the specificity of lexical-grammar relationships while holding child factors constant. Within-language vocabulary-grammar associations were stronger than cross-language relationships, even after controlling for age, proportion of language exposure, general language skill and reporter bias. Similar patterns were found based on naturalistic language samples (n = 22), ruling out a methodological artifact. These results are consistent with the view that grammar learning is specifically tied to lexical progress in a given language and provide further support for strong lexical-grammatical continuity early in acquisition.

  15. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  16. The Nature of Chinese Language Classroom Learning Environments in Singapore Secondary Schools

    ERIC Educational Resources Information Center

    Chua, Siew Lian; Wong, Angela F. L.; Chen, Der-Thanq V.

    2011-01-01

    This article reports findings from a classroom environment study which was designed to investigate the nature of Chinese Language classroom environments in Singapore secondary schools. We used a perceptual instrument, the Chinese Language Classroom Environment Inventory, to investigate teachers' and students' perceptions towards their Chinese…

  17. Structured Natural-Language Descriptions for Semantic Content Retrieval of Visual Materials.

    ERIC Educational Resources Information Center

    Tam, A. M.; Leung, C. H. C.

    2001-01-01

    Proposes a structure for natural language descriptions of the semantic content of visual materials that requires descriptions to be (modified) keywords, phrases, or simple sentences, with components that are grammatical relations common to many languages. This structure makes it easy to implement a collection's descriptions as a relational…

  18. A Fuzzy Set Approach to Modifiers and Vagueness in Natural Language

    ERIC Educational Resources Information Center

    Hersh, Harry M.; Caramazza, Alfonso

    1976-01-01

    The proposition that natural language concepts are represented as fuzzy sets, a generalization of the traditional theory of sets, of meaning components and that language operators--adverbs, negative markers, and adjectives--can be considered as operators on fuzzy sets was assessed empirically. (Editor/RK)

  19. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  20. Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing

    ERIC Educational Resources Information Center

    Cohen, Kevin Bretonnel

    2010-01-01

    Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of…

  1. Digging in the Dictionary: Building a Relational Lexicon To Support Natural Language Processing Applications.

    ERIC Educational Resources Information Center

    Evens, Martha; And Others

    Advanced learners of second languages and natural language processing systems both demand much more detailed lexical information than conventional dictionaries provide. Text composition, whether by humans or machines, requires a thorough understanding of relationships between words, such as selectional restrictions, case patterns, factives, and…

  2. Transfer of a Natural Language System for Problem-Solving in Physics to Other Domains.

    ERIC Educational Resources Information Center

    Oberem, Graham E.

    The limited language capability of CAI systems has made it difficult to personalize problem-solving instruction. The intelligent tutoring system, ALBERT, is a problem-solving monitor and coach that has been used with high school and college level physics students for several years; it uses a natural language system to understand kinematics…

  3. Nine-Month-Olds Extract Structural Principles Required for Natural Language

    ERIC Educational Resources Information Center

    Gerken, LouAnn

    2004-01-01

    Infants' ability to rapidly extract properties of language-like systems during brief laboratory exposures has been taken as evidence about the innate linguistic state of humans. However, previous studies have focused on structural properties that are not central to descriptions of natural language. In the current study, infants were exposed to 3-…

  4. Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems.

    ERIC Educational Resources Information Center

    Burton, Richard R.

    In an attempt to overcome the lack of natural means of communication between student and computer, this thesis addresses the problem of developing a system which can understand natural language within an educational problem-solving environment. The nature of the environment imposes efficiency, habitability, self-teachability, and awareness of…

  5. Approach to the organization of knowledge and its use in natural language recall tasks

    SciTech Connect

    Mccalla, G.I.

    1983-01-01

    The viewpoint espoused in this paper is that natural language understanding and production is the action of a number of highly integrated domain-specific specialists. Described first is an object oriented representation scheme which allows these specialists to be built. Discussed next is the organization of these specialists into a four-level goal hierarchy that enables the modelling of natural language conversation. It is shown how the representation and natural language structures can be used to facilitate the recall of earlier natural language conversations. Six specific kinds of recall tasks are outlined in terms of these structures and their occurrence in several legal dialogues is examined. Finally, the need for intelligent garbage collection of old episodic information is pointed out. 38 references.

  6. The natural order of events: How speakers of different languages represent events nonverbally

    PubMed Central

    Goldin-Meadow, Susan; So, Wing Chee; Özyürek, Aslı; Mylander, Carolyn

    2008-01-01

    To test whether the language we speak influences our behavior even when we are not speaking, we asked speakers of four languages differing in their predominant word orders (English, Turkish, Spanish, and Chinese) to perform two nonverbal tasks: a communicative task (describing an event by using gesture without speech) and a noncommunicative task (reconstructing an event with pictures). We found that the word orders speakers used in their everyday speech did not influence their nonverbal behavior. Surprisingly, speakers of all four languages used the same order and on both nonverbal tasks. This order, actor–patient–act, is analogous to the subject–object–verb pattern found in many languages of the world and, importantly, in newly developing gestural languages. The findings provide evidence for a natural order that we impose on events when describing and reconstructing them nonverbally and exploit when constructing language anew. PMID:18599445

  7. GE FRST Evaluation Report: How Well Does a Statistically-Based Natural Language Processing System Score Natural Language Constructed-Responses?

    ERIC Educational Resources Information Center

    Burstein, Jill C.; Kaplan, Randy M.

    There is a considerable interest at Educational Testing Service (ETS) to include performance-based, natural language constructed-response items on standardized tests. Such items can be developed, but the projected time and costs required to have these items scored by human graders would be prohibitive. In order for ETS to include these types of…

  8. Integrating Corpus-Based Resources and Natural Language Processing.

    ERIC Educational Resources Information Center

    Cantos, Pascual

    2002-01-01

    Surveys computational linguistic tools presently available, but whose potential has neither been fully considered nor exploited to its full in modern computer assisted language learning (CALL). Discusses the rationale of DDL to engage learning, presenting typical data-driven learning (DDL)-activities, DDL-software, and potential extensions of…

  9. Natural language modeling for phoneme-to-text transcription

    SciTech Connect

    Derouault, A.M.; Merialdo, B.

    1986-11-01

    This paper relates different kinds of language modeling methods that can be applied to the linguistic decoding part of a speech recognition system with a very large vocabulary. These models are studied experimentally on a pseudophonetic input arising from French stenotypy. The authors propose a model which combines the advantages of a statistical modeling with information theoretic tools, and those of a grammatical approach.

  10. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  11. Unit 1001: The Nature of Meaning in Language.

    ERIC Educational Resources Information Center

    Minnesota Univ., Minneapolis. Center for Curriculum Development in English.

    This 10th-grade unit in Minnesota's "language-centered" curriculum introduces the complexity of linguistic meaning by demonstrating the relationships among linguistic symbols, their referents, their interpreters, and the social milieu. The unit begins with a discussion of Ray Bradbury's "The Kilimanjaro Machine," which illustrates how an otherwise…

  12. Neural correlates for the acquisition of natural language syntax.

    PubMed

    Tettamanti, Marco; Alkadhi, Hatem; Moro, Andrea; Perani, Daniela; Kollias, Spyros; Weniger, Dorothea

    2002-10-01

    Some types of simple and logically possible syntactic rule never occur in human language grammars, leading to a distinction between grammatical and nongrammatical syntactic rules. Comparison of the neuroanatomical correlates underlying the acquisition of grammatical and nongrammatical rules can provide relevant evidence on the neural processes dedicated to language acquisition in a given developmental stage. Until present no direct evidence on the neural mechanisms subserving language acquisition at any developmental stage has been supplied. We used fMRI in investigating the acquisition of grammatical and nongrammatical rules in the specified sense in 14 healthy adults. Grammatical rules compared with nongrammatical rules specifically activated a left hemispheric network including Broca's area, as shown by direct comparisons between the two rule types. The selective role of Broca's area was further confirmed by time x condition interactions and by proficiency effects, in that higher proficiency in grammatical rule usage, but not in usage of nongrammatical rules, led to higher levels of activation in this area. These findings provide evidence for the neural mechanisms underlying language acquisition in adults. PMID:12377145

  13. Inferring Speaker Affect in Spoken Natural Language Communication

    ERIC Educational Resources Information Center

    Pon-Barry, Heather Roberta

    2013-01-01

    The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…

  14. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E; Garrett, Kenyan; Sottile, Matthew; Shreyas, Ns

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  15. The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution

    PubMed Central

    Benítez-Burraco, Antonio; Murphy, Elliot

    2016-01-01

    Autism spectrum disorders (ASD) are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this article is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to) genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations. PMID:27047363

  16. The language faculty that wasn't: a usage-based account of natural language recursion

    PubMed Central

    Christiansen, Morten H.; Chater, Nick

    2015-01-01

    In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567

  17. The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution.

    PubMed

    Benítez-Burraco, Antonio; Murphy, Elliot

    2016-01-01

    Autism spectrum disorders (ASD) are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this article is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to) genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations. PMID:27047363

  18. Concreteness and Psychological Distance in Natural Language Use

    PubMed Central

    Snefjella, Bryor; Kuperman, Victor

    2015-01-01

    Existing evidence shows that more abstract mental representations are formed, and more abstract language is used, to characterize phenomena which are more distant from self. Yet the precise form of the functional relationship between distance and linguistic abstractness has been unknown. In four studies, we test whether more abstract language is used in textual references to more geographically distant cities (Study 1), times further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social media users, we determine that linguistic concreteness is a curvilinear function of the logarithm of distance and discuss psychological underpinnings of the mathematical properties of the relationship. We also demonstrate that gradient curvilinear effects of geographic and temporal distance on concreteness are near-identical, suggesting uniformity in representation of abstractness along multiple dimensions. PMID:26239108

  19. Concreteness and Psychological Distance in Natural Language Use.

    PubMed

    Snefjella, Bryor; Kuperman, Victor

    2015-09-01

    Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions.

  20. Thermo-msf-parser: an open source Java library to parse and visualize Thermo Proteome Discoverer msf files.

    PubMed

    Colaert, Niklaas; Barsnes, Harald; Vaudel, Marc; Helsens, Kenny; Timmerman, Evy; Sickmann, Albert; Gevaert, Kris; Martens, Lennart

    2011-08-01

    The Thermo Proteome Discoverer program integrates both peptide identification and quantification into a single workflow for peptide-centric proteomics. Furthermore, its close integration with Thermo mass spectrometers has made it increasingly popular in the field. Here, we present a Java library to parse the msf files that constitute the output of Proteome Discoverer. The parser is also implemented as a graphical user interface allowing convenient access to the information found in the msf files, and in Rover, a program to analyze and validate quantitative proteomics information. All code, binaries, and documentation is freely available at http://thermo-msf-parser.googlecode.com.

  1. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  2. For the People...Citizenship Education and Naturalization Information. An English as a Second Language Text.

    ERIC Educational Resources Information Center

    Short, Deborah J.; And Others

    A textbook for English-as-a-Second-Language (ESL) students presents lessons on U.S. citizenship education and naturalization information. The nine lessons cover the following topics: the U.S. system of government; the Bill of Rights; responsibilities and rights of citizens; voting; requirements for naturalization; the application process; the…

  3. The Right Tool for the Job: Techniques for Analysis of Natural Language Use.

    ERIC Educational Resources Information Center

    Green, Georgia M.

    A variety of techniques for collecting and analyzing information about the natural use of natural languages is surveyed, emphasizing the importance of recognizing the properties of a research task that make a given technique more or less suitable to it rather than comparing techniques globally and ranking them absolutely. An initial goal is to…

  4. Rimac: A Natural-Language Dialogue System that Engages Students in Deep Reasoning Dialogues about Physics

    ERIC Educational Resources Information Center

    Katz, Sandra; Jordan, Pamela; Litman, Diane

    2011-01-01

    The natural-language tutorial dialogue system that the authors are developing will allow them to focus on the nature of interactivity during tutoring as a malleable factor. Specifically, it will serve as a research platform for studies that manipulate the frequency and types of verbal alignment processes that take place during tutoring, such as…

  5. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.

    PubMed

    Clody, Michael C

    2011-01-01

    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key. PMID:22371983

  6. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.

    PubMed

    Clody, Michael C

    2011-01-01

    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key.

  7. GazeParser: an open-source and multiplatform library for low-cost eye tracking and analysis.

    PubMed

    Sogo, Hiroyuki

    2013-09-01

    Eye movement analysis is an effective method for research on visual perception and cognition. However, recordings of eye movements present practical difficulties related to the cost of the recording devices and the programming of device controls for use in experiments. GazeParser is an open-source library for low-cost eye tracking and data analysis; it consists of a video-based eyetracker and libraries for data recording and analysis. The libraries are written in Python and can be used in conjunction with PsychoPy and VisionEgg experimental control libraries. Three eye movement experiments are reported on performance tests of GazeParser. These showed that the means and standard deviations for errors in sampling intervals were less than 1 ms. Spatial accuracy ranged from 0.7° to 1.2°, depending on participant. In gap/overlap tasks and antisaccade tasks, the latency and amplitude of the saccades detected by GazeParser agreed with those detected by a commercial eyetracker. These results showed that the GazeParser demonstrates adequate performance for use in psychological experiments.

  8. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  9. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  10. Spatial and Numerical Abilities without a Complete Natural Language

    ERIC Educational Resources Information Center

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations…

  11. A Goal-Oriented Model of Natural Language Interaction.

    ERIC Educational Resources Information Center

    Moore, James A.; And Others

    This report summarizes a research program in modeling human communication ability. The methodology involved selecting a single, naturally occurring dialogue, instructing a human observer to extract certain aspects relative to its comprehension, and then using these aspects to guide the construction and verification of the model. The model assumes…

  12. Grammar as a Programming Language. Artificial Intelligence Memo 391.

    ERIC Educational Resources Information Center

    Rowe, Neil

    Student projects that involve writing generative grammars in the computer language, "LOGO," are described in this paper, which presents a grammar-running control structure that allows students to modify and improve the grammar interpreter itself while learning how a simple kind of computer parser works. Included are procedures for programing a…

  13. Stochastic Model for the Vocabulary Growth in Natural Languages

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2013-04-01

    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  14. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  15. The nature and origins of ambient language influence on infant vocal production and early words.

    PubMed

    Vihman, M M; de Boysson-Bardies, B

    1994-01-01

    Phonological structure may be seen as emerging in ontogeny from the combined effects of performance constraints rooted in the neuromotor and perceptual systems, individual lexical development and the influence of the particular ambient language. We review here the nature and origins of the earliest ambient language influences. Global effects within the first year of life include both (1) loss of early appearing phonetic gestures not supported by the ambient language and (2) positive effects, reflecting infant attention to prosody and to cues available in the visual as well as the auditory modality. In the course of early lexical development more specific effects become manifest as individual children pursue less common phonetic paths to which the ambient language provides 'sufficient exposure'.

  16. Dimensions of Difficulty in Translating Natural Language into First Order Logic

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Cox, Richard; Dale, Robert

    2009-01-01

    In this paper, we present a study of a large corpus of student logic exercises in which we explore the relationship between two distinct measures of difficulty: the proportion of students whose initial attempt at a given natural language to first-order logic translation is incorrect, and the average number of attempts that are required in order to…

  17. An Evaluation of Help Mechanisms in Natural Language Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Kreymer, Oleg

    2002-01-01

    Evaluates the current state of natural language processing information retrieval systems from the user's point of view, focusing on the structure and components of the systems' help mechanisms. Topics include user/system interaction; semantic parsing; syntactic parsing; semantic mapping; and concept matching. (Author/LRW)

  18. The International English Language Testing System (IELTS): Its Nature and Development.

    ERIC Educational Resources Information Center

    Ingram, D. E.

    The nature and development of the recently released International English Language Testing System (IELTS) instrument are described. The test is the result of a joint Australian-British project to develop a new test for use with foreign students planning to study in English-speaking countries. It is expected that the modular instrument will become…

  19. You Are Your Words: Modeling Students' Vocabulary Knowledge with Natural Language Processing Tools

    ERIC Educational Resources Information Center

    Allen, Laura K.; McNamara, Danielle S.

    2015-01-01

    The current study investigates the degree to which the lexical properties of students' essays can inform stealth assessments of their vocabulary knowledge. In particular, we used indices calculated with the natural language processing tool, TAALES, to predict students' performance on a measure of vocabulary knowledge. To this end, two corpora were…

  20. Speech perception and reading: two parallel modes of understanding language and implications for acquiring literacy naturally.

    PubMed

    Massaro, Dominic W

    2012-01-01

    I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally. PMID:22953690

  1. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  2. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory

    ERIC Educational Resources Information Center

    Tierney, Patrick J.

    2012-01-01

    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

  3. An Analysis of Methods for Preparing a Large Natural Language Data Base.

    ERIC Educational Resources Information Center

    Porch, Ann

    Relative cost and effectiveness of techniques for preparing a computer compatible data base consisting of approximately one million words of natural language are outlined. Considered are dollar cost, ease of editing, and time consumption. Facility for insertion of identifying information within the text, and updating of a text by merging with…

  4. Verification Processes in Recognition Memory: The Role of Natural Language Mediators

    ERIC Educational Resources Information Center

    Marshall, Philip H.; Smith, Randolph A. S.

    1977-01-01

    The existence of verification processes in recognition memory was confirmed in the context of Adams' (Adams & Bray, 1970) closed-loop theory. Subjects' recognition was tested following a learning session. The expectation was that data would reveal consistent internal relationships supporting the position that natural language mediation plays an…

  5. Real English: A Translator to Enable Natural Language Man-Machine Conversation.

    ERIC Educational Resources Information Center

    Gautin, Harvey

    This dissertation presents a pragmatic interpreter/translator called Real English to serve as a natural language man-machine communication interface in a multi-mode on-line information retrieval system. This multi-mode feature affords the user a library-like searching tool by giving him access to a dictionary, lexicon, thesaurus, synonym table,…

  6. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…

  7. Construct Validity in TOEFL iBT Speaking Tasks: Insights from Natural Language Processing

    ERIC Educational Resources Information Center

    Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S.

    2016-01-01

    This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…

  8. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…

  9. The Development of Language and Abstract Concepts: The Case of Natural Number

    ERIC Educational Resources Information Center

    Condry, Kirsten F.; Spelke, Elizabeth S.

    2008-01-01

    What are the origins of abstract concepts such as "seven," and what role does language play in their development? These experiments probed the natural number words and concepts of 3-year-old children who can recite number words to ten but who can comprehend only one or two. Children correctly judged that a set labeled eight retains this label if…

  10. Art Related Experiences for Social Science, Natural Science, and Language Arts.

    ERIC Educational Resources Information Center

    Mack, Edward B.

    This booklet is intended to serve as an introduction to art experiences that relate to studies in social science, natural science, and language arts. It is designed to develop a better understanding of the dynamics of interaction of the abiotic, biotic, and cultural factors of the total environment as manifest in art forms. Each section, presented…

  11. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…

  12. BIT BY BIT: A Game Simulating Natural Language Processing in Computers

    ERIC Educational Resources Information Center

    Kato, Taichi; Arakawa, Chuichi

    2008-01-01

    BIT BY BIT is an encryption game that is designed to improve students' understanding of natural language processing in computers. Participants encode clear words into binary code using an encryption key and exchange them in the game. BIT BY BIT enables participants who do not understand the concept of binary numbers to perform the process of…

  13. Introduction to Special Issue: Understanding the Nature-Nurture Interactions in Language and Learning Differences.

    ERIC Educational Resources Information Center

    Berninger, Virginia Wise

    2001-01-01

    The introduction to this special issue on nature-nurture interactions notes that the following articles represent five biologically oriented research approaches which each provide a tutorial on the investigator's major research tool, a summary of current research understandings regarding language and learning differences, and a discussion of…

  14. The Use of Natural Language Entry and Laser Videodisk Technology in CAI.

    ERIC Educational Resources Information Center

    Abdulla, Abdulla M.; And Others

    1984-01-01

    The use of an authoring system is described that incorporates student interaction with the computer by natural language entry at the keyboard and the use of the microcomputer to direct a random-access laser video-disk player. (Author/MLW)

  15. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for…

  16. Speech perception and reading: two parallel modes of understanding language and implications for acquiring literacy naturally.

    PubMed

    Massaro, Dominic W

    2012-01-01

    I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally.

  17. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  18. Reconceptualizing the Nature of Goals and Outcomes in Language/s Education

    ERIC Educational Resources Information Center

    Leung, Constant; Scarino, Angela

    2016-01-01

    Transformations associated with the increasing speed, scale, and complexity of mobilities, together with the information technology revolution, have changed the demography of most countries of the world and brought about accompanying social, cultural, and economic shifts (Heugh, 2013). This complex diversity has changed the very nature of…

  19. Language Revitalization.

    ERIC Educational Resources Information Center

    Hinton, Leanne

    2003-01-01

    Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)

  20. A Natural Language for AdS/CFT Correlators

    SciTech Connect

    Fitzpatrick, A.Liam; Kaplan, Jared; Penedones, Joao; Raju, Suvrat; van Rees, Balt C.; /YITP, Stony Brook

    2012-02-14

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  1. Using the Natural Language Paradigm (NLP) to increase vocalizations of older adults with cognitive impairments.

    PubMed

    Leblanc, Linda A; Geiger, Kaneen B; Sautter, Rachael A; Sidener, Tina M

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated in a multiple baseline design across participants. Data were collected on appropriate and inappropriate vocalizations with appropriate vocalizations coded as prompted or unprompted during baseline and treatment sessions. All participants experienced increases in appropriate speech during NLP with variable response patterns. Additionally, the two participants with substantial inappropriate vocalizations showed decreases in inappropriate speech. Implications for intervention in day programs are discussed.

  2. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  3. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  4. A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language

    PubMed Central

    Golosio, Bruno; Cangelosi, Angelo; Gamotina, Olesya; Masala, Giovanni Luca

    2015-01-01

    Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities. PMID:26560154

  5. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

    PubMed

    Khalifa, Abdulrahman; Meystre, Stéphane

    2015-12-01

    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts.

  6. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks

    PubMed Central

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar

    2015-01-01

    Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions. PMID:26151211

  7. New Trends in Computing Anticipatory Systems : Emergence of Artificial Conscious Intelligence with Machine Learning Natural Language

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2008-10-01

    This paper deals with the challenge to create an Artificial Intelligence System with an Artificial Consciousness. For that, an introduction to computing anticipatory systems is presented, with the definitions of strong and weak anticipation. The quasi-anticipatory systems of Robert Rosen are linked to open-loop controllers. Then, some properties of the natural brain are presented in relation to the triune brain theory of Paul D. MacLean, and the mind time of Benjamin Libet, with his veto of the free will. The theory of the hyperincursive discrete anticipatory systems is recalled in view to introduce the concept of hyperincursive free will, which gives a similar veto mechanism: free will as unpredictable hyperincursive anticipation The concepts of endo-anticipation and exo-anticipation are then defined. Finally, some ideas about artificial conscious intelligence with natural language are presented, in relation to the Turing Machine, Formal Language, Intelligent Agents and Mutli-Agent System.

  8. Natural language processing-based COTS software and related technologies survey.

    SciTech Connect

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.

    2003-09-01

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  9. Laboratory process control using natural language commands from a personal computer

    NASA Technical Reports Server (NTRS)

    Will, Herbert A.; Mackin, Michael A.

    1989-01-01

    PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.

  10. The effect of teachers' language on students' conceptions of the nature of science

    NASA Astrophysics Data System (ADS)

    Zeidler, Dana L.; Lederman, Norman G.

    Conveying an adequate conception of the nature of science to students is implicit in the border context of what has come to be known as scientific literacy. However, it has previously been demonstrated that possession of valid conceptions of the nature of science does not necessarily result in the performance of those teaching behaviors that are related to improved student conceptions. The present study examines the possibility that the language teachers use to communicate science content may provide the context (Realist or Instrumentalist orientations) in which students come to formulate a world view of science. Eighteen high school biology teachers and one randomly selected class from each of their sections (n = 409 students) were administered pre- and posttests at the beginning and end of the fall term using the Nature of Scientific Knowledge Scale (NSKS). Composite scores of the student changes on the Testable, Developmental, and Creative subscales were used to compare those six classes that exhibited the greatest change with those six classes that had the least change on the NSKS. Intensive qualitative observations of each teacher were also conducted over the fall semester, resulting in complete transcripts of teacher-student interactions. Qualitative comparisons of classes with respect to six variables related to Realist and Instrumentalist conceptions of the nature of science were conducted. TEACHERS' ordinary language in the presentation of subject matter was found to have significant impact on students' conceptions of the nature of science. These variables represented different contexts (Realist-Instrumental) teachers used to express themselves, scientific information, and concepts. Determining the extent to which TEACHERS' language has an impact on changes in students' conception of the nature of science has direct bearing on all preservice and inservice science teacher education programs.

  11. Modeling virtual organizations with Latent Dirichlet Allocation: a case for natural language processing.

    PubMed

    Gross, Alexander; Murthy, Dhiraj

    2014-10-01

    This paper explores a variety of methods for applying the Latent Dirichlet Allocation (LDA) automated topic modeling algorithm to the modeling of the structure and behavior of virtual organizations found within modern social media and social networking environments. As the field of Big Data reveals, an increase in the scale of social data available presents new challenges which are not tackled by merely scaling up hardware and software. Rather, they necessitate new methods and, indeed, new areas of expertise. Natural language processing provides one such method. This paper applies LDA to the study of scientific virtual organizations whose members employ social technologies. Because of the vast data footprint in these virtual platforms, we found that natural language processing was needed to 'unlock' and render visible latent, previously unseen conversational connections across large textual corpora (spanning profiles, discussion threads, forums, and other social media incarnations). We introduce variants of LDA and ultimately make the argument that natural language processing is a critical interdisciplinary methodology to make better sense of social 'Big Data' and we were able to successfully model nested discussion topics from forums and blog posts using LDA. Importantly, we found that LDA can move us beyond the state-of-the-art in conventional Social Network Analysis techniques. PMID:24930023

  12. Modeling virtual organizations with Latent Dirichlet Allocation: a case for natural language processing.

    PubMed

    Gross, Alexander; Murthy, Dhiraj

    2014-10-01

    This paper explores a variety of methods for applying the Latent Dirichlet Allocation (LDA) automated topic modeling algorithm to the modeling of the structure and behavior of virtual organizations found within modern social media and social networking environments. As the field of Big Data reveals, an increase in the scale of social data available presents new challenges which are not tackled by merely scaling up hardware and software. Rather, they necessitate new methods and, indeed, new areas of expertise. Natural language processing provides one such method. This paper applies LDA to the study of scientific virtual organizations whose members employ social technologies. Because of the vast data footprint in these virtual platforms, we found that natural language processing was needed to 'unlock' and render visible latent, previously unseen conversational connections across large textual corpora (spanning profiles, discussion threads, forums, and other social media incarnations). We introduce variants of LDA and ultimately make the argument that natural language processing is a critical interdisciplinary methodology to make better sense of social 'Big Data' and we were able to successfully model nested discussion topics from forums and blog posts using LDA. Importantly, we found that LDA can move us beyond the state-of-the-art in conventional Social Network Analysis techniques.

  13. Language.

    ERIC Educational Resources Information Center

    Gadlin, Barry; Nemanich, Donald

    1974-01-01

    An article and a bibliography constitute this issue of the "Illinois English Bulletin." In "Keep the Natives from Getting Restless," Barry Gadlin examines native language learning by children from infancy through high school and discusses the theories of several authors concerning the teaching of the native language. The "Bibliography of…

  14. Exploring culture, language and the perception of the nature of science

    NASA Astrophysics Data System (ADS)

    Sutherland, Dawn

    2002-01-01

    One dimension of early Canadian education is the attempt of the government to use the education system as an assimilative tool to integrate the First Nations and Me´tis people into Euro-Canadian society. Despite these attempts, many First Nations and Me´tis people retained their culture and their indigenous language. Few science educators have examined First Nations and Western scientific worldviews and the impact they may have on science learning. This study explored the views some First Nations (Cree) and Euro-Canadian Grade-7-level students in Manitoba had about the nature of science. Both qualitative (open-ended questions and interviews) and quantitative (a Likert-scale questionnaire) instruments were used to explore student views. A central hypothesis to this research programme is the possibility that the different world-views of two student populations, Cree and Euro-Canadian, are likely to influence their perceptions of science. This preliminary study explored a range of methodologies to probe the perceptions of the nature of science in these two student populations. It was found that the two cultural groups differed significantly between some of the tenets in a Nature of Scientific Knowledge Scale (NSKS). Cree students significantly differed from Euro-Canadian students on the developmental, testable and unified tenets of the nature of scientific knowledge scale. No significant differences were found in NSKS scores between language groups (Cree students who speak English in the home and those who speak English and Cree or Cree only). The differences found between language groups were primarily in the open-ended questions where preformulated responses were absent. Interviews about critical incidents provided more detailed accounts of the Cree students' perception of the nature of science. The implications of the findings of this study are discussed in relation to the challenges related to research methodology, further areas for investigation, science

  15. Computerized measurement of the content analysis of natural language for use in biomedical and neuropsychiatric research.

    PubMed

    Gottschalk, L A; Bechtel, R

    1995-07-01

    Over several decades, the senior author, with various colleagues, has developed an objective method of measuring the magnitude of commonly useful and pertinent neuropsychiatric and neuropsychological dimensions from the content and form analysis of verbal behavior and natural language. Extensive reliability and validation studies using this method have been published involving English, German, Spanish and many other languages, and which confirm that these Content Analysis Scales can be reliably scored cross-culturally and have construct validity. The validated measures include the Anxiety Scale (and six subscales), the Hostility Outward Scale (and two subscales), the Hostility In Scale, the Ambivalent Hostility Scale, the Social Alienation-Personal Disorganization Scale, the Cognitive Impairment Scale, the Depression Scale (and seven subscales), and the Hope Scale. Here, the authors report the development of artificial intelligence (LISP based) software that can reliably score these Content Analysis Scales, whose achievement facilitates the application of these measures to biomedical and neuropsychiatric research. PMID:7587159

  16. Ulisse Aldrovandi's Color Sensibility: Natural History, Language and the Lay Color Practices of Renaissance Virtuosi.

    PubMed

    Pugliano, Valentina

    2015-01-01

    Famed for his collection of drawings of naturalia and his thoughts on the relationship between painting and natural knowledge, it now appears that the Bolognese naturalist Ulisse Aldrovandi (1522-1605) also pondered specifically color and pigments, compiling not only lists and diagrams of color terms but also a full-length unpublished manuscript entitled De coloribus or Trattato dei colori. Introducing these writings for the first time, this article portrays a scholar not so much interested in the materiality of pigment production, as in the cultural history of hues. It argues that these writings constituted an effort to build a language of color, in the sense both of a standard nomenclature of hues and of a lexicon, a dictionary of their denotations and connotations as documented in the literature of ancients and moderns. This language would serve the naturalist in his artistic patronage and his natural historical studies, where color was considered one of the most reliable signs for the correct identification of specimens, and a guarantee of accuracy in their illustration. Far from being an exception, Aldrovandi's 'color sensibility'spoke of that of his university-educated nature-loving peers.

  17. A Comparison of Natural Language Processing Methods for Automated Coding of Motivational Interviewing.

    PubMed

    Tanana, Michael; Hallgren, Kevin A; Imel, Zac E; Atkins, David C; Srikumar, Vivek

    2016-06-01

    Motivational interviewing (MI) is an efficacious treatment for substance use disorders and other problem behaviors. Studies on MI fidelity and mechanisms of change typically use human raters to code therapy sessions, which requires considerable time, training, and financial costs. Natural language processing techniques have recently been utilized for coding MI sessions using machine learning techniques, rather than human coders, and preliminary results have suggested these methods hold promise. The current study extends this previous work by introducing two natural language processing models for automatically coding MI sessions via computer. The two models differ in the way they semantically represent session content, utilizing either 1) simple discrete sentence features (DSF model) and 2) more complex recursive neural networks (RNN model). Utterance- and session-level predictions from these models were compared to ratings provided by human coders using a large sample of MI sessions (N=341 sessions; 78,977 clinician and client talk turns) from 6 MI studies. Results show that the DSF model generally had slightly better performance compared to the RNN model. The DSF model had "good" or higher utterance-level agreement with human coders (Cohen's kappa>0.60) for open and closed questions, affirm, giving information, and follow/neutral (all therapist codes); considerably higher agreement was obtained for session-level indices, and many estimates were competitive with human-to-human agreement. However, there was poor agreement for client change talk, client sustain talk, and therapist MI-inconsistent behaviors. Natural language processing methods provide accurate representations of human derived behavioral codes and could offer substantial improvements to the efficiency and scale in which MI mechanisms of change research and fidelity monitoring are conducted.

  18. Discovering novel causal patterns from biomedical natural-language texts using Bayesian nets.

    PubMed

    Atkinson, John; Rivas, Alejandro

    2008-11-01

    Most of the biomedicine text mining approaches do not deal with specific cause--effect patterns that may explain the discoveries. In order to fill this gap, this paper proposes an effective new model for text mining from biomedicine literature that helps to discover cause--effect hypotheses related to diseases, drugs, etc. The supervised approach combines Bayesian inference methods with natural-language processing techniques in order to generate simple and interesting patterns. The results of applying the model to biomedicine text databases and its comparison with other state-of-the-art methods are also discussed.

  19. Knowledge acquisition from natural language for expert systems based on classification problem-solving methods

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando

    1989-01-01

    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

  20. Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping.

    PubMed

    Smith, Andrew E; Humphreys, Michael S

    2006-05-01

    The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in a nunsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks. PMID:16956103

  1. Practical systems use natural languages and store human expertise (artificial intelligence)

    SciTech Connect

    Evanczuk, S.; Manuel, T.

    1983-12-01

    For earlier articles see T. Manuel et al., ibid., vol.56, no.22, p.127-37. This second part of a special report on commercial applications of artificial intelligence examines the milestones which mark this major new path for the software industry. It covers state-space search, the problem of ambiguity, augmented transition networks, early commercial products, current and expected personal computer software, natural-language interfaces, research projects, knowledge engineering, the workings of artificial-intelligence-based applications programs, LISP, attributes and object orientation.

  2. Role of PROLOG (Programming and Logic) in natural-language processing. Report for September-December 1987

    SciTech Connect

    McHale, M.L.

    1988-03-01

    The field of artificial Intelligence strives to produce computer programs that exhibit intelligent behavior. One of the areas of interest is the processing of natural language. This report discusses the role of the computer language PROLOG in Natural Language Processing (NLP) both from theoretic and pragmatic viewpoints. The reasons for using PROLOG for NLP are numerous. First, linguists can write natural-language grammars almost directly as PROLOG programs; this allows fast-prototyping of NLP systems and facilitates analysis of NLP theories. Second, semantic representations of natural-language texts that use logic formalisms are readily produced in PROLOG because of PROLOG's logical foundations. Third, PROLOG's built-in inferencing mechanisms are often sufficient for inferences on the logical forms produced by NLPs. Fourth, the logical, declarative nature of PROLOG may make it the language of choice for parallel computing systems. Finally, the fact that PROLOG has a de facto standard (Edinburgh) makes the porting of code from one computer system to another virtually trouble free. Perhaps the strongest tie one could make between NLP and PROLOG was stated by John Stuart Mill in his inaugural Address at St. Andrews: The structure of every sentence is a lesson in logic.

  3. Validation of the "Chinese Language Classroom Learning Environment Inventory" for Investigating the Nature of Chinese Language Classrooms

    ERIC Educational Resources Information Center

    Lian, Chua Siew; Wong, Angela F. L.; Der-Thanq, Victor Chen

    2006-01-01

    The Chinese Language Classroom Environment Inventory (CLCEI) is a bilingual instrument developed for use in measuring students' and teachers' perceptions toward their Chinese Language classroom learning environments in Singapore secondary schools. The English version of the CLCEI was customised from the English version of the "What is happening in…

  4. Neural Network Processing of Natural Language: I. Sensitivity to Serial, Temporal, and Abstract Structure of Language in the Infant.

    ERIC Educational Resources Information Center

    Dominey, Peter Ford; Ramus, Franck

    2000-01-01

    Demonstrates how innate representational capabilities for serial and temporal structure of language could arise from a common neural architecture, distinct from that required for the representation of abstract structure, and provides a predictive testable model of the initial computational state of the language learner. (Author/VWL)

  5. A Principled Framework for Constructing Natural Language Interfaces To Temporal Databases

    NASA Astrophysics Data System (ADS)

    Androutsopoulos, Ion

    1996-09-01

    Most existing natural language interfaces to databases (NLIDBs) were designed to be used with ``snapshot'' database systems, that provide very limited facilities for manipulating time-dependent data. Consequently, most NLIDBs also provide very limited support for the notion of time. The database community is becoming increasingly interested in _temporal_ database systems. These are intended to store and manipulate in a principled manner information not only about the present, but also about the past and future. This thesis develops a principled framework for constructing English NLIDBs for _temporal_ databases (NLITDBs), drawing on research in tense and aspect theories, temporal logics, and temporal databases. I first explore temporal linguistic phenomena that are likely to appear in English questions to NLITDBs. Drawing on existing linguistic theories of time, I formulate an account for a large number of these phenomena that is simple enough to be embodied in practical NLITDBs. Exploiting ideas from temporal logics, I then define a temporal meaning representation language, TOP, and I show how the HPSG grammar theory can be modified to incorporate the tense and aspect account of this thesis, and to map a wide range of English questions involving time to appropriate TOP expressions. Finally, I present and prove the correctness of a method to translate from TOP to TSQL2, TSQL2 being a temporal extension of the SQL-92 database language. This way, I establish a sound route from English questions involving time to a general-purpose temporal database language, that can act as a principled framework for building NLITDBs. To demonstrate that this framework is workable, I employ it to develop a prototype NLITDB, implemented using ALE and Prolog.

  6. Language of the Earth: Exploring Natural Hazards through a Literary Anthology

    NASA Astrophysics Data System (ADS)

    Malamud, B. D.; Rhodes, F. H. T.

    2009-04-01

    This paper explores natural hazards teaching and communications through the use of a literary anthology of writings about the earth aimed at non-experts. Teaching natural hazards in high-school and university introductory Earth Science and Geography courses revolves mostly around lectures, examinations, and laboratory demonstrations/activities. Often the results of such a course are that a student 'memorizes' the answers, and is penalized when they miss a given fact [e.g., "You lost one point because you were off by 50 km/hr on the wind speed of an F5 tornado."] Although facts and general methodologies are certainly important when teaching natural hazards, it is a strong motivation to a student's assimilation of, and enthusiasm for, this knowledge, if supplemented by writings about the Earth. In this paper, we discuss a literary anthology which we developed [Language of the Earth, Rhodes, Stone, Malamud, Wiley-Blackwell, 2008] which includes many descriptions about natural hazards. Using first- and second-hand accounts of landslides, earthquakes, tsunamis, floods and volcanic eruptions, through the writings of McPhee, Gaskill, Voltaire, Austin, Cloos, and many others, hazards become 'alive', and more than 'just' a compilation of facts and processes. Using short excerpts such as these, or other similar anthologies, of remarkably written accounts and discussions about natural hazards results in 'dry' facts becoming more than just facts. These often highly personal viewpoints of our catostrophic world, provide a useful supplement to a student's understanding of the turbulent world in which we live.

  7. Comparative study on the customization of natural language interfaces to databases.

    PubMed

    Pazos R, Rodolfo A; Aguirre L, Marco A; González B, Juan J; Martínez F, José A; Pérez O, Joaquín; Verástegui O, Andrés A

    2016-01-01

    In the last decades the popularity of natural language interfaces to databases (NLIDBs) has increased, because in many cases information obtained from them is used for making important business decisions. Unfortunately, the complexity of their customization by database administrators make them difficult to use. In order for a NLIDB to obtain a high percentage of correctly translated queries, it is necessary that it is correctly customized for the database to be queried. In most cases the performance reported in NLIDB literature is the highest possible; i.e., the performance obtained when the interfaces were customized by the implementers. However, for end users it is more important the performance that the interface can yield when the NLIDB is customized by someone different from the implementers. Unfortunately, there exist very few articles that report NLIDB performance when the NLIDBs are not customized by the implementers. This article presents a semantically-enriched data dictionary (which permits solving many of the problems that occur when translating from natural language to SQL) and an experiment in which two groups of undergraduate students customized our NLIDB and English language frontend (ELF), considered one of the best available commercial NLIDBs. The experimental results show that, when customized by the first group, our NLIDB obtained a 44.69 % of correctly answered queries and ELF 11.83 % for the ATIS database, and when customized by the second group, our NLIDB attained 77.05 % and ELF 13.48 %. The performance attained by our NLIDB, when customized by ourselves was 90 %. PMID:27190752

  8. Formal ontology for natural language processing and the integration of biomedical databases.

    PubMed

    Simon, Jonathan; Dos Santos, Mariana; Fielding, James; Smith, Barry

    2006-01-01

    The central hypothesis underlying this communication is that the methodology and conceptual rigor of a philosophically inspired formal ontology can bring significant benefits in the development and maintenance of application ontologies [A. Flett, M. Dos Santos, W. Ceusters, Some Ontology Engineering Procedures and their Supporting Technologies, EKAW2002, 2003]. This hypothesis has been tested in the collaboration between Language and Computing (L&C), a company specializing in software for supporting natural language processing especially in the medical field, and the Institute for Formal Ontology and Medical Information Science (IFOMIS), an academic research institution concerned with the theoretical foundations of ontology. In the course of this collaboration L&C's ontology, LinKBase, which is designed to integrate and support reasoning across a plurality of external databases, has been subjected to a thorough auditing on the basis of the principles underlying IFOMIS's Basic Formal Ontology (BFO) [B. Smith, Basic Formal Ontology, 2002. http://ontology.buffalo.edu/bfo]. The goal is to transform a large terminology-based ontology into one with the ability to support reasoning applications. Our general procedure has been the implementation of a meta-ontological definition space in which the definitions of all the concepts and relations in LinKBase are standardized in the framework of first-order logic. In this paper we describe how this principles-based standardization has led to a greater degree of internal coherence of the LinKBase structure, and how it has facilitated the construction of mappings between external databases using LinKBase as translation hub. We argue that the collaboration here described represents a new phase in the quest to solve the so-called "Tower of Babel" problem of ontology integration [F. Montayne, J. Flanagan, Formal Ontology: The Foundation for Natural Language Processing, 2003. http://www.landcglobal.com/].

  9. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

    NASA Technical Reports Server (NTRS)

    Genuardi, Michael T.

    1993-01-01

    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  10. Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records.

    PubMed

    Luo, Yuan; Szolovits, Peter

    2016-01-01

    In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions. PMID:27478379

  11. Linking sounds to meanings: Infant statistical learning in a natural language

    PubMed Central

    Hay, Jessica F.; Pelucchi, Bruna; Estes, Katharine Graf; Saffran, Jenny R.

    2011-01-01

    The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants’ subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants’ prior experience with the distribution of sounds that make up words in natural languages. PMID:21762650

  12. Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records

    PubMed Central

    Luo, Yuan; Szolovits, Peter

    2016-01-01

    In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen’s interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen’s relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions. PMID:27478379

  13. A study of the very high order natural user language (with AI capabilities) for the NASA space station common module

    NASA Technical Reports Server (NTRS)

    Gill, E. N.

    1986-01-01

    The requirements are identified for a very high order natural language to be used by crew members on board the Space Station. The hardware facilities, databases, realtime processes, and software support are discussed. The operations and capabilities that will be required in both normal (routine) and abnormal (nonroutine) situations are evaluated. A structure and syntax for an interface (front-end) language to satisfy the above requirements are recommended.

  14. Language.

    PubMed

    Cattaneo, Luigi

    2013-01-01

    Noninvasive focal brain stimulation by means of transcranial magnetic stimulation (TMS) has been used extensively in the past 20 years to investigate normal language functions. The picture emerging from this collection of empirical works is that of several independent modular functions mapped on left-lateralized temporofrontal circuits originating dorsally or ventrally to the auditory cortex. The identification of sounds as language (i.e., phonological transformations) is modulated by TMS applied over the posterior-superior temporal cortex and over the caudal inferior frontal gyrus/ventral premotor cortex complex. Conversely, attribution of semantics to words is modulated successfully by applying TMS to the rostral part of the inferior frontal gyrus. Speech production is typically interfered with by TMS applied to the left inferior frontal gyrus, onto the same cortical areas that also contain phonological representations. The cortical mapping of grammatical functions has been investigated with TMS mainly regarding the category of verbs, which seem to be represented in the left middle frontal gyrus. Most TMS studies have investigated the cortical processing of single words or sublexical elements. Conversely, complex elements of language such as syntax have not been investigated extensively, although a few studies have indicated a left temporal, frontal, and parietal system also involving the neocerebellar cortex. Finally, both the perception and production of nonlinguistic communicative properties of speech, such as prosody, have been mapped by TMS in the peri-Silvian region of the right hemisphere. PMID:24112933

  15. Neural substrates of figurative language during natural speech perception: an fMRI study

    PubMed Central

    Nagels, Arne; Kauschke, Christina; Schrauf, Judith; Whitney, Carin; Straube, Benjamin; Kircher, Tilo

    2013-01-01

    Many figurative expressions are fully conventionalized in everyday speech. Regarding the neural basis of figurative language processing, research has predominantly focused on metaphoric expressions in minimal semantic context. It remains unclear in how far metaphoric expressions during continuous text comprehension activate similar neural networks as isolated metaphors. We therefore investigated the processing of similes (figurative language, e.g., “He smokes like a chimney!”) occurring in a short story. Sixteen healthy, male, native German speakers listened to similes that came about naturally in a short story, while blood-oxygenation-level-dependent (BOLD) responses were measured with functional magnetic resonance imaging (fMRI). For the event-related analysis, similes were contrasted with non-figurative control sentences (CS). The stimuli differed with respect to figurativeness, while they were matched for frequency of words, number of syllables, plausibility, and comprehensibility. Similes contrasted with CS resulted in enhanced BOLD responses in the left inferior (IFG) and adjacent middle frontal gyrus. Concrete CS as compared to similes activated the bilateral middle temporal gyri as well as the right precuneus and the left middle frontal gyrus (LMFG). Activation of the left IFG for similes in a short story is consistent with results on single sentence metaphor processing. The findings strengthen the importance of the left inferior frontal region in the processing of abstract figurative speech during continuous, ecologically-valid speech comprehension; the processing of concrete semantic contents goes along with a down-regulation of bilateral temporal regions. PMID:24065897

  16. HUNTER-GATHERER: Three search techniques integrated for natural language semantics

    SciTech Connect

    Beale, S.; Nirenburg, S.; Mahesh, K.

    1996-12-31

    This work integrates three related Al search techniques - constraint satisfaction, branch-and-bound and solution synthesis - and applies the result to semantic processing in natural language (NL). We summarize the approach as {open_quote}Hunter-Gatherer:{close_quotes} (1) branch-and-bound and constraint satisfaction allow us to {open_quote}hunt down{close_quotes} non-optimal and impossible solutions and prune them from the search space. (2) solution synthesis methods then {open_quote}gather{close_quotes} all optimal solutions avoiding exponential complexity. Each of the three techniques is briefly described, as well as their extensions and combinations used in our system. We focus on the combination of solution synthesis and branch-and-bound methods which has enabled near-linear-time processing in our applications. Finally, we illustrate how the use of our technique in a large-scale MT project allowed a drastic reduction in search space.

  17. Extracting important information from Chinese Operation Notes with natural language processing methods.

    PubMed

    Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei

    2014-04-01

    Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score.

  18. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines.

    PubMed

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne

    2014-01-01

    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice.

  19. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary

    PubMed Central

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda

    2014-01-01

    In April 2012, the National Institutes of Health organized a two-day workshop entitled ‘Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making’ (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193

  20. Knowledge Extraction from MEDLINE by Combining Clustering with Natural Language Processing

    PubMed Central

    Miñarro-Giménez, Jose A.; Kreuzthaler, Markus; Schulz, Stefan

    2015-01-01

    The identification of relevant predicates between co-occurring concepts in scientific literature databases like MEDLINE is crucial for using these sources for knowledge extraction, in order to obtain meaningful biomedical predications as subject-predicate-object triples. We consider the manually assigned MeSH indexing terms (main headings and subheadings) in MEDLINE records as a rich resource for extracting a broad range of domain knowledge. In this paper, we explore the combination of a clustering method for co-occurring concepts based on their related MeSH subheadings in MEDLINE with the use of SemRep, a natural language processing engine, which extracts predications from free text documents. As a result, we generated sets of clusters of co-occurring concepts and identified the most significant predicates for each cluster. The association of such predicates with the co-occurrences of the resulting clusters produces the list of predications, which were checked for relevance. PMID:26958228

  1. On application of image analysis and natural language processing for music search

    NASA Astrophysics Data System (ADS)

    Gwardys, Grzegorz

    2013-10-01

    In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.

  2. HF-Explain: a natural language generation system for explaining a medical expert system.

    PubMed Central

    Lewin, H. C.

    1991-01-01

    Causal models have been used, with considerable success, to reason in the medical domain. While these systems typically have a robust reasoning mechanism and knowledge base about their specific area of expertise, their ability to satisfactorily explain their results in a meaningful, coherent and concise manner has been less impressive then their diagnostic capabilities. This paper describes a program, HF-Explain, that generates natural language explanations of one such system--the Heart Failure Program. HF-Explain, is loosely based on work done by McKeown in the Text system, using augmented transition networks (ATN) as a formalism to guide the explanation process. The result is a coherent, concise, accurate and rich explanation of Heart Failure Programs' diagnostic hypotheses. PMID:1807682

  3. The Archaeotools project: faceted classification and natural language processing in an archaeological context.

    PubMed

    Jeffrey, S; Richards, J; Ciravegna, F; Waller, S; Chapman, S; Zhang, Z

    2009-06-28

    This paper describes 'Archaeotools', a major e-Science project in archaeology. The aim of the project is to use faceted classification and natural language processing to create an advanced infrastructure for archaeological research. The project aims to integrate over 1 x 10(6) structured database records referring to archaeological sites and monuments in the UK, with information extracted from semi-structured grey literature reports, and unstructured antiquarian journal accounts, in a single faceted browser interface. The project has illuminated the variable level of vocabulary control and standardization that currently exists within national and local monument inventories. Nonetheless, it has demonstrated that the relatively well-defined ontologies and thesauri that exist in archaeology mean that a high level of success can be achieved using information extraction techniques. This has great potential for unlocking and making accessible the information held in grey literature and antiquarian accounts, and has lessons for allied disciplines.

  4. Interset: A natural language interface for teleoperated robotic assembly of the EASE space structure

    NASA Technical Reports Server (NTRS)

    Boorsma, Daniel K.

    1989-01-01

    A teleoperated robot was used to assemble the Experimental Assembly of Structures in Extra-vehicular activity (EASE) space structure under neutral buoyancy conditions, simulating a telerobot performing structural assembly in the zero gravity of space. This previous work used a manually controlled teleoperator as a test bed for system performance evaluations. From these results several Artificial Intelligence options were proposed. One of these was further developed into a real time assembly planner. The interface for this system is effective in assembling EASE structures using windowed graphics and a set of networked menus. As the problem space becomes more complex and hence the set of control options increases, a natural language interface may prove to be beneficial to supplement the menu based control strategy. This strategy can be beneficial in situations such as: describing the local environment, maintaining a data base of task event histories, modifying a plan or a heuristic dynamically, summarizing a task in English, or operating in a novel situation.

  5. DBPQL: A view-oriented query language for the Intel Data Base Processor

    NASA Technical Reports Server (NTRS)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  6. Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language

    PubMed Central

    West, Andrew G

    2015-01-01

    Background Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. Objective This paper quantifies the production and consumption of Wikipedia’s medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia’s natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia’s medical contributors, including year-over-year participation trends and editor demographics. Methods Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. Results Wikipedia’s medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this

  7. LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

    NASA Technical Reports Server (NTRS)

    Will, H.

    1994-01-01

    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  8. Integrating natural language processing and web GIS for interactive knowledge domain visualization

    NASA Astrophysics Data System (ADS)

    Du, Fangming

    Recent years have seen a powerful shift towards data-rich environments throughout society. This has extended to a change in how the artifacts and products of scientific knowledge production can be analyzed and understood. Bottom-up approaches are on the rise that combine access to huge amounts of academic publications with advanced computer graphics and data processing tools, including natural language processing. Knowledge domain visualization is one of those multi-technology approaches, with its aim of turning domain-specific human knowledge into highly visual representations in order to better understand the structure and evolution of domain knowledge. For example, network visualizations built from co-author relations contained in academic publications can provide insight on how scholars collaborate with each other in one or multiple domains, and visualizations built from the text content of articles can help us understand the topical structure of knowledge domains. These knowledge domain visualizations need to support interactive viewing and exploration by users. Such spatialization efforts are increasingly looking to geography and GIS as a source of metaphors and practical technology solutions, even when non-georeferenced information is managed, analyzed, and visualized. When it comes to deploying spatialized representations online, web mapping and web GIS can provide practical technology solutions for interactive viewing of knowledge domain visualizations, from panning and zooming to the overlay of additional information. This thesis presents a novel combination of advanced natural language processing - in the form of topic modeling - with dimensionality reduction through self-organizing maps and the deployment of web mapping/GIS technology towards intuitive, GIS-like, exploration of a knowledge domain visualization. A complete workflow is proposed and implemented that processes any corpus of input text documents into a map form and leverages a web

  9. Automated access to a large medical dictionary: online assistance for research and application in natural language processing.

    PubMed

    McCray, A T; Srinivasan, S

    1990-04-01

    Online dictionaries can be important tools for research and application in natural language processing. This paper describes work with a machine-readable version of "Dorland's Illustrated Medical Dictionary". First the characteristics of the dictionary are briefly described, and then the complex process of converting the tape to an online interactive dictionary is discussed. The results of several experiments in automatically deriving information from the online dictionary are presented, and the paper ends with a discussion of the use of the online dictionary as a tool in the development of a natural language processing system designed for the biomedical domain.

  10. The Usual and the Unusual: Solving Remote Associates Test Tasks Using Simple Statistical Natural Language Processing Based on Language Use

    ERIC Educational Resources Information Center

    Klein, Ariel; Badia, Toni

    2015-01-01

    In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…

  11. A Scheme For Assessing The Nature Of A Young Child's Language Competence

    ERIC Educational Resources Information Center

    McFetridge, Patricia A.

    1974-01-01

    Article considered recent research in language conducted by a teacher educator from St. Lucia, West Indies. Her specific focus was on the methods devised for collection and analysis of language samples. (Author/RK)

  12. Genes, language, and the nature of scientific explanations: the case of Williams syndrome.

    PubMed

    Musolino, Julien; Landau, Barbara

    2012-01-01

    In this article, we discuss two experiments of nature and their implications for the sciences of the mind. The first, Williams syndrome, bears on one of cognitive science's holy grails: the possibility of unravelling the causal chain between genes and cognition. We sketch the outline of a general framework to study the relationship between genes and cognition, focusing as our case study on the development of language in individuals with Williams syndrome. Our approach emphasizes the role of three key ingredients: the need to specify a clear level of analysis, the need to provide a theoretical account of the relevant cognitive structure at that level, and the importance of the (typical) developmental process itself. The promise offered by the case of Williams syndrome has also given rise to two strongly conflicting theoretical approaches-modularity and neuroconstructivism-themselves offshoots of a perennial debate between nativism and empiricism. We apply our framework to explore the tension created by these two conflicting perspectives. To this end, we discuss a second experiment of nature, which allows us to compare the two competing perspectives in what comes close to a controlled experimental setting. From this comparison, we conclude that the "meaningful debate assumption", a widespread assumption suggesting that neuroconstructivism and modularity address the same questions and represent genuine theoretical alternatives, rests on a fallacy.

  13. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.

    PubMed

    Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. PMID:24935050

  14. Statistical Learning in a Natural Language by 8-Month-Old Infants

    ERIC Educational Resources Information Center

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2009-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real…

  15. A Feasibility Study of the Case Hierarchy Model for the Construction and Porting of Natural Language Interfaces.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1990-01-01

    Describes the Case Hierarchy, a model of the case system of unconstrained natural language, and ways in which the case system is specialized in a restricted domain. Results of a feasibility study which examined the utility of the Case Hierarchy and the Case Hierarchy Tool (an intelligent editor supporting the domain analysis process) are…

  16. An Examination of Natural Language as a Query Formation Tool for Retrieving Information on E-Health from Pub Med.

    ERIC Educational Resources Information Center

    Peterson, Gabriel M.; Su, Kuichun; Ries, James E.; Sievert, Mary Ellen C.

    2002-01-01

    Discussion of Internet use for information searches on health-related topics focuses on a study that examined complexity and variability of natural language in using search terms that express the concept of electronic health (e-health). Highlights include precision of retrieved information; shift in terminology; and queries using the Pub Med…

  17. Does It Really Matter whether Students' Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language?

    ERIC Educational Resources Information Center

    D'Mello, Sidney K.; Dowell, Nia; Graesser, Arthur

    2011-01-01

    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The "speech facilitation" hypothesis predicts that spoken input will "increase" learning, whereas the "text…

  18. Evaluation of Automated Natural Language Processing in the Further Development of Science Information Retrieval. String Program Reports No. 10.

    ERIC Educational Resources Information Center

    Sager, Naomi

    This investigation matches the emerging techniques in computerized natural language processing against emerging needs for such techniques in the information field to evaluate and extend such techniques for future applications and to establish a basis and direction for further research toward these goals. An overview describes developments in the…

  19. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  20. Surmounting the Tower of Babel: Monolingual and bilingual 2-year-olds' understanding of the nature of foreign language words.

    PubMed

    Byers-Heinlein, Krista; Chen, Ke Heng; Xu, Fei

    2014-03-01

    Languages function as independent and distinct conventional systems, and so each language uses different words to label the same objects. This study investigated whether 2-year-old children recognize that speakers of their native language and speakers of a foreign language do not share the same knowledge. Two groups of children unfamiliar with Mandarin were tested: monolingual English-learning children (n=24) and bilingual children learning English and another language (n=24). An English speaker taught children the novel label fep. On English mutual exclusivity trials, the speaker asked for the referent of a novel label (wug) in the presence of the fep and a novel object. Both monolingual and bilingual children disambiguated the reference of the novel word using a mutual exclusivity strategy, choosing the novel object rather than the fep. On similar trials with a Mandarin speaker, children were asked to find the referent of a novel Mandarin label kuò. Monolinguals again chose the novel object rather than the object with the English label fep, even though the Mandarin speaker had no access to conventional English words. Bilinguals did not respond systematically to the Mandarin speaker, suggesting that they had enhanced understanding of the Mandarin speaker's ignorance of English words. The results indicate that monolingual children initially expect words to be conventionally shared across all speakers-native and foreign. Early bilingual experience facilitates children's discovery of the nature of foreign language words.

  1. Voice-enabled Knowledge Engine using Flood Ontology and Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Sermet, M. Y.; Demir, I.; Krajewski, W. F.

    2015-12-01

    The Iowa Flood Information System (IFIS) is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to flood inundation maps, real-time flood conditions, flood forecasts, flood-related data, information and interactive visualizations for communities in Iowa. The IFIS is designed for use by general public, often people with no domain knowledge and limited general science background. To improve effective communication with such audience, we have introduced a voice-enabled knowledge engine on flood related issues in IFIS. Instead of navigating within many features and interfaces of the information system and web-based sources, the system provides dynamic computations based on a collection of built-in data, analysis, and methods. The IFIS Knowledge Engine connects to real-time stream gauges, in-house data sources, analysis and visualization tools to answer natural language questions. Our goal is the systematization of data and modeling results on flood related issues in Iowa, and to provide an interface for definitive answers to factual queries. The goal of the knowledge engine is to make all flood related knowledge in Iowa easily accessible to everyone, and support voice-enabled natural language input. We aim to integrate and curate all flood related data, implement analytical and visualization tools, and make it possible to compute answers from questions. The IFIS explicitly implements analytical methods and models, as algorithms, and curates all flood related data and resources so that all these resources are computable. The IFIS Knowledge Engine computes the answer by deriving it from its computational knowledge base. The knowledge engine processes the statement, access data warehouse, run complex database queries on the server-side and return outputs in various formats. This presentation provides an overview of IFIS Knowledge Engine, its unique information interface and functionality as an educational tool, and discusses the future plans

  2. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…

  3. Searching for Cancer Information on the Internet: Analyzing Natural Language Search Queries

    PubMed Central

    Theofanos, Mary Frances

    2003-01-01

    .7%), Skin (11.3%), and Genitourinary (10.5%). Additional subcategories of queries about specific cancer types varied, depending on user input. Queries that were not specific to a cancer type were also tracked and categorized. Conclusions Natural-language searching affords users the opportunity to fully express their information needs and can aid users naïve to the content and vocabulary. The specific queries analyzed for this study reflect news and research studies reported during the study dates and would surely change with different study dates. Analyzing queries from search engines represents one way of knowing what kinds of content to provide to users of a given Web site. Users ask questions using whole sentences and keywords, often misspelling words. Providing the option for natural-language searching does not obviate the need for good information architecture, usability engineering, and user testing in order to optimize user experience. PMID:14713659

  4. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management

    PubMed Central

    Keselman, Alla; Rosemblat, Graciela; Kilicoglu, Halil; Fiszman, Marcelo; Jin, Honglan; Shin, Dongwook; Rindflesch, Thomas C.

    2013-01-01

    Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design. PMID:24311971

  5. Natural Language Processing As an Alternative to Manual Reporting of Colonoscopy Quality Metrics

    PubMed Central

    RAJU, GOTTUMUKKALA S.; LUM, PHILLIP J.; SLACK, REBECCA; THIRUMURTHI, SELVI; LYNCH, PATRICK M.; MILLER, ETHAN; WESTON, BRIAN R.; DAVILA, MARTA L.; BHUTANI, MANOOP S.; SHAFI, MEHNAZ A.; BRESALIER, ROBERT S.; DEKOVICH, ALEXANDER A.; LEE, JEFFREY H.; GUHA, SUSHOVAN; PANDE, MALA; BLECHACZ, BORIS; RASHID, ASIF; ROUTBORT, MARK; SHUTTLESWORTH, GLADIS; MISHRA, LOPA; STROEHLEIN, JOHN R.; ROSS, WILLIAM A.

    2015-01-01

    BACKGROUND & AIMS The adenoma detection rate (ADR) is a quality metric tied to interval colon cancer occurrence. However, manual extraction of data to calculate and track the ADR in clinical practice is labor-intensive. To overcome this difficulty, we developed a natural language processing (NLP) method to identify patients, who underwent their first screening colonoscopy, identify adenomas and sessile serrated adenomas (SSA). We compared the NLP generated results with that of manual data extraction to test the accuracy of NLP, and report on colonoscopy quality metrics using NLP. METHODS Identification of screening colonoscopies using NLP was compared with that using the manual method for 12,748 patients who underwent colonoscopies from July 2010 to February 2013. Also, identification of adenomas and SSAs using NLP was compared with that using the manual method with 2259 matched patient records. Colonoscopy ADRs using these methods were generated for each physician. RESULTS NLP correctly identified 91.3% of the screening examinations, whereas the manual method identified 87.8% of them. Both the manual method and NLP correctly identified examinations of patients with adenomas and SSAs in the matched records almost perfectly. Both NLP and manual method produce comparable values for ADR for each endoscopist as well as the group as a whole. CONCLUSIONS NLP can correctly identify screening colonoscopies, accurately identify adenomas and SSAs in a pathology database, and provide real-time quality metrics for colonoscopy. PMID:25910665

  6. Semi-supervised learning of statistical models for natural language understanding.

    PubMed

    Zhou, Deyu; He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.

  7. Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports.

    PubMed

    Sohn, Sunghwan; Ye, Zi; Liu, Hongfang; Chute, Christopher G; Kullo, Iftikhar J

    2013-01-01

    Prevalence of abdominal aortic aneurysm (AAA) is increasing due to longer life expectancy and implementation of screening programs. Patient-specific longitudinal measurements of AAA are important to understand pathophysiology of disease development and modifiers of abdominal aortic size. In this paper, we applied natural language processing (NLP) techniques to process radiology reports and developed a rule-based algorithm to identify AAA patients and also extract the corresponding aneurysm size with the examination date. AAA patient cohorts were determined by a hierarchical approach that: 1) selected potential AAA reports using keywords; 2) classified reports into AAA-case vs. non-case using rules; and 3) determined the AAA patient cohort based on a report-level classification. Our system was built in an Unstructured Information Management Architecture framework that allows efficient use of existing NLP components. Our system produced an F-score of 0.961 for AAA-case report classification with an accuracy of 0.984 for aneurysm size extraction. PMID:24303276

  8. Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing.

    PubMed

    Sippo, Dorothy A; Warden, Graham I; Andriole, Katherine P; Lacson, Ronilda; Ikuta, Ichiro; Birdwell, Robyn L; Khorasani, Ramin

    2013-10-01

    The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies. PMID:23868515

  9. Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning

    PubMed Central

    Cohen, Kevin Bretonnel; Glass, Benjamin; Greiner, Hansel M.; Holland-Bouley, Katherine; Standridge, Shannon; Arya, Ravindra; Faist, Robert; Morita, Diego; Mangano, Francesco; Connolly, Brian; Glauser, Tracy; Pestian, John

    2016-01-01

    Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient’s status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral. PMID:27257386

  10. Bringing Chatbots into education: Towards Natural Language Negotiation of Open Learner Models

    NASA Astrophysics Data System (ADS)

    Kerlyl, Alice; Hall, Phil; Bull, Susan

    There is an extensive body of work on Intelligent Tutoring Systems: computer environments for education, teaching and training that adapt to the needs of the individual learner. Work on personalisation and adaptivity has included research into allowing the student user to enhance the system's adaptivity by improving the accuracy of the underlying learner model. Open Learner Modelling, where the system's model of the user's knowledge is revealed to the user, has been proposed to support student reflection on their learning. Increased accuracy of the learner model can be obtained by the student and system jointly negotiating the learner model. We present the initial investigations into a system to allow people to negotiate the model of their understanding of a topic in natural language. This paper discusses the development and capabilities of both conversational agents (or chatbots) and Intelligent Tutoring Systems, in particular Open Learner Modelling. We describe a Wizard-of-Oz experiment to investigate the feasibility of using a chatbot to support negotiation, and conclude that a fusion of the two fields can lead to developing negotiation techniques for chatbots and the enhancement of the Open Learner Model. This technology, if successful, could have widespread application in schools, universities and other training scenarios.

  11. A framework for the natural-language-perception-based creative control of unmanned ground vehicles

    NASA Astrophysics Data System (ADS)

    Ghaffari, Masoud; Liao, Xiaoqun; Hall, Ernest L.

    2004-09-01

    Mobile robots must often operate in an unstructured environment cluttered with obstacles and with many possible action paths. That is why mobile robotics problems are complex with many unanswered questions. To reach a high degree of autonomous operation, a new level of learning is required. On the one hand, promising learning theories such as the adaptive critic and creative control have been proposed, while on other hand the human brain"s processing ability has amazed and inspired researchers in the area of Unmanned Ground Vehicles but has been difficult to emulate in practice. A new direction in the fuzzy theory tries to develop a theory to deal with the perceptions conveyed by the natural language. This paper tries to combine these two fields and present a framework for autonomous robot navigation. The proposed creative controller like the adaptive critic controller has information stored in a dynamic database (DB), plus a dynamic task control center (TCC) that functions as a command center to decompose tasks into sub-tasks with different dynamic models and multi-criteria functions. The TCC module utilizes computational theory of perceptions to deal with the high levels of task planning. The authors are currently trying to implement the model on a real mobile robot and the preliminary results have been described in this paper.

  12. Extraction of CYP chemical interactions from biomedical literature using natural language processing methods.

    PubMed

    Jiao, Dazhi; Wild, David J

    2009-02-01

    This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.

  13. Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof

    ERIC Educational Resources Information Center

    Robotti, Elisabetta

    2012-01-01

    In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language…

  14. A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports

    PubMed Central

    Jonnalagadda, Siddhartha R.

    2016-01-01

    Large volumes of data are continuously generated from clinical notes and diagnostic studies catalogued in electronic health records (EHRs). Echocardiography is one of the most commonly ordered diagnostic tests in cardiology. This study sought to explore the feasibility and reliability of using natural language processing (NLP) for large-scale and targeted extraction of multiple data elements from echocardiography reports. An NLP tool, EchoInfer, was developed to automatically extract data pertaining to cardiovascular structure and function from heterogeneously formatted echocardiographic data sources. EchoInfer was applied to echocardiography reports (2004 to 2013) available from 3 different on-going clinical research projects. EchoInfer analyzed 15,116 echocardiography reports from 1684 patients, and extracted 59 quantitative and 21 qualitative data elements per report. EchoInfer achieved a precision of 94.06%, a recall of 92.21%, and an F1-score of 93.12% across all 80 data elements in 50 reports. Physician review of 400 reports demonstrated that EchoInfer achieved a recall of 92–99.9% and a precision of >97% in four data elements, including three quantitative and one qualitative data element. Failure of EchoInfer to correctly identify or reject reported parameters was primarily related to non-standardized reporting of echocardiography data. EchoInfer provides a powerful and reliable NLP-based approach for the large-scale, targeted extraction of information from heterogeneous data sources. The use of EchoInfer may have implications for the clinical management and research analysis of patients undergoing echocardiographic evaluation. PMID:27124000

  15. Measuring information acquisition from sensory input using automated scoring of natural-language descriptions.

    PubMed

    Saunders, Daniel R; Bex, Peter J; Rose, Dylan J; Woods, Russell L

    2014-01-01

    Information acquisition, the gathering and interpretation of sensory information, is a basic function of mobile organisms. We describe a new method for measuring this ability in humans, using free-recall responses to sensory stimuli which are scored objectively using a "wisdom of crowds" approach. As an example, we demonstrate this metric using perception of video stimuli. Immediately after viewing a 30 s video clip, subjects responded to a prompt to give a short description of the clip in natural language. These responses were scored automatically by comparison to a dataset of responses to the same clip by normally-sighted viewers (the crowd). In this case, the normative dataset consisted of responses to 200 clips by 60 subjects who were stratified by age (range 22 to 85 y) and viewed the clips in the lab, for 2,400 responses, and by 99 crowdsourced participants (age range 20 to 66 y) who viewed clips in their Web browser, for 4,000 responses. We compared different algorithms for computing these similarities and found that a simple count of the words in common had the best performance. It correctly matched 75% of the lab-sourced and 95% of crowdsourced responses to their corresponding clips. We validated the measure by showing that when the amount of information in the clip was degraded using defocus lenses, the shared word score decreased across the five predetermined visual-acuity levels, demonstrating a dose-response effect (N = 15). This approach, of scoring open-ended immediate free recall of the stimulus, is applicable not only to video, but also to other situations where a measure of the information that is successfully acquired is desirable. Information acquired will be affected by stimulus quality, sensory ability, and cognitive processes, so our metric can be used to assess each of these components when the others are controlled.

  16. A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports.

    PubMed

    Nath, Chinmoy; Albaghdadi, Mazen S; Jonnalagadda, Siddhartha R

    2016-01-01

    Large volumes of data are continuously generated from clinical notes and diagnostic studies catalogued in electronic health records (EHRs). Echocardiography is one of the most commonly ordered diagnostic tests in cardiology. This study sought to explore the feasibility and reliability of using natural language processing (NLP) for large-scale and targeted extraction of multiple data elements from echocardiography reports. An NLP tool, EchoInfer, was developed to automatically extract data pertaining to cardiovascular structure and function from heterogeneously formatted echocardiographic data sources. EchoInfer was applied to echocardiography reports (2004 to 2013) available from 3 different on-going clinical research projects. EchoInfer analyzed 15,116 echocardiography reports from 1684 patients, and extracted 59 quantitative and 21 qualitative data elements per report. EchoInfer achieved a precision of 94.06%, a recall of 92.21%, and an F1-score of 93.12% across all 80 data elements in 50 reports. Physician review of 400 reports demonstrated that EchoInfer achieved a recall of 92-99.9% and a precision of >97% in four data elements, including three quantitative and one qualitative data element. Failure of EchoInfer to correctly identify or reject reported parameters was primarily related to non-standardized reporting of echocardiography data. EchoInfer provides a powerful and reliable NLP-based approach for the large-scale, targeted extraction of information from heterogeneous data sources. The use of EchoInfer may have implications for the clinical management and research analysis of patients undergoing echocardiographic evaluation. PMID:27124000

  17. Identifying Repetitive Institutional Review Board Stipulations by Natural Language Processing and Network Analysis.

    PubMed

    Kury, Fabrício S P; Cimino, James J

    2015-01-01

    The corrections ("stipulations") to a proposed research study protocol produced by an institutional review board (IRB) can often be repetitive across many studies; however, there is no standard set of stipulations that could be used, for example, by researchers wishing to anticipate and correct problems in their research proposals prior to submitting to an IRB. The objective of the research was to computationally identify the most repetitive types of stipulations generated in the course of IRB deliberations. The text of each stipulation was normalized using the natural language processing techniques. An undirected weighted network was constructed in which each stipulation was represented by a node, and each link, if present, had weight corresponding to the TF-IDF Cosine Similarity of the stipulations. Network analysis software was then used to identify clusters in the network representing similar stipulations. The final results were correlated with additional data to produce further insights about the IRB workflow. From a corpus of 18,582 stipulations we identified 31 types of repetitive stipulations. Those types accounted for 3,870 stipulations (20.8% of the corpus) produced for 697 (88.7%) of all protocols in 392 (also 88.7%) of all the CNS IRB meetings with stipulations entered in our data source. A notable peroportion of the corrections produced by the IRB can be considered highly repetitive. Our shareable method relied on a minimal manual analysis and provides an intuitive exploration with theoretically unbounded granularity. Finer granularity allowed for the insight that is anticipated to prevent the need for identifying the IRB panel expertise or any human supervision.

  18. The Complex Nature of Bilinguals' Language Usage Modulates Task-Switching Outcomes

    PubMed Central

    Yang, Hwajin; Hartanto, Andree; Yang, Sujin

    2016-01-01

    In view of inconsistent findings regarding bilingual advantages in executive functions (EF), we reviewed the literature to determine whether bilinguals' different language usage causes measureable changes in the shifting aspects of EF. By drawing on the theoretical framework of the adaptive control hypothesis—which postulates a critical link between bilinguals' varying demands on language control and adaptive cognitive control (Green and Abutalebi, 2013), we examined three factors that characterize bilinguals' language-switching experience: (a) the interactional context of conversational exchanges, (b) frequency of language switching, and (c) typology of code-switching. We also examined whether methodological variations in previous task-switching studies modulate task-specific demands on control processing and lead to inconsistencies in the literature. Our review demonstrates that not only methodological rigor but also a more finely grained, theory-based approach will be required to understand the cognitive consequences of bilinguals' varied linguistic practices in shifting EF. PMID:27199800

  19. The Complex Nature of Bilinguals' Language Usage Modulates Task-Switching Outcomes.

    PubMed

    Yang, Hwajin; Hartanto, Andree; Yang, Sujin

    2016-01-01

    In view of inconsistent findings regarding bilingual advantages in executive functions (EF), we reviewed the literature to determine whether bilinguals' different language usage causes measureable changes in the shifting aspects of EF. By drawing on the theoretical framework of the adaptive control hypothesis-which postulates a critical link between bilinguals' varying demands on language control and adaptive cognitive control (Green and Abutalebi, 2013), we examined three factors that characterize bilinguals' language-switching experience: (a) the interactional context of conversational exchanges, (b) frequency of language switching, and (c) typology of code-switching. We also examined whether methodological variations in previous task-switching studies modulate task-specific demands on control processing and lead to inconsistencies in the literature. Our review demonstrates that not only methodological rigor but also a more finely grained, theory-based approach will be required to understand the cognitive consequences of bilinguals' varied linguistic practices in shifting EF. PMID:27199800

  20. HPARSER: extracting formal patient data from free text history and physical reports using natural language processing software.

    PubMed

    Sponsler, J L

    2001-01-01

    A prototype, HPARSER, processes a patient history and physical report such that specific data are obtained and stored in a patient data record. HPARSER is a recursive transition network (RTN) parser, and includes English and medical grammar rules, lexicon, and database constraints. Medical grammar rules augment the grammar rule base and specify common phrases seen in patient reports (e.g., "pupils are equal and reactive"). Each database constraint associates a grammar rule with a database table and attribute. Constraint behavior is such that if a rule is satisfied, data is extracted from the parse tree and stored into the database. Control reports guided construction of grammar and constraint rules. Test reports were processed with the control rules. 85% of test report sentences parsed and a 60% data capture rate, compared to controls, was achieved. HPARSER demonstrates use of an RTN to parse patient reports, and database constraints to transfer formal data from parse trees into a database.

  1. HPARSER: extracting formal patient data from free text history and physical reports using natural language processing software.

    PubMed Central

    Sponsler, J. L.

    2001-01-01

    A prototype, HPARSER, processes a patient history and physical report such that specific data are obtained and stored in a patient data record. HPARSER is a recursive transition network (RTN) parser, and includes English and medical grammar rules, lexicon, and database constraints. Medical grammar rules augment the grammar rule base and specify common phrases seen in patient reports (e.g., "pupils are equal and reactive"). Each database constraint associates a grammar rule with a database table and attribute. Constraint behavior is such that if a rule is satisfied, data is extracted from the parse tree and stored into the database. Control reports guided construction of grammar and constraint rules. Test reports were processed with the control rules. 85% of test report sentences parsed and a 60% data capture rate, compared to controls, was achieved. HPARSER demonstrates use of an RTN to parse patient reports, and database constraints to transfer formal data from parse trees into a database. PMID:11825263

  2. Programming Languages.

    ERIC Educational Resources Information Center

    Tesler, Lawrence G.

    1984-01-01

    Discusses the nature of programing languages, considering the features of BASIC, LOGO, PASCAL, COBOL, FORTH, APL, and LISP. Also discusses machine/assembly codes, the operation of a compiler, and trends in the evolution of programing languages (including interest in notational systems called object-oriented languages). (JN)

  3. Investigating the Nature of "Interest" Reported by a Group of Postgraduate Students in an MA in English Language Teacher Education Programme

    ERIC Educational Resources Information Center

    Tin, Tan Bee

    2006-01-01

    "Interest" is a widely used term not only in language education but also in our everyday life. However, very little attempt has been made to investigate the nature of "interest" in language teaching and learning. This paper, using a definition of interest proposed in the field of educational psychology, reports on the findings of a study conducted…

  4. The complex of neural networks and probabilistic methods for mathematical modeling of the syntactic structure of a sentence of natural language

    NASA Astrophysics Data System (ADS)

    Sboev, A.; Rybka, R.; Moloshnikov, I.; Gudovskikh, D.

    2016-02-01

    The formalized model to construct the syntactic structure of sentences of a natural language is presented. On base of this model the complex algorithm with use of neural networks founded on data of Russian National language Corpus and set of parameters extracted from this data was developed. The resulted accuracy along with possible accuracy which theoretically could be received with these parameters is presented.

  5. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  6. PHENOGO: ASSIGNING PHENOTYPIC CONTEXT TO GENE ONTOLOGY ANNOTATIONS WITH NATURAL LANGUAGE PROCESSING

    PubMed Central

    LUSSIER, YVES; BORLAWSKY, TARA; RAPPAPORT, DANIEL; LIU, YANG; FRIEDMAN, CAROL

    2010-01-01

    Natural language processing (NLP) is a high throughput technology because it can process vast quantities of text within a reasonable time period. It has the potential to substantially facilitate biomedical research by extracting, linking, and organizing massive amounts of information that occur in biomedical journal articles as well as in textual fields of biological databases. Until recently, much of the work in biological NLP and text mining has revolved around recognizing the occurrence of biomolecular entities in articles, and in extracting particular relationships among the entities. Now, researchers have recognized a need to link the extracted information to ontologies or knowledge bases, which is a more difficult task. One such knowledge base is Gene Ontology annotations (GOA), which significantly increases semantic computations over the function, cellular components and processes of genes. For multicellular organisms, these annotations can be refined with phenotypic context, such as the cell type, tissue, and organ because establishing phenotypic contexts in which a gene is expressed is a crucial step for understanding the development and the molecular underpinning of the pathophysiology of diseases. In this paper, we propose a system, PhenoGO, which automatically augments annotations in GOA with additional context. PhenoGO utilizes an existing NLP system, called BioMedLEE, an existing knowledge-based phenotype organizer system (PhenOS) in conjunction with MeSH indexing and established biomedical ontologies. More specifically, PhenoGO adds phenotypic contextual information to existing associations between gene products and GO terms as specified in GOA. The system also maps the context to identifiers that are associated with different biomedical ontologies, including the UMLS, Cell Ontology, Mouse Anatomy, NCBI taxonomy, GO, and Mammalian Phenotype Ontology. In addition, PhenoGO was evaluated for coding of anatomical and cellular information and assigning

  7. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search

    PubMed Central

    Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain

    2016-01-01

    significantly faster using the Web search interface (F 1,19=18.0, P<.001). There was also a main effect of task (F 2,38=4.1, P=.025, Greenhouse-Geisser correction applied). Overall, participants were asked to rate learnability, ease of use, and satisfaction. Paired mean comparisons showed that the Web search interface received significantly higher ratings than the traditional search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (P<.001, 95% CI [1.2-3.2]), and satisfaction (P<.001, 95% CI [1.8-3.5]). The results show superior cross-domain usability of Web search, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement. Conclusions The results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation. PMID:26769334

  8. Zipf's word frequency law in natural language: a critical review and future directions.

    PubMed

    Piantadosi, Steven T

    2014-10-01

    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data.

  9. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language

    NASA Astrophysics Data System (ADS)

    Szathmáry, Eörs; Szathmáry, Zoltán; Ittzés, Péter; Orbaán, Geroő; Zachár, István; Huszár, Ferenc; Fedor, Anna; Varga, Máté; Számadó, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  10. Medieval and Modern Views of Universal Grammar and the Nature of Second Language Learning.

    ERIC Educational Resources Information Center

    Thomas, Margaret

    1995-01-01

    Examines the relationship between ideas of universal grammar (UG) and second language (L2) teaching and learning in medieval Europe in the context of the 20th-century debate about the role of UG in L2 acquisition. The roles of generative linguistics on UG and L2 instruction and learning in the 20th century are discussed. (65 references) (MDM)

  11. Nature, Nurture, and Age in Language Acquisition: The Case of Speech Perception.

    ERIC Educational Resources Information Center

    Wode, Henning

    1994-01-01

    This paper reviews the research on speech perception and reassesses the contribution of innate capacities versus external stimulation in conjunction with age in first- and second-language acquisition. A developmental model of speech perception is then discussed in relation to neonatal auditory perception. (Contains 86 references.) (MDM)

  12. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  13. School Meaning Systems: The Symbiotic Nature of Culture and "Language-In-Use"

    ERIC Educational Resources Information Center

    Abawi, Lindy

    2013-01-01

    Recent research has produced evidence to suggest a strong reciprocal link between school context-specific language constructions that reflect a school's vision and schoolwide pedagogy, and the way that meaning making occurs, and a school's culture is characterized. This research was conducted within three diverse settings: one school in…

  14. The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data.

    ERIC Educational Resources Information Center

    Popovic, Mirko; Willett, Peter

    1992-01-01

    Reports on the use of stemming for Slovene language documents and queries in free-text retrieval systems and demonstrates that an appropriate stemming algorithm results in an increase in retrieval effectiveness when compared with nonstemming processing. A comparison is made with stemming of English versions of the same documents and queries. (24…

  15. The Ability of Children with Language Impairment to Dissemble Emotions in Hypothetical Scenarios and Natural Situations

    ERIC Educational Resources Information Center

    Brinton, Bonnie; Fujiki, Martin; Hurst, Noel Quist; Jones, Emily Rowberry; Spackman, Matthew P.

    2015-01-01

    Purpose: This study examined the ability of children with language impairment (LI) to dissemble (hide) emotional reactions when socially appropriate to do so. Method: Twenty-two children with LI and their typically developing peers (7;1-10;11 [years;months]) participated in two tasks. First, participants were presented with hypothetical scenarios…

  16. Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

    PubMed Central

    Hartzler, Andrea L; Huh, Jina; McDonald, David W; Pratt, Wanda

    2015-01-01

    Background The prevalence and value of patient-generated health text are increasing, but processing such text remains problematic. Although existing biomedical natural language processing (NLP) tools are appealing, most were developed to process clinician- or researcher-generated text, such as clinical notes or journal articles. In addition to being constructed for different types of text, other challenges of using existing NLP include constantly changing technologies, source vocabularies, and characteristics of text. These continuously evolving challenges warrant the need for applying low-cost systematic assessment. However, the primarily accepted evaluation method in NLP, manual annotation, requires tremendous effort and time. Objective The primary objective of this study is to explore an alternative approach—using low-cost, automated methods to detect failures (eg, incorrect boundaries, missed terms, mismapped concepts) when processing patient-generated text with existing biomedical NLP tools. We first characterize common failures that NLP tools can make in processing online community text. We then demonstrate the feasibility of our automated approach in detecting these common failures using one of the most popular biomedical NLP tools, MetaMap. Methods Using 9657 posts from an online cancer community, we explored our automated failure detection approach in two steps: (1) to characterize the failure types, we first manually reviewed MetaMap’s commonly occurring failures, grouped the inaccurate mappings into failure types, and then identified causes of the failures through iterative rounds of manual review using open coding, and (2) to automatically detect these failure types, we then explored combinations of existing NLP techniques and dictionary-based matching for each failure cause. Finally, we manually evaluated the automatically detected failures. Results From our manual review, we characterized three types of failure: (1) boundary failures, (2) missed

  17. Crowdsourcing a Normative Natural Language Dataset: A Comparison of Amazon Mechanical Turk and In-Lab Data Collection

    PubMed Central

    Bex, Peter J; Woods, Russell L

    2013-01-01

    Background Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. Objective To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. Methods We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. Results In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (P<.001), and they used a less varied vocabulary. However, there was strong similarity in the words used to describe a particular clip between the two datasets, as a cross-dataset count

  18. A natural language query system for Hubble Space Telescope proposal selection

    NASA Technical Reports Server (NTRS)

    Hornick, Thomas; Cohen, William; Miller, Glenn

    1987-01-01

    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  19. FMS: A Format Manipulation System for Automatic Production of Natural Language Documents, Second Edition. Final Report.

    ERIC Educational Resources Information Center

    Silver, Steven S.

    FMS/3 is a system for producing hard copy documentation at high speed from free format text and command input. The system was originally written in assembler language for a 12K IBM 360 model 20 using a high speed 1403 printer with the UCS-TN chain option (upper and lower case). Input was from an IBM 2560 Multi-function Card Machine. The model 20…

  20. Zipf’s word frequency law in natural language: A critical review and future directions

    PubMed Central

    2014-01-01

    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf ’ s law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf’s law and are then used to evaluate many of the theoretical explanations of Zipf’s law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf’s law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880

  1. What Is a Programming Language?

    ERIC Educational Resources Information Center

    Wold, Allen L.

    1983-01-01

    The nature of programing languages is discussed, focusing on machine/assembly language and high-level languages. The latter includes systems (such as "Basic") in which an entire set of low-level instructions (in assembly/machine language) are combined. Also discusses the nature of other languages such as "Lisp" and list-processing languages. (JN)

  2. Research in knowledge representation for natural language communication and planning assistance. Final report, 18 March 1985-30 September 1988

    SciTech Connect

    Goodman, B.A.; Grosz, B.; Haas, A.; Litman, D.; Reinhardt, T.

    1988-11-01

    BBN's DARPA project in Knowledge Representation for Natural Language Communication and Planning Assistance has two primary objectives: 1) To perform research on aspects of the interaction between users who are making complex decisions and systems that are assisting them with their task. In particular, this research is focused on communication and the reasoning required for performing its underlying task of discourse processing, planning, and plan recognition and communication repair. 2) Based on the research objectives to build tools for communication, plan recognition, and planning assistance and for the representation of knowledge and reasoning that underlie all of these processes. This final report summarizes BBN's research activities performed under this contract in the areas of knowledge representation and speech and natural language. In particular, the report discusses the work in the areas of knowledge representation, planning, and discourse modeling. We describe a parallel truth maintenance system. We provide an extension to the sentential theory of propositional attitudes by adding a sentential semantics. The report also contains a description of our research in discourse modelling in the areas of planning and plan recognition.

  3. Identification of methicillin-resistant Staphylococcus aureus within the Nation’s Veterans Affairs Medical Centers using natural language processing

    PubMed Central

    2012-01-01

    Background Accurate information is needed to direct healthcare systems’ efforts to control methicillin-resistant Staphylococcus aureus (MRSA). Assembling complete and correct microbiology data is vital to understanding and addressing the multiple drug-resistant organisms in our hospitals. Methods Herein, we describe a system that securely gathers microbiology data from the Department of Veterans Affairs (VA) network of databases. Using natural language processing methods, we applied an information extraction process to extract organisms and susceptibilities from the free-text data. We then validated the extraction against independently derived electronic data and expert annotation. Results We estimate that the collected microbiology data are 98.5% complete and that methicillin-resistant Staphylococcus aureus was extracted accurately 99.7% of the time. Conclusions Applying natural language processing methods to microbiology records appears to be a promising way to extract accurate and useful nosocomial pathogen surveillance data. Both scientific inquiry and the data’s reliability will be dependent on the surveillance system’s capability to compare from multiple sources and circumvent systematic error. The dataset constructed and methods used for this investigation could contribute to a comprehensive infectious disease surveillance system or other pressing needs. PMID:22533507

  4. A Proposal of 3-dimensional Self-organizing Memory and Its Application to Knowledge Extraction from Natural Language

    NASA Astrophysics Data System (ADS)

    Sakakibara, Kai; Hagiwara, Masafumi

    In this paper, we propose a 3-dimensional self-organizing memory and describe its application to knowledge extraction from natural language. First, the proposed system extracts a relation between words by JUMAN (morpheme analysis system) and KNP (syntax analysis system), and stores it in short-term memory. In the short-term memory, the relations are attenuated with the passage of processing. However, the relations with high frequency of appearance are stored in the long-term memory without attenuation. The relations in the long-term memory are placed to the proposed 3-dimensional self-organizing memory. We used a new learning algorithm called ``Potential Firing'' in the learning phase. In the recall phase, the proposed system recalls relational knowledge from the learned knowledge based on the input sentence. We used a new recall algorithm called ``Waterfall Recall'' in the recall phase. We added a function to respond to questions in natural language with ``yes/no'' in order to confirm the validity of proposed system by evaluating the quantity of correct answers.

  5. L3 Interactive Data Language

    2006-09-05

    The L3 system is a computational steering environment for image processing and scientific computing. It consists of an interactive graphical language and interface. Its purpose is to help advanced users in controlling their computational software and assist in the management of data accumulated during numerical experiments. L3 provides a combination of features not found in other environments; these are: - textual and graphical construction of programs - persistence of programs and associated data - directmore » mapping between the scripts, the parameters, and the produced data - implicit hierarchial data organization - full programmability, including conditionals and functions - incremental execution of programs The software includes the l3 language and the graphical environment. The language is a single-assignment functional language; the implementation consists of lexer, parser, interpreter, storage handler, and editing support, The graphical environment is an event-driven nested list viewer/editor providing graphical elements corresponding to the language. These elements are both the represenation of a users program and active interfaces to the values computed by that program.« less

  6. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework

    ERIC Educational Resources Information Center

    Vlas, Radu Eduard

    2012-01-01

    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  7. A translator writing system for microcomputer high-level languages and assemblers

    NASA Technical Reports Server (NTRS)

    Collins, W. R.; Knight, J. C.; Noonan, R. E.

    1980-01-01

    In order to implement high level languages whenever possible, a translator writing system of advanced design was developed. It is intended for routine production use by many programmers working on different projects. As well as a fairly conventional parser generator, it includes a system for the rapid generation of table driven code generators. The parser generator was developed from a prototype version. The translator writing system includes various tools for the management of the source text of a compiler under construction. In addition, it supplies various default source code sections so that its output is always compilable and executable. The system thereby encourages iterative enhancement as a development methodology by ensuring an executable program from the earliest stages of a compiler development project. The translator writing system includes PASCAL/48 compiler, three assemblers, and two compilers for a subset of HAL/S.

  8. On the nature and evolution of the neural bases of human language

    NASA Technical Reports Server (NTRS)

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on

  9. On the nature and evolution of the neural bases of human language.

    PubMed

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on

  10. On the nature and evolution of the neural bases of human language.

    PubMed

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on

  11. The Two Cultures of Science: On Language-Culture Incommensurability Concerning "Nature" and "Observation"

    ERIC Educational Resources Information Center

    Loo, Seng Piew

    2007-01-01

    Culture without nature is empty, nature without culture is deaf Intercultural dialogue in higher education around the globe is needed to improve the theory, policy and practice of science and science education. The culture, cosmology and philosophy of "global" science as practiced today in all societies around the world are seemingly anchored in…

  12. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    PubMed Central

    2016-01-01

    According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful. PMID:27668260

  13. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    PubMed Central

    2016-01-01

    According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful.

  14. Linguistics in Language Education

    ERIC Educational Resources Information Center

    Kumar, Rajesh; Yunus, Reva

    2014-01-01

    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  15. The development of a natural language interface to a geographical information system

    NASA Technical Reports Server (NTRS)

    Toledo, Sue Walker; Davis, Bruce

    1993-01-01

    This paper will discuss a two and a half year long project undertaken to develop an English-language interface for the geographical information system GRASS. The work was carried out for NASA by a small business, Netrologic, based in San Diego, California, under Phase 1 and 2 Small Business Innovative Research contracts. We consider here the potential value of this system whose current functionality addresses numerical, categorical and boolean raster layers and includes the display of point sets defined by constraints on one or more layers, answers yes/no and numerical questions, and creates statistical reports. It also handles complex queries and lexical ambiguities, and allows temporarily switching to UNIX or GRASS.

  16. Writing in science: Exploring teachers' and students' views of the nature of science in language enriched environments

    NASA Astrophysics Data System (ADS)

    Decoito, Isha

    Writing in science can be used to address some of the issues relevant to contemporary scientific literacy, such as the nature of science, which describes the scientific enterprise for science education. This has implications for the kinds of writing tasks students should attempt in the classroom, and for how students should understand the rationale and claims of these tasks. While scientific writing may train the mind to think scientifically in a disciplined and structured way thus encouraging students to gain access to the public domain of scientific knowledge, the counter-argument is that students need to be able to express their thoughts freely in their own language. Writing activities must aim to promote philosophical and epistemological views of science that accurately portray contemporary science. This mixed-methods case study explored language-enriched environments, in this case, secondary science classrooms with a focus on teacher-developed activities, involving diversified writing styles, that were directly linked to the science curriculum. The research foci included: teachers' implementation of these activities in their classrooms; how the activities reflected the teachers' nature of science views; common attributes between students' views of science and how they represented science in their writings; and if, and how the activities influenced students' nature of science views. Teachers' and students' views of writing and the nature of science are illustrated through pre-and post-questionnaire responses; interviews; student work; and classroom observations. Results indicated that diversified writing activities have the potential to accurately portray science to students, personalize learning in science, improve students' overall attitude towards science, and enhance scientific literacy through learning science, learning about science, and doing science. Further research is necessary to develop an understanding of whether the choice of genre has an

  17. On the Dual Nature of the Functional Discourse Grammar Model: Context, the Language System/Language Use Distinction, and Indexical Reference in Discourse

    ERIC Educational Resources Information Center

    Cornish, Francis

    2013-01-01

    The Functional Discourse Grammar model has a twofold objective: on the one hand, to provide a descriptively, psychologically and pragmatically adequate account of the forms made available by a typologically diverse range of languages; and on the other, to provide a model of language which is set up to reflect, at one remove, certain of the stages…

  18. Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

    PubMed Central

    Scott-Phillips, Thomas C.; Blythe, Richard A.

    2013-01-01

    In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication. PMID:24047871

  19. Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

    PubMed

    Scott-Phillips, Thomas C; Blythe, Richard A

    2013-11-01

    In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication.

  20. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  1. Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing.

    PubMed

    Yu, Sheng; Kumamaru, Kanako K; George, Elizabeth; Dunne, Ruth M; Bedayat, Arash; Neykov, Matey; Hunsaker, Andetta R; Dill, Karin E; Cai, Tianxi; Rybicki, Frank J

    2014-12-01

    In this paper we describe an efficient tool based on natural language processing for classifying the detail state of pulmonary embolism (PE) recorded in CT pulmonary angiography reports. The classification tasks include: PE present vs. absent, acute PE vs. others, central PE vs. others, and subsegmental PE vs. others. Statistical learning algorithms were trained with features extracted using the NLP tool and gold standard labels obtained via chart review from two radiologists. The areas under the receiver operating characteristic curves (AUC) for the four tasks were 0.998, 0.945, 0.987, and 0.986, respectively. We compared our classifiers with bag-of-words Naive Bayes classifiers, a standard text mining technology, which gave AUC 0.942, 0.765, 0.766, and 0.712, respectively. PMID:25117751

  2. An Evaluation of a Natural Language Processing Tool for Identifying and Encoding Allergy Information in Emergency Department Clinical Notes

    PubMed Central

    Goss, Foster R.; Plasek, Joseph M.; Lau, Jason J.; Seger, Diane L.; Chang, Frank Y.; Zhou, Li

    2014-01-01

    Emergency department (ED) visits due to allergic reactions are common. Allergy information is often recorded in free-text provider notes; however, this domain has not yet been widely studied by the natural language processing (NLP) community. We developed an allergy module built on the MTERMS NLP system to identify and encode food, drug, and environmental allergies and allergic reactions. The module included updates to our lexicon using standard terminologies, and novel disambiguation algorithms. We developed an annotation schema and annotated 400 ED notes that served as a gold standard for comparison to MTERMS output. MTERMS achieved an F-measure of 87.6% for the detection of allergen names and no known allergies, 90% for identifying true reactions in each allergy statement where true allergens were also identified, and 69% for linking reactions to their allergen. These preliminary results demonstrate the feasibility using NLP to extract and encode allergy information from clinical notes. PMID:25954363

  3. The Study of Natural Sign Language in Eighteenth-Century France.

    ERIC Educational Resources Information Center

    Fischer, Renate

    2002-01-01

    Examines how Pierre Desloges--a deaf person who was interested in the aptness of "natural signs" to express complex concepts and who highlighted the community aspect of communication--described and categorized signs used by the Deaf community in Paris in 1779. Presents additional sources of information on communication of deaf people. (Author/VWL)

  4. Naturalization Language Testing and Its Basis in Ideologies of National Identity and Citizenship.

    ERIC Educational Resources Information Center

    Piller, Ingrid

    2001-01-01

    Examines the relationship of ideologies of national and linguistic identity and the ways in which they impact upon ideologies of citizenship. Describes current naturalization legislation in a number of countries and the ways in which it is based on these ideologies. Particular focus is on Germany. (Author/VWL)

  5. Psychological linguistics: A natural science approach to the study of language interactions

    PubMed Central

    Bijou, Sidney W.; Umbreit, John; Ghezzi, Patrick M.; Chao, Chia-Chen

    1986-01-01

    Kantor's theoretical analysis of “psychological linguistics” offers a natural science approach to the study of linguistic behavior and interactions. This paper includes brief descriptions of (a) some of the basic assumptions of the approach, (b) Kantor's conception of linguistic behavior and interactions, (c) a compatible research method and sample research data, and (d) some areas of research and application. PMID:22477507

  6. Neurolinguistic approach to natural language processing with applications to medical text analysis.

    PubMed

    Duch, Włodzisław; Matykiewicz, Paweł; Pestian, John

    2008-12-01

    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts not found directly in the text. The approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications. PMID:18614334

  7. Object-oriented knowledge representation in a natural language understanding system of economic surveys

    NASA Astrophysics Data System (ADS)

    Planes, Jean-Christophe; Trigano, Philippe

    1992-03-01

    The HIRONDELLE research project of the Banque de France intends to summarize economic surveys giving statements about a specific economic domain. The principal goal is the detection of causal relations between economic events appearing in the texts. We will focus on knowledge representation, based on three distinct hierarchical structures. The first one concerns the lexical items and allows inheritance of syntactic properties. Descriptions of the applications domains are achieved by a taxonomy based on attribute-value models and case relations, adapted to the economic sectors. The summarization goal of this system defines a set of primitives representing statements and causality meta-language. The semantic analysis of the texts is based on two phases. The first one leads to a propositional representation of the sentences through conceptual graphs formalization, taking into account the syntactic transformations of sentences. The second one is dedicated to the summarizing role of the system, detecting paraphrastic sentences by processing syntactic and semantic transformations like negation or metonymious constructions.

  8. Arbitrary symbolism in natural language revisited: when word forms carry meaning.

    PubMed

    Reilly, Jamie; Westbury, Chris; Kean, Jacob; Peelle, Jonathan E

    2012-01-01

    Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog). Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 1-2 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete) for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., "apartment") as abstract and shorter uninflected abstract words (e.g., "fate") as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words.

  9. Neurolinguistic Approach to Natural Language Processing with Applications to Medical Text Analysis

    PubMed Central

    Matykiewicz, Paweł; Pestian, John

    2008-01-01

    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts that are not found directly in the text. Approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications. PMID:18614334

  10. Arbitrary Symbolism in Natural Language Revisited: When Word Forms Carry Meaning

    PubMed Central

    Reilly, Jamie; Westbury, Chris; Kean, Jacob; Peelle, Jonathan E.

    2012-01-01

    Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog). Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 1–2 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete) for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., “apartment”) as abstract and shorter uninflected abstract words (e.g., “fate”) as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words. PMID:22879931

  11. The nature of facilitation and interference in the multilingual language system: insights from treatment in a case of trilingual aphasia.

    PubMed

    Keane, Caitlin; Kiran, Swathi

    2015-01-01

    The rehabilitation study described here sets out to test the premise of Abutalebi and Green's neurocognitive model--specifically, that language selection and control are components of overall cognitive control. We follow a trilingual woman (first language, L1: Amharic; second language, L2: English; third language, L3: French) with damage to the left frontal lobe and left basal ganglia who presented with cognitive control and naming deficits, through two periods of semantic treatment (French, followed by English) to alleviate naming deficits. The results showed that while the participant improved on trained items, she did not show within- or cross-language generalization. In addition, error patterns revealed a substantial increase of interference of the currently trained language into the nontrained language during each of the two treatment phases. These results are consistent with Abutalebi and Green's neurocognitive model and support the claim that language selection and control are components of overall cognitive control.

  12. The nature of facilitation and interference in the multilingual language system: insights from treatment in a case of trilingual aphasia.

    PubMed

    Keane, Caitlin; Kiran, Swathi

    2015-01-01

    The rehabilitation study described here sets out to test the premise of Abutalebi and Green's neurocognitive model--specifically, that language selection and control are components of overall cognitive control. We follow a trilingual woman (first language, L1: Amharic; second language, L2: English; third language, L3: French) with damage to the left frontal lobe and left basal ganglia who presented with cognitive control and naming deficits, through two periods of semantic treatment (French, followed by English) to alleviate naming deficits. The results showed that while the participant improved on trained items, she did not show within- or cross-language generalization. In addition, error patterns revealed a substantial increase of interference of the currently trained language into the nontrained language during each of the two treatment phases. These results are consistent with Abutalebi and Green's neurocognitive model and support the claim that language selection and control are components of overall cognitive control. PMID:26377506

  13. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  14. SIMD-parallel understanding of natural language with application to magnitude-only optical parsing of text

    NASA Astrophysics Data System (ADS)

    Schmalz, Mark S.

    1992-08-01

    A novel parallel model of natural language (NL) understanding is presented which can realize high levels of semantic abstraction, and is designed for implementation on synchronous SIMD architectures and optical processors. Theory is expressed in terms of the Image Algebra (IA), a rigorous, concise, inherently parallel notation which unifies the design, analysis, and implementation of image processing algorithms. The IA has been implemented on numerous parallel architectures, and IA preprocessors and interpreters are available for the FORTRAN and Ada languages. In a previous study, we demonstrated the utility of IA for mapping MEA- conformable (Multiple Execution Array) algorithms to optical architectures. In this study, we extend our previous theory to map serial parsing algorithms to the synchronous SIMD paradigm. We initially derive a two-dimensional image that is based upon the adjacency matrix of a semantic graph. Via IA template mappings, the operations of bottom-up parsing, semantic disambiguation, and referential resolution are implemented as image-processing operations upon the adjacency matrix. Pixel-level operations are constrained to Hadamard addition and multiplication, thresholding, and row/column summation, which are available in magnitude-only optics. Assuming high parallelism in the parse rule base, the parsing of n input symbols with a grammar consisting of M rules of arity H, on an N-processor architecture, could exhibit time complexity of T(n)

  15. Language and human nature: Kurt Goldstein's neurolinguistic foundation of a holistic philosophy.

    PubMed

    Ludwig, David

    2012-01-01

    Holism in interwar Germany provides an excellent example for social and political influences on scientific developments. Deeply impressed by the ubiquitous invocation of a cultural crisis, biologists, physicians, and psychologists presented holistic accounts as an alternative to the "mechanistic worldview" of the nineteenth century. Although the ideological background of these accounts is often blatantly obvious, many holistic scientists did not content themselves with a general opposition to a mechanistic worldview but aimed at a rational foundation of their holistic projects. This article will discuss the work of Kurt Goldstein, who is known for both his groundbreaking contributions to neuropsychology and his holistic philosophy of human nature. By focusing on Goldstein's neurolinguistic research, I want to reconstruct the empirical foundations of his holistic program without ignoring its cultural background. In this sense, Goldstein's work provides a case study for the formation of a scientific theory through the complex interplay between specific empirical evidences and the general cultural developments of the Weimar Republic. PMID:25363384

  16. Language and human nature: Kurt Goldstein's neurolinguistic foundation of a holistic philosophy.

    PubMed

    Ludwig, David

    2012-01-01

    Holism in interwar Germany provides an excellent example for social and political influences on scientific developments. Deeply impressed by the ubiquitous invocation of a cultural crisis, biologists, physicians, and psychologists presented holistic accounts as an alternative to the "mechanistic worldview" of the nineteenth century. Although the ideological background of these accounts is often blatantly obvious, many holistic scientists did not content themselves with a general opposition to a mechanistic worldview but aimed at a rational foundation of their holistic projects. This article will discuss the work of Kurt Goldstein, who is known for both his groundbreaking contributions to neuropsychology and his holistic philosophy of human nature. By focusing on Goldstein's neurolinguistic research, I want to reconstruct the empirical foundations of his holistic program without ignoring its cultural background. In this sense, Goldstein's work provides a case study for the formation of a scientific theory through the complex interplay between specific empirical evidences and the general cultural developments of the Weimar Republic.

  17. A perspective on the advancement of natural language processing tasks via topological analysis of complex networks. Comment on "Approaching human language with complex networks" by Cong and Liu

    NASA Astrophysics Data System (ADS)

    Amancio, Diego Raphael

    2014-12-01

    Concepts and methods of complex networks have been applied to probe the properties of a myriad of real systems [1]. The finding that written texts modeled as graphs share several properties of other completely different real systems has inspired the study of language as a complex system [2]. Actually, language can be represented as a complex network in its several levels of complexity. As a consequence, morphological, syntactical and semantical properties have been employed in the construction of linguistic networks [3]. Even the character level has been useful to unfold particular patterns [4,5]. In the review by Cong and Liu [6], the authors emphasize the need to use the topological information of complex networks modeling the various spheres of the language to better understand its origins, evolution and organization. In addition, the authors cite the use of networks in applications aiming at holistic typology and stylistic variations. In this context, I will discuss some possible directions that could be followed in future research directed towards the understanding of language via topological characterization of complex linguistic networks. In addition, I will comment the use of network models for language processing applications. Additional prospects for future practical research lines will also be discussed in this comment.

  18. Coh-metrix: analysis of text on cohesion and language.

    PubMed

    Graesser, Arthur C; McNamara, Danielle S; Louwerse, Max M; Cai, Zhiqiang

    2004-05-01

    Advances in computational linguistics and discourse processing have made it possible to automate many language- and text-processing mechanisms. We have developed a computer tool called Coh-Metrix, which analyzes texts on over 200 measures of cohesion, language, and readability. Its modules use lexicons, part-of-speech classifiers, syntactic parsers, templates, corpora, latent semantic analysis, and other components that are widely used in computational linguistics. After the user enters an English text, CohMetrix returns measures requested by the user. In addition, a facility allows the user to store the results of these analyses in data files (such as Text, Excel, and SPSS). Standard text readability formulas scale texts on difficulty by relying on word length and sentence length, whereas Coh-Metrix is sensitive to cohesion relations, world knowledge, and language and discourse characteristics. PMID:15354684

  19. Neutrality in Language Policy

    ERIC Educational Resources Information Center

    Wee, Lionel

    2010-01-01

    The unavoidability of language makes it critical that language policies appeal to some notion of language neutrality as part of their rationale, in order to assuage concerns that the policies might otherwise be unduly discriminatory. However, the idea of language neutrality is deeply ideological in nature, since it is not only an attempt to treat…

  20. Informatics in radiology: RADTF: a semantic search-enabled, natural language processor-generated radiology teaching file.

    PubMed

    Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L

    2010-11-01

    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material.

  1. Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.

    PubMed

    Carrell, David S; Halgrim, Scott; Tran, Diem-Thy; Buist, Diana S M; Chubak, Jessica; Chapman, Wendy W; Savova, Guergana

    2014-03-15

    The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.

  2. Tracking irregular morphophonological dependencies in natural language: evidence from the acquisition of subject-verb agreement in French.

    PubMed

    Nazzi, Thierry; Barrière, Isabelle; Goyet, Louise; Kresh, Sarah; Legendre, Géraldine

    2011-07-01

    This study examines French-learning infants' sensitivity to grammatical non-adjacent dependencies involving subject-verb agreement (e.g., le/les garçons lit/lisent 'the boy(s) read(s)') where number is audible on both the determiner of the subject DP and the agreeing verb, and the dependency is spanning across two syntactic phrases. A further particularity of this subsystem of French subject-verb agreement is that number marking on the verb is phonologically highly irregular. Despite the challenge, the HPP results for 24- and 18-month-olds demonstrate knowledge of both number dependencies: between the singular determiner le and the non-adjacent singular verbal forms and between the plural determiner les and the non-adjacent plural verbal forms. A control experiment suggests that the infants are responding to known verb forms, not phonological regularities. Given the paucity of such forms in the adult input documented through a corpus study, these results are interpreted as evidence that 18-month-olds have the ability to extract complex patterns across a range of morphophonologically inconsistent and infrequent items in natural language.

  3. Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.

    PubMed

    Carrell, David S; Halgrim, Scott; Tran, Diem-Thy; Buist, Diana S M; Chubak, Jessica; Chapman, Wendy W; Savova, Guergana

    2014-03-15

    The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction. PMID:24488511

  4. Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media

    PubMed Central

    Bail, Christopher Andrew

    2016-01-01

    Social media sites are rapidly becoming one of the most important forums for public deliberation about advocacy issues. However, social scientists have not explained why some advocacy organizations produce social media messages that inspire far-ranging conversation among social media users, whereas the vast majority of them receive little or no attention. I argue that advocacy organizations are more likely to inspire comments from new social media audiences if they create “cultural bridges,” or produce messages that combine conversational themes within an advocacy field that are seldom discussed together. I use natural language processing, network analysis, and a social media application to analyze how cultural bridges shaped public discourse about autism spectrum disorders on Facebook over the course of 1.5 years, controlling for various characteristics of advocacy organizations, their social media audiences, and the broader social context in which they interact. I show that organizations that create substantial cultural bridges provoke 2.52 times more comments about their messages from new social media users than those that do not, controlling for these factors. This study thus offers a theory of cultural messaging and public deliberation and computational techniques for text analysis and application-based survey research. PMID:27694580

  5. Using Natural Language Processing to Improve Efficiency of Manual Chart Abstraction in Research: The Case of Breast Cancer Recurrence

    PubMed Central

    Carrell, David S.; Halgrim, Scott; Tran, Diem-Thy; Buist, Diana S. M.; Chubak, Jessica; Chapman, Wendy W.; Savova, Guergana

    2014-01-01

    The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction. PMID:24488511

  6. Combining Speech Recognition/Natural Language Processing with 3D Online Learning Environments to Create Distributed Authentic and Situated Spoken Language Learning

    ERIC Educational Resources Information Center

    Jones, Greg; Squires, Todd; Hicks, Jeramie

    2008-01-01

    This article will describe research done at the National Institute of Multimedia in Education, Japan and the University of North Texas on the creation of a distributed Internet-based spoken language learning system that would provide more interactive and motivating learning than current multimedia and audiotape-based systems. The project combined…

  7. Second-language experience modulates first- and second-language word frequency effects: evidence from eye movement measures of natural paragraph reading.

    PubMed

    Whitford, Veronica; Titone, Debra

    2012-02-01

    We used eye movement measures of first-language (L1) and second-language (L2) paragraph reading to investigate whether the degree of current L2 exposure modulates the relative size of L1 and L2 frequency effects (FEs). The results showed that bilinguals displayed larger L2 than L1 FEs during both early- and late-stage eye movement measures, which are taken to reflect initial lexical access and postlexical access, respectively. Moreover, the magnitude of L2 FEs was inversely related to current L2 exposure, such that lower levels of L2 exposure led to larger L2 FEs. In contrast, during early-stage reading measures, bilinguals with higher levels of current L2 exposure showed larger L1 FEs than did bilinguals with lower levels of L2 exposure, suggesting that increased L2 experience modifies the earliest stages of L1 lexical access. Taken together, the findings are consistent with implicit learning accounts (e.g., Monsell, 1991), the weaker links hypothesis (Gollan, Montoya, Cera, Sandoval, Journal of Memory and Language, 58:787-814, 2008), and current bilingual visual word recognition models (e.g., the bilingual interactive activation model plus [BIA+]; Dijkstra & van Heuven, Bilingualism: Language and Cognition, 5:175-197, 2002). Thus, amount of current L2 exposure is a key determinant of FEs and, thus, lexical activation, in both the L1 and L2.

  8. Extracting noun phrases for all of MEDLINE.

    PubMed Central

    Bennett, N. A.; He, Q.; Powell, K.; Schatz, B. R.

    1999-01-01

    A natural language parser that could extract noun phrases for all medical texts would be of great utility in analyzing content for information retrieval. We discuss the extraction of noun phrases from MEDLINE, using a general parser not tuned specifically for any medical domain. The noun phrase extractor is made up of three modules: tokenization; part-of-speech tagging; noun phrase identification. Using our program, we extracted noun phrases from the entire MEDLINE collection, encompassing 9.3 million abstracts. Over 270 million noun phrases were generated, of which 45 million were unique. The quality of these phrases was evaluated by examining all phrases from a sample collection of abstracts. The precision and recall of the phrases from our general parser compared favorably with those from three other parsers we had previously evaluated. We are continuing to improve our parser and evaluate our claim that a generic parser can effectively extract all the different phrases across the entire medical literature. PMID:10566444

  9. Longitudinal analysis of pain in patients with metastatic prostate cancer using natural language processing of medical record text

    PubMed Central

    Heintzelman, Norris H; Taylor, Robert J; Simonsen, Lone; Lustig, Roger; Anderko, Doug; Haythornthwaite, Jennifer A; Childs, Lois C; Bova, George Steven

    2013-01-01

    Objectives To test the feasibility of using text mining to depict meaningfully the experience of pain in patients with metastatic prostate cancer, to identify novel pain phenotypes, and to propose methods for longitudinal visualization of pain status. Materials and methods Text from 4409 clinical encounters for 33 men enrolled in a 15-year longitudinal clinical/molecular autopsy study of metastatic prostate cancer (Project to ELIminate lethal CANcer) was subjected to natural language processing (NLP) using Unified Medical Language System-based terms. A four-tiered pain scale was developed, and logistic regression analysis identified factors that correlated with experience of severe pain during each month. Results NLP identified 6387 pain and 13 827 drug mentions in the text. Graphical displays revealed the pain ‘landscape’ described in the textual records and confirmed dramatically increasing levels of pain in the last years of life in all but two patients, all of whom died from metastatic cancer. Severe pain was associated with receipt of opioids (OR=6.6, p<0.0001) and palliative radiation (OR=3.4, p=0.0002). Surprisingly, no severe or controlled pain was detected in two of 33 subjects’ clinical records. Additionally, the NLP algorithm proved generalizable in an evaluation using a separate data source (889 Informatics for Integrating Biology and the Bedside (i2b2) discharge summaries). Discussion Patterns in the pain experience, undetectable without the use of NLP to mine the longitudinal clinical record, were consistent with clinical expectations, suggesting that meaningful NLP-based pain status monitoring is feasible. Findings in this initial cohort suggest that ‘outlier’ pain phenotypes useful for probing the molecular basis of cancer pain may exist. Limitations The results are limited by a small cohort size and use of proprietary NLP software. Conclusions We have established the feasibility of tracking longitudinal patterns of pain by text mining

  10. The Acquisition of Written Language: Response and Revision. Writing Research: Multidisciplinary Inquiries into the Nature of Writing Series.

    ERIC Educational Resources Information Center

    Freedman, Sarah Warshauer, Ed.

    Viewing writing as both a form of language learning and an intellectual skill, this book presents essays on how writers acquire trusted inner voices and the roles schools and teachers can play in helping student writers in the learning process. The essays in the book focus on one of three topics: the language of instruction and how response and…

  11. Integrating Learner Corpora and Natural Language Processing: A Crucial Step towards Reconciling Technological Sophistication and Pedagogical Effectiveness

    ERIC Educational Resources Information Center

    Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie

    2007-01-01

    Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with…

  12. The Importance of Natural Change in Planning School-Based Intervention for Children with Developmental Language Impairment (DLI)

    ERIC Educational Resources Information Center

    Botting, Nicola; Gaynor, Marguerite; Tucker, Katie; Orchard-Lisle, Ginnie

    2016-01-01

    Some reports suggest that there is an increase in the number of children identified as having developmental language impairment (Bercow, 2008). yet resource issues have meant that many speech and language therapy services have compromised provision in some way. Thus, efficient ways of identifying need and prioritizing intervention are required.…

  13. Language Arts Guide; Composition and Language Study. Junior High School.

    ERIC Educational Resources Information Center

    Dade County Board of Public Instruction, Miami, FL.

    GRADES OR AGES: Junior high school (grades 7, 8 and 9). SUBJECT MATTER: Language arts; composition and language study. ORGANIZATION AND PHYSICAL APPEARANCE: The guide has three main sections: 1) oral composition--individual preservations and group activities; 2) language study--the nature of language, varieties of language, history of the English…

  14. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation

    PubMed Central

    Ferraro, Jeffrey P; Daumé, Hal; DuVall, Scott L; Chapman, Wendy W; Harkema, Henk; Haug, Peter J

    2013-01-01

    Objective Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives. Methods Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt. Results The evaluated POS taggers drop in accuracy by 8.5–15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3–91.0% on clinical texts. ClinAdapt reports 93.2–93.9%. Conclusions ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks. PMID:23486109

  15. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard

    NASA Astrophysics Data System (ADS)

    Lendholt, Matthias; Hammitzsch, Martin; Wächter, Joachim

    2010-05-01

    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System) [3] project where CAP serves as central message format containing both human readable warnings and structured data for automatic processing by message receivers. In particular the spatial reference capabilities are of paramount importance both in CAP and EDXL. Affected areas are addressable via geo codes like HASC (Hierarchical Administrative Subdivision Codes) [4] or UN/LOCODE [5] but also with arbitrary polygons that can be directly generated out of GML [6]. For each affected area standardized criticality values (urgency, severity and certainty) have to be set but also application specific key-value-pairs like estimated time of arrival or maximum inundation height can be specified. This enables - together with multilingualism, message aggregation and message conversion for different dissemination channels - the generation of user-specific tailored warning messages. [1] CAP, http://www.oasis-emergency.org/cap [2] EDXL-DE, http://docs.oasis-open.org/emergency/edxl-de/v1.0/EDXL-DE_Spec_v1.0.pdf [3] DEWS, http://www.dews-online.org [4] HASC, "Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998" ISBN 0-7864-0729-8 [5] UN/LOCODE, http://www.unece.org/cefact/codesfortrade/codes_index.htm [6] GML, http://www.opengeospatial.org/standards/gml

  16. First Language Acquisition and Teaching

    ERIC Educational Resources Information Center

    Cruz-Ferreira, Madalena

    2011-01-01

    "First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in…

  17. A Study To Identify and Analyze the Perceived Nature and Causes of the English Language-Based Problems and the Coping Strategies of the Indonesian and Malaysian Students Studying in an American University.

    ERIC Educational Resources Information Center

    Ali, M. Solaiman

    A study investigated the nature and causes of the English language-based problems and the coping strategies of 44 Indonesian and 57 Malaysian students studying at Indiana University, Bloomington. The Indonesian and Malaysian student groups represented non-Commonwealth and Commonwealth students sharing the same native language roots but differed in…

  18. jmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats.

    PubMed

    Griss, Johannes; Reisinger, Florian; Hermjakob, Henning; Vizcaíno, Juan Antonio

    2012-03-01

    We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for reporting protein and peptide identifications, developed by the HUPO proteomics standards initiative (PSI). mzIdentML files do not contain spectra data but contain references to different kinds of external MS data files. As a key functionality, all parsers implement a common interface that supports the various methods used by mzIdentML to reference external spectra. Thus, when developing software for mzIdentML, programmers no longer have to support multiple MS data file formats but only this one interface. The library (which includes a viewer) is open source and, together with detailed documentation, can be downloaded from http://code.google.com/p/jmzreader/.

  19. Variation in the application of natural processes: language-dependent constraints in the phonological acquisition of bilingual children.

    PubMed

    Faingold, E D

    1996-09-01

    This paper studies phonological processes and constraints on early phonological and lexical development, as well as the strategies employed by a young Spanish-, Portuguese-, and Hebrew-speaking child-Nurit (the author's niece)-in the construction of her early lexicon. Nurit's linguistic development is compared to that of another Spanish-, Portuguese-, and Hebrew-speaking child-Noam (the author's son). Noam and Nurit's linguistic development is contrasted to that of Berman's (1977) English- and Hebrew-speaking daughter (Shelli). The simultaneous acquisition of similar (closely related languages) such as Spanish and Portuguese versus that of nonrelated languages such as English and Hebrew yields different results: Children acquiring similar languages seem to prefer maintenance as a strategy for the construction of their early lexicon, while children exposed to nonrelated languages appear to prefer reduction to a large extent (Faingold, 1990). The Spanish- and Portuguese-speaking children's high accuracy stems from a wider choice of target words, where the diachronic development of two closely related languages provides a simplified model lexicon to the child. PMID:8865623

  20. Introducing a gender-neutral pronoun in a natural gender language: the influence of time on attitudes and behavior

    PubMed Central

    Gustafsson Sendén, Marie; Bäck, Emma A.; Lindqvist, Anna

    2015-01-01

    The implementation of gender fair language is often associated with negative reactions and hostile attacks on people who propose a change. This was also the case in Sweden in 2012 when a third gender-neutral pronoun hen was proposed as an addition to the already existing Swedish pronouns for she (hon) and he (han). The pronoun hen can be used both generically, when gender is unknown or irrelevant, and as a transgender pronoun for people who categorize themselves outside the gender dichotomy. In this article we review the process from 2012 to 2015. No other language has so far added a third gender-neutral pronoun, existing parallel with two gendered pronouns, that actually have reached the broader population of language users. This makes the situation in Sweden unique. We present data on attitudes toward hen during the past 4 years and analyze how time is associated with the attitudes in the process of introducing hen to the Swedish language. In 2012 the majority of the Swedish population was negative to the word, but already in 2014 there was a significant shift to more positive attitudes. Time was one of the strongest predictors for attitudes also when other relevant factors were controlled for. The actual use of the word also increased, although to a lesser extent than the attitudes shifted. We conclude that new words challenging the binary gender system evoke hostile and negative reactions, but also that attitudes can normalize rather quickly. We see this finding very positive and hope it could motivate language amendments and initiatives for gender-fair language, although the first responses may be negative. PMID:26191016

  1. Language, the Forgotten Content.

    ERIC Educational Resources Information Center

    Kelly, Patricia P., Ed.; Small, Robert C., Jr., Ed.

    1987-01-01

    The ways that students can learn about the nature of the English language and develop a sense of excitement about their language are explored in this focused journal issue. The titles of the essays and their authors are as follows: (1) "Language, the Forgotten Content" (R. Small and P. P. Kelly); (2) "What Should English Teachers Know about…

  2. Language, Gesture, and Space.

    ERIC Educational Resources Information Center

    Emmorey, Karen, Ed.; Reilly, Judy S., Ed.

    A collection of papers addresses a variety of issues regarding the nature and structure of sign language, gesture, and gesture systems. Articles include: "Theoretical Issues Relating Language, Gesture, and Space: An Overview" (Karen Emmorey, Judy S. Reilly); "Real, Surrogate, and Token Space: Grammatical Consequences in ASL American Sign Language"…

  3. The Compensatory Nature of Discipline-Related Knowledge and English-Language Proficiency in Reading English for Academic Purposes

    ERIC Educational Resources Information Center

    Uso-Juan, Esther

    2006-01-01

    The purpose of this study is twofold: first, to estimate the contribution of discipline-related knowledge and English-language proficiency to reading comprehension in English for academic purposes (EAP) and, second, to specify the levels at which the compensatory effect between the two variables takes place for successful EAP reading. The…

  4. The Nature and Impact of Changes in Home Learning Environment on Development of Language and Academic Skills in Preschool Children

    ERIC Educational Resources Information Center

    Son, Seung-Hee; Morrison, Frederick J.

    2010-01-01

    In this study, we examined changes in the early home learning environment as children approached school entry and whether these changes predicted the development of children's language and academic skills. Findings from a national sample of the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development…

  5. Language-Dependent Pitch Encoding Advantage in the Brainstem Is Not Limited to Acceleration Rates that Occur in Natural Speech

    ERIC Educational Resources Information Center

    Krishnan, Ananthanarayan; Gandour, Jackson T.; Smalt, Christopher J.; Bidelman, Gavin M.

    2010-01-01

    Experience-dependent enhancement of neural encoding of pitch in the auditory brainstem has been observed for only specific portions of native pitch contours exhibiting high rates of pitch acceleration, irrespective of speech or nonspeech contexts. This experiment allows us to determine whether this language-dependent advantage transfers to…

  6. Advanced computer languages

    SciTech Connect

    Bryce, H.

    1984-05-03

    If software is to become an equal partner in the so-called fifth generation of computers-which of course it must-programming languages and the human interface will need to clear some high hurdles. Again, the solutions being sought turn to cerebral emulation-here, the way that human beings understand language. The result would be natural or English-like languages that would allow a person to communicate with a computer much as he or she does with another person. In the discussion the authors look at fourth level languages and fifth level languages, used in meeting the goal of AI. The higher level languages aim to be non procedural. Application of LISP, and Forth to natural language interface are described as well as programs such as natural link technology package, written in C.

  7. An English language interface for constrained domains

    NASA Technical Reports Server (NTRS)

    Page, Brenda J.

    1989-01-01

    The Multi-Satellite Operations Control Center (MSOCC) Jargon Interpreter (MJI) demonstrates an English language interface for a constrained domain. A constrained domain is defined as one with a small and well delineated set of actions and objects. The set of actions chosen for the MJI is from the domain of MSOCC Applications Executive (MAE) Systems Test and Operations Language (STOL) directives and contains directives for signing a cathode ray tube (CRT) on or off, calling up or clearing a display page, starting or stopping a procedure, and controlling history recording. The set of objects chosen consists of CRTs, display pages, STOL procedures, and history files. Translation from English sentences to STOL directives is done in two phases. In the first phase, an augmented transition net (ATN) parser and dictionary are used for determining grammatically correct parsings of input sentences. In the second phase, grammatically typed sentences are submitted to a forward-chaining rule-based system for interpretation and translation into equivalent MAE STOL directives. Tests of the MJI show that it is able to translate individual clearly stated sentences into the subset of directives selected for the prototype. This approach to an English language interface may be used for similarly constrained situations by modifying the MJI's dictionary and rules to reflect the change of domain.

  8. Individual biases, cultural evolution, and the statistical nature of language universals: the case of colour naming systems.

    PubMed

    Baronchelli, Andrea; Loreto, Vittorio; Puglisi, Andrea

    2015-01-01

    Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces. PMID:26018391

  9. Individual Biases, Cultural Evolution, and the Statistical Nature of Language Universals: The Case of Colour Naming Systems

    PubMed Central

    Baronchelli, Andrea; Loreto, Vittorio; Puglisi, Andrea

    2015-01-01

    Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces. PMID:26018391

  10. Simultaneous natural speech and AAC interventions for children with childhood apraxia of speech: lessons from a speech-language pathologist focus group.

    PubMed

    Oommen, Elizabeth R; McCarthy, John W

    2015-03-01

    In childhood apraxia of speech (CAS), children exhibit varying levels of speech intelligibility depending on the nature of errors in articulation and prosody. Augmentative and alternative communication (AAC) strategies are beneficial, and commonly adopted with children with CAS. This study focused on the decision-making process and strategies adopted by speech-language pathologists (SLPs) when simultaneously implementing interventions that focused on natural speech and AAC. Eight SLPs, with significant clinical experience in CAS and AAC interventions, participated in an online focus group. Thematic analysis revealed eight themes: key decision-making factors; treatment history and rationale; benefits; challenges; therapy strategies and activities; collaboration with team members; recommendations; and other comments. Results are discussed along with clinical implications and directions for future research.

  11. Gendered Language in Interactive Discourse

    ERIC Educational Resources Information Center

    Hussey, Karen A.; Katz, Albert N.; Leith, Scott A.

    2015-01-01

    Over two studies, we examined the nature of gendered language in interactive discourse. In the first study, we analyzed gendered language from a chat corpus to see whether tokens of gendered language proposed in the gender-as-culture hypothesis (Maltz and Borker in "Language and social identity." Cambridge University Press, Cambridge, pp…

  12. Language as a Liberal Art.

    ERIC Educational Resources Information Center

    Stein, Jack M.

    Language, considered as a liberal art, is examined in the light of other philosophical viewpoints concerning the nature of language in relation to second language instruction in this paper. Critical of an earlier mechanistic audio-lingual learning theory, translation approaches to language learning, vocabulary list-oriented courses, graduate…

  13. Language Contact.

    ERIC Educational Resources Information Center

    Nelde, Peter Hans

    1995-01-01

    Examines the phenomenon of language contact and recent trends in linguistic contact research, which focuses on language use, language users, and language spheres. Also discusses the role of linguistic and cultural conflicts in language contact situations. (13 references) (MDM)

  14. Social Network Development, Language Use, and Language Acquisition during Study Abroad: Arabic Language Learners' Perspectives

    ERIC Educational Resources Information Center

    Dewey, Dan P.; Belnap, R. Kirk; Hillstrom, Rebecca

    2013-01-01

    Language learners and educators have subscribed to the belief that those who go abroad will have many opportunities to use the target language and will naturally become proficient. They also assume that language learners will develop relationships with native speakers allowing them to use the language and become more fluent, an assumption…

  15. Linguistic Unification and Language Rights.

    ERIC Educational Resources Information Center

    Akinnaso, F. Niyi

    1994-01-01

    This paper examines the tension between linguistic unification and language rights in Nigeria and assesses the nature, causes, and implications of the tension against the backgrounds of the country's history, political development, and language situation. (Contains 116 references.) (MDM)

  16. Modeling Coevolution between Language and Memory Capacity during Language Origin.

    PubMed

    Gong, Tao; Shuai, Lan

    2015-01-01

    Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language.

  17. Modeling Coevolution between Language and Memory Capacity during Language Origin

    PubMed Central

    Gong, Tao; Shuai, Lan

    2015-01-01

    Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language. PMID:26544876

  18. Three design principles of language: the search for parsimony in redundancy.

    PubMed

    Beekhuizen, Barend; Bod, Rens; Zuidema, Willem

    2013-09-01

    In this paper we present three design principles of language - experience, heterogeneity and redundancy--and present recent developments in a family of models incorporating them, namely Data-Oriented Parsing/Unsupervised Data-Oriented Parsing. Although the idea of some form of redundant storage has become part and parcel of parsing technologies and usage-based linguistic approaches alike, the question how much of it is cognitively realistic and/or computationally optimally efficient is an open one. We argue that a segmentation-based approach (Bayesian Model Merging) combined with an all-subtrees approach reduces the number of rules needed to achieve an optimal performance, thus making the parser more efficient. At the same time, starting from unsegmented wholes comes closer to the acquisitional situation of a language learner, and thus adds to the cognitive plausibility of the model.

  19. Null subjects: a problem for parameter-setting models of language acquisition.

    PubMed

    Valian, V

    1990-05-01

    Some languages, like English, require overt surface subjects, while others, like Italian and Spanish, allow "null" subjects. How does the young child determine whether or not her language allows null subjects? Modern parameter-setting theory has proposed a solution, in which the child begins acquisition with the null subject parameter set for either the English-like value or the Italian-like value. Incoming data, or the absence thereof, force a resetting of the parameter if the original value was incorrect. This paper argues that the single-value solution cannot work, no matter which value is chosen as the initial one, because of inherent limitations in the child's parser, and because of the presence of misleading input. An alternative dual-value solution is proposed, in which the child begins acquisition with both values available, and uses theory-confirmation procedures to decide which value is best supported by the available data.

  20. SOL - SIZING AND OPTIMIZATION LANGUAGE COMPILER

    NASA Technical Reports Server (NTRS)

    Scotti, S. J.

    1994-01-01

    each variable was used. The listings summarize all optimizations, listing the objective functions, design variables, and constraints. The compiler offers error-checking specific to optimization problems, so that simple mistakes will not cost hours of debugging time. The optimization engine used by and included with the SOL compiler is a version of Vanderplatt's ADS system (Version 1.1) modified specifically to work with the SOL compiler. SOL allows the use of the over 100 ADS optimization choices such as Sequential Quadratic Programming, Modified Feasible Directions, interior and exterior penalty function and variable metric methods. Default choices of the many control parameters of ADS are made for the user, however, the user can override any of the ADS control parameters desired for each individual optimization. The SOL language and compiler were developed with an advanced compiler-generation system to ensure correctness and simplify program maintenance. Thus, SOL's syntax was defined precisely by a LALR(1) grammar and the SOL compiler's parser was generated automatically from the LALR(1) grammar with a parser-generator. Hence unlike ad hoc, manually coded interfaces, the SOL compiler's lexical analysis insures that the SOL compiler recognizes all legal SOL programs, can recover from and correct for many errors and report the location of errors to the user. This version of the SOL compiler has been implemented on VAX/VMS computer systems and requires 204 KB of virtual memory to execute. Since the SOL compiler produces FORTRAN code, it requires the VAX FORTRAN compiler to produce an executable program. The SOL compiler consists of 13,000 lines of Pascal code. It was developed in 1986 and last updated in 1988. The ADS and other utility subroutines amount to 14,000 lines of FORTRAN code and were also updated in 1988.

  1. Language Program Evaluation

    ERIC Educational Resources Information Center

    Norris, John M.

    2016-01-01

    Language program evaluation is a pragmatic mode of inquiry that illuminates the complex nature of language-related interventions of various kinds, the factors that foster or constrain them, and the consequences that ensue. Program evaluation enables a variety of evidence-based decisions and actions, from designing programs and implementing…

  2. Imaginative Language: What Event-Related Potentials have Revealed about the Nature and Source of Concreteness Effects*

    PubMed Central

    Huang, Hsu-Wen; Federmeier, Kara D.

    2016-01-01

    Behavioral and neuropsychological evidence suggest that abstract and concrete concepts may be represented, retrieved, and processed differently in the human brain. As reviewed in this paper, data using event-related potential measures, some in combination with visual half-field presentation methods, have offered a detailed picture of the nature and source of concreteness effects. In particular, the results provide strong evidence for multiple mechanisms underlying the behavioral processing differences that have long been noted for concrete and abstract words and, further, suggest an intriguing, unique role for the right hemisphere in associating words with sensory imagery. PMID:27559305

  3. Language Development and Language Disorders.

    ERIC Educational Resources Information Center

    Bloom, Lois; Lahey, Margaret

    This book provides a synthesis of research findings in normal language development as well as a practical approach to the evaluation and treatment of children with language disorders. Its 21 chapters are divided into six topical sections: language description, normal language development, deviant language development, goals of language learning…

  4. Developmental Changes in the Nature of Language Proficiency and Reading Fluency Paint a More Complex View of Reading Comprehension in ELL and EL1

    ERIC Educational Resources Information Center

    Geva, Esther; Farnia, Fataneh

    2012-01-01

    We examined theoretical issues concerning the development of reading fluency and language proficiency in 390 English Language Learners (ELLs,) and 149 monolingual, English-as-a-first language (EL1) students. The extent to which performance on these constructs in Grade 5 (i.e., concurrent predictors) contributes to reading comprehension in the…

  5. Foreign Language Aptitude and Intelligence.

    ERIC Educational Resources Information Center

    Wesche, Marjorie; And Others

    1982-01-01

    Provides a partial characterization of the nature of language aptitude through correlations and factor analyses of the Modern Language Aptitude Test and Primary Mental Abilities Test. Also discusses whether second-language learning ability is better conceptualized as a unitary or a composite factor. (EKN)

  6. Programmed Instruction and Language Teaching

    ERIC Educational Resources Information Center

    Littlewood, W. T.

    1974-01-01

    This article first takes some characteristics of language and suggests that the nature of language makes it, intrinsically, unsuitable to treatment by a fully programmed course. Second, it takes programming and suggests what aspects of language might be assigned to programmed instruction. (Author/LG)

  7. Liuds'ka vdacha: uchnivs'kyi zoshyt (Human Nature: Student Activity Book) [and] Liuds'ka vdacha: vidpovidi do uchnivs'koho zoshyta (Human Nature: Answer Key to Student Activity Book). Collage 2: A Ukrainian Language Development Series.

    ERIC Educational Resources Information Center

    Boruszczak, Bohdan, Comp.; And Others

    One of four intermediate- to advanced-level activity books in a series, this student workbook offers a selection of exercises, vocabulary builders, dialogs, and writing exercises for language skill development. It is intended for use in the instruction of native speakers, heritage language learners, or second language learners of Ukrainian. Also…

  8. Conceptual Complexity and Apparent Contradictions in Mathematics Language

    ERIC Educational Resources Information Center

    Gough, John

    2007-01-01

    Mathematics is like a language, although technically it is not a natural or informal human language, but a formal, that is, artificially constructed language. Importantly, educators use their natural everyday language to teach the formal language of mathematics. At times, however, instructors encounter problems when the technical words they use,…

  9. An AdaBoost Using a Weak-Learner Generating Several Weak-Hypotheses for Large Training Data of Natural Language Processing

    NASA Astrophysics Data System (ADS)

    Iwakura, Tomoya; Okamoto, Seishi; Asakawa, Kazuo

    AdaBoost is a method to create a final hypothesis by repeatedly generating a weak hypothesis in each training iteration with a given weak learner. AdaBoost-based algorithms are successfully applied to several tasks such as Natural Language Processing (NLP), OCR, and so on. However, learning on the training data consisting of large number of samples and features requires long training time. We propose a fast AdaBoost-based algorithm for learning rules represented by combination of features. Our algorithm constructs a final hypothesis by learning several weak-hypotheses at each iteration. We assign a confidence-rated value to each weak-hypothesis while ensuring a reduction in the theoretical upper bound of the training error of AdaBoost. We evaluate our methods with English POS tagging and text chunking. The experimental results show that the training speed of our algorithm are about 25 times faster than an AdaBoost-based learner, and about 50 times faster than Support Vector Machines with polynomial kernel on the average while maintaining state-of-the-art accuracy.

  10. Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries

    PubMed Central

    Lin, Ching-Heng; Wu, Nai-Yuan; Lai, Wei-Shao; Liou, Der-Ming

    2015-01-01

    Background and objective Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Methods Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. Results The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, p<0.0001), but a similar F-measure to that of the SAAP (0.89 vs 0.87). For the Procedure terms, the F-measure was not significantly different among the three pipelines. Conclusions The combination of a semi-automatic annotation approach and the NLP application seems to be a solution for generating entry-level interoperable clinical documents. PMID:25332357

  11. Facilitating Surveillance of Pulmonary Invasive Mold Diseases in Patients with Haematological Malignancies by Screening Computed Tomography Reports Using Natural Language Processing

    PubMed Central

    Ananda-Rajah, Michelle R.; Martinez, David; Slavin, Monica A.; Cavedon, Lawrence; Dooley, Michael; Cheng, Allen; Thursky, Karin A.

    2014-01-01

    Purpose Prospective surveillance of invasive mold diseases (IMDs) in haematology patients should be standard of care but is hampered by the absence of a reliable laboratory prompt and the difficulty of manual surveillance. We used a high throughput technology, natural language processing (NLP), to develop a classifier based on machine learning techniques to screen computed tomography (CT) reports supportive for IMDs. Patients and Methods We conducted a retrospective case-control study of CT reports from the clinical encounter and up to 12-weeks after, from a random subset of 79 of 270 case patients with 33 probable/proven IMDs by international definitions, and 68 of 257 uninfected-control patients identified from 3 tertiary haematology centres. The classifier was trained and tested on a reference standard of 449 physician annotated reports including a development subset (n = 366), from a total of 1880 reports, using 10-fold cross validation, comparing binary and probabilistic predictions to the reference standard to generate sensitivity, specificity and area under the receiver-operating-curve (ROC). Results For the development subset, sensitivity/specificity was 91% (95%CI 86% to 94%)/79% (95%CI 71% to 84%) and ROC area was 0.92 (95%CI 89% to 94%). Of 25 (5.6%) missed notifications, only 4 (0.9%) reports were regarded as clinically significant. Conclusion CT reports are a readily available and timely resource that may be exploited by NLP to facilitate continuous prospective IMD surveillance with translational benefits beyond surveillance alone. PMID:25250675

  12. Novel Use of Natural Language Processing (NLP) to Predict Suicidal Ideation and Psychiatric Symptoms in a Text-Based Mental Health Intervention in Madrid

    PubMed Central

    Progovac, Ana M.; Chen, Pei; Mullin, Brian; Hou, Sherry

    2016-01-01

    Natural language processing (NLP) and machine learning were used to predict suicidal ideation and heightened psychiatric symptoms among adults recently discharged from psychiatric inpatient or emergency room settings in Madrid, Spain. Participants responded to structured mental and physical health instruments at multiple follow-up points. Outcome variables of interest were suicidal ideation and psychiatric symptoms (GHQ-12). Predictor variables included structured items (e.g., relating to sleep and well-being) and responses to one unstructured question, “how do you feel today?” We compared NLP-based models using the unstructured question with logistic regression prediction models using structured data. The PPV, sensitivity, and specificity for NLP-based models of suicidal ideation were 0.61, 0.56, and 0.57, respectively, compared to 0.73, 0.76, and 0.62 of structured data-based models. The PPV, sensitivity, and specificity for NLP-based models of heightened psychiatric symptoms (GHQ-12 ≥ 4) were 0.56, 0.59, and 0.60, respectively, compared to 0.79, 0.79, and 0.85 in structured models. NLP-based models were able to generate relatively high predictive values based solely on responses to a simple general mood question. These models have promise for rapidly identifying persons at risk of suicide or psychological distress and could provide a low-cost screening alternative in settings where lengthy structured item surveys are not feasible. PMID:27752278

  13. Language Endangerment and Language Revival.

    ERIC Educational Resources Information Center

    Muhlhausler, Peter

    2003-01-01

    Reviews and discusses the following books: "Language Death," by David Crystal; "The Green Book of Language Revitalization in Practice," by Leanne Hinton; and "Vanishing Voices of the World's Languages," by David Nettle. (Author/VWL)

  14. Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

    PubMed Central

    Kreinacke, Marcos; Spallek, Heiko; Song, Mei; O'Donnell, Jean A

    2011-01-01

    Background An Internet mailing list may be characterized as a virtual community of practice that serves as an information hub with easy access to expert advice and opportunities for social networking. We are interested in mining messages posted to a list for dental practitioners to identify clinical topics. Once we understand the topical domain, we can study dentists’ real information needs and the nature of their shared expertise, and can avoid delivering useless content at the point of care in future informatics applications. However, a necessary first step involves developing procedures to identify messages that are worth studying given our resources for planned, labor-intensive research. Objectives The primary objective of this study was to develop a workflow for finding a manageable number of clinically relevant messages from a much larger corpus of messages posted to an Internet mailing list, and to demonstrate the potential usefulness of our procedures for investigators by retrieving a set of messages tailored to the research question of a qualitative research team. Methods We mined 14,576 messages posted to an Internet mailing list from April 2008 to May 2009. The list has about 450 subscribers, mostly dentists from North America interested in clinical practice. After extensive preprocessing, we used the Natural Language Toolkit to identify clinical phrases and keywords in the messages. Two academic dentists classified collocated phrases in an iterative, consensus-based process to describe the topics discussed by dental practitioners who subscribe to the list. We then consulted with qualitative researchers regarding their research question to develop a plan for targeted retrieval. We used selected phrases and keywords as search strings to identify clinically relevant messages and delivered the messages in a reusable database. Results About half of the subscribers (245/450, 54.4%) posted messages. Natural language processing (NLP) yielded 279

  15. On Teaching Strategies in Second Language Acquisition

    ERIC Educational Resources Information Center

    Yang, Hong

    2008-01-01

    How to acquire a second language is a question of obvious importance to teachers and language learners, and how to teach a second language has also become a matter of concern to the linguists' interest in the nature of primary linguistic data. Starting with the development stages of second language acquisition and Stephen Krashen's theory, this…

  16. Why Johnny Should Learn Foreign Languages.

    ERIC Educational Resources Information Center

    Huebener, Theodore

    A case for the study of foreign languages by pupils in the United States is presented in this book. The polyglot nature of America and the history of its language education is described, and language programs in foreign countries and in the United States are compared. Also discussed are the need for adequate language education for international…

  17. Language Play.

    ERIC Educational Resources Information Center

    Schwartz, Judy I.

    This paper discusses kinds and characteristics of language play, explores the relationship of such play to wider domains of language and play, and speculates on the possible contributions of language play for language mastery and cognitive development. Jump rope chants and ritual insults ("Off my case, potato face") and other expressive language…

  18. Language Transfer in Language Learning. Language Acquisition & Language Disorders 5.

    ERIC Educational Resources Information Center

    Gass, Susan M., Ed.; Selinker, Larry, Ed.

    The study of native language influence in Second Language Acquisition has undergone significant changes over the past few decades. This book, which includes 12 chapters by distinguished researchers in the field of second language acquisition, traces the conceptual history of language transfer from its early role within a Contrastive Analysis…

  19. Abstraction and natural language semantics.

    PubMed Central

    Kayser, Daniel

    2003-01-01

    According to the traditional view, a word prototypically denotes a class of objects sharing similar features, i.e. it results from an abstraction based on the detection of common properties in perceived entities. I explore here another idea: words result from abstraction of common premises in the rules governing our actions. I first argue that taking 'inference', instead of 'reference', as the basic issue in semantics does matter. I then discuss two phenomena that are, in my opinion, particularly difficult to analyse within the scope of traditional semantic theories: systematic polysemy and plurals. I conclude by a discussion of my approach, and by a summary of its main features. PMID:12903662

  20. Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data

    PubMed Central

    Sauer, Brian C.; Jones, Barbara E.; Globe, Gary; Leng, Jianwei; Lu, Chao-Chin; He, Tao; Teng, Chia-Chen; Sullivan, Patrick; Zeng, Qing

    2016-01-01

    Introduction/Objective: Pulmonary function tests (PFTs) are objective estimates of lung function, but are not reliably stored within the Veteran Health Affairs data systems as structured data. The aim of this study was to validate the natural language processing (NLP) tool we developed—which extracts spirometric values and responses to bronchodilator administration—against expert review, and to estimate the number of additional spirometric tests identified beyond the structured data. Methods: All patients at seven Veteran Affairs Medical Centers with a diagnostic code for asthma Jan 1, 2006–Dec 31, 2012 were included. Evidence of spirometry with a bronchodilator challenge (BDC) was extracted from structured data as well as clinical documents. NLP’s performance was compared against a human reference standard using a random sample of 1,001 documents. Results: In the validation set NLP demonstrated a precision of 98.9 percent (95 percent confidence intervals (CI): 93.9 percent, 99.7 percent), recall of 97.8 percent (95 percent CI: 92.2 percent, 99.7 percent), and an F-measure of 98.3 percent for the forced vital capacity pre- and post pairs and precision of 100 percent (95 percent CI: 96.6 percent, 100 percent), recall of 100 percent (95 percent CI: 96.6 percent, 100 percent), and an F-measure of 100 percent for the forced expiratory volume in one second pre- and post pairs for bronchodilator administration. Application of the NLP increased the proportion identified with complete bronchodilator challenge by 25 percent. Discussion/Conclusion: This technology can improve identification of PFTs for epidemiologic research. Caution must be taken in assuming that a single domain of clinical data can completely capture the scope of a disease, treatment, or clinical test. PMID:27376095

  1. Light at Night Markup Language (LANML): XML Technology for Light at Night Monitoring Data

    NASA Astrophysics Data System (ADS)

    Craine, B. L.; Craine, E. R.; Craine, E. M.; Crawford, D. L.

    2013-05-01

    Light at Night Markup Language (LANML) is a standard, based upon XML, useful in acquiring, validating, transporting, archiving and analyzing multi-dimensional light at night (LAN) datasets of any size. The LANML standard can accommodate a variety of measurement scenarios including single spot measures, static time-series, web based monitoring networks, mobile measurements, and airborne measurements. LANML is human-readable, machine-readable, and does not require a dedicated parser. In addition LANML is flexible; ensuring future extensions of the format will remain backward compatible with analysis software. The XML technology is at the heart of communicating over the internet and can be equally useful at the desktop level, making this standard particularly attractive for web based applications, educational outreach and efficient collaboration between research groups.

  2. Language evolution and human-computer interaction

    NASA Technical Reports Server (NTRS)

    Grudin, Jonathan; Norman, Donald A.

    1991-01-01

    Many of the issues that confront designers of interactive computer systems also appear in natural language evolution. Natural languages and human-computer interfaces share as their primary mission the support of extended 'dialogues' between responsive entities. Because in each case one participant is a human being, some of the pressures operating on natural languages, causing them to evolve in order to better support such dialogue, also operate on human-computer 'languages' or interfaces. This does not necessarily push interfaces in the direction of natural language - since one entity in this dialogue is not a human, this is not to be expected. Nonetheless, by discerning where the pressures that guide natural language evolution also appear in human-computer interaction, we can contribute to the design of computer systems and obtain a new perspective on natural languages.

  3. Evolutionary biology of language.

    PubMed Central

    Nowak, M A

    2000-01-01

    Language is the most important evolutionary invention of the last few million years. It was an adaptation that helped our species to exchange information, make plans, express new ideas and totally change the appearance of the planet. How human language evolved from animal communication is one of the most challenging questions for evolutionary biology The aim of this paper is to outline the major principles that guided language evolution in terms of mathematical models of evolutionary dynamics and game theory. I will discuss how natural selection can lead to the emergence of arbitrary signs, the formation of words and syntactic communication. PMID:11127907

  4. Factors Influencing Sensitivity to Lexical Tone in an Artificial Language: Implications for Second Language Learning

    ERIC Educational Resources Information Center

    Caldwell-Harris, Catherine L.; Lancaster, Alia; Ladd, D. Robert; Dediu, Dan; Christiansen, Morten H.

    2015-01-01

    This study examined whether musical training, ethnicity, and experience with a natural tone language influenced sensitivity to tone while listening to an artificial tone language. The language was designed with three tones, modeled after level-tone African languages. Participants listened to a 15-min random concatenation of six 3-syllable words.…

  5. Sentence Repetition in Deaf Children with Specific Language Impairment in British Sign Language

    ERIC Educational Resources Information Center

    Marshall, Chloë; Mason, Kathryn; Rowley, Katherine; Herman, Rosalind; Atkinson, Joanna; Woll, Bencie; Morgan, Gary

    2015-01-01

    Children with specific language impairment (SLI) perform poorly on sentence repetition tasks across different spoken languages, but until now, this methodology has not been investigated in children who have SLI in a signed language. Users of a natural sign language encode different sentence meanings through their choice of signs and by altering…

  6. Using Machine Learning and Natural Language Processing Algorithms to Automate the Evaluation of Clinical Decision Support in Electronic Medical Record Systems

    PubMed Central

    Szlosek, Donald A; Ferrett, Jonathan

    2016-01-01

    Introduction: As the number of clinical decision support systems (CDSSs) incorporated into electronic medical records (EMRs) increases, so does the need to evaluate their effectiveness. The use of medical record review and similar manual methods for evaluating decision rules is laborious and inefficient. The authors use machine learning and Natural Language Processing (NLP) algorithms to accurately evaluate a clinical decision support rule through an EMR system, and they compare it against manual evaluation. Methods: Modeled after the EMR system EPIC at Maine Medical Center, we developed a dummy data set containing physician notes in free text for 3,621 artificial patients records undergoing a head computed tomography (CT) scan for mild traumatic brain injury after the incorporation of an electronic best practice approach. We validated the accuracy of the Best Practice Advisories (BPA) using three machine learning algorithms—C-Support Vector Classification (SVC), Decision Tree Classifier (DecisionTreeClassifier), k-nearest neighbors classifier (KNeighborsClassifier)—by comparing their accuracy for adjudicating the occurrence of a mild traumatic brain injury against manual review. We then used the best of the three algorithms to evaluate the effectiveness of the BPA, and we compared the algorithm’s evaluation of the BPA to that of manual review. Results: The electronic best practice approach was found to have a sensitivity of 98.8 percent (96.83–100.0), specificity of 10.3 percent, PPV = 7.3 percent, and NPV = 99.2 percent when reviewed manually by abstractors. Though all the machine learning algorithms were observed to have a high level of prediction, the SVC displayed the highest with a sensitivity 93.33 percent (92.49–98.84), specificity of 97.62 percent (96.53–98.38), PPV = 50.00, NPV = 99.83. The SVC algorithm was observed to have a sensitivity of 97.9 percent (94.7–99.86), specificity 10.30 percent, PPV 7.25 percent, and NPV 99.2 percent for

  7. Using Data Mining and Natural Language Processing to Reveal Institutional Water Management Structures in Four Urban Areas in the US Southwest

    NASA Astrophysics Data System (ADS)

    Murphy, J.; Ozik, J.; Altaweel, M.; Lammers, R. B.; Collier, N. T.; Kliskey, A.; Alessa, L.; Williams, P.; Cason, D.

    2013-12-01

    Water management in urban settings is often under the control of multiple entities and institutions that may exist at different scales, have varying aims and capabilities, and serve different ends. The impact of water management structure on a given area's ability to respond to short- and long-term water challenges is an open question. Public perception is an important aspect of this response: public knowledge of both water management structure and water issues is key to motivating and shaping individual and institutional adaptive responses to challenges of water supply or shortage, water quality, and other problems. Our study asks how public perception and discourse captures and reflects local water management institutional structure. We examine four study areas in the Colorado Basin for which several years of newspaper articles (100,000+ documents) are available for data mining and where water management is an important issue: Las Vegas, NV; Tucson, AZ; Flagstaff, AZ; and the cities in the Grand Valley, CO. These four areas experienced different historical trajectories that have influenced different water management structures, both in terms of physical infrastructure and social and institutional arrangements. We present a method and software for performing Natural Language Processing to extract the names of water management entities from readily available sources. Standard techniques for discovering proper nouns are used, then specific internal and contextual criteria are applied that identify likely names of institutions. Documents in the corpus are scored based on the frequency of occurrence of water keywords. Institutions are then scored according to their association with water-related documents. The result is a list of highly-water related regional and local institutions. The resulting list is used to create a network, with edges between any two institutions established and weighted by the count of the documents in which both institutions are discussed

  8. Using Machine Learning and Natural Language Processing Algorithms to Automate the Evaluation of Clinical Decision Support in Electronic Medical Record Systems

    PubMed Central

    Szlosek, Donald A; Ferrett, Jonathan

    2016-01-01

    Introduction: As the number of clinical decision support systems (CDSSs) incorporated into electronic medical records (EMRs) increases, so does the need to evaluate their effectiveness. The use of medical record review and similar manual methods for evaluating decision rules is laborious and inefficient. The authors use machine learning and Natural Language Processing (NLP) algorithms to accurately evaluate a clinical decision support rule through an EMR system, and they compare it against manual evaluation. Methods: Modeled after the EMR system EPIC at Maine Medical Center, we developed a dummy data set containing physician notes in free text for 3,621 artificial patients records undergoing a head computed tomography (CT) scan for mild traumatic brain injury after the incorporation of an electronic best practice approach. We validated the accuracy of the Best Practice Advisories (BPA) using three machine learning algorithms—C-Support Vector Classification (SVC), Decision Tree Classifier (DecisionTreeClassifier), k-nearest neighbors classifier (KNeighborsClassifier)—by comparing their accuracy for adjudicating the occurrence of a mild traumatic brain injury against manual review. We then used the best of the three algorithms to evaluate the effectiveness of the BPA, and we compared the algorithm’s evaluation of the BPA to that of manual review. Results: The electronic best practice approach was found to have a sensitivity of 98.8 percent (96.83–100.0), specificity of 10.3 percent, PPV = 7.3 percent, and NPV = 99.2 percent when reviewed manually by abstractors. Though all the machine learning algorithms were observed to have a high level of prediction, the SVC displayed the highest with a sensitivity 93.33 percent (92.49–98.84), specificity of 97.62 percent (96.53–98.38), PPV = 50.00, NPV = 99.83. The SVC algorithm was observed to have a sensitivity of 97.9 percent (94.7–99.86), specificity 10.30 percent, PPV 7.25 percent, and NPV 99.2 percent for

  9. An Examination of Western Influences on Indigenous Language Teaching.

    ERIC Educational Resources Information Center

    Mellow, J. Dean

    To examine the influence of Western perspectives on indigenous language teaching, a two-dimensional framework of approaches to language teaching is presented. A horizontal continuum concerning the nature of language ranges between form and function, and a vertical continuum concerning the nature of language learning ranges between construction and…

  10. Language evolution in the laboratory.

    PubMed

    Scott-Phillips, Thomas C; Kirby, Simon

    2010-09-01

    The historical origins of natural language cannot be observed directly. We can, however, study systems that support language and we can also develop models that explore the plausibility of different hypotheses about how language emerged. More recently, evolutionary linguists have begun to conduct language evolution experiments in the laboratory, where the emergence of new languages used by human participants can be observed directly. This enables researchers to study both the cognitive capacities necessary for language and the ways in which languages themselves emerge. One theme that runs through this work is how individual-level behaviours result in population-level linguistic phenomena. A central challenge for the future will be to explore how different forms of information transmission affect this process.

  11. How could language have evolved?

    PubMed

    Bolhuis, Johan J; Tattersall, Ian; Chomsky, Noam; Berwick, Robert C

    2014-08-01

    The evolution of the faculty of language largely remains an enigma. In this essay, we ask why. Language's evolutionary analysis is complicated because it has no equivalent in any nonhuman species. There is also no consensus regarding the essential nature of the language "phenotype." According to the "Strong Minimalist Thesis," the key distinguishing feature of language (and what evolutionary theory must explain) is hierarchical syntactic structure. The faculty of language is likely to have emerged quite recently in evolutionary terms, some 70,000-100,000 years ago, and does not seem to have undergone modification since then, though individual languages do of course change over time, operating within this basic framework. The recent emergence of language and its stability are both consistent with the Strong Minimalist Thesis, which has at its core a single repeatable operation that takes exactly two syntactic elements a and b and assembles them to form the set {a, b}.

  12. Language evolution in the laboratory.

    PubMed

    Scott-Phillips, Thomas C; Kirby, Simon

    2010-09-01

    The historical origins of natural language cannot be observed directly. We can, however, study systems that support language and we can also develop models that explore the plausibility of different hypotheses about how language emerged. More recently, evolutionary linguists have begun to conduct language evolution experiments in the laboratory, where the emergence of new languages used by human participants can be observed directly. This enables researchers to study both the cognitive capacities necessary for language and the ways in which languages themselves emerge. One theme that runs through this work is how individual-level behaviours result in population-level linguistic phenomena. A central challenge for the future will be to explore how different forms of information transmission affect this process. PMID:20675183

  13. How Much Language Is Enough? Some Immigrant Language Lessons from Canada and Germany. Discussion Paper.

    ERIC Educational Resources Information Center

    DeVoretz, Don J.; Hinte, Holger; Werner, Christiane

    Germany and Canada are at opposite ends of the debate over language integration and ascension to citizenship. German naturalization contains an explicit language criterion for naturalization. The first German immigration act will not only concentrate on control aspects but also focus on language as a criterion for legal immigration. Canada does…

  14. Language Development and Early Encounters with Written Language.

    ERIC Educational Resources Information Center

    Baghban, Marcia

    The language development of one child was examined from birth to three years of age in order to map the similarities and differences in the acquisition of oral language, reading, and writing skills. The study also sought to provide insight into why learning to read and write are not as naturally easy as learning to talk. Data were collected by…

  15. Second Language Acquisition, Teacher Education and Language Pedagogy

    ERIC Educational Resources Information Center

    Ellis, Rod

    2010-01-01

    Various positions regarding the Second Language Acquisition (SLA)-Language Pedagogy (LP) nexus have been advanced. Taking these as a starting point, this article will examine the nature of the SLA/LP relationship both more generally and more concretely. First, it will place the debates evident in the different positions regarding the relationship…

  16. Genotype Specification Language.

    PubMed

    Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

    2016-06-17

    We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility. PMID:26886161

  17. Genotype Specification Language.

    PubMed

    Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

    2016-06-17

    We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility.

  18. The language of quality.

    PubMed

    Loughlin, M

    1996-05-01

    Management theorists have developed a language which, they claim, can be used to evaluate many diverse practices, including practices in health care. This language embodies conceptualizations of practice and an approach to evaluation which treat the concept of quality as foundational and which have links with free market ideology. Despite an extensive literature which attempts to apply this language to various areas of life, its fundamental conceptual assumptions remain largely unexamined. Without adequate philosophical arguments in support of these assumptions, the value of this language and the validity of the approach to practice that it embodies are unproven. Its imposition in the absence of such arguments therefore represents a form of intellectual imperialism. To understand and develop adequate responses to this situation, it is necessary to look at the broad political picture which affects the nature of debates in specific areas of practice, such as the health service, and to question the dominant paradigm governing practical debate in contemporary society.

  19. Language Acquisition and Language Revitalization

    ERIC Educational Resources Information Center

    O'Grady, William; Hattori, Ryoko

    2016-01-01

    Intergenerational transmission, the ultimate goal of language revitalization efforts, can only be achieved by (re)establishing the conditions under which an imperiled language can be acquired by the community's children. This paper presents a tutorial survey of several key points relating to language acquisition and maintenance in children,…

  20. Language Switching and Language Competition

    ERIC Educational Resources Information Center

    Macizo, Pedro; Bajo, Teresa; Paolieri, Daniela

    2012-01-01

    This study examined the asymmetrical language switching cost in a word reading task (Experiment 1) and in a categorization task (Experiment 2 and 3). In Experiment 1, Spanish-English bilinguals named words in first language (L1) and second language (L2) in a switching paradigm. They were slower to switch from their weaker L2 to their more dominant…

  1. Language shift, bilingualism and the future of Britain's Celtic languages.

    PubMed

    Kandler, Anne; Unger, Roman; Steele, James

    2010-12-12

    'Language shift' is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100-120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the 'natural wastage' as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission.

  2. The Functional Properties of Language.

    ERIC Educational Resources Information Center

    Garvin, Paul L.

    This paper asks whether the imprecision and complexity of natural language, as opposed to the language of science or logic, represent flaws or essential functional properties. It is argued that ambiguity can be manipulated by the speaker through environmentally derived characteristics. A discussion follows on the study of the functions of language…

  3. Grammar and the Spoken Language.

    ERIC Educational Resources Information Center

    Carter, Ronald; McCarthy, Michael

    1995-01-01

    Argues that second language teaching that aims to foster speaking skills and natural spoken interaction should be based upon the grammar of spoken language, not on grammars that mainly reflect written norms. Using evidence from a mini-corpus of conversational English, it is shown that popular pedagogical grammars are deficient in conversational…

  4. Grammar and the Spoken Language.

    ERIC Educational Resources Information Center

    Carter, Ronald; McCarthy, Michael

    This paper argues that second language instruction that aims to foster speaking skills and natural spoken interaction should be based upon the grammar of the spoken language, and not on grammars that reflect written norms. Using evidence from a corpus of conversational English, this examination focuses on how four grammatical features that occur…

  5. Neural Network Processing of Natural Language: II. Towards a Unified Model of Corticostriatal Function in Learning Sentence Comprehension and Non-Linguistic Sequencing

    ERIC Educational Resources Information Center

    Dominey, Peter Ford; Inui, Toshio; Hoen, Michel

    2009-01-01

    A central issue in cognitive neuroscience today concerns how distributed neural networks in the brain that are used in language learning and processing can be involved in non-linguistic cognitive sequence learning. This issue is informed by a wealth of functional neurophysiology studies of sentence comprehension, along with a number of recent…

  6. Cross-Disciplinary Dialogue about the Nature of Oral and Written Language Problems in the Context of Developmental, Academic, and Phenotypic Profiles

    ERIC Educational Resources Information Center

    Silliman, Elaine R.; Berninger, Virginia W.

    2011-01-01

    Professionals across disciplines who assess and teach students with language problems should develop their own standards for best professional practices to improve the diagnostic and treatment (instructional) services in schools and nonschool settings rather than assessing only for eligibility for categories of special education services according…

  7. Bedtime Stories in English: Field-Testing Comprehensible Input Materials for Natural Second-Language Acquisition in Japanese Pre-School Children

    ERIC Educational Resources Information Center

    Hamilton, Robert

    2014-01-01

    In this study, the prototype of a new type of bilingual picture book was field-tested with two sets of mother-son subject pairs. This picture book was designed as a possible tool for providing children with comprehensible input during their critical period for second language acquisition. Context is provided by visual cues and both Japanese and…

  8. Exploring the Ancestral Roots of American Sign Language: Lexical Borrowing from Cistercian Sign Language and French Sign Language

    ERIC Educational Resources Information Center

    Cagle, Keith Martin

    2010-01-01

    American Sign Language (ASL) is the natural and preferred language of the Deaf community in both the United States and Canada. Woodward (1978) estimated that approximately 60% of the ASL lexicon is derived from early 19th century French Sign Language, which is known as "langue des signes francaise" (LSF). The lexicon of LSF and ASL may be derived…

  9. Spoken Language Phonotactics.

    ERIC Educational Resources Information Center

    Hieke, A. E.

    The transformation that language undergoes when it becomes speech is examined in English. Statistical analysis of a representative sample of natural, informal speech reveals a number of characteristics of dynamic speech that distinguish it from static (citation form or pre-dynamic) linguistic form. It appears that in running speech, vowels and…

  10. Language & Literature. Curriculum Handbook.

    ERIC Educational Resources Information Center

    Livonia Public Schools, MI.

    The global education curriculum presented in this booklet is offered as a model, of integrated, interdisciplinary English studies, that involves participants in cultural, scientific, ecological, and economic issues while promoting student awareness of the nature and development of world literature, languages, the arts, and their…

  11. Language Teaching and Acquisition of Communication.

    ERIC Educational Resources Information Center

    Sajavaara, Kari; Lehtonen, Jaakko

    A theoretical linguistic model is insufficient to deal with the problems of language teaching because of the complexity of the phenomena concerned and the dynamic nature of language acquisition and communication. Most linguistic models neglect the fact that, in communicative situations, language users construct the prerequisites of communicative…

  12. Rank and Sparsity in Language Processing

    ERIC Educational Resources Information Center

    Hutchinson, Brian

    2013-01-01

    Language modeling is one of many problems in language processing that have to grapple with naturally high ambient dimensions. Even in large datasets, the number of unseen sequences is overwhelmingly larger than the number of observed ones, posing clear challenges for estimation. Although existing methods for building smooth language models tend to…

  13. Mirror Neurons and the Evolution of Language

    ERIC Educational Resources Information Center

    Corballis, Michael C.

    2010-01-01

    The mirror system provided a natural platform for the subsequent evolution of language. In nonhuman primates, the system provides for the understanding of biological action, and possibly for imitation, both prerequisites for language. I argue that language evolved from manual gestures, initially as a system of pantomime, but with gestures…

  14. Influences of Foreign Language Study on English

    ERIC Educational Resources Information Center

    Boyd, Rachel

    1977-01-01

    Discussion of a study to determine the correlation between better performance of foreign language students in English and natural talent for languages, transfer of foreign language skills to English, or both talent and transfer. The procedure and results are described. Support for the third hypothesis (talent and transfer) was gained. (AMH)

  15. Facilitating Second Language Learning with Music

    ERIC Educational Resources Information Center

    Bae, Su-Young

    2006-01-01

    The use of music in facilitating second language (as well as first language) learning is supported by evidence that points to the musical nature of even preverbal infants. Music and language have been found to develop similarly, and researchers have noted advantages to using song in learning. The author observed her Korean 21-month-old for …

  16. Content-based Instruction for African Languages.

    ERIC Educational Resources Information Center

    Moshi, Lioba

    2001-01-01

    Examines content-based instruction for African languages and considers Schleicher's (2000) expatiation of goal-based instruction for African languages. Focuses on the parameters for content-based instruction, the profile of a content-based instructional program, the nature of content-based instruction, the first steps for African languages, and…

  17. Arthroscopy language.

    PubMed

    Zahiri, H; Brazina, G; Zahiri, C A

    1994-09-01

    The authors have devised an "arthroscopy language" to make orthopaedic surgeons' intraoperative communication clear, comprehensive, and concise. This language specifically eliminates surgeons' "freestyle" conversation at the most crucial moments of their procedure, when concentration and the coordinated work of two surgeons are essential. The language uses current arthroscopic terminology and new words that have been adapted by the authors to describe all the basic maneuvers that are used during any arthroscopic procedure. The authors believe the language brings the necessary scientific sophistication into arthroscopic surgeons' speech in the operating theater. PMID:7800401

  18. Complexity in language acquisition.

    PubMed

    Clark, Alexander; Lappin, Shalom

    2013-01-01

    Learning theory has frequently been applied to language acquisition, but discussion has largely focused on information theoretic problems-in particular on the absence of direct negative evidence. Such arguments typically neglect the probabilistic nature of cognition and learning in general. We argue first that these arguments, and analyses based on them, suffer from a major flaw: they systematically conflate the hypothesis class and the learnable concept class. As a result, they do not allow one to draw significant conclusions about the learner. Second, we claim that the real problem for language learning is the computational complexity of constructing a hypothesis from input data. Studying this problem allows for a more direct approach to the object of study--the language acquisition device-rather than the learnable class of languages, which is epiphenomenal and possibly hard to characterize. The learnability results informed by complexity studies are much more insightful. They strongly suggest that target grammars need to be objective, in the sense that the primitive elements of these grammars are based on objectively definable properties of the language itself. These considerations support the view that language acquisition proceeds primarily through data-driven learning of some form.

  19. Programming languages for synthetic biology.

    PubMed

    Umesh, P; Naveen, F; Rao, Chanchala Uma Maheswara; Nair, Achuthsankar S

    2010-12-01

    In the backdrop of accelerated efforts for creating synthetic organisms, the nature and scope of an ideal programming language for scripting synthetic organism in-silico has been receiving increasing attention. A few programming languages for synthetic biology capable of defining, constructing, networking, editing and delivering genome scale models of cellular processes have been recently attempted. All these represent important points in a spectrum of possibilities. This paper introduces Kera, a state of the art programming language for synthetic biology which is arguably ahead of similar languages or tools such as GEC, Antimony and GenoCAD. Kera is a full-fledged object oriented programming language which is tempered by biopart rule library named Samhita which captures the knowledge regarding the interaction of genome components and catalytic molecules. Prominent feature of the language are demonstrated through a toy example and the road map for the future development of Kera is also presented. PMID:22132053

  20. Programming languages for synthetic biology.

    PubMed

    Umesh, P; Naveen, F; Rao, Chanchala Uma Maheswara; Nair, Achuthsankar S

    2010-12-01

    In the backdrop of accelerated efforts for creating synthetic organisms, the nature and scope of an ideal programming language for scripting synthetic organism in-silico has been receiving increasing attention. A few programming languages for synthetic biology capable of defining, constructing, networking, editing and delivering genome scale models of cellular processes have been recently attempted. All these represent important points in a spectrum of possibilities. This paper introduces Kera, a state of the art programming language for synthetic biology which is arguably ahead of similar languages or tools such as GEC, Antimony and GenoCAD. Kera is a full-fledged object oriented programming language which is tempered by biopart rule library named Samhita which captures the knowledge regarding the interaction of genome components and catalytic molecules. Prominent feature of the language are demonstrated through a toy example and the road map for the future development of Kera is also presented.