Sample records for natural language parsers

  1. Policy-Based Management Natural Language Parser

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  2. Benchmarking natural-language parsers for biological applications using dependency graphs

    PubMed Central

    Clegg, Andrew B; Shepherd, Adrian J

    2007-01-01

    Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351

  3. Recognizing noun phrases in medical discharge summaries: an evaluation of two natural language parsers.

    PubMed Central

    Spackman, K. A.; Hersh, W. R.

    1996-01-01

    We evaluated the ability of two natural language parsers, CLARIT and the Xerox Tagger, to identify simple, noun phrases in medical discharge summaries. In twenty randomly selected discharge summaries, there were 1909 unique simple noun phrases. CLARIT and the Xerox Tagger exactly identified 77.0% and 68.7% of the phrases, respectively, and partially identified 85.7% and 80.8% of the phrases. Neither system had been specially modified or tuned to the medical domain. These results suggest that it is possible to apply existing natural language processing (NLP) techniques to large bodies of medical text, in order to empirically identify the terminology used in medicine. Virtually all the noun phrases could be regarded as having special medical connotation and would be candidates for entry into a controlled medical vocabulary. PMID:8947647

  4. CoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning, pages 198202 Manchester, August 2008

    E-print Network

    -based DepParser (MST Parser) Graph-based DepParser Deep Linguistic Parser (ERG/PET)Predicate IdentificationCoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning, pages 198­202 Manchester, August 2008 Hybrid Learning of Dependency Structures from Heterogeneous Linguistic

  5. Building a Parser for ATC language in the project seminar

    E-print Network

    Ladkin, Peter B.

    the ATC Parser using Flex and Bison 1 3 Generating the Flex and Bison input files 3 3.1 The Flex input file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 The Bison input file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.5 The class BisonRule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4

  6. Building a Parser for ATC language in the project seminar

    E-print Network

    Ladkin, Peter B.

    Generating the ATC Parser using Flex and Bison 1 3 Generating the Flex and Bison input #28;les 3 3.1 The Flex input #28;le . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 The Bison input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.5 The class BisonRule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4

  7. Wait-and-See Strategies for Parsing Natural Language

    E-print Network

    Marcus, Mitchell P.

    The intent of this paper is to convey one idea central to the structure of a natural language parser currently under development, the notion of wait-and-see strategies. This notion will hopefully allow the recognition of ...

  8. A natural language interface to databases

    NASA Technical Reports Server (NTRS)

    Ford, D. R.

    1988-01-01

    The development of a Natural Language Interface which is semantic-based and uses Conceptual Dependency representation is presented. The system was developed using Lisp and currently runs on a Symbolics Lisp machine. A key point is that the parser handles morphological analysis, which expands its capabilities of understanding more words.

  9. Automatic Prediction of Parser Accuracy Sujith Ravi and Kevin Knight

    E-print Network

    Knight, Kevin

    @languageweaver.com Abstract Statistical parsers have become increasingly accurate, to the point where they are useful in many on any given domain. 1 Introduction Statistical natural language parsers have recently become more given in the large literature on parsing to date is to have human anno- tators build parse trees

  10. An Introductory Lisp Parser.

    ERIC Educational Resources Information Center

    Loritz, Donald

    1987-01-01

    Gives a short grammar of the Lisp computer language. Presents an introductory English parser (Simparse) as an example of how to write a parser in Lisp. Lists references for further explanation. Intended as preparation for teachers who may use computer-assisted language instruction in the future. (LMO)

  11. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  12. The Fourth Symposium on Natural Language Processing 2000 AN HPSG FORMALISM FOR IMPLEMENTING A JAVA

    E-print Network

    Keselj, Vlado

    PARSER #3; Vlado Ke#20; selj Department of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1 Grammars (HPSGs), named Stefy. The parser is used in an Internet information retrieval system. The speci#12 to implement a natural language processing (NLP) system for Internet information retrieval (IR). This larger

  13. An initial study of full parsing of clinical text using the Stanford Parser

    Microsoft Academic Search

    Hua Xu; Samir AbdelRahman; Min Jiang; Jung-wei Fan; Yang Huang

    2011-01-01

    Full parsing recognizes a sentence and generates a syntactic structure of it (a parse tree), which is useful for many natural language processing (NLP) applications. The Stanford Parser is one of the state-of-art parsers in the general English domain. However, there is no formal evaluation of its performance in clinical text that often contains ungrammatical structures. In this study, we

  14. Natural Language Processing.

    ERIC Educational Resources Information Center

    Chowdhury, Gobinda G.

    2003-01-01

    Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…

  15. Bootstrapping parsers via syntactic projection across parallel texts

    Microsoft Academic Search

    REBECCA HWA; PHILIP RESNIK; AMY WEINBERG; CLARA CABEZAS; OKAN KOLAK

    2005-01-01

    Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as \\

  16. Processing Natural Language without Natural Language Processing

    Microsoft Academic Search

    Eric Brill

    2003-01-01

    We can still create computer programs displaying only the most rudimentary natural language processing capabilities. One of\\u000a the greatest barriers to advanced natural language processing is our inability to overcome the linguistic knowledge acquisition\\u000a bottleneck. In this paper, we describe recent work in a number of areas, including grammar checker development, automatic\\u000a question answering, and language modeling, where state of

  17. Parsing clinical text: how good are the state-of-the-art parsers?

    PubMed Central

    2015-01-01

    Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Conclusions Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text. PMID:26045009

  18. Natural language processing.

    PubMed

    Joshi, A K

    1991-09-13

    Natural language processing (NLP) is the study of mathematical and computational modeling of various aspects of language and the development of a wide range of systems. These include spoken language systems that integrate speech and natural language; cooperative interfaces to databases and knowledge bases that model aspects of human-human interaction; multilingual interfaces; machine translation; and message-understanding systems, among others. Research in NLP is highly interdisciplinary, involving concepts in computer science, linguistics, logic, and psychology. NLP has a special role in computer science because many aspects of the field deal with linguistic features of computation and NLP seeks to model language computationally. PMID:17831443

  19. Persian language understanding using a two-step extended hidden vector state parser

    Microsoft Academic Search

    attaneh Jabbari; ossein Sameti; Mohammad Hadi Bokaei

    2011-01-01

    The key element of a spoken dialogue system is a spoken language understanding (SLU) unit. Hidden Vector State (HVS) is one of the most popular statistical approaches employed to implement the SLU unit. This paper presents a two-step approach for Persian language understanding. First, a goal detector is used to identify the main goal of the input utterance. Second, after

  20. Natural Language Interface

    NSDL National Science Digital Library

    Lane, David M.

    This case study, by David M. Lane of Rice University, assesses the question, "Is it easier to learn to use computer software that uses natural language commands?" Main concepts are analysis of covariance, adjusted means, and boxplots. The experimental design, descriptive statistics, inferential statistics, and raw data are all given.

  1. Connectionism and Determinism in a Syntactic Parser STAN C.. KWASNY & KANAAN A. FAISAL

    E-print Network

    Faisal, Kanaan Abed

    11 9 -Chapter 7 Connectionism and Determinism in a Syntactic Parser STAN C.. KWASNY & KANAAN A: Connectionism, determinism, learning, natural language processing, neural networks, parsing. 1. Introduction as symbolic approaches, stand as an important counterpoint. In connectionism, there is the promise of robust

  2. La Description des langues naturelles en vue d'applications linguistiques: Actes du colloque (The Description of Natural Languages with a View to Linguistic Applications: Conference Papers). Publication K-10.

    ERIC Educational Resources Information Center

    Ouellon, Conrad, Comp.

    Presentations from a colloquium on applications of research on natural languages to computer science address the following topics: (1) analysis of complex adverbs; (2) parser use in computerized text analysis; (3) French language utilities; (4) lexicographic mapping of official language notices; (5) phonographic codification of Spanish; (6)…

  3. A Concept-Centric Framework for Building Natural Language Interfaces

    NASA Astrophysics Data System (ADS)

    Funakoshi, Kotaro; Nakano, Mikio; Hasegawa, Yuji; Tsujino, Hiroshi

    Natural language interfaces are expected to come into practical use in many situations. It is, however, not practical to expect to achieve a universal interface because language use is so diverse. To that end, not only advancements in speech and language technologies but also well-designed development frameworks are required so that developers can build domain-specific interfaces rapidly and easily. This paper proposes KNOLU, a framework for building natural language interfaces of a broad range of applications. Developers using this framework can easily build an interface capable of understanding subsets of natural language expressions just by providing an ontology (a concept hierarchy with semantic frames and a lexicon), an onomasticon (a set of instances and their names) and API functions that provide procedural knowledge required to connect the interface to a target application. To develop an interface using KNOLU, first developers define a concept hierarchy for a target domain. Then they provide other declarative and procedural knowledge components with these knowledge components asscicated to the hierarchy. This developmental flow affords an unobstructed view both for development and maintanance. KNOLU uses an existing general-purpose parser and requires neither grammar rules nor expression patterns. It does not require rules to generate semantic interpretations from parsing results, either. Therefore, developers can build an interface without deep knowledge and experience of natural language processing. We applied KNOLU to two applications and confirmed the effectiveness.

  4. Programming Languages, Natural Languages, and Mathematics

    ERIC Educational Resources Information Center

    Naur, Peter

    1975-01-01

    Analogies are drawn between the social aspects of programming and similar aspects of mathematics and natural languages. By analogy with the history of auxiliary languages it is suggested that Fortran and Cobol will remain dominant. (Available from the Association of Computing Machinery, 1133 Avenue of the Americas, New York, NY 10036.) (Author/TL)

  5. Readings in natural language processing

    SciTech Connect

    Grosz, B.J.; Jones, K.S.; Webber, B.L.

    1986-01-01

    The book presents papers on natural language processing, focusing on the central issues of representation, reasoning, and recognition. The introduction discusses theoretical issues, historical developments, and current problems and approaches. The book presents work in syntactic models (parsing and grammars), semantic interpretation, discourse interpretation, language action and intentions, language generation, and systems.

  6. A Natural Language Graphics System.

    ERIC Educational Resources Information Center

    Brown, David, C.; Kwasny, Stan C.

    This report describes an experimental system for drawing simple pictures on a computer graphics terminal using natural language input. The system is capable of drawing lines, points, and circles on command from the user, as well as answering questions about system capabilities and objects on the screen. Erasures are permitted and language input…

  7. Toward understanding natural language directions

    E-print Network

    Kollar, Thomas Fleming

    Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped ...

  8. Advances in natural language processing.

    PubMed

    Hirschberg, Julia; Manning, Christopher D

    2015-07-17

    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. PMID:26185244

  9. A natural language interface for real-time dialogue in the flight domain

    NASA Technical Reports Server (NTRS)

    Ali, M.; Ai, C.-S.; Ferber, H. J.

    1986-01-01

    A flight expert system (FLES) is being developed to assist pilots in monitoring, diagnosisng and recovering from in-flight faults. To provide a communications interface between the flight crew and FLES, a natural language interface, has been implemented. Input to NALI is processed by three processors: (1) the semantic parser, (2) the knowledge retriever, and (3) the response generator. The architecture of NALI has been designed to process both temporal and nontemporal queries. Provisions have also been made to reduce the number of system modifications required for adapting NALI to other domains. This paper describes the architecture and implementation of NALI.

  10. Natural language processing Laboratory Plan: An essential for Persian language

    Microsoft Academic Search

    Mohammad Azadnia; Sina Rezagholizadeh; Alireza Yari

    2010-01-01

    Prevalent use of human language in computer systems and language oriented document processing, especially in the web, caused creating the need of designing mechanisms for developing natural language processing. In this paper, a survey in Natural Language Processing Laboratory has been done and a proposed framework for its development is suggested. Achieving this goal, the specifications, structures and activities of

  11. The parser generator as a general purpose tool

    NASA Technical Reports Server (NTRS)

    Noonan, R. E.; Collins, W. R.

    1985-01-01

    The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.

  12. Danica D. Damljanovic Natural Language

    E-print Network

    Stevenson, Mark

    Danica D. Damljanovi´c Natural Language Interfaces to Conceptual Models Submitted in partial difficulties for non-expert users. One way to lower the learning overhead and make ontology queries more through all difficult moments that existed while working on this thesis and helping out with all those

  13. Database semantics for natural language

    Microsoft Academic Search

    Roland Hausser

    2001-01-01

    This paper presents a formal 'fragment' of database semantics as a declarative model of a cognitive agent. It is called a SLIM machine and functionally integrates the procedures of natural language interpretation, conceptualization, and production as well as query and inference. Each of these functions is illustrated explicitly by a corresponding LA-grammar. In addition, a control structure based on the

  14. Models of natural language understanding.

    PubMed Central

    Bates, M

    1995-01-01

    This paper surveys some of the fundamental problems in natural language (NL) understanding (syntax, semantics, pragmatics, and discourse) and the current approaches to solving them. Some recent developments in NL processing include increased emphasis on corpus-based rather than example- or intuition-based work, attempts to measure the coverage and effectiveness of NL systems, dealing with discourse and dialogue phenomena, and attempts to use both analytic and stochastic knowledge. Critical areas for the future include grammars that are appropriate to processing large amounts of real language; automatic (or at least semi-automatic) methods for deriving models of syntax, semantics, and pragmatics; self-adapting systems; and integration with speech processing. Of particular importance are techniques that can be tuned to such requirements as full versus partial understanding and spoken language versus text. Portability (the ease with which one can configure an NL system for a particular application) is one of the largest barriers to application of this technology. PMID:7479812

  15. GEMINI: A Natural Language System for Spoken-Language Understanding

    Microsoft Academic Search

    John Dowding; Jean Mark Gawron; Douglas E. Appelt; John Bear; Lynn Cherny; Robert C. Moore; Douglas B. Moran

    1993-01-01

    Gemini is a natural language understanding system developed for spoken language applications. This paper describes the details of the system, and includes relevant measurements of size, efficiency, and performance of each of its sub-components in detail.

  16. Disambiguating the species of biomedical named entities using natural language parsers

    Microsoft Academic Search

    Xinglong Wang; Jun-ichi Tsujii; Sophia Ananiadou

    2010-01-01

    Motiv ation: Text mining technologies have been shown to reduce the laborious work involved in organising the vast amount of information hidden in the literature. One challenge in text mining is linking ambiguous word forms to unambiguous biological concepts. This paper reports on a comprehensive study on resolving the ambiguity in mentions of biomedical named entities with respect to model

  17. Towards Designing Natural Language Interfaces

    Microsoft Academic Search

    Svetlana Sheremetyeva

    2003-01-01

    The paper addresses issues of designing natural language interfaces that guide users towards expert ways of thinking. It attempts\\u000a to contribute to an interface methodology with a case study,—the AutoPat interface,—an application for authoring technical\\u000a documents, such as patent claims. Content and composition support is provided through access to domain models, words and phrases\\u000a as well as to the application

  18. MIA -Master on Artificial Intelligence Advanced Natural Language Processing

    E-print Network

    Ageno, Alicia

    Advanced Natural Language Processing Machine Learning Review MIA - Master on Artificial Intelligence Advanced Natural Language Processing #12;Advanced Natural Language Processing Machine Learning . . . #12;Advanced Natural Language Processing Machine Learning Review Introduction Other relevant

  19. Unsupervised learning of natural languages.

    PubMed

    Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon

    2005-08-16

    We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885

  20. Unsupervised learning of natural languages

    PubMed Central

    Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon

    2005-01-01

    We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885

  1. Natural Language of Application Domains versus Domain Specific Programming Languages

    E-print Network

    Rus, Teodor

    and implement programs using programming languages. Hence, programming is done by computer expertsNatural Language of Application Domains versus Domain Specific Programming Languages Cuong Bui to be a programmer, that is, to be a computer expert. We believe programming can be made easier for the computer user

  2. Natural Language Processing by Computer and Language Teaching.

    ERIC Educational Resources Information Center

    Cook, V. J.; Fass, D.

    1986-01-01

    Outlines some applications of Natural Language Processing, the capacity of computers to process human language, to language teaching. The role of syntactic parsing (the assigning of grammatical structure to sentences by computer) and semantically based processing is examined. (Author/SED)

  3. NATURAL LANGUAGE PROCESSING FOR REQUIREMENTS ENGINEERING

    E-print Network

    NATURAL LANGUAGE PROCESSING FOR REQUIREMENTS ENGINEERING: APPLICABILITY TO LARGE REQUIREMENTS a case study on application of natural language processing in very early stages of software development language processing (NLP) is not ripe enough to be used in requirements engineering, we can nevertheless

  4. Designing a Constraint Based Parser for Sanskrit

    NASA Astrophysics Data System (ADS)

    Kulkarni, Amba; Pokar, Sheetal; Shukl, Devanand

    Verbal understanding (?? bdabodha) of any utterance requires the knowledge of how words in that utterance are related to each other. Such knowledge is usually available in the form of cognition of grammatical relations. Generative grammars describe how a language codes these relations. Thus the knowledge of what information various grammatical relations convey is available from the generation point of view and not the analysis point of view. In order to develop a parser based on any grammar one should then know precisely the semantic content of the grammatical relations expressed in a language string, the clues for extracting these relations and finally whether these relations are expressed explicitly or implicitly. Based on the design principles that emerge from this knowledge, we model the parser as finding a directed Tree, given a graph with nodes representing the words and edges representing the possible relations between them. Further, we also use the M?m? ?s? constraint of ?k? ?k?? (expectancy) to rule out non-solutions and sannidhi (proximity) to prioritize the solutions. We have implemented a parser based on these principles and its performance was found to be satisfactory giving us a confidence to extend its functionality to handle the complex sentences.

  5. Ambiguity resolution analysis in incremental parsing of natural language.

    PubMed

    Costa, Fabrizio; Frasconi, Paolo; Lombardo, Vincenzo; Sturt, Patrick; Soda, Giovanni

    2005-07-01

    Incremental parsing gains its importance in natural language processing and psycholinguistics because of its cognitive plausibility. Modeling the associated cognitive data structures, and their dynamics, can lead to a better understanding of the human parser. In earlier work, we have introduced a recursive neural network (RNN) capable of performing syntactic ambiguity resolution in incremental parsing. In this paper, we report a systematic analysis of the behavior of the network that allows us to gain important insights about the kind of information that is exploited to resolve different forms of ambiguity. In attachment ambiguities, in which a new phrase can be attached at more than one point in the syntactic left context, we found that learning from examples allows us to predict the location of the attachment point with high accuracy, while the discrimination amongst alternative syntactic structures with the same attachment point is slightly better than making a decision purely based on frequencies. We also introduce several new ideas to enhance the architectural design, obtaining significant improvements of prediction accuracy, up to 25% error reduction on the same dataset used in previous work. Finally, we report large scale experiments on the entire Wall Street Journal section of the Penn Treebank. The best prediction accuracy of the model on this large dataset is 87.6%, a relative error reduction larger than 50% compared to previous results. PMID:16121736

  6. Integration of speech with natural language understanding.

    PubMed Central

    Moore, R C

    1995-01-01

    The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding. PMID:7479813

  7. Lagrangian relaxation for natural language decoding

    E-print Network

    Rush, Alexander M. (Alexander Matthew)

    2014-01-01

    The major success story of natural language processing over the last decade has been the development of high-accuracy statistical methods for a wide-range of language applications. The availability of large textual data ...

  8. Arabic natural language processing A. Belad, Loria

    E-print Network

    Paris-Sud XI, Université de

    language of Pakistan and is closely related to Hindi, though a lot of Urdu vocabulary comes from Persian1 Arabic natural language processing A. Belaïd, Loria Introduction The automatic recognition is a Semitic language spoken and understood in various forms by millions of people throughout the Middle East

  9. Symbolic connectionism in natural language disambiguation

    Microsoft Academic Search

    Samuel W. K. Chan; James Franklin

    1998-01-01

    Natural language understanding involves the simul- taneous consideration of a large number of different sources of information. Traditional methods employed in language analysis have focused on developing powerful formalisms to represent syntactic or semantic structures along with rules for transforming language into these formalisms. However, they make use of only small subsets of knowledge. This article will describe how to

  10. Java Mathematical Expression Parser

    NSDL National Science Digital Library

    Funk, Nathan.

    The Java Mathematical Expression Parser (JEP) is a handy tool "for parsing and evaluating mathematical expressions." It is a no-frills package that incorporates several important features, including user-definable functions and implicit multiplication for easy use. JEP can be downloaded as a complete application, or a couple of its features can be used online as applets. There is a separate page of documentation and installation instructions. Also available on this Web site is the AutoAbacus, which allows users to input a system of equations and obtain the solutions instantaneously.

  11. Deep Learning for Natural Language Processing

    E-print Network

    Collobert, Ronan

    Deep Learning for Natural Language Processing Ronan Collobert Jason Weston NEC Labs America;Deep Learning for Natural Language Processing Ronan Collobert Jason Weston NEC Labs America, Princeton Learning As with the history of the world, machine learning has a history of and exploration exploitation

  12. Commonsense Reasoning in and over Natural Language

    E-print Network

    . Structured as a network of semi-structured natural language fragments, ConceptNet presently consists of overCommonsense Reasoning in and over Natural Language Hugo Liu and Push Singh Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA {hugo,push}@media.mit.edu Abstract. Concept

  13. Natural Language Processing on the Web

    E-print Network

    Natural Language Processing on the Web Guy Lapalme RALI-DIRO, Université de Montréal ! http://www.iro.umontreal.ca/~lapalme #12;Overview · What is Natural Language Processing (NLP) · NLP for the Web · The Web for NLP 2 #12 recognition 5 #12;http://rali.iro.umontreal.ca #12;NLP for the syntactic Web search engines · NLP saved

  14. Natural language and spatial reasoning

    E-print Network

    Tellex, Stefanie, 1980-

    2010-01-01

    Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands ...

  15. A Table Look-Up Parser in Online ILTS Applications

    ERIC Educational Resources Information Center

    Chen, Liang; Tokuda, Naoyuki; Hou, Pingkui

    2005-01-01

    A simple table look-up parser (TLUP) has been developed for parsing and consequently diagnosing syntactic errors in semi-free formatted learners' input sentences of an intelligent language tutoring system (ILTS). The TLUP finds a parse tree for a correct version of an input sentence, diagnoses syntactic errors of the learner by tracing and…

  16. Foundations of Statistical Natural Language Processing

    Microsoft Academic Search

    Christopher D. Manning; Hinrich Schiitze

    1999-01-01

    Abstract: this paperas "the first clear demonstration of a probabilistic parser outperforming a trigram model" (pg. 457), itdoes not discuss what features of the algorithm lead to its superior results

  17. A lex-based mad parser and its applications

    SciTech Connect

    Oleg Krivosheev et al.

    2001-07-03

    An embeddable and portable Lex-based MAD language parser has been developed. The parser consists of a front-end which reads a MAD file and keeps beam elements, beam line data and algebraic expressions in tree-like structures, and a back-end, which processes the front-end data to generate an input file or data structures compatible with user applications. Three working programs are described, namely, a MAD to C++ converter, a dynamic C++ object factory and a MAD-MARS beam line builder. Design and implementation issues are discussed.

  18. Natural language interface for command and control

    NASA Technical Reports Server (NTRS)

    Shuler, Robert L., Jr.

    1986-01-01

    A working prototype of a flexible 'natural language' interface for command and control situations is presented. This prototype is analyzed from two standpoints. First is the role of natural language for command and control, its realistic requirements, and how well the role can be filled with current practical technology. Second, technical concepts for implementation are discussed and illustrated by their application in the prototype system. It is also shown how adaptive or 'learning' features can greatly ease the task of encoding language knowledge in the language processor.

  19. Language and the Multisemiotic Nature of Mathematics

    ERIC Educational Resources Information Center

    de Oliveira, Luciana C.; Cheng, Dazhi

    2011-01-01

    This article explores how language and the multisemiotic nature of mathematics can present potential challenges for English language learners (ELLs). Based on two qualitative studies of the discourse of mathematics, we discuss some of the linguistic challenges of mathematics for ELLs in order to highlight the potential difficulties they may have…

  20. Visual Tools for Natural Language Processing

    Microsoft Academic Search

    Robert J. Gaizauskas; Peter J. Rodgers; Kevin Humphreys

    2001-01-01

    We describe GATE, the General Architecture for Text Engineering, an integrated visual development environment to support the visual assembly, execution and analysis of modular natural language processing systems. The visual model is an executable data flow program graph, automatically synthesised from data dependency declarations of language processing modules. The graph is then directly executable: modules are run interactively in the

  1. Introduction: Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Smeaton, Alan F.

    1990-01-01

    Discussion of research into information and text retrieval problems highlights the work with automatic natural language processing (NLP) that is reported in this issue. Topics discussed include the occurrences of nominal compounds; anaphoric references; discontinuous language constructs; automatic back-of-the-book indexing; and full-text analysis.…

  2. THE NATURAL-LANGUAGE APPROACH TO PSYCHOMETRICS.

    ERIC Educational Resources Information Center

    HELM, CARL E.

    A COMPUTER PROGRAMING SYSTEM HAS BEEN DEVISED THAT WILL ALLOW THE RESEARCHER TO SPECIFY ANY OR ALL VARIABLES ENTERING INTO A PARTICULAR SIMULATION, ALONG WITH THE FUNCTIONS WHICH DEFINE THE RELATIONSHIPS BETWEEN VARIABLES. THE SYTSTEM USES A "SPECIAL-PURPOSE-PROGRAMING LANGUAGE" BASED ON THE NATURAL LANGUAGE DESCRIPTION THE SCIENTIST USES TO…

  3. Natural language search of structured documents

    E-print Network

    Oney, Stephen W

    2008-01-01

    This thesis focuses on techniques with which natural language can be used to search for specific elements in a structured document, such as an XML file. The goal is to create a system capable of being trained to identify ...

  4. Natural Language Tools for Information Extraction

    E-print Network

    Shapiro, Stuart C.

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4 Tools that automatically process text 34 4.1 AeroText (Lockheed MartinNatural Language Tools for Information Extraction for Soft Target Exploitation and Fusion Final

  5. Decoding algorithms for complex natural language tasks

    E-print Network

    Deshpande, Pawan

    2007-01-01

    This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a ...

  6. Phonetic symbolism in natural languages

    Microsoft Academic Search

    Roger W. Brown; Abraham H. Black; Arnold E. Horowitz

    1955-01-01

    Three separate investigations, using three lists of English words and six foreign languages, have shown superior to chance agreement and accuracy in the translation of unfamiliar tongues. The agreement can be explained as the result of a \\

  7. Natural Language Introduction to NLP

    E-print Network

    Inkpen, Diana

    pages, medical records, financial filings, etc.) 2. Conversational agents are becoming an important form #12;1/11/2014 Speech and Language Processing - Jurafsky and Martin 6 Text Analytics · Data-mining

  8. Graphical law beneath each written natural language

    E-print Network

    Anindya Kumar Biswas

    2013-10-08

    We study twenty four written natural languages. We draw in the log scale, number of words starting with a letter vs rank of the letter, both normalised. We find that all the graphs are of the similar type. The graphs are tantalisingly closer to the curves of reduced magnetisation vs reduced temperature for magnetic materials. We make a weak conjecture that a curve of magnetisation underlies a written natural language.

  9. Edinburgh Research Explorer Historical Post Office Directory Parser (POD Parser) Software

    E-print Network

    Millar, Andrew J.

    Edinburgh Research Explorer Historical Post Office Directory Parser (POD Parser) Software From, 'Historical Post Office Directory Parser (POD Parser) Software From the AddressingHistory Project' Journal Office Directories(PODs) with contemporaneous histori- cal maps. TPODs emerged during the late

  10. Combining Semantic Wikis and Controlled Natural Language

    E-print Network

    Kuhn, Tobias

    2008-01-01

    We demonstrate AceWiki that is a semantic wiki using the controlled natural language Attempto Controlled English (ACE). The goal is to enable easy creation and modification of ontologies through the web. Texts in ACE can automatically be translated into first-order logic and other languages, for example OWL. Previous evaluation showed that ordinary people are able to use AceWiki without being instructed.

  11. Enabling Software Analysis using PetitParser and Michael Rufenacht

    E-print Network

    Nierstrasz, Oscar

    1 Enabling Software Analysis using PetitParser and Moose Michael R¨ufenacht Software Composition--Static software analysis is an important pro- cess in software quality assurance. Building suitable tools e, software analysis tools have a pretty static nature, concerning both the source code parsing and analysis

  12. Evaluating healthcare quality using natural language processing.

    PubMed

    Baldwin, Karen Brandt

    2008-01-01

    Consistent monitoring for quality indicators as adverse events or missed screening opportunities remains a difficult proposition for most healthcare organizations. Much of the clinical data needed for quality reports is imbedded in narrative reports in the electronic health record. Narrative data most often require costly retrieval by manual data extraction. NUD*IST, a qualitative research computer program, was used as an automated natural Language processing tool to extract and code data for analysis of screening and treatment for breast cancer. The study method demonstrated acceptable Levels of precision and recall compared to large-scale natural Language processing programs. PMID:18680924

  13. Connection Science, Vol. 2, Nos I & 2, 1990 63 Connectionism and Determinism in a Syntactic Parser

    E-print Network

    Faisal, Kanaan Abed

    Connection Science, Vol. 2, Nos I & 2, 1990 63 Connectionism and Determinism in a Syntactic Parser, ungrammatical and lexically ambiguous sentences. KEYWORDS: Connectionism, determinism, learning, natural counterpoint. In connectionism, there is the promise of robust decision making, generalization, and other

  14. The Universal Parser Architecture for Knowledge-based Machine Translation

    Microsoft Academic Search

    Masaru Tomita; Jaime G. Carbonell

    1987-01-01

    Machine translation should be semanticalty-accurate, linguistically- principled, user-interactive, and extensible to multiple languages and domains. This paper presents the universal parser architecture that strives to meet these objectives. In essence, linguistic knowledge bases (syntactic, semantic, lexical, pragmatic), encoded in theoretically-motivated formalisms such as lexical-functional grammars, are unified and precompiled into fast run-time grammars for parsing and generation. Thus, the universal

  15. Entropy analysis of natural language written texts

    NASA Astrophysics Data System (ADS)

    Papadimitriou, C.; Karamanos, K.; Diakonos, F. K.; Constantoudis, V.; Papageorgiou, H.

    2010-08-01

    The aim of the present work is to investigate the relative contribution of ordered and stochastic components in natural written texts and examine the influence of text category and language on these. To this end, a binary representation of written texts and the generated symbolic sequences are examined by the standard block entropy analysis and the Shannon and Kolmogorov entropies are obtained. It is found that both entropies are sensitive to both language and text category with the text category sensitivity to follow almost the same trends in both languages (English and Greek) considered. The values of these entropies are compared with those of stochastically generated symbolic sequences and the nature of correlations present in this representation of real written texts is identified.

  16. The Rhetorical Parsing of Natural Language Texts

    Microsoft Academic Search

    Daniel Marcu

    1997-01-01

    We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sentences into clauses, and one that produces valid rhetorical structure trees for unrestricted natural languages texts. The algorithms use information that was derived from a corpus analysis of cue phrases.

  17. Natural Language Annotations for the Semantic Web

    E-print Network

    Massachusetts Institute of Technology (MIT), Computer Science and Artificial Intelligence Laboratory, InfoLab

    Natural Language Annotations for the Semantic Web Boris Katz 1 , Jimmy Lin 1 , and Dennis Quan 2 1. Because the ultimate purpose of the Semantic Web is to help users locate, organize, and process of the Semantic Web, was designed to be easily processed by computers, not humans. To render RDF friendlier

  18. Natural Language Annotations for the Semantic Web

    E-print Network

    Lin, Jimmy

    Natural Language Annotations for the Semantic Web Boris Katz1 , Jimmy Lin1 , and Dennis Quan2 1 MIT. Because the ultimate purpose of the Semantic Web is to help users locate, organize, and process of the Semantic Web, was designed to be easily processed by computers, not humans. To render RDF friendlier

  19. Field-effect natural language semantic mapping

    Microsoft Academic Search

    Stuart H. Rubin; Shu-Ching ChenZ; Mei-Ling Shyu

    2003-01-01

    This paper addresses the problem of mapping natural language to its semantics. It presupposes that the input is in random (compressed) form and proceeds to detail a methodology for extracting the semantics from that normal form. The idea is to enumerate contextual cues and learn to associate those cues with meaning. The process is inherently fuzzy and for this reason

  20. Natural Language Information Retrieval: Progress Report.

    ERIC Educational Resources Information Center

    Perez-Carballo, Jose; Strzalkowski, Tomek

    2000-01-01

    Reports on the progress of the natural language information retrieval project, a joint effort led by GE (General Electric) Research, and its evaluation at the sixth TREC (Text Retrieval Conference). Discusses stream-based information retrieval, which uses alternative methods of document indexing; advanced linguistic streams; weighting; and query…

  1. A search engine for natural language applications

    Microsoft Academic Search

    Michael J. Cafarella; Oren Etzioni

    2005-01-01

    Many modern natural language-processing applications utilize search engines to locate large numbers of Web documents or to compute statistics over the Web corpus. Yet Web search engines are designed and optimized for simple human queries---they are not well suited to support such applications. As a result, these applications are forced to issue millions of successive queries resulting in unnecessary search

  2. On natural language dialogue with assistive robots

    Microsoft Academic Search

    Vladimir A. Kulyukin

    2006-01-01

    This paper examines the appropriateness of natural language dialogue (NLD) with assistive robots. Assistive robots are defined in terms of an existing human-robot interaction taxonomy. A decision support procedure is outlined for assistive technology researchers and practitioners to evaluate the appropriateness of NLD in assistive robots. Several conjectures are made on when NLD may be appropriate as a human-robot interaction

  3. Enhanced Text Retrieval Using Natural Language Processing.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.

    1998-01-01

    Defines natural language processing (NLP); describes the use of NLP in information retrieval (IR); provides seven levels of linguistic analysis: phonological, morphological, lexical, syntactic, semantic, discourse, and pragmatic. Discusses the commercial use of NLP in IR with the example of DR-LINK (Document Retrieval using LINguistic Knowledge)…

  4. Attacks on Lexical Natural Language Steganography Systems

    Microsoft Academic Search

    Cuneyt M. Taskiran; Umut Topkara; Mercan Topkara; Edward J. Delp

    ABSTRACT Text data forms the largest bulk of digital data that people encounter and exchange daily. For this reason the potential usage of text data as a covert channel for secret communication is an imminent concern. Even though information hiding into natural language text has started to attract great interest, there has been no study on attacks against these applications.

  5. Vector-based Natural Language Call Routing

    Microsoft Academic Search

    Jennifer Chu-Carroll; Bob Carpenter

    1999-01-01

    This paper describes a domain-independent, automatically trained natural language call router for directing incoming calls in a call center. Our call router directs customer calls based on their response to an open-ended -gram terms extracted from the caller's request, the caller is 1) routed to the appropriate destination, 2) transferred to a human operator, or 3) asked a disambiguation question.

  6. Lexical Knowledge Representation and Natural Language Processing

    Microsoft Academic Search

    James Pustejovsky; Branimir Boguraev

    1993-01-01

    Pustejovsky, J. and B. Boguraev, Lexical knowledge representation and natural language processing, Artificial Intelligence 63 (1993) 193-223. Traditionally, semantic information in computational lexicons is limited to notions such as selectional restrictions or domain-specific constraints, encoded in a \\

  7. A Natural Language Translation Neural Network

    Microsoft Academic Search

    Nenad KONCAR; Gregory GUTHRIE; Connectionist NLP

    proper translation by a user without any expert knowledge of how the computer stores and represents rules. This paper demonstrates the utility of neural networks in precisely this area on a small scale translation problem. We have tested the ability of neural networks to perform natural language translation. Our results have shown a greatly improved translation accuracy in comparison to

  8. Towards a Bio-computational Model of Natural Language Learning

    E-print Network

    Boyer, Edmond

    Towards a Bio-computational Model of Natural Language Learning Leonor Becerra-Bonache Laboratoire in natural language learning. 1 Introduction Children, independently of their culture and the language motivated research in formal models of language learning [14,13]. Such mod- els can allow us to address

  9. Natural language processing, pragmatics, and verbal behavior

    PubMed Central

    Cherpas, Chris

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052

  10. Natural Language Processing and User Modeling: Synergies and Limitations

    Microsoft Academic Search

    Ingrid Zukerman; Diane J. Litman

    2001-01-01

    The fields of user modeling and natural language processing h ave been closely linked since the early days of user modeling. Natural language systems consult user models in order to improve their understanding of users' requirements and to generate appropriate and relevant resp onses. At the same time, the information natural language systems obtain from their users is expected to

  11. Representing Requirements in Natural Language as Concept Lattices

    E-print Network

    Richards, Debbie

    Representing Requirements in Natural Language as Concept Lattices Debbie Richards and Kathrin on the translation of natural language into crosstables to allow genera- tion of concept lattices using FCA. 2 Introducing the Process and the Foundational Concepts To translate use cases in natural language into concept

  12. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  13. Symbolic Natural Language 3.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

    E-print Network

    Paris-Sud XI, Université de

    155 CHAPTER 3 Symbolic Natural Language Processing 3.0 Introduction Combinatorics on Words, Lothaire (Ed.) (2005) 164-209" #12;156 Symbolic Natural Language Processing some . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 3.0. Introduction Fundamental notions of combinatorics on words underlie natural language pro

  14. Automated database design from natural language input

    NASA Technical Reports Server (NTRS)

    Gomez, Fernando; Segami, Carlos; Delaune, Carl

    1995-01-01

    Users and programmers of small systems typically do not have the skills needed to design a database schema from an English description of a problem. This paper describes a system that automatically designs databases for such small applications from English descriptions provided by end-users. Although the system has been motivated by the space applications at Kennedy Space Center, and portions of it have been designed with that idea in mind, it can be applied to different situations. The system consists of two major components: a natural language understander and a problem-solver. The paper describes briefly the knowledge representation structures constructed by the natural language understander, and, then, explains the problem-solver in detail.

  15. Learning procedures from interactive natural language instructions

    NASA Technical Reports Server (NTRS)

    Huffman, Scott B.; Laird, John E.

    1994-01-01

    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  16. Linear separability in superordinate natural language concepts

    Microsoft Academic Search

    Wim Ruts; Gert Storms; James Hampton

    2004-01-01

    Two experiments are reported in which linear separability was investigated in superordinate natural language concept pairs\\u000a (e.g.,toiletry-sewing gear). Representations of the exemplars of semantically related concept pairs were derived in two to five dimensions using\\u000a multidimensional scaling (MDS) of similarities based on possession of the concept features. Next, category membership, obtained\\u000a from an exemplar generation study (in Experiment 1) and

  17. An expert system for natural language processing

    NASA Technical Reports Server (NTRS)

    Hennessy, John F.

    1988-01-01

    A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.

  18. Robust natural language dialogues for instruction tasks

    NASA Astrophysics Data System (ADS)

    Scheutz, Matthias

    2010-04-01

    Being able to understand and carry out spoken natural instructions even in limited domains is extremely challenging for current robots. The difficulties are multifarious, ranging from problems with speech recognizers to difficulties with parsing disfluent speech or resolving references based on perceptual or task-based knowledge. In this paper, we present our efforts at starting to address these problems with an integrated natural language understanding system implemented in our DIARC architecture on a robot that can handle fairly unconstrained spoken ungrammatical and incomplete instructions reliably in a limited domain.

  19. Symbolic connectionism in natural language disambiguation.

    PubMed

    Chan, S K; Franklin, J

    1998-01-01

    Natural language understanding involves the simultaneous consideration of a large number of different sources of information. Traditional methods employed in language analysis have focused on developing powerful formalisms to represent syntactic or semantic structures along with rules for transforming language into these formalisms. However, they make use of only small subsets of knowledge. This article will describe how to use the whole range of information through a neurosymbolic architecture which is a hybridization of a symbolic network and subsymbol vectors generated from a connectionist network. Besides initializing the symbolic network with prior knowledge, the subsymbol vectors are used to enhance the system's capability in disambiguation and provide flexibility in sentence understanding. The model captures a diversity of information including word associations, syntactic restrictions, case-role expectations, semantic rules and context. It attains highly interactive processing by representing knowledge in an associative network on which actual semantic inferences are performed. An integrated use of previously analyzed sentences in understanding is another important feature of our model. The model dynamically selects one hypothesis among multiple hypotheses. This notion is supported by three simulations which show the degree of disambiguation relies both on the amount of linguistic rules and the semantic-associative information available to support the inference processes in natural language understanding. Unlike many similar systems, our hybrid system is more sophisticated in tackling language disambiguation problems by using linguistic clues from disparate sources as well as modeling context effects into the sentence analysis. It is potentially more powerful than any systems relying on one processing paradigm. PMID:18255763

  20. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  1. ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING IN A CHINESE LANGUAGE LEARNING GAME

    E-print Network

    Miles, Will

    ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING IN A CHINESE LANGUAGE LEARNING GAME to learn as a second language. Not only is the vocabulary considerably different from that of English the difficulty of learning this language. These sentences will then be used in a game that will aid the learning

  2. A computational model to connect gestalt perception and natural language

    E-print Network

    Dhande, Sheel Sanjay, 1979-

    2003-01-01

    We present a computational model that connects gestalt visual perception and language. The model grounds the meaning of natural language words and phrases in terms of the perceptual properties of visually salient groups. ...

  3. Fuzzy Modeling and Natural Language Processing for Panini's Sanskrit Grammar

    E-print Network

    Reddy, P Venkata Subba

    2010-01-01

    Indian languages have long history in World Natural languages. Panini was the first to define Grammar for Sanskrit language with about 4000 rules in fifth century. These rules contain uncertainty information. It is not possible to Computer processing of Sanskrit language with uncertain information. In this paper, fuzzy logic and fuzzy reasoning are proposed to deal to eliminate uncertain information for reasoning with Sanskrit grammar. The Sanskrit language processing is also discussed in this paper.

  4. Language Model and Sentence Structure Manipulations for Natural Language Application Systems

    E-print Network

    of natural language expressions plays a central role; natural language manipulation facilities axe indis, translation,etc., axe indispensable for developing and maintaining natu- ral language application systems or sententialconceptual unit. The sentence analysis iscarriedout as a CF instantiationprocess,in which several CFs axe

  5. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  6. Understanding and representing natural language meaning

    NASA Astrophysics Data System (ADS)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

    1982-12-01

    During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.

  7. Outline Introduction Parsing natural mathematical language FMathL and GF Conclusion Formal Mathematical Language

    E-print Network

    Neumaier, Arnold

    Kevin Kofler University of Vienna, Austria Faculty of Mathematics FMathL Formal Mathematical Language, Austria Faculty of Mathematics FMathL Formal Mathematical Language #12;Outline Introduction ParsingOutline Introduction Parsing natural mathematical language FMathL and GF Conclusion FMathL Formal

  8. Automatic Induction of N-Gram Language Models from a Natural Language Grammar1

    E-print Network

    Automatic Induction of N-Gram Language Models from a Natural Language Grammar1 Stephanie Seneff work in developing a technique which can automatically generate class n-gram language models from the standard class n-gram framework for compu- tational efficiency. Moreover, both the n-gram classes and train

  9. Machine Learning in Natural Language Georgios P. Petasis

    E-print Network

    Kouroupetroglou, Georgios

    Machine Learning in Natural Language Processing Georgios P. Petasis Software and Knowledge@iit.demokritos.gr Abstract. This thesis examines the use of machine learning techniques in various tasks of natural language-entity recog- nition, and b) the creation of a new machine learning algorithm and its assessment on synthetic

  10. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  11. Ants for Natural Language Processing Mathieu Lafourcade1

    E-print Network

    Paris-Sud XI, Université de

    Language Processing (NLP) applications like Word Sense Disambiguation (WSD) and Thematic Analysis. LearningAnts for Natural Language Processing Mathieu Lafourcade1 , Fr´ed´eric Guinand2 1 LIRMM (CNRS.guinand@univ-lehavre.fr Abstract The conceptual vector model aims at representing word meanings by concept activations for Natural

  12. Natural Language Processing with Distributional Compositional Models Jean Maillard

    E-print Network

    Natural Language Processing with Distributional Compositional Models Jean Maillard Supervised by: Dr Stephen Clark, Computer Laboratory, University of Cambridge Natural Language Processing know a word by the company it keeps". The principle is that the meaning of a word can be captured

  13. Cognitive Emotion Modeling in Natural Language Communication 1

    E-print Network

    Bari, Università degli Studi di

    will be considered in particular (Cowie, 2006): ­ Personality traits: stable, dynamic and organized setCognitive Emotion Modeling in Natural Language Communication 1 Valeria Carofiglio, Fiorella de Introduction Computer science recently began, with success, to endow natural language dialogues with emotions

  14. Predicting Garden Path Sentences Based on Natural Language Understanding System

    Microsoft Academic Search

    DU Jia-li; YU Ping-fang

    2012-01-01

    Natural language understanding (NLU) focusing on machine reading comprehension is a branch of natural language processing (NLP). The domain of the developing NLU system covers from sentence decoding to text understanding and the automatic decoding of GP sentence belongs to the domain of NLU system. GP sentence is a special linguistic phenomenon in which processing breakdown and backtracking are two

  15. MENELAS: an access system for medical records using natural language

    Microsoft Academic Search

    Pierre Zweigenbaum

    1994-01-01

    The overall goal of Menelas is to provide better access to the information contained in natural language patient discharge summaries, through the design and implementation of a pilot system able to access medical reports through natural languages. A first, experimental version of the Menelas indexing prototype for French has been assembled. Its function is to encode free text PDSs into

  16. Hunting for Smells in Natural Language Tests Benedikt Hauptmann

    E-print Network

    : A human tester executes test cases written in natural language by interacting with the system under testHunting for Smells in Natural Language Tests Benedikt Hauptmann Maximilian Junker, Sebastian Eder, Germany Peter Braun Validas AG, Germany Abstract--Tests are central artifacts of software systems and play

  17. Generating Natural Language Description of Human Behavior from Video Images

    Microsoft Academic Search

    Atsuhiro Kojimat; Masao Izumit; Takeshi Tamurat; Kunio Fukunagat

    2000-01-01

    In visual surveillance applications, it is becoming popular to perceive video images and to interpret them using natural language concepts. We propose an approach to generating a natural language description of human behavior appearing in real video images. First, a head region of a human, on behalf of the whole body, is extracted from each frame. Using a model based

  18. The Role of Propositions in Natural Language Semantics

    Microsoft Academic Search

    Peter Bosch

    The purpose of this paper is twofold: I want to show that (a) the currently most generally applied explication of the notion of proposition in terms of truth-definitional semantics is inapplicable to the notion as it is needed for natural language semantics, and (b) that the intuitive notion of propositions, which arises from our use of natural language, rests on

  19. The Linguistic Nature of Language and Communication.

    ERIC Educational Resources Information Center

    Zitlow, Connie S., Ed.

    2001-01-01

    Discusses five recent books about language that address issues that arise in classrooms with an increasing number of diverse dialects and varied home languages. Discusses the complexities of language, misunderstandings in the Ebonics controversy, socioeducational issues, and classroom ideas for teachers. Describes two web sites. (SR)

  20. Towards more natural functional programming languages

    Microsoft Academic Search

    Brad A. Myers

    2002-01-01

    Programming languages are the way for a person to express a mental plan in a way that the computer can understand. Therefore, it is appropriate to consider properties of people when designing new programming languages. In our research, we are investigating how people think about algorithms, and how programming languages can be made easier to learn and more effective for

  1. MyProLang - My Programming Language: A Template-Driven Automatic Natural Programming Language

    E-print Network

    Bassil, Youssef

    2012-01-01

    Modern computer programming languages are governed by complex syntactic rules. They are unlike natural languages; they require extensive manual work and a significant amount of learning and practicing for an individual to become skilled at and to write correct programs. Computer programming is a difficult, complicated, unfamiliar, non-automated, and a challenging discipline for everyone; especially, for students, new programmers and end-users. This paper proposes a new programming language and an environment for writing computer applications based on source-code generation. It is mainly a template-driven automatic natural imperative programming language called MyProLang. It harnesses GUI templates to generate proprietary natural language source-code, instead of having computer programmers write the code manually. MyProLang is a blend of five elements. A proprietary natural programming language with unsophisticated grammatical rules and expressive syntax; automation templates that automate the generation of in...

  2. Linguistics is the scientific study of language. Linguists seek to understand the nature of the human language

    E-print Network

    Saldin, Dilano

    of the human language faculty by examining the formal properties of natural- language grammars and the processLinguistics is the scientific study of language. Linguists seek to understand the nature of language acquisition. Given the central importance of language to both cognition and culture, linguistics

  3. Building a Natural Language Interface for the ATNF Pulsar Database for Speeding up Execution of Complex Queries

    NASA Astrophysics Data System (ADS)

    Tang, Rupert; Jenet, F.; Rangel, S.; Dartez, L.

    2010-01-01

    Until now, there has been no available natural language interfaces (NLI's) for querying a database of pulsars (rotating neutron stars emitting radiation at regular intervals). Currently, pulsar records are retrieved through an HTML form accessible via the Australia Telescope National Facility (ATNF) website where one needs to be familiar with pulsar attributes used by the interface (e.g. BLC). Using a NLI relinquishes the need for learning form-specific formalism and allows execution of more powerful queries than those supported by the HTML form. Furthermore, on database access that requires comparison of attributes for all the pulsar records (e.g. what is the fastest pulsar?), using a NLI for retrieving answers to such complex questions is definitely much more efficient and less error-prone. This poster presents the first NLI ever created for the ATNF pulsar database (ATNF-Query) to facilitate database access using complex queries. ATNF-Query is built using a machine learning approach that induces a semantic parser from a question corpus; the innovative application is intended to provide pulsar researchers or laymen with an intelligent language understanding database system for friendly information access.

  4. Concepts and implementations of natural language query systems

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  5. MICA: A Probabilistic Dependency Parser Based on Tree Insertion Grammars

    E-print Network

    Boyer, Edmond

    MICA: A Probabilistic Dependency Parser Based on Tree Insertion Grammars Application Note Srinivas.nasr@lif.univ-mrs.fr rambow@ccls.columbia.edu benoit.sagot@inria.fr Abstract MICA is a dependency parser which returns deep This application note presents a freely avail- able parser, MICA (Marseille-INRIA-Columbia- AT&T).1 MICA has

  6. An Annotated Bibliography of Affective Natural Language Generation

    E-print Network

    Piwek, Paul

    -strictly rational aspects of the Hearer. De Rosis and Grasso consider `personality traits, emotions and highlyAn Annotated Bibliography of Affective Natural Language Generation Paul Piwek ITRI - University: Systems and Computational Theories 3 3 Affect in Language: Linguistic, Descriptive and Empirical Work 7 1

  7. Getting Answers to Natural Language Questions on the Web.

    ERIC Educational Resources Information Center

    Radev, Dragomir R.; Libner, Kelsey; Fan, Weiguo

    2002-01-01

    Describes a study that investigated the use of natural language questions on Web search engines. Highlights include query languages; differences in search engine syntax; and results of logistic regression and analysis of variance that showed aspects of questions that predicted significantly different performances, including the number of words,…

  8. A Multilingual Natural Language Interface for E-Commerce Applications

    Microsoft Academic Search

    Werner Winiwarter; Ismail Khalil Ibrahim

    2000-01-01

    In this paper we present a multilingual natural language interface architecture, which can be used for accessing on line product catalogs and lets users formulate their queries in their native languages. In our interface architecture a rule based machine- learning module replaces an elaborate semantic analysis component. The learning module learns the correct mappings of a user's input to the

  9. On the Representation of Physical Quantities in Natural Language Text

    E-print Network

    Forbus, Kenneth D.

    language. Our focus is on physical quantities found in descriptions of physical processes that water will eventually boil if you heat it on a stove, that a ball placed at the top of a steep ramp continuous properties can appear in written natural language. Our focus is on physical quantities found

  10. NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval.

    ERIC Educational Resources Information Center

    Feldman, Susan

    1999-01-01

    Focuses on natural language processing (NLP) in information retrieval. Defines the seven levels at which people extract meaning from text/spoken language. Discusses the stages of information processing; how an information retrieval system works; advantages to adding full NLP to information retrieval systems; and common problems with information…

  11. Towards Surveillance Video Search by Natural Language Query

    E-print Network

    Roy, Deb

    Towards Surveillance Video Search by Natural Language Query Stefanie Tellex MIT Media Lab 20 Ames of surveillance video for clips that match spatial language queries such as "along the hallway" and "across they are looking for in video collections. We are building an interface that finds video clips in surveil- lance

  12. The Nature of Symbols in the Language of Thought

    Microsoft Academic Search

    SUSAN SCHNEIDER

    2009-01-01

    The core of language of thought program is the claim that thinking is the manipulation of symbols according to rules. Yet LOT has said little about symbol natures, and existing accounts are highly controversial. This is a major flaw at the heart of the LOT program: LOT requires an account of symbol natures to naturalize intentionality, to determine whether the

  13. Natural Language Generation for the Semantic Web: Unsupervised template extraction 

    E-print Network

    Duma, Daniel

    2012-11-28

    I propose an architecture for a Natural Language Generation system that automatically learns sentence templates, together with statistical document planning, from parallel RDF data and text. To this end, I design, build ...

  14. Information extraction to facilitate translation of natural language legislation

    E-print Network

    Wang, Samuel (Samuel Siyue)

    2011-01-01

    There is a large body of existing legislation and policies that govern how government organizations and corporations can share information. Since these rules are generally expressed in natural language, it is difficult and ...

  15. Natural language watermarking: Challenges in building a practical system

    NASA Astrophysics Data System (ADS)

    Topkara, Mercan; Riccardi, Giuseppe; Hakkani-Tür, Dilek; Atallah, Mikhail J.

    2006-02-01

    This paper gives an overview of the research and implementation challenges we encountered in building an end-to-end natural language processing based watermarking system. With natural language watermarking, we mean embedding the watermark into a text document, using the natural language components as the carrier, in such a way that the modifications are imperceptible to the readers and the embedded information is robust against possible attacks. Of particular interest is using the structure of the sentences in natural language text in order to insert the watermark. We evaluated the quality of the watermarked text using an objective evaluation metric, the BLEU score. BLEU scoring is commonly used in the statistical machine translation community. Our current system prototype achieves 0.45 BLEU score on a scale [0,1].

  16. Mixed-Initiative Natural Language Dialogue with Variable Communicative Modes 

    E-print Network

    Ishizaki, Masato

    As speech and natural language processing technology advance, it now reaches a stage where the dialogue control or initiative can be studied to realise usable and friendly human computer interface programs such as computer ...

  17. Natural language processing for unmanned aerial vehicle guidance interfaces

    E-print Network

    Craparo, Emily M. (Emily Marie), 1980-

    2004-01-01

    In this thesis, the opportunities and challenges involved in applying natural language processing techniques to the control of unmanned aerial vehicles (UAVs) are addressed. The problem of controlling an unmanned aircraft ...

  18. Natural language command of an autonomous micro-air vehicle

    E-print Network

    Huang, Albert S.

    Natural language is a flexible and intuitive modality for conveying directions and commands to a robot but presents a number of computational challenges. Diverse words and phrases must be mapped into structures that the ...

  19. Natural language processing using spreading activation and lateral inhibition

    SciTech Connect

    Pollack, J.

    1982-08-01

    The knowledge needed to process natural language comes from many sources. While the knowledge itself may be broken up modularly, into knowledge of syntax, semantics, etc., the actual processing should be completely integrated. This form of processing is not easily amenable to the type of processing done by serial von Neumann computers. This work in progress is an investigation of the use of a highly parallel, spreading activation and lateral inhibition network as a mechanism for integrated natural language processing.

  20. Design and Implementation of a Parser Solver for SDPs with Matrix Structure

    E-print Network

    prob- lems maxdet-problems. In engineering applications these problems usually have matrix structure, i-problems with matrix structure. The parser solverparses a prob- lem speci cation close to its natural mathematical de-problem 1 reduces to the semide nite programming SDP prob- lem: minimize cTx subject to Fi x 0; i = 1;::: ;L

  1. MyProLang - My Programming Language A Template-Driven Automatic Natural Programming Language

    Microsoft Academic Search

    Youssef Bassil; Aziz M. Barbar

    2008-01-01

    Modern computer programming languages are governed by complex syntactic rules. They are unlike natural languages; they require extensive manual work and a significant amount of learning and practicing for an individual to become skilled at and to write correct programs. Computer programming is a difficult, complicated, unfamiliar, non- automated, and a challenging discipline for everyone; especially, for students, new programmers

  2. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  3. Natural Language Processing Techniques in Computer-Assisted Language Learning: Status and Instructional Issues.

    ERIC Educational Resources Information Center

    Holland, V. Melissa; Kaplan, Jonathan D.

    1995-01-01

    Describes the role of natural language processing (NLP) techniques, such as parsing and semantic analysis, within current language tutoring systems. Examines trends, design issues and tradeoffs, and potential contributions of NLP techniques with respect to instructional theory and educational practice. Addresses limitations and problems in using…

  4. FromTo-CLIR: Web-Based Natural Language Interface for Cross-Language Information Retrieval.

    ERIC Educational Resources Information Center

    Kim, Taewan; Sim, Chul-Min; Yuh, Sanghwa; Jung, Hanmin; Kim, Young-Kil; Choi, Sung-Kwon; Park, Dong-In; Choi, Key Sun

    1999-01-01

    Describes the implementation of FromTo-CLIR, a Web-based natural-language interface for cross-language information retrieval that was tested with Korean and Japanese. Proposes a method that uses a semantic category tree and collocation to resolve the ambiguity of query translation. (Author/LRW)

  5. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  6. Errors in the Compositions of Second-Year German Students: An Empirical Study for Parser-Based ICALI.

    ERIC Educational Resources Information Center

    Juozulynas, Vilius

    1994-01-01

    Presents an analysis of errors in a 400-page corpus of German essays by American college students in second-year language courses showing that syntax is the most problematic area, followed by morphology. This study indicates that 80% of student errors are not of semantic origin and are potentially recognizable by a syntactic parser. (six…

  7. The integration hypothesis of human language evolution and the nature of contemporary languages

    PubMed Central

    Miyagawa, Shigeru; Ojima, Shiro; Berwick, Robert C.; Okanoya, Kazuo

    2014-01-01

    How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa et al. (2013) put forward a proposal, which we will call the Integration Hypothesis of human language evolution, that holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are in fact finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis. PMID:24936195

  8. A Natural Interface for Sign Language Mathematics

    Microsoft Academic Search

    Nicoletta Adamo-villani; Bedrich Benes; Matt Brisbin; Bryce Hyland

    2006-01-01

    The general goal of our research is the creation of a natu- ral and intuitive interface for input and recognition of American Sign Language (ASL) math signs. The specific objective of this work is the development of two new interfaces for the Mathsignertm application. Mathsignertm is an interactive, 3D animation-based game designed to increase the mathematical skills of deaf children.

  9. Antonymic phonetic symbolism in three natural languages

    Microsoft Academic Search

    Dan I. Slobin

    1968-01-01

    American Ss matched English antonym pairs with antonym pairs from Thai, Kanarese, and Yoruba. The pairs represented the 3 major dimensions of the semantic differential, and referred to both sensible and nonsensible continua. Correct translations were made from all languages and in all semantic domains sampled, indicating that phonetic symbolism is not restricted to terms denoting magnitude and its common

  10. Natural Language and Spatial Reasoning Stefanie Tellex

    E-print Network

    Tellex, Stefanie

    that connects symbols to real-world paths and movements of people. Landau and Jackendoff (1993), Talmy (2005 and integrate many disparate abilities. But many of these abilities seem to have nothing to do with language. However there has been less work towards integrating results from different subfields into a consistent

  11. A KNOWLEDGE ENGINEERING APPROACH TO NATURAL LANGUAGE UNDERSTANDING

    E-print Network

    Shapiro, Stuart C.

    , representation, and use of linguistic knowledge. The computer system is rule-based and utilizes a semantic tracing facility is also being developed as a part of the rule-based system with output in natural other rule-based natural language processing systems such as that of Pereira and Warren [9] and Robinson

  12. Natural Language Names Recognition Method -Fault Tolerant to the

    E-print Network

    Mustakerov, Ivan

    #12;Natural Language Names Recognition Method - Fault Tolerant to the Most Common Mistypings Introduction The paper considers recognition of character strings and, in particular, recognition of natural a character string of substantial length, the method is tolerant to most common typist errors. To this end

  13. Implicit learning of natural language syntax

    E-print Network

    Rebuschat, Patrick

    2009-10-13

    , 2003; Lewicki, Hill, & Czyzewska, 1992; Perruchet & Pacton, 2006; Reber, 1993). Everyday life offers many examples of implicit learning. Language acquisition (Berry & Dienes, 1993; Winter & Reber, 1994), socialization (Lewicki, 1986), music... , planning and producing complex utterances in less than a second – but nonetheless remain largely unaware of how the system actually works. For example, every native speaker of English will know intuitively that sentence (1) “John has a cup of coffee...

  14. Analyzing Learner Language: Towards a Flexible Natural Language Processing Architecture for Intelligent Language Tutors

    ERIC Educational Resources Information Center

    Amaral, Luiz; Meurers, Detmar; Ziai, Ramon

    2011-01-01

    Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…

  15. Grid-Enabling Natural Language Engineering By Stealth

    Microsoft Academic Search

    Baden Hughes; Steven Bird

    2003-01-01

    We describe a proposal for an extensible, component-based software\\u000aarchitecture for natural language engineering applications. Our model leverages\\u000aexisting linguistic resource description and discovery mechanisms based on\\u000aextended Dublin Core metadata. In addition, the application design is flexible,\\u000aallowing disparate components to be combined to suit the overall application\\u000afunctionality. An application specification language provides abstraction from\\u000athe programming environment

  16. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  17. The Rhetorical Parsing of Natural Language Texts Daniel Marcu

    E-print Network

    Marcu, Daniel

    The Rhetorical Parsing of Natural Language Texts Daniel Marcu Department of Computer Science University of Toronto Toronto, Ontario Canada M5S 3G4 marcu@cs.toronto.edu Abstract We derive the rhetorical structures of texts by means of two new, surface­form­based algorithms: one that identifies discourse usages

  18. Ambiguity Resolution in Search Engine Using Natural Language Web Application

    Microsoft Academic Search

    Azeez Nureni Ayofe; Azeez Raheem Ajetola; Ade Stanley Oyewole

    2009-01-01

    Our aim is to create NLWA technique which will be able to retrieve resources from a knowledge base in a more efficient way to respond to the ambiguity problem that occurs when performing the search using the search engine. This system was implemented with the fundamental concept of Natural Language Processing (NLP) whereby it differentiates the similar meaning (synonyms) or

  19. Natural Language Processing: A HumanComputer Interaction Perspective

    E-print Network

    Manaris, Bill

    Natural Language Processing: A Human­Computer Interaction Perspective BILL MANARIS Computer Science to the field of human-computer interaction in terms of theoretical results and practical applications in human-human interaction, its significance and potential in human-computer interaction should

  20. Building Natural Language Interfaces for Rule-based Expert Systems

    Microsoft Academic Search

    Galina Datskovsky Moerdler; Kathleen Mckeown; J. Robert Ensor

    1987-01-01

    In this paper we discuss a semantics for translating natural language statements into facts of an underlying expert system, replacing the more conventional menu interface for gathering data from the user. We describe two issues that must be considered when building such an interface for an expert system. These issues are semantic processing of the user statements and the design

  1. Design of Lexicons in Some Natural Language Systems.

    ERIC Educational Resources Information Center

    Cercone, Nick; Mercer, Robert

    1980-01-01

    Discusses an investigation of certain problems concerning the structural design of lexicons used in computational approaches to natural language understanding. Emphasizes three aspects of design: retrieval of relevant portions of lexicals items, storage requirements, and representation of meaning in the lexicon. (Available from ALLC, Dr. Rex Last,…

  2. A framework for large scalable natural language call routing systems

    Microsoft Academic Search

    Cheng Wu; David Lubensky; Juan Huerta; Xiang Li; Hong-Kwang Jeff Kuo

    2003-01-01

    A framework is proposed for enterprise automated call routing system development and large scalable natural language call routing application deployment based on IBM's speech recognition and NLU application engagement practices in recently years. To facilitate employing different call classification algorithms in an easy integration manner, this framework architecture provides a plug & play environment for evaluating promising call routing algorithms

  3. Coping with Ambiguity in Knowledge-based Natural Language Analysis

    E-print Network

    Shamos, Michael I.

    analysis component of the KANT Knowledge- based Machine Translation system to cope with ambigu- ity. 1 INTRODUCTION The KANT system [Nyberg and Mitamura, 1992] is a Knowledge-based Machine TranslationCoping with Ambiguity in Knowledge-based Natural Language Analysis Kathryn L. Baker, Alexander M

  4. Natural Language Question Answering Over Triple Knowledge Bases

    E-print Network

    Sanner, Scott

    for translating questions into queries over the knowledge base. is report describes a question answering systemNatural Language Question Answering Over Triple Knowledge Bases Aaron Defazio Supervisor: Scott answering system built over a triple knowledge base. Typical document retrieval systems and some question

  5. Coping with Ambiguity in Knowledgebased Natural Language Analysis

    E-print Network

    Shamos, Michael I.

    analysis component of the KANT Knowledge­ based Machine Translation system to cope with ambigu­ ity. 1 INTRODUCTION The KANT system [Nyberg and Mitamura, 1992] is a Knowledge­based Machine TranslationCoping with Ambiguity in Knowledge­based Natural Language Analysis Kathryn L. Baker, Alexander M

  6. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  7. INTERFACING ACOUSTIC MODELS WITH NATURAL LANGUAGE PROCESSING SYSTEMS

    E-print Network

    Johnson, Michael T.

    . In addition, since word graphs can be made arbitrarily large by using lengthy acoustic processing with littleINTERFACING ACOUSTIC MODELS WITH NATURAL LANGUAGE PROCESSING SYSTEMS Michael T. Johnson, Mary P on implementation and ef- ficiency issues associated with the use of word graphs for inter- facing acoustic speech

  8. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  9. Principles of Organization in Young Children's Natural Language Hierarchies.

    ERIC Educational Resources Information Center

    Callanan, Maureen A.; Markman, Ellen M.

    1982-01-01

    When preschool children think of objects as organized into collections (e.g., forest, army) they solve certain problems better than when they think of the same objects as organized into classes (e.g., trees, soldiers). Present studies indicate preschool children occasionally distort natural language inclusion hierarchies (e.g., oak, tree) into the…

  10. Representing Requirements in Natural Language as Concept Lattices

    Microsoft Academic Search

    D. Richards; K. Boettger

    2002-01-01

    Abstract We have developed a viewpoint development approach to identify and rec - oncile di erences between stakeholder requirements The initial phase in our approach seeks to provide a formal solution to the problem of con - verting requirements descriptions in natural language into a computer processable representation After the group brainstorms the functional requirements in the form of use

  11. Anaphora in Natural Language Processing and Information Retrieval.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth DuRoss

    1990-01-01

    Describes the linguistic phenomenon of anaphora; surveys the approaches to anaphora undertaken in theoretical linguistics and natural language processing (NLP); presents results of research conducted at Syracuse University on anaphora in information retrieval; and discusses the future of anaphora research in regard to information retrieval tasks.…

  12. Attacks on Lexical Natural Language Steganography Systems Cuneyt M. Taskirana

    E-print Network

    Topkara, Mercan

    Attacks on Lexical Natural Language Steganography Systems Cuneyt M. Taskirana , Umut Topkarab. In this paper we examine the robustness of lexical steganography systems.In this paper we used a universal by a lexical steganography algorithm from unmodified sentences. The experimental accuracy of our method

  13. Describing Complex Charts in Natural Language: A Caption Generation System

    E-print Network

    Carenini, Giuseppe

    . A number of research groups have developed systems that can automatically design sophisticatedDescribing Complex Charts in Natural Language: A Caption Generation System Vibhu O. Mittal* Johanna D. Moore University of Pittsburgh University of Pittsburgh Giuseppe Carenini Steven Roth§ University

  14. CITE NLM: Natural-Language Searching in an Online Catalog.

    ERIC Educational Resources Information Center

    Doszkocs, Tamas E.

    1983-01-01

    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…

  15. Natural Language Processing in the Medical and Biological Domains

    E-print Network

    Zweigenbaum, Pierre

    Natural Language Processing As in the medical domain Note : Text Mining Data mining from text [Hearst] OR "genetics"[Subheading] Approximation of BioNLP : 1 & 2 Manual check Also examine text mining[all elds] 6 bigrams reveal common apparition of ontology, text mining, SVM Here, focus on NLP + much simpler study

  16. Learning to Disambiguate Natural Language Using World Knowledge

    E-print Network

    Collobert, Ronan

    general than the traditional tasks like word-sense disambiguation, co-reference resolution, and named-entityLearning to Disambiguate Natural Language Using World Knowledge Antoine Bordes, Nicolas Usunier LIP with the unique physical entity (e.g. person, object or location) or abstract concept it refers to. Our method

  17. Natural Language Grammatical Inference with Recurrent Neural Networks

    E-print Network

    Fong, Sandiway

    Natural Language Grammatical Inference with Recurrent Neural Networks Steve Lawrence, Member, IEEE of a complex grammar with neural networksÐspecifically, the task considered is that of training a network-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed

  18. A Natural Language Query Interface to Structured Information

    E-print Network

    Bontcheva, Kalina

    A Natural Language Query Interface to Structured Information Valentin Tablan, Danica Damljanovic interface for accessing structured information, that is domain independent and easy to use without training. It aims to bring the simplicity of Google's search interface to conceptual retrieval by automatically

  19. Learning from a Computer Tutor with Natural Language Capabilities

    ERIC Educational Resources Information Center

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha

    2003-01-01

    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…

  20. Extracting Phenotypic Information from the Literature via Natural Language Processing

    Microsoft Academic Search

    Lifeng Chen; Carol Friedman

    2004-01-01

    In recent years, the amount of biomedical knowledge has been increasing exponentially. Several Natural Language Processing (NLP) systems have been developed to help researchers extract, encode and organize new information automatically from textual literature or narrative reports. Some of these systems focus on extracting biological entities or molecular interactions while others retrieve and encode clinical information. To exploit gene functions

  1. Natural Language Processing: A Terminological And Statistical Approach

    Microsoft Academic Search

    Gabriella Pardelli; Manuela Sassi; Sara Goggi; Paola Orsolini

    The aim of this article is to provide a statistical representation of significant terms used in the field of Natural Language Processing from the 1960s till nowadays, in order to draft a survey on the most significant research trends in that period. By retrieving these keywords it should be possible to highlight the ebb and flow of some thematic topics.

  2. Some Aspects of Optimality in Natural Language Interpretation

    E-print Network

    Blutner, Reinhard

    the form of OT as used in phonology, morphology and syntax on the one hand and its form as used in semantics on the other hand. Whereas in the first case OT takes the point of view of the speaker`s (1981) idea of balancing between informativeness and efficiency in natural language processing

  3. Natural language understanding and speech recognition for industrial vision systems

    NASA Astrophysics Data System (ADS)

    Batchelor, Bruce G.

    1992-11-01

    The accepted method of programming machine vision systems for a new application is to incorporate sub-routines from a standard library into code, written specially for the given task. Typical programming languages that might be used here are Pascal, C, and assembly code, although other `conventional' (i.e., imperative) languages are often used instead. The representation of an algorithm to recognize a certain object, in the form of, say, a C language program is clumsy and unnatural, compared to the alternative process of describing the object itself and leaving the software to search for it. The latter method, known as declarative programming, is used extensively both when programming in Prolog and when people talk to one another in English, or other natural languages. Programs to understand a limited sub-set of a natural language can also be written conveniently in Prolog. The article considers the prospects for talking to an image processing system, using only slightly constrained English. Moderately priced speech recognition devices, which interface to a standard desk-top computer and provide a limited repertoire (200 words) as well as the ability to identify isolated words, are already available commercially. At the moment, the goal of talking in English to a computer is incompletely fulfilled. Yet, sufficient progress has been made to encourage greater effort in this direction.

  4. Controlled natural language interfaces (extended abstract): the best of three worlds

    Microsoft Academic Search

    Eva-Martin Mueckstein

    1985-01-01

    This paper will discuss the problem of designing user-friendly interfaces for computer applications. In particular, we will describe an interface that is based on mapping formal into natural languages in a controlled and structured way.The basic approaches for designing interfaces range from formal or natural language to menu driven ones. Formal language interfaces such as query or programming languages are

  5. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  6. Natural Language Processing in aid of FlyBase curators

    E-print Network

    Karamanis, Nikiforos; Seal, Ruth; Lewin, Ian; McQuilton, Peter; Vlachos, Andreas; Gasperin, Caroline; Drysdale, Rachel; Briscoe, Ted

    2008-04-14

    curate in a similar way as FlyBase, this study is likely to have far-reaching implications. Availability and Requirements  Project name: FlySlip  Project website: http://www.wiki.cl.cam.ac.uk/rowiki/ NaturalLanguage/FlySlip  Programming language: Java... 1.4.2 or above.  Restrictions: PaperBrowser is implemented on top of Mozilla Gecko and JREX. It runs on 32-bit Linux Fedora Core 3 and is freely available for non-commercial use from the Resources section of the FlySlip website. The NLP pipeline...

  7. Proving Memory Safety of the ANI Windows Image Parser

    E-print Network

    Rajamani, Sriram K.

    Proving Memory Safety of the ANI Windows Image Parser using Compositional Exhaustive Testing Maria memory safety of a complex Windows image parser written in low-level C in only three months of workCUTE [33], SAGE [21], Pex [36], KLEE [8], BitBlaze [34], and Apollo [2] to name a few. These tools vary

  8. A Comparative Study of Bing Web N-gram Language Models for Web Search and Natural Language Processing

    E-print Network

    Rajamani, Sriram K.

    A Comparative Study of Bing Web N-gram Language Models for Web Search and Natural Language}@microsoft.com ABSTRACT This paper presents a comparative study of the recently re- leased Microsoft Web N-gram Language web services, called Microsoft Web N-gram Services, are much more accessible and easier to use than

  9. Towards Natural Language Processing: A Well-Formed Substring Table Approach to Understanding Garden Path Sentence

    Microsoft Academic Search

    Jia-li Du; Ping-fang Yu

    2010-01-01

    As computers have become more affordable and accessible, the theories and techniques of natural language processing (NLP) are increasingly used as a means for automatically decoding natural language. Well-formed substring table (WFST) is an efficient parsing algorithm used to decode natural language. The form of (START, FINISH, LABEL?FOUND. TO FIND) is accepted by system as its basic model, and its

  10. SEMILAR: A Semantic Similarity Toolkit For Assessing Students' Natural Language Inputs

    E-print Network

    Rus, Vasile

    in conversational ITSs. First, there is need for advanced natural language algorithms to interpret the meaning-of-the-art conversational ITSs and in other mainstream natural language processing applications such as Question AnsweringSEMILAR: A Semantic Similarity Toolkit For Assessing Students' Natural Language Inputs Vasile Rus

  11. The Role of Natural Language in Advanced Knowledge-Based Systems

    E-print Network

    Wahlster, Wolfgang - Deutsche Forschungszentrum für Künstliche Intelligenz & FR 6.2

    : Natural language processing is a prerequisite for advanced knowledge-based systems since the abilityThe Role of Natural Language in Advanced Knowledge-Based Systems Wolfgang Wahlster Department performance in face-to- face communication. 1. Introduction Natural language processing is a prerequisite

  12. Natural Language Watermarking Mercan Topkara Cuneyt M. Taskiran Edward J. Delp

    E-print Network

    Topkara, Mercan

    Natural Language Watermarking Mercan Topkara Cuneyt M. Taskiran Edward J. Delp Center for Education Lafayette, Indiana, 47907 ABSTRACT In this paper we discuss natural language watermarking, which uses the structure of the sentence constituents in natural language text in order to insert a watermark

  13. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  14. Using natural language processing techniques to inform research on nanotechnology.

    PubMed

    Lewinski, Nastassja A; McInnes, Bridget T

    2015-01-01

    Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  15. 78CS 536 Fall 2002 Java CUP is a parser-generation tool,

    E-print Network

    Fischer, Charles N.

    78CS 536 Fall 2002 © Java CUP Java CUP is a parser-generation tool, similar to Yacc. CUP builds). CUP generates a Java source file parser.java. It contains a class parser, with a method Symbol parse, Exception() is thrown by the parser. CUP and Yacc accept exactly the same class of grammars--all LL(1

  16. Research Paper: Natural Language Processing Framework to Assess Clinical Conditions

    Microsoft Academic Search

    Henry Ware; Charles J. Mullett; V. Jagannathan

    2009-01-01

    ObjectiveThe authors developed a natural language processing (NLP) framework that could be used to extract clinical findings and diagnoses from dictated physician documentation.DesignDe-identified documentation was made available by i2b2 Bio-informatics research group as a part of their NLP challenge focusing on obesity and its co-morbidities. The authors describe their approach, which used a combination of concept detection, context validation, and

  17. Recent advances in natural language processing for biomedical applications.

    PubMed

    Collier, Nigel; Nazarenko, Adeline; Baud, Robert; Ruch, Patrick

    2006-06-01

    We survey a set a recent advances in natural language processing applied to biomedical applications, which were presented in Geneva, Switzerland, in 2004 at an international workshop. While text mining applied to molecular biology and biomedical literature can report several interesting achievements, we observe that studies applied to clinical contents are still rare. In general, we argue that clinical corpora, including electronic patient records, must be made available to fill the gap between bioinformatics and medical informatics. PMID:16139564

  18. The Role of Contrast Categories in Natural Language Concepts

    Microsoft Academic Search

    Timothy Verbeemen; Veerle Vanoverberghe; Gert Storms; Wim Ruts

    2001-01-01

    In this paper, seven experiments are described in which the effect of contrast categories on the within-category structure of superordinate and basic level natural language concepts was studied. Intension-based and extension-based predictors originating from both the target category and a contrast category were used to predict typicality ratings and response times in two different speeded categorization tasks. Virtually no evidence

  19. Applications of Weighted Automata in Natural Language Processing

    Microsoft Academic Search

    Kevin Knight; Jonathan May

    2009-01-01

    We explain why weighted automata are an attractive knowledge representation for natural language problems. We first trace\\u000a the close historical ties between the two fields, then present two complex real-world problems, transliteration and translation.\\u000a These problems are usefully decomposed into a pipeline of weighted transducers, and weights can be set to maximize the likelihood\\u000a of a training corpus using standard

  20. Knowledge discovery and data mining to assist natural language understanding.

    PubMed Central

    Wilcox, A.; Hripcsak, G.

    1998-01-01

    As natural language processing systems become more frequent in clinical use, methods for interpreting the output of these programs become increasingly important. These methods require the effort of a domain expert, who must build specific queries and rules for interpreting the processor output. Knowledge discovery and data mining tools can be used instead of a domain expert to automatically generate these queries and rules. C5.0, a decision tree generator, was used to create a rule base for a natural language understanding system. A general-purpose natural language processor using this rule base was tested on a set of 200 chest radiograph reports. When a small set of reports, classified by physicians, was used as the training set, the generated rule base performed as well as lay persons, but worse than physicians. When a larger set of reports, using ICD9 coding to classify the set, was used for training the system, the rule base performed worse than the physicians and lay persons. It appears that a larger, more accurate training set is needed to increase performance of the method. PMID:9929336

  1. Elicitation of natural language representations of uncertainty using computer technology

    SciTech Connect

    Tonn, B.; Goeltz, R.; Travis, C. (Oak Ridge National Lab., TN (USA); Tennessee Univ., Knoxville, TN (USA))

    1989-01-01

    Knowledge elicitation is an important aspect of risk analysis. Knowledge about risks must be accurately elicited from experts for use in risk assessments. Knowledge and perceptions of risks must also be accurately elicited from the public in order to intelligently perform policy analysis and develop and implement programs. Oak Ridge National Laboratory is developing computer technology to effectively and efficiently elicit knowledge from experts and the public. This paper discusses software developed to elicit natural language representations of uncertainty. The software is written in Common Lisp and resides on VAX Computers System and Symbolics Lisp machines. The software has three goals, to determine preferences for using natural language terms for representing uncertainty; likelihood rankings of the terms; and how likelihood estimates are combined to form new terms. The first two goals relate to providing useful results for those interested in risk communication. The third relates to providing cognitive data to further our understanding of people's decision making under uncertainty. The software is used to elicit natural language terms used to express the likelihood of various agents causing cancer in humans and cancer resulting in various maladies, and the likelihood of everyday events. 6 refs., 4 figs., 4 tabs.

  2. Porting a lexicalized-grammar parser to the biomedical domain.

    PubMed

    Rimell, Laura; Clark, Stephen

    2009-10-01

    This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought. PMID:19141332

  3. Evaluation of Natural Language Tools for Italian: EVALITA 2007 B. Magnini1

    E-print Network

    Mazzei, Alessandro

    Evaluation of Natural Language Tools for Italian: EVALITA 2007 B. Magnini1 , A. Cappelli2 , F is to promote the development of language technologies for the Italian language, by providing a shared framework/lir/lir11/ HAREM8 , EVALITA concentrates specifically on one single language, i.e. Italian. Organized

  4. Applications of Natural Language Processing in Biodiversity Science

    PubMed Central

    Thessen, Anne E.; Cui, Hong; Mozzherin, Dmitry

    2012-01-01

    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science. PMID:22685456

  5. Human task animation from performance models and natural language input

    NASA Technical Reports Server (NTRS)

    Esakov, Jeffrey; Badler, Norman I.; Jung, Moon

    1989-01-01

    Graphical manipulation of human figures is essential for certain types of human factors analyses such as reach, clearance, fit, and view. In many situations, however, the animation of simulated people performing various tasks may be based on more complicated functions involving multiple simultaneous reaches, critical timing, resource availability, and human performance capabilities. One rather effective means for creating such a simulation is through a natural language description of the tasks to be carried out. Given an anthropometrically-sized figure and a geometric workplace environment, various simple actions such as reach, turn, and view can be effectively controlled from language commands or standard NASA checklist procedures. The commands may also be generated by external simulation tools. Task timing is determined from actual performance models, if available, such as strength models or Fitts' Law. The resulting action specification are animated on a Silicon Graphics Iris workstation in real-time.

  6. Natural Language Interfaces to Prepared for Knowledge Engineering Review, Special Issue on the

    E-print Network

    Haddadi, Hamed

    on the Applications of Natural Language Processing Techniques, in press. Ann Copestake and Karen Sparck Jones Computer is on the central process of translating a natural language question into a database query, but other supporting will depend on the one hand on general advances in natu­ ral language processing, and on the other

  7. A broad-coverage natural language processing system.

    PubMed Central

    Friedman, C.

    2000-01-01

    Natural language processing systems (NLP) that extract clinical information from textual reports were shown to be effective for limited domains and for particular applications. Because an NLP system typically requires substantial resources to develop, it is beneficial if it is designed to be easily extendible to multiple domains and applications. This paper describes multiple extensions of an NLP system called MedLEE, which was originally developed for the domain of radiological reports of the chest, but has subsequently been extended to mammography, discharge summaries, all of radiology, electrocardiography, echocardiography, and pathology. PMID:11079887

  8. Natural Language Processing Framework to Assess Clinical Conditions

    Microsoft Academic Search

    HENRY WARE; C HARLES J. MULLETT; V. JAGANNATHAN

    Abstract,Objective: The authors developed,a natural language,processing,(NLP) framework,that could be used,to extract clinical findings,and,diagnoses,from,dictated,physician,documentation. Design: De-identified documentation,was,made,available by i2b2 Bio-informatics research,group,as a part of their NLP challenge focusing on obesity and its co-morbidities. The authors describe their approach, which used a combination of concept detection, context validation, and the application of a variety of rules to conclude patient diagnoses. Results: The

  9. Deviations in the Zipf and Heaps laws in natural languages

    NASA Astrophysics Data System (ADS)

    Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.

    2014-03-01

    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.

  10. Restricted natural language processing for case simulation tools.

    PubMed Central

    Lehmann, C. U.; Nguyen, B.; Kim, G. R.; Johnson, K. B.; Lehmann, H. P.

    1999-01-01

    For Interactive Patient II, a multimedia case simulation designed to improve history-taking skills, we created a new natural language interface called GRASP (General Recognition and Analysis of Sentences and Phrases) that allows students to interact with the program at a higher level of realism. Requirements included the ability to handle ambiguous word senses and to match user questions/queries to unique Canonical Phrases, which are used to identify case findings in our knowledge database. In a simulation of fifty user queries, some of which contained ambiguous words, this tool was 96% accurate in identifying concepts. PMID:10566424

  11. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    SciTech Connect

    Powers, D.M.W.

    1983-04-01

    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  12. SQUALL: a Controlled Natural Language for Querying and Updating RDF Graphs

    E-print Network

    Paris-Sud XI, Université de

    of controlled natural languages (CNL) is to reconcile the high-level and natural syntax of natural languages Data, CNL could not only allow more people to contribute by abstracting from the low-level details abstracts from low-level notions such as bindings and relational algebra. We formally define the syntax

  13. `Ideal learning' of natural language: Positive results about learning from positive

    E-print Network

    Chater, Nick

    1 `Ideal learning' of natural language: Positive results about learning from positive evidence Nick knowledge of language, often termed "universal grammar," that the child brings to bear on the learning. If it were possible to show that this ideal learner is unable to learn language from the specific linguistic

  14. A Compositional Natural Semantics and Hoare Logic for Low-Level Languages 1

    E-print Network

    Uustalu, Tarmo

    A Compositional Natural Semantics and Hoare Logic for Low-Level Languages 1 Ando Saabas 2 and Tarmo of proof-carrying code has generated significant interest in reasoning about low-level languages. It is widely believed that low-level languages with jumps must be difficult to reason about by being inherently

  15. A Classification of Sentences Used in Natural Language Processing in the Military Services.

    ERIC Educational Resources Information Center

    Wittrock, Merlin C.

    Concepts in cognitive psychology are applied to the language used in military situations, and a sentence classification system for use in analyzing military language is outlined. The system is designed to be used, in part, in conjunction with a natural language query system that allows a user to access a database. The discussion of military…

  16. Neural network processing of natural language: I. Sensitivity to serial, temporal and abstract structure

    E-print Network

    Dominey, Peter F.

    Neural network processing of natural language: I. Sensitivity to serial, temporal and abstract structure of language in the infant Peter Ford Dominey Institut des Sciences Cognitives, Bron, France Franck of rhythmic or temporal structure of a new language within 5­10 minutes of exposure (Nazzi et al., 1998). All

  17. Spatial and numerical abilities without a complete natural language

    PubMed Central

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  18. Spatial and numerical abilities without a complete natural language.

    PubMed

    Hyde, Daniel C; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A; Spelke, Elizabeth S

    2011-04-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations involving the use of both exact and approximate numbers. Tests of spatial and geometrical abilities revealed an interesting patchwork of age-typical strengths and localized deficits. In particular, the child performed extremely well on navigation tasks involving geometrical or landmark information presented in isolation, but very poorly on otherwise similar tasks that required the combination of the two types of spatial information. Tests of number- and space-specific language revealed proficiency in the use of number words and deficits in the use of spatial terms. This case suggests that a full linguistic system is not necessary to reap the benefits of linguistic vocabulary on basic numerical tasks. Furthermore, it suggests that language plays an important role in the combination of mental representations of space. PMID:21168425

  19. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  20. Natural Language Processing Methods and Systems for Biomedical Ontology Learning

    PubMed Central

    Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.

    2010-01-01

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054

  1. Building Gold Standard Corpora for Medical Natural Language Processing Tasks

    PubMed Central

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre

    2012-01-01

    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project. PMID:23304283

  2. What can natural language processing do for clinical decision support?

    PubMed

    Demner-Fushman, Dina; Chapman, Wendy W; McDonald, Clement J

    2009-10-01

    Computerized clinical decision support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. natural language processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  3. Real-world natural language interfaces to expert systems

    SciTech Connect

    Cullingford, R.E.; Selfridge, M.

    1983-01-01

    ACE (academic counseling experiment) is a natural-language text processing system currently under development at the University of Connecticut as a testbed for work in real-world conversational interaction with rule-based expert systems. ACE is designed to perform the tasks of a faculty advisor of undergraduate engineering students who intend to be computer science majors at the university. The key problem for a conversational system of this sort is robust understanding, the ability to cope with ungrammatical, ellipsed, and otherwise variant, but responsive, input. The paper outlines ACE's current status and the progress toward testing it with real users. The authors believe it represents a technology which can be applied to a wide variety of rule-based expert systems. 22 references.

  4. Creation of structured documentation templates using Natural Language Processing techniques.

    PubMed

    Kashyap, Vipul; Turchin, Alexander; Morin, Laura; Chang, Frank; Li, Qi; Hongsermeier, Tonya

    2006-01-01

    Structured Clinical Documentation is a fundamental component of the healthcare enterprise, linking both clinical (e.g., electronic health record, clinical decision support) and administrative functions (e.g., evaluation and management coding, billing). One of the challenges in creating good quality documentation templates has been the inability to address specialized clinical disciplines and adapt to local clinical practices. A one-size-fits-all approach leads to poor adoption and inefficiencies in the documentation process. On the other hand, the cost associated with manual generation of documentation templates is significant. Consequently there is a need for at least partial automation of the template generation process. We propose an approach and methodology for the creation of structured documentation templates for diabetes using Natural Language Processing (NLP). PMID:17238596

  5. Formalism-Independent Parser Evaluation with CCG and DepBank Stephen Clark

    E-print Network

    Koehn, Philipp

    the RASP parser, outperform- ing RASP by over 5% overall and on the ma- jority of dependency types. 1, obtaining impressive results on DepBank and outperforming the RASP parser (Briscoe et al., 2006) by over 5

  6. FormalismIndependent Parser Evaluation with CCG and DepBank Stephen Clark

    E-print Network

    Curran, James R.

    the RASP parser, outperform­ ing RASP by over 5% overall and on the ma­ jority of dependency types. 1, obtaining impressive results on DepBank and outperforming the RASP parser (Briscoe et al., 2006) by over 5

  7. Moving Toward a Unified Effort to Understand the Nature and Causes of Language Disorders

    E-print Network

    Rice, Mabel L.; Warren, Steven F.

    2005-01-01

    Applied Psycholinguistics 26 (2005), 3–6 Printed in the United States of America DOI: 10.1017.S0142716405050022 EDITORIAL Moving toward a unified effort to understand the nature and causes of language disorders MABEL L. RICE and STEVEN F. WARREN... University of Kansas ADDRESS FOR CORRESPONDENCE Mabel L. Rice, University of Kansas, Child Language Doctoral Program, 1000 Sunnyside Avenue, 3031 Dole Center, Lawrence, KS 66045-7555. E-mail: mabel@ku.edu The nature and causes of language disorders...

  8. Integrating casebased learning and cognitive biases for machine learning of natural language

    E-print Network

    Cardie, Claire

    Integrating case­based learning and cognitive biases for machine learning of natural language@cs.cornell.edu Running head: Integrating CBL and cognitive biases August 9, 1999 1 #12; Integrating case­based learning and cognitive biases for machine learning of natural language Abstract This paper shows that psychological

  9. Affective Natural Language Generation Fiorella de Rosis 1 and Floriana Grasso 2

    E-print Network

    Bari, Università degli Studi di

    of ``attitudes'') include personality traits, emotions and highly­placed values. We argue for the needAffective Natural Language Generation Fiorella de Rosis 1 and Floriana Grasso 2 1 Dipartimento di of Liverpool, UK floriana@csc.liv.ac.uk 1 Introduction The automatic generation of natural language messages

  10. An Application of Natural Language Processing to Domain Modelling Two Case Studies

    E-print Network

    An Application of Natural Language Processing to Domain Modelling ­ Two Case Studies Leonid Kof] that natural language processing (NLP) is not ripe enough to be used in requirements engineering engineer to detect such omissions. So, an incomplete extracted model would be an indicator for some

  11. Ludics and Its Applications to Natural Language Semantics Alain Lecomte1

    E-print Network

    Paris-Sud XI, Université de

    ) - interpretation. This conception has been used in philosophy, linguistics and mathematics. In Natural LanguageLudics and Its Applications to Natural Language Semantics Alain Lecomte1 and Myriam Quatrini2 1 UMR interaction. In this aim, we shall de- velop many concepts of Ludics like designs (which generalize proofs

  12. Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.

    ERIC Educational Resources Information Center

    Bernstein, Lionel M.; Williamson, Robert E.

    1984-01-01

    The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries proposed by…

  13. Planning in AI and Text Planning in Natural Language JongGyun Lim

    E-print Network

    1 Planning in AI and Text Planning in Natural Language Generation Jong­Gyun Lim Columbia University the content and structure of the natural language text and that of other AI planning tasks. The problem of text planning and other AI planning problems have been studied separately from each other, and while

  14. Natural and Artificial Intelligence, Language, Consciousness, Emotion, and Anticipation

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    The classical paradigm of the neural brain as the seat of human natural intelligence is too restrictive. This paper defends the idea that the neural ectoderm is the actual brain, based on the development of the human embryo. Indeed, the neural ectoderm includes the neural crest, given by pigment cells in the skin and ganglia of the autonomic nervous system, and the neural tube, given by the brain, the spinal cord, and motor neurons. So the brain is completely integrated in the ectoderm, and cannot work alone. The paper presents fundamental properties of the brain as follows. Firstly, Paul D. MacLean proposed the triune human brain, which consists to three brains in one, following the species evolution, given by the reptilian complex, the limbic system, and the neo-cortex. Secondly, the consciousness and conscious awareness are analysed. Thirdly, the anticipatory unconscious free will and conscious free veto are described in agreement with the experiments of Benjamin Libet. Fourthly, the main section explains the development of the human embryo and shows that the neural ectoderm is the whole neural brain. Fifthly, a conjecture is proposed that the neural brain is completely programmed with scripts written in biological low-level and high-level languages, in a manner similar to the programmed cells by the genetic code. Finally, it is concluded that the proposition of the neural ectoderm as the whole neural brain is a breakthrough in the understanding of the natural intelligence, and also in the future design of robots with artificial intelligence.

  15. The Nature of Spanish versus English Language Use at Home

    ERIC Educational Resources Information Center

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude

    2014-01-01

    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,…

  16. A Cache-Based Natural Language Model for Speech Recognition

    Microsoft Academic Search

    Roland Kuhn; Renato De Mori

    1990-01-01

    Speech-recognition systems must often decide between competing ways of breaking up the acoustic input into strings of words. Since the possible strings may be acoustically similar, a language model is required; given a word string, the model returns its linguistic probability. Several Markov language models are discussed. A novel kind of language model which reflects short-term patterns of word use

  17. A grammar-based semantic similarity algorithm for natural language sentences.

    PubMed

    Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  18. Applying Semantic-based Probabilistic Context-Free Grammar to Medical Language Processing – A Preliminary Study on Parsing Medication Sentences

    PubMed Central

    Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C.; Doan, Son

    2011-01-01

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4,564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440

  19. A common type system for clinical natural language processing

    PubMed Central

    2013-01-01

    Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462

  20. Automating curation using a natural language processing pipeline

    PubMed Central

    Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Tobin, Richard; Wang, Xinglong

    2008-01-01

    Background: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. Results: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. Conclusion: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems. PMID:18834488

  1. Automatic retrieval of bone fracture knowledge using natural language processing.

    PubMed

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip

    2013-08-01

    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time. PMID:23053906

  2. Natural Language Processing Framework to Assess Clinical Conditions

    PubMed Central

    Ware, Henry; Mullett, Charles J.; Jagannathan, V.

    2009-01-01

    Objective The authors developed a natural language processing (NLP) framework that could be used to extract clinical findings and diagnoses from dictated physician documentation. Design De-identified documentation was made available by i2b2 Bio-informatics research group as a part of their NLP challenge focusing on obesity and its co-morbidities. The authors describe their approach, which used a combination of concept detection, context validation, and the application of a variety of rules to conclude patient diagnoses. Results The framework was successful at correctly identifying diagnoses as judged by NLP challenge organizers when compared with a gold standard of physician annotations. The authors overall kappa values for agreement with the gold standard were 0.92 for explicit textual results and 0.91 for intuited results. The NLP framework compared favorably with those of the other entrants, placing third in textual results and fourth in intuited results in the i2b2 competition. Conclusions The framework and approach used to detect clinical conditions was reasonably successful at extracting 16 diagnoses related to obesity. The system and methodology merits further development, targeting clinically useful applications. PMID:19390100

  3. Advanced LanguageLanguageLanguageLanguage TechnologiesTechnologiesTechnologiesTechnologies

    E-print Network

    Erjavec, Toma?

    ((statistical Natural Language Processingstatistical Natural Language Processing)) #12;3 Characteristics of a1 Advanced LanguageLanguageLanguageLanguage Technologies · History · Slovene language corpora #12;2 A corpusA corpus isis:: a large collection of textsa large

  4. Linking Parser Development to Acquisition of Syntactic Knowledge

    ERIC Educational Resources Information Center

    Omaki, Akira; Lidz, Jeffrey

    2015-01-01

    Traditionally, acquisition of syntactic knowledge and the development of sentence comprehension behaviors have been treated as separate disciplines. This article reviews a growing body of work on the development of incremental sentence comprehension mechanisms and discusses how a better understanding of the developing parser can shed light on two…

  5. The Universal Parser Architecture for Knowledge-Based Machine Translation

    E-print Network

    Carbonell, Jaime

    The Universal Parser Architecture for Knowledge-Based Machine Translation Masaru Tomita and Jaime G. This paper does not attempt to revisit the ample rationale for the knowledge-based machine translation. Carbonall1 Center for Machine Translation Carnegie Mellon University Pittsburgh, PA 15213 Abstract Machine

  6. The Parser Doesn't Ignore Intransitivity, after All

    ERIC Educational Resources Information Center

    Staub, Adrian

    2007-01-01

    Several previous studies (B. C. Adams, C. Clifton, & D. C. Mitchell, 1998; D. C. Mitchell, 1987; R. P. G. van Gompel & M. J. Pickering, 2001) have explored the question of whether the parser initially analyzes a noun phrase that follows an intransitive verb as the verb's direct object. Three eye-tracking experiments examined this issue in more…

  7. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  8. Of substance: The nature of language effects on entity construal

    Microsoft Academic Search

    Peggy Li; Yarrow Dunham; Susan Carey

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count\\/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.

  9. Experimenting natural-language dictation with a 20000-word speech recognizer

    Microsoft Academic Search

    P. Alto; M. Brandetti; M. Ferretti; G. Maltese; S. Scarci

    1989-01-01

    The authors describe a newly developed real-time large-vocabulary speech recognizer for the Italian language and some preliminary experiments on its usage. Some of these experiments are aimed at evaluating voice versus keyboard as a means for entry and editing of texts. The experiments made use of a dictating-machine prototype for the Italian language, which recognizes in real time natural-language sentences

  10. Dynamic changes in network activations characterize early learning of a natural language.

    PubMed

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius

    2014-09-01

    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. PMID:25058056

  11. Evaluation of Machine Learning Methods for Natural Language Processing Walter Daelemans, V eronique Hoste

    E-print Network

    Hoste, Véronique

    in use for comparing symbolic supervised learning methods applied to human language technol­ ogy tasks an annotated corpus). The reason why these methods are researched intensively is that, like statisticalEvaluation of Machine Learning Methods for Natural Language Processing Tasks Walter Daelemans, V

  12. Logic-Based Rhetorical Structuring for Natural Language Generation in Human-Computer Dialogue

    Microsoft Academic Search

    Vladimir Popescu; Jean Caelen; Corneliu Burileanu

    2007-01-01

    Rhetorical structuring is field approached mostly by research in natu- ral language (pragmatic) interpretation. However, in natural language generation (NLG) the rhetorical structure plays an important part, in monologues and dia- logues as well. Hence, several approaches in this direction exist. In most of these, the rhetorical structure is calculated and built in the framework of Rhetorical Structure Theory (RST),

  13. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  14. The Preservation and Use of Our Languages: Respecting the Natural Order of the Creator.

    ERIC Educational Resources Information Center

    Kirkness, Verna J.

    As a world community, Indigenous peoples are faced with many common challenges in their attempts to maintain the vitality of their respective languages and to honor the "natural order of the Creator." Ten strategies are discussed that are critical to the task of renewing and maintaining Indigenous languages. These strategies are: (1) banking…

  15. A natural language interface plug-in for cooperative query answering in biological databases

    PubMed Central

    2012-01-01

    Background One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Results Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. Conclusions The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a semantic description of the intended application. We demonstrate the feasibility of our approach with a practical example. PMID:22759613

  16. Of substance: the nature of language effects on entity construal.

    PubMed

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-06-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D. (1997). A cross-linguistic study of early word meaning: Universal ontology and linguistic influence. Cognition, 62, 169-200]. Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin speakers and English speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's [Prasada, S. (1999). Names for things and stuff: An Aristotelian perspective. In R. Jackendoff, P. Bloom, & K. Wynn (Eds.), Language, logic, and concepts (pp. 119-146). Cambridge, MA: MIT Press] hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object kind rather than substance kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  17. Resolution of linear entity and path geometries expressed via partially-geospatial natural language

    E-print Network

    Marrero, John Javier

    2010-01-01

    When conveying geospatial information via natural language, people typically combine implicit, commonsense knowledge with explicitly-stated information. Usually, much of this is contextual and relies on establishing locations ...

  18. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing.

    ERIC Educational Resources Information Center

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  19. Natural language technology and query expansion: issues, state-of-the-art and perspectives

    Microsoft Academic Search

    Bhawani Selvaretnam; Mohammed Belkhatir

    The availability of an abundance of knowledge sources has spurred a large amount of effort in the development and enhancement\\u000a of Information Retrieval techniques. Users’ information needs are expressed in natural language and successful retrieval is\\u000a very much dependent on the effective communication of the intended purpose. Natural language queries consist of multiple linguistic\\u000a features which serve to represent the

  20. Exemplars and prototypes in natural language concepts: A typicality-based evaluation

    Microsoft Academic Search

    Wouter Voorspoels; Wolf Vanpaemel; Gert Storms

    2008-01-01

    Are natural language categories represented by instances of the category or by a summary representation? We used an exemplar\\u000a model and a prototype model, both derived within the framework of the generalized context model (Nosofsky, 1984, 1986), to\\u000a predict typicality ratings for 12 superordinate natural language concepts. The models were fitted to typicality ratings averaged\\u000a across participants and to the

  1. Automation of Software System Development Using Natural Language Processing and Two-Level Grammar

    Microsoft Academic Search

    Beum-seuk Lee; Barrett R. Bryant

    2002-01-01

    \\u000a In software engineering, even with recent active research on formal methods and automated tools, users’ involvement is inevitable\\u000a and crucial throughout the software development lifecycle. Automation of these manual tasks would assist the developers throughout\\u000a the development. Our project goal is to help the engineers to resolve ambiguity in natural language (NL) using Natural Language\\u000a Processing and to overcome different

  2. Of Substance: The Nature of Language Effects on Entity Construal

    PubMed Central

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind (whisk; Imai and Gentner, 1997). Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin-speakers and English-speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's (1999) hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object-kind rather than substance-kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  3. 2 Evolution in Language and Elsewhere It is a natural principle that the script and the sounds

    E-print Network

    fossil remains reveal brain and vocal tract structures suggesting that the modern human language faculty. The study of language change over the past 5,000-7,000 years assumes a mature human language faculty21 2 Evolution in Language and Elsewhere It is a natural principle that the script and the sounds

  4. Entropy analysis of word-length series of natural language texts: Effects of text language and genre

    E-print Network

    Kalimeri, Maria; Papadimitriou, Constantinos; Karamanos, Kostantinos; Diakonos, Fotis K; Papageorgiou, Haris

    2014-01-01

    We estimate the $n$-gram entropies of natural language texts in word-length representation and find that these are sensitive to text language and genre. We attribute this sensitivity to changes in the probability distribution of the lengths of single words and emphasize the crucial role of the uniformity of probabilities of having words with length between five and ten. Furthermore, comparison with the entropies of shuffled data reveals the impact of word length correlations on the estimated $n$-gram entropies.

  5. The natural order of events: How speakers of different languages represent events nonverbally

    PubMed Central

    Goldin-Meadow, Susan; So, Wing Chee; Özyürek, Asl?; Mylander, Carolyn

    2008-01-01

    To test whether the language we speak influences our behavior even when we are not speaking, we asked speakers of four languages differing in their predominant word orders (English, Turkish, Spanish, and Chinese) to perform two nonverbal tasks: a communicative task (describing an event by using gesture without speech) and a noncommunicative task (reconstructing an event with pictures). We found that the word orders speakers used in their everyday speech did not influence their nonverbal behavior. Surprisingly, speakers of all four languages used the same order and on both nonverbal tasks. This order, actor–patient–act, is analogous to the subject–object–verb pattern found in many languages of the world and, importantly, in newly developing gestural languages. The findings provide evidence for a natural order that we impose on events when describing and reconstructing them nonverbally and exploit when constructing language anew. PMID:18599445

  6. Heavy NP shift is the parser’s last resort: Evidence from eye movements ?

    PubMed Central

    Staub, Adrian; Clifton, Charles; Frazier, Lyn

    2006-01-01

    Two eye movement experiments explored the roles of verbal subcategorization possibilities and transitivity biases in the processing of heavy NP shift sentences in which the verb’s direct object appears to the right of a post-verbal phrase. In Experiment 1, participants read sentences in which a prepositional phrase immediately followed the verb, which was either obligatorily transitive or had a high transitivity bias (e.g., Jack praised/watched from the stands his daughter’s attempt to shoot a basket). Experiment 2 compared unshifted sentences to sentences in which an adverb intervened between the verb and its object, and obligatorily transitive verbs to optionally transitive verbs with widely varying transitivity biases. In both experiments, evidence of processing difficulty appeared on the material that intervened between the verb and its object when the verb was obligatorily transitive, and on the shifted direct object when the verb was optionally transitive, regardless of transitivity bias. We conclude that the parser adopts the heavy NP shift analysis only when it is forced to by the grammar, which we interpret in terms of a preference for immediate incremental interpretation. PMID:17047731

  7. Proceedings of Recent Advances in Natural Language Processing, pages 275281, Hissar, Bulgaria, 12-14 September 2011.

    E-print Network

    Paris-Sud XI, Université de

    Proceedings of Recent Advances in Natural Language Processing, pages 275­281, Hissar, Bulgaria, 12 number of natural language applications (e.g. information extraction, question answering, automatic engi- neering (what to annotate?) and the other concerning language engineering (how to deal

  8. Role of PROLOG (Programming and Logic) in natural-language processing. Report for September-December 1987

    Microsoft Academic Search

    McHale

    1988-01-01

    The field of artificial Intelligence strives to produce computer programs that exhibit intelligent behavior. One of the areas of interest is the processing of natural language. This report discusses the role of the computer language PROLOG in Natural Language Processing (NLP) both from theoretic and pragmatic viewpoints. The reasons for using PROLOG for NLP are numerous. First, linguists can write

  9. TThhee 6th Conference on Natural Language Learning 2002 (CoNLL-2002)6th Conference on Natural Language Learning 2002 (CoNLL-2002) Timothy Baldwin, Aline: Extracting the Unextractable: A Case Study on

    E-print Network

    on Natural Language Learning 2002 (CoNLL-2002) Timothy Baldwin, Aline: Extracting the Unextractable: A Case, David Yarowsky: Inducing Translation Lexicons via Diverse Similarity Measures and Bridge Languages S. H

  10. Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 129137, Chiang Mai, Thailand, November 8 13, 2011. c 2011 AFNLP

    E-print Network

    Fraser, Alexander M.

    class vocabulary from Persian, Arabic, Turk- ish and Sanskrit. Both languages have lived to- getherProceedings of the 5th International Joint Conference on Natural Language Processing, pages 129 Institute for Natural Language Processing University of Stuttgart {sajjad

  11. Evolutionary developmental linguistics: Naturalization of the faculty of language

    Microsoft Academic Search

    John L. Locke

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new theoretical framework—one

  12. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language

    ERIC Educational Resources Information Center

    Locke, John L.

    2009-01-01

    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  13. Domain Driven Technologies for Natural Language Processing Alfio Massimiliano Gliozzo

    E-print Network

    Baeza-Yates, Ricardo

    -irst Abstract: Semantic Domains are a matter of recent interest in Computational Linguistics. Domain Models. Semantic Domains shows many interesting properties: lexical ambiguity inside a domain is sensibly reduced languages. These properties have been exploited to develop innovative technologies for a wide range

  14. Early understanding of emotion: Evidence from natural language

    Microsoft Academic Search

    Henry M. Wellman; Paul L. Harris; Mita Banerjee; Anna Sinclair

    1995-01-01

    Young children's early understanding of emotion was investigated by examining their use of emotion terms such as happy, sad, mud, and cry. Five children's emotion language was examined longitudinally from the age of 2 to 5 years, and as a comparison their reference to pains via such terms as burn, sting, and hurt was also examined. In Phase 1 we

  15. The 2014 Conference on Empirical Methods In Natural Language Processing

    E-print Network

    from a learning sciences or social psychological perspective. What is needed are new methodologies for development and interpretation of models that bridge expertise from machine learning and language technologies on one side and learning sciences, sociolinguistics, and social psychology on the other side. The field

  16. Book Review Natural Language Processing for Historical Texts

    E-print Network

    Boyer, Edmond

    developments in language technology may help digital humanities projects to be aware of the current state and provides an overview of the reasons why NLP has such an entrenched position in digital humanities at large. Chapter 2 in particular ("NLP and digital humanities") could be read as an autonomous position paper

  17. Book Review Natural Language Processing for Historical Texts

    E-print Network

    developments in language technology may help digital humanities projects to be aware of the current state) has such an entrenched position in digital humanities at large and the study of historical text and Digital Humanities") could be read as an autonomous position paper, which, independently of the following

  18. Children as Models for Computers: Natural Language Acquisition for Machine Learning

    E-print Network

    Paris-Sud XI, Université de

    Children as Models for Computers: Natural Language Acquisition for Machine Learning Leonor Becerra Tarragona, Spain mariadolores.jimenez@urv.cat Abstract. This paper focuses on a subfield of machine learning. Nevertheless, what has not been achieved yet is that machines learn to speak. It is a truism that natural

  19. Rimac: A Natural-Language Dialogue System that Engages Students in Deep Reasoning Dialogues about Physics

    ERIC Educational Resources Information Center

    Katz, Sandra; Jordan, Pamela; Litman, Diane

    2011-01-01

    The natural-language tutorial dialogue system that the authors are developing will allow them to focus on the nature of interactivity during tutoring as a malleable factor. Specifically, it will serve as a research platform for studies that manipulate the frequency and types of verbal alignment processes that take place during tutoring, such as…

  20. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  1. Natural language processing for information assurance and security: an overview and implementations

    Microsoft Academic Search

    Mikhail J. Atallah; Craig J. Mcdonough; Victor Raskin; Sergei Nirenburg

    2001-01-01

    This paper explores a promising interface between natural language processing (NLP) and informationassurance and security (IAS). More specifically, it is devoted to possible applications ofthe accumulated considerable resources in NLP to IAS. The paper is of a mixed theoretical andempirical nature. Of the four possible venues of applications, (i) memorizing randomly generatedpasswords with the help of automatically generated funny jingles,

  2. Menelas AIM Project A2023 An Access System for Medical Records using Natural Language

    E-print Network

    Zweigenbaum, Pierre

    Menelas AIM Project A2023 An Access System for Medical Records using Natural Language Final Report WP3.3 10 7 1995 Deliverable 17 4 3 Nature: R Type: P #12;#12;Menelas AIM Project A2023 An Access of a pilot system able to analyse medical texts. This report summarises the developments performed

  3. Menelas AIM Project A2023 An Access System for Medical Records using Natural Language

    E-print Network

    Zweigenbaum, Pierre

    Menelas AIM Project A2023 An Access System for Medical Records using Natural Language Final Report WP3.3 10/7/1995 Deliverable 17 (4/3) Nature: R Type: P #12; #12; Menelas AIM Project A2023 An Access and implementation of a pilot system able to analyse medical texts. This report summarises the developments performed

  4. Incremental Syntactic Parsing of Natural Language Corpora with Simple Synchrony Networks

    Microsoft Academic Search

    James B. Henderson

    2001-01-01

    This article explores the use of Simple Synchrony Networks (SSNs) for learning to parse English sentences drawn from a corpus of naturally occurring text. Parsing natural language sentences requires taking a sequence of words and outputting a hierarchical structure representing how those words fit together to form constituents. Feed-forward and Simple Recurrent Networks have had great difficulty with this task,

  5. Natural Language Processing (NLP) as an Instrument of Raising the Language Awareness of Learners of English as a Second Language

    ERIC Educational Resources Information Center

    Dodigovic, Marina

    2003-01-01

    Based on the statistical regularity of certain error types, an interlanguage grammar could be devised and applied to develop an intelligent computer tool, capable not only of identifying the typical errors in L2 student writing, but also of making adequate corrections. The purpose of the corrections is to make the student aware of the language

  6. The feasibility of using natural language processing to extract clinical information from breast pathology reports

    PubMed Central

    Buckley, Julliette M.; Coopey, Suzanne B.; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K.; Kim, Elizabeth M. H.; Garber, Judy E.; Smith, Barbara L.; Gadd, Michele A.; Specht, Michelle C.; Roche, Constance A.; Gudewicz, Thomas M.; Hughes, Kevin S.

    2012-01-01

    Objective: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. Approach and Procedure: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. Results: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. Conclusion: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. PMID:22934236

  7. n-Gram Statistics for Natural Language Understanding and Text Processing

    Microsoft Academic Search

    Ching Y. Suen

    1979-01-01

    n-gram (n = 1 to 5) statistics and other properties of the English language were derived for applications in natural language understanding and text processing. They were computed from a well-known corpus composed of 1 million word samples. Similar properties were also derived from the most frequent 1000 words of three other corpuses. The positional distributions of n-grams obtained in

  8. QATT: a Natural Language Interface for QPE. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    White, Douglas Robert-Graham

    1989-01-01

    QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.

  9. SWAN: An expert system with natural language interface for tactical air capability assessment

    NASA Technical Reports Server (NTRS)

    Simmons, Robert M.

    1987-01-01

    SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment.

  10. BISON(1) BISON(1) bison -GNU Project parser generator (yacc replacement)

    E-print Network

    Stark, Ian

    BISON(1) BISON(1) NAME bison - GNU Project parser generator (yacc replacement) SYNOPSIS bison [ -b ] [ --yacc ] [ -h ] [ --help ] [ --fixed-output-files ] file DESCRIPTION Bison is a parser generator.tab.c. This description of the options that can be given to bison is adapted from the node Invocation in the bison

  11. Corretto: A CUP of Java with Grappa A Tool for Parser Generation

    E-print Network

    van Breugel, Franck

    Corretto: A CUP of Java with Grappa A Tool for Parser Generation Laura Apostoloiu York University, Department of Computer Science 4700 Keele Street, Toronto, Canada M3J 1P3 November 26, 2002 Abstract CUP as action code, can be included in a CUP speci cation such that the generated parser also builds parse trees

  12. INTERFACING A CDG PARSER WITH AN HMM WORD RECOGNIZER USING WORD GRAPHS

    E-print Network

    Johnson, Michael T.

    the recognition lattice of the speech recognizer. Word graphs support an aggregate processing view as wellINTERFACING A CDG PARSER WITH AN HMM WORD RECOGNIZER USING WORD GRAPHS M. P. Harper, M. T. Johnson component based on hidden Markov models with a constraint depen- dency grammar (CDG) parser using a word

  13. A Parser from Antiquity1 Aravind K. Joshi and Phil Hopely

    E-print Network

    Plotkin, Joshua B.

    of this program have a close relationship to some of the recent work on finite state transducers. 1 Introduction of the parser in some detail and also briefly discuss several aspects of the parser that have a close relationship to some of the recent work on finite state transducers. An illustrative example is provided

  14. The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser.

    PubMed

    Vosse, Theo; Kempen, Gerard

    2009-12-01

    We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described. PMID:19784798

  15. Natural language processing with dynamic classification improves P300 speller accuracy and bit rate

    NASA Astrophysics Data System (ADS)

    Speier, William; Arnold, Corey; Lu, Jessica; Taira, Ricky K.; Pouratian, Nader

    2012-02-01

    The P300 speller is an example of a brain-computer interface that can restore functionality to victims of neuromuscular disorders. Although the most common application of this system has been communicating language, the properties and constraints of the linguistic domain have not to date been exploited when decoding brain signals that pertain to language. We hypothesized that combining the standard stepwise linear discriminant analysis with a Naive Bayes classifier and a trigram language model would increase the speed and accuracy of typing with the P300 speller. With integration of natural language processing, we observed significant improvements in accuracy and 40-60% increases in bit rate for all six subjects in a pilot study. This study suggests that integrating information about the linguistic domain can significantly improve signal classification.

  16. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning

    ERIC Educational Resources Information Center

    Fazeli, Seyed Hossein

    2011-01-01

    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  17. Crowdsourcing Research Opportunities: Lessons from Natural Language Processing

    E-print Network

    Bontcheva, Kalina

    1. INTRODUCTION The notion of citizen science, "a form of collaboration that involves the public significantly lowered the cost of user participation and lead to citizen science projects that are entirely citizen science projects. Although crowdsourcing of scientific work is a natural con- tinuation of citizen

  18. Spatial and Numerical Abilities without a Complete Natural Language

    ERIC Educational Resources Information Center

    Hyde, Daniel C.; Winkler-Rhoades, Nathan; Lee, Sang-Ah; Izard, Veronique; Shapiro, Kevin A.; Spelke, Elizabeth S.

    2011-01-01

    We studied the cognitive abilities of a 13-year-old deaf child, deprived of most linguistic input from late infancy, in a battery of tests designed to reveal the nature of numerical and geometrical abilities in the absence of a full linguistic system. Tests revealed widespread proficiency in basic symbolic and non-symbolic numerical computations…

  19. Patterns of Natural Language Use: Disclosure, Personality, and Social Integration

    Microsoft Academic Search

    James W. Pennebaker; Anna Graybeal

    2001-01-01

    When people write about their deepest thoughts and feelings about an emotionally significant event, numerous benefits in many domains (e.g., health, achievement, and well-being) result. As one step in understanding how writing achieves these effects, we have developed a computer program that provides a “fingerprint” of the words people use in writing or in natural settings. Analyses of text samples

  20. Visual language recognition with a feed-forward network of spiking neurons

    SciTech Connect

    Rasmussen, Craig E [Los Alamos National Laboratory; Garrett, Kenyan [Los Alamos National Laboratory; Sottile, Matthew [GALOIS; Shreyas, Ns [INDIANA UNIV.

    2010-01-01

    An analogy is made and exploited between the recognition of visual objects and language parsing. A subset of regular languages is used to define a one-dimensional 'visual' language, in which the words are translational and scale invariant. This allows an exploration of the viewpoint invariant languages that can be solved by a network of concurrent, hierarchically connected processors. A language family is defined that is hierarchically tiling system recognizable (HREC). As inspired by nature, an algorithm is presented that constructs a cellular automaton that recognizes strings from a language in the HREC family. It is demonstrated how a language recognizer can be implemented from the cellular automaton using a feed-forward network of spiking neurons. This parser recognizes fixed-length strings from the language in parallel and as the computation is pipelined, a new string can be parsed in each new interval of time. The analogy with formal language theory allows inferences to be drawn regarding what class of objects can be recognized by visual cortex operating in purely feed-forward fashion and what class of objects requires a more complicated network architecture.

  1. Natural Language Generation for Nature Conservation: Automating Feedback to help Volunteers identify

    E-print Network

    Siddharthan, Advaith

    Language Generation, Educational Application, Bumblebee Conser- vation, Citizen Science, Generating, including the use of websites and social media, to increase participation in "citizen science", which Science, University of Aberdeen, U.K. (2) Aberdeen Centre for Environmental Sustainability (ACES

  2. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry

    ERIC Educational Resources Information Center

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe

    2012-01-01

    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  3. Teaching the Tacit Knowledge of Programming to Novices with Natural Language Tutoring

    ERIC Educational Resources Information Center

    Lane, H. Chad; VanLehn, Kurt

    2005-01-01

    For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and…

  4. Real English: A Translator to Enable Natural Language Man-Machine Conversation.

    ERIC Educational Resources Information Center

    Gautin, Harvey

    This dissertation presents a pragmatic interpreter/translator called Real English to serve as a natural language man-machine communication interface in a multi-mode on-line information retrieval system. This multi-mode feature affords the user a library-like searching tool by giving him access to a dictionary, lexicon, thesaurus, synonym table,…

  5. The Linguistic Correlates of Conversational Deception: Comparing Natural Language Processing Technologies

    ERIC Educational Resources Information Center

    Duran, Nicholas D.; Hall, Charles; McCarthy, Philip M.; McNamara, Danielle S.

    2010-01-01

    The words people use and the way they use them can reveal a great deal about their mental states when they attempt to deceive. The challenge for researchers is how to reliably distinguish the linguistic features that characterize these hidden states. In this study, we use a natural language processing tool called Coh-Metrix to evaluate deceptive…

  6. To Catch a Predator: A Natural Language Approach for Eliciting Malicious Payloads

    Microsoft Academic Search

    Sam Small; Joshua Mason; Fabian Monrose; Niels Provos; Adam Stubblefield

    2008-01-01

    We present an automated, scalable, method for craft- ing dynamic responses to real-time network requests. Specifically, we provide a flexible technique based on natural language processing and string alignment tech- niques for intelligently interacting with protocols trained directly from raw network traffic. We demonstrate the utility of our approach by creating a low-interaction web- based honeypot capable of luring attacks

  7. The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching.

    ERIC Educational Resources Information Center

    Knapp, Sara D., Comp.

    This book is designed primarily to help users find meaningful words for natural language, or free-text, computer searching of bibliographic and textual databases in the social and behavioral sciences. Additionally, it covers many socially relevant and technical topics not covered by the usual literary thesaurus, therefore it may also be useful for…

  8. Naturally-Occurring Comprehension Strategies Instruction in 9th-Grade Language Arts Classrooms

    Microsoft Academic Search

    Øistein Anmarkrud; Ivar Bråten

    2011-01-01

    In this descriptive classroom study, we used video-based observations supplemented with teacher interviews to provide precise information about the instruction of comprehension strategies that naturally occurred in 4 Norwegian lower-secondary language arts classrooms while students worked with expository texts. The results showed that the teachers varied vastly with respect to the amount of comprehension strategies instruction, that the repertoire of

  9. FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text

    Microsoft Academic Search

    Jerry R. Hobbs; Douglas E. Appelt; John Bear; David J. Israel; Megumi Kameyama; Mark E. Stickel; Mabry Tyson

    1997-01-01

    Abstract FASTUS is a system for extracting information from natural language text for entry into a database and for other applications. It works essentially as a cascaded, nondeterministic finite-state automaton. There are five stages in the operation of FASTUS. In Stage 1, names and other fixed form expressions are recognized. In Stage 2, basic noun groups, verb groups, and prepositions

  10. The Utility of Affect Expression in Natural Language Interactions in Joint Human-Robot Tasks

    E-print Network

    Treuille, Adrien

    and cogni- tive mechanisms. We then present the details of the human- robot team experiment, in whichThe Utility of Affect Expression in Natural Language Interactions in Joint Human-Robot Tasks.nd.edu ABSTRACT Recognizing and responding to human affect is important in collaborative tasks in joint human

  11. Integration of an XML electronic dictionary with linguistic tools for natural language processing

    Microsoft Academic Search

    Octavio Santana Suárez; Francisco J. Carreras Riudavets; Zenón José Hernández Figueroa; Antonio C. González Cabrera

    2007-01-01

    This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to

  12. TWO PH.D. STUDENTSHIPS IN NATURAL LANGUAGE PROCESSING SAPIENZA UNIVERSITY OF ROME (ITALY)

    E-print Network

    Navigli, Roberto

    TWO PH.D. STUDENTSHIPS IN NATURAL LANGUAGE PROCESSING SAPIENZA UNIVERSITY OF ROME (ITALY) The Department of Computer Science of the Sapienza University of Rome invites applications for TWO FULLY the position reference LCL-PHD-2011 in the subject line. ABOUT LA SAPIENZA The Sapienza University of Rome

  13. Integrating existing natural language processing tools for medication extraction from discharge summaries

    Microsoft Academic Search

    Son Doan; Lisa Bastarache; Sergio Klimkowski; Joshua C. Denny; Hua Xu

    2010-01-01

    OBJECTIVE: To develop an automated system to extract medications and related information from discharge summaries as part of the 2009 i2b2 natural language processing (NLP) challenge. This task required accurate recognition of medication name, dosage, mode, frequency, duration, and reason for drug administration. DESIGN: We developed an integrated system using several existing NLP components developed at Vanderbilt University Medical Center,

  14. i2b2 Workshop on Natural Language Processing Challenges for Clinical Records

    Microsoft Academic Search

    Ozlem Uzuner; Peter Szolovits; Isaac Kohane

    This workshop aims to bring together computational linguists and medical informaticians interested in automatic linguistic processing of clinical records such as medical discharge summaries and radiology reports. Lack of a publicly available and standardized data set has been one of the biggest barriers to systematic progress of Natural Language Processing techniques for clinical data. Within the framework of the i2b2

  15. The Role of Non-Ambiguous Words in Natural Language Disambiguation Rada Mihalcea

    E-print Network

    Mihalcea, Rada

    sense disam- biguation (English, Romanian). Classifiers trained on automatically constructed corpora of Computer Science and Engineering University of North Texas rada@cs.unt.edu Abstract This paper describes an unsupervised approach for natural language disambiguation, applicable to am- biguity problems where classes

  16. The Role of NonAmbiguous Words in Natural Language Disambiguation Rada Mihalcea

    E-print Network

    Mihalcea, Rada

    sense disam­ biguation (English, Romanian). Classifiers trained on automatically constructed corpora of Computer Science and Engineering University of North Texas rada@cs.unt.edu Abstract This paper describes an unsupervised approach for natural language disambiguation, applicable to am­ biguity problems where classes

  17. Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication

    Microsoft Academic Search

    BIING-HWANG JUANG; SADAOKI FURUI

    2000-01-01

    The promise of a powerful computing device to help people in productivity as well as in recreation can only be realized with proper human-machine communication. Automatic recognition and understanding of spoken language is the first step toward natural human-machine interaction. Research in this field has produced remarkable results, leading to many exciting expectations and new challenges. We summarize the development

  18. THEORETICAL REVIEW Zipf's word frequency law in natural language: A critical review

    E-print Network

    Makous, Walter

    THEORETICAL REVIEW Zipf's word frequency law in natural language: A critical review and future approximately follows a simple mathemati- cal form known as Zipf 's law. This article first shows that human law, al- though prior data visualization methods have obscured this fact. A number of empirical

  19. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…

  20. Natural Language Processing for Lines and Devices in Portable Chest X-Rays

    E-print Network

    Rubin, Daniel L.

    /improving patient safety and reducing medical errors #12;Natural Language Processing for Lines and Devices the frequency of infections to the presence or length of time medical devices are present in patients. We a variety of medical devices inserted as part of their course of clinical care. These patients frequently

  1. Dude, srsly?: The Surprisingly Formal Nature of Twitter's Language Yuheng Hu Kartik Talamadupula Subbarao Kambhampati

    E-print Network

    Kambhampati, Subbarao

    Dude, srsly?: The Surprisingly Formal Nature of Twitter's Language Yuheng Hu Kartik Talamadupula, krt, rao}@asu.edu Abstract Twitter has become the de facto information sharing and com- munication platform. Given the factors that influence lan- guage on Twitter ­ size limitation as well as communication

  2. POS Tagging of Dialectal Arabic: A Minimally Supervised Approach Natural language processing technology

    E-print Network

    Kirchhoff, Katrin

    small amounts of written dialectal material in e.g. plays, novels, chat rooms, etc., data can onlyPOS Tagging of Dialectal Arabic: A Minimally Supervised Approach Abstract Natural language processing technology for the dialects of Arabic is still in its infancy, due to the problem of obtaining

  3. Discrimination of Coronal Stops by Bilingual Adults: The Timing and Nature of Language Interaction

    ERIC Educational Resources Information Center

    Sundara, Megha; Polka, Linda

    2008-01-01

    The current study was designed to investigate the timing and nature of interaction between the two languages of bilinguals. For this purpose, we compared discrimination of Canadian French and Canadian English coronal stops by simultaneous bilingual, monolingual and advanced early L2 learners of French and English. French /d/ is phonetically…

  4. Bayesian Inference with Tears a tutorial workbook for natural language researchers

    E-print Network

    Zhang, Yi

    1 Bayesian Inference with Tears a tutorial workbook for natural language researchers Kevin Knight point in my life, I downloaded a decision tree package, and I trained it on some data that I had. I didn's problems." Wow! I figured everybody should know about it, so I wrote "A Statistical MT Tutorial Workbook

  5. A Natural Language Intelligent Tutoring System for Training Pathologists: Implementation and Evaluation

    ERIC Educational Resources Information Center

    El Saadawi, Gilan M.; Tseytlin, Eugene; Legowski, Elizabeth; Jukic, Drazen; Castine, Melissa; Fine, Jeffrey; Gormley, Robert; Crowley, Rebecca S.

    2008-01-01

    Introduction: We developed and evaluated a Natural Language Interface (NLI) for an Intelligent Tutoring System (ITS) in Diagnostic Pathology. The system teaches residents to examine pathologic slides and write accurate pathology reports while providing immediate feedback on errors they make in their slide review and diagnostic reports. Residents…

  6. Natural Vs. Precise Concise Languages for Human Operation of Computers: Research Issues and Experimental Approaches

    Microsoft Academic Search

    Ben Shneiderman

    1980-01-01

    This paper raises concerns that natural language front ends for computer systems can limit a researcher's scope of thinking, yield inappropriately complex systems, and exaggerate public fear of computers. Alternative modes of computer use are suggested and the role of psychologically oriented controlled experimentation is emphasized. Research methods and recent experimental results are briefly reviewed.

  7. Pragmatic Issues in Handling Miscommunication: Observations of a Spoken Natural Language Dialog System

    E-print Network

    Smith, Ronnie W.

    Pragmatic Issues in Handling Miscommunication: Observations of a Spoken Natural Language Dialog miscommu­ nications can safely be ignored; and (2) a strategy of repairing miscommunications based. Furthermore, this repair strategy can deal with miscommunication caused by misstatements by the human user

  8. Limitations of Co-Training for Natural Language Learning from Large Datasets

    Microsoft Academic Search

    David Pierce; Claire Cardie

    2001-01-01

    Co-Training is a weakly supervised learning paradigm in which the redundancy of the learn- ing task is captured by training two classifiers using separate views of the same data. This enables bootstrapping from a small set of la- beled training data via a large set of unlabeled data. This study examines the learning behav- ior of co-training on natural language

  9. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…

  10. Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text.

    ERIC Educational Resources Information Center

    Ruge, Gerda; And Others

    1991-01-01

    Describes a system that was developed in Germany for natural language processing (NLP) to improve free text analysis for information retrieval. Techniques from empirical linguistics are discussed, system architecture is explained, and rules for dealing with conjunctions in dependency analysis for free text processing are proposed. (13 references)…

  11. A Sublanguage Approach to Natural Language Processing for an Expert System.

    ERIC Educational Resources Information Center

    Liddy, Elizabeth D.; And Others

    1993-01-01

    Reports on the development of an NLP (natural language processing) component for processing the free-text comments on life insurance applications for evaluation by an underwriting expert system. A sublanguage grammar approach with strong reliance on semantic word classes is described. Highlights include lexical analysis, adjacency analysis, and…

  12. Generating Symbolic and Natural Language Partial Solutions for Inclusion in Medical Plans

    Microsoft Academic Search

    Sanjay Modgil; Peter Hammond

    2001-01-01

    We describe the generation of partial solutions to Prolog queries posed during the design of medical treatment plans. Given a set of Prolog encoded safety principles, the queries request advise on plan revisions to conform with safety requirements. The user unfolds queries interactively, navigating a path through the solution search space by in- teracting with natural language representations of the

  13. An Approach to Detecting Duplicate Bug Reports using Natural Language and Execution Information

    E-print Network

    Xie, Tao

    An Approach to Detecting Duplicate Bug Reports using Natural Language and Execution Information source project typically maintains an open bug reposi- tory so that bug reports from all over the world can be gathered. When a new bug report is submitted to the repository, a person, called a triager

  14. Distinguishing Natural Language Processes on the Basis of fMRI-Measured Brain Activation

    E-print Network

    in sentence processing, the Left Inferior Frontal Gyrus (LIFG), also known as Broca's area, and the LeftDistinguishing Natural Language Processes on the Basis of fMRI-Measured Brain Activation Francisco of the underlying brain activation measured with fMRI. The method uses a classifier to learn to distinguish between

  15. The Development of a Natural Language Generation System For Personalized e-Health Information

    E-print Network

    DiMarco, Chrysanne

    Director, My CARE Source Patient Portal, Grand River Hospital, personal communication). An effective meansThe Development of a Natural Language Generation System For Personalized e-Health Information C. Di a David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada b

  16. Naturally-Occurring Comprehension Strategies Instruction in 9th-Grade Language Arts Classrooms

    ERIC Educational Resources Information Center

    Anmarkrud, Oistein; Braten, Ivar

    2012-01-01

    In this descriptive classroom study, we used video-based observations supplemented with teacher interviews to provide precise information about the instruction of comprehension strategies that naturally occurred in 4 Norwegian lower-secondary language arts classrooms while students worked with expository texts. The results showed that the teachers…

  17. Self-Regulated Learning in Learning Environments With Pedagogical Agents That Interact in Natural Language

    Microsoft Academic Search

    ARTHUR GRAESSER; DANIELLE McNAMARA

    2010-01-01

    This article discusses the occurrence and measurement of self-regulated learning (SRL) both in human tutoring and in computer tutors with agents that hold conversations with students in natural language and help them learn at deeper levels. One challenge in building these computer tutors is to accommodate, encourage, and scaffold SRL because these skills are not adequately developed for most students.

  18. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

    ERIC Educational Resources Information Center

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen

    2014-01-01

    AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…

  19. AI agents combining natural language interaction, task planning, and business ontologies can help

    E-print Network

    Fox, Mark S.

    AI agents combining natural language interaction, task planning, and business ontologies can help a new customer than to keep an existing one. How can AI help in addressing this problem? For several years we have built a domain-inde- pendent AI platform for creating conversation- al customer

  20. An Evaluation of Help Mechanisms in Natural Language Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Kreymer, Oleg

    2002-01-01

    Evaluates the current state of natural language processing information retrieval systems from the user's point of view, focusing on the structure and components of the systems' help mechanisms. Topics include user/system interaction; semantic parsing; syntactic parsing; semantic mapping; and concept matching. (Author/LRW)

  1. Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars

    Microsoft Academic Search

    Ted Briscoe; John Carroll

    1993-01-01

    We describe work toward the construction of a very wide-coverage probabilistic parsing system for natural language (NL), based on LR parsing techniques. The system is intended to rank the large number of syntactic analyses produced by NL grammars according to the frequency of occurrence of the individual rules deployed in each analysis. We discuss a fully automatic procedure for constructing

  2. A Lisp-Language Mathematica-to-Lisp Translator

    E-print Network

    Fateman, Richard J.

    A Lisp-Language Mathematica-to-Lisp Translator Richard J. Fateman University of California basically as described in the Mathematica references. We describe a parser writ- ten in Common Lisp-precision integers, and tools for lexi- cal scanning of languages and (3) Lisp is the host lan- guage for several

  3. Software Development Of XML Parser Based On Algebraic Tools

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2011-12-01

    In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.

  4. A prototype natural language interface to a large complex knowledge base, the Foundational Model of Anatomy.

    PubMed

    Distelhorst, Gregory; Srivastava, Vishrut; Rosse, Cornelius; Brinkley, James F

    2003-01-01

    We describe a constrained natural language interface to a large knowledge base, the Foundational Model of Anatomy (FMA). The interface, called GAPP, handles simple or nested questions that can be parsed to the form, subject-relation-object, where subject or object is unknown. With the aid of domain-specific dictionaries the parsed sentence is converted to queries in the StruQL graph-searching query language, then sent to a server we developed, called OQAFMA, that queries the FMA and returns output as XML. Preliminary evaluation shows that GAPP has the potential to be used in the evaluation of the FMA by domain experts in anatomy. PMID:14728162

  5. Psychological aspects of natural language. use: our words, our selves.

    PubMed

    Pennebaker, James W; Mehl, Matthias R; Niederhoffer, Kate G

    2003-01-01

    The words people use in their daily lives can reveal important aspects of their social and psychological worlds. With advances in computer technology, text analysis allows researchers to reliably and quickly assess features of what people say as well as subtleties in their linguistic styles. Following a brief review of several text analysis programs, we summarize some of the evidence that links natural word use to personality, social and situational fluctuations, and psychological interventions. Of particular interest are findings that point to the psychological value of studying particles-parts of speech that include pronouns, articles, prepositions, conjunctives, and auxiliary verbs. Particles, which serve as the glue that holds nouns and regular verbs together, can serve as markers of emotional state, social identity, and cognitive styles. PMID:12185209

  6. Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

    NASA Technical Reports Server (NTRS)

    Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

    2014-01-01

    In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.

  7. A Natural Language for AdS/CFT Correlators

    SciTech Connect

    Fitzpatrick, A.Liam; /Boston U.; Kaplan, Jared; /SLAC; Penedones, Joao; /Perimeter Inst. Theor. Phys.; Raju, Suvrat; /Harish-Chandra Res. Inst.; van Rees, Balt C.; /YITP, Stony Brook

    2012-02-14

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  8. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)

    ERIC Educational Resources Information Center

    Jackendoff, Ray; Pinker, Steven

    2005-01-01

    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  9. The nature of the language faculty and its implications for evolution of language (Reply to Fitch, Hauser, and Chomsky)

    Microsoft Academic Search

    Ray Jackendoff; Steven Pinker

    2005-01-01

    Abstract In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the “narrow language faculty”) consists only of recursion, and that this part cannot be considered an adaptation to communication.,We argue that their characterization of the narrow

  10. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language?

    ERIC Educational Resources Information Center

    Armstrong, Timothy Currie

    2014-01-01

    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  11. Dependency Parser-based Negation Detection in Clinical Narratives

    PubMed Central

    Sohn, Sunghwan; Wu, Stephen; Chute, Christopher G.

    2012-01-01

    Negation of clinical named entities is common in clinical documents and is a crucial factor to accurately compile patients’ clinical conditions and to further support complex phenotype detection. In 2009, Mayo Clinic released the clinical Text Analysis and Knowledge Extraction System (cTAKES), which includes a negation annotator that identifies negation status of a named entity by searching for negation words within a fixed word distance. However, this negation strategy is not sophisticated enough to correctly identify complicated patterns of negation. This paper aims to investigate whether the dependency structure from the cTAKES dependency parser can improve the negation detection performance. Manually compiled negation rules, derived from dependency paths were tested. Dependency negation rules do not limit the negation scope to word distance; instead, they are based on syntactic context. We found that using a dependency-based negation proved a superior alternative to the current cTAKES negation annotator. PMID:22779038

  12. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks.

    PubMed

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar

    2015-01-01

    Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions. PMID:26151211

  13. Integrating existing natural language processing tools for medication extraction from discharge summaries

    Microsoft Academic Search

    Son Doan; Lisa Bastarache; Sergio Klimkowski; Joshua C. Denny; Hua Xu

    2010-01-01

    ObjectiveTo develop an automated system to extract medications and related information from discharge summaries as part of the 2009 i2b2 natural language processing (NLP) challenge. This task required accurate recognition of medication name, dosage, mode, frequency, duration, and reason for drug administration.DesignWe developed an integrated system using several existing NLP components developed at Vanderbilt University Medical Center, which included MedEx

  14. Using Shallow Natural Language Processing in a Just-In-Time Information Retrieval Assistant for Bloggers

    Microsoft Academic Search

    Ang Gaoand; Derek G. Bridge

    2009-01-01

    \\u000a Just-In-Time Information Retrieval agents proactively retrieve information based on queries that are implicit in, and formulated\\u000a from, the user’s current context, such as the blogpost she is writing. This paper compares five heuristics by which queries\\u000a can be extracted from a user’s blogpost or other document. Four of the heuristics use shallow Natural Language Processing\\u000a techniques, such as tagging and

  15. A Tool for Extension and Restructuring Natural Language Question Answering Domains

    Microsoft Academic Search

    Boris Galitsky

    2002-01-01

    In this report, we present the system that allows various forms of knowledge exchange for users of the natural language question\\u000a answering system. The tool is also capable of performing the domain restructuring by domain experts to adjust it to a particular\\u000a audience of customers. The tool is implemented for financial and legal advisors, where the information is extremely dynamic

  16. Symbolic Languages and Natural Structures a Mathematician’s Account of Empiricism

    Microsoft Academic Search

    Hermann G. W. Burchard

    2005-01-01

    The ancient dualism of a sensible and an intelligible world important in Neoplatonic and medieval philosophy, down to Descartes and Kant, would seem to be supplanted today by a scientific view of mind-in-nature. Here, we revive the old dualism in a modified form, and describe mind as a symbolic language, founded in linguistic recursive computation according to the Church-Turing thesis,

  17. The Use of Natural Language as an Intuitive Semantic Integration System Interface

    Microsoft Academic Search

    Stanis?aw Kozielski; Micha? ?widerski; Ma?gorzata Bach

    This paper describes the need for intuitive interfaces to complex systems that take their origin from the concepts of Semantic\\u000a Web. Then it shows how Semantic Integration System HILLS can benefit from being merged with Pseudo Natural Language layer\\u000a of Metalog system. The cooperation of these two systems is not perfect though - second part of the paper shows guidelines

  18. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 655665, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    and Computational Natural Language Learning, pages 655­665, Jeju Island, Korea, 12­14 July 2012. c 2012 Association similarity. Our interest in graph 655 #12;algorithms is two-fold. First, graph-based domain representations

  19. De-Centering English: Highlighting the Dynamic Nature of the English Language to Promote the Teaching of Code-Switching

    ERIC Educational Resources Information Center

    White, John W.

    2011-01-01

    Embracing the dynamic nature of English language can help students learn more about all forms of English. To fully engage students, teachers should not adhere to an anachronistic and static view of English. Instead, they must acknowledge, accept, and even use different language forms within the classroom to make that classroom dynamic, inclusive,…

  20. Computerized measurement of the content analysis of natural language for use in biomedical and neuropsychiatric research.

    PubMed

    Gottschalk, L A; Bechtel, R

    1995-07-01

    Over several decades, the senior author, with various colleagues, has developed an objective method of measuring the magnitude of commonly useful and pertinent neuropsychiatric and neuropsychological dimensions from the content and form analysis of verbal behavior and natural language. Extensive reliability and validation studies using this method have been published involving English, German, Spanish and many other languages, and which confirm that these Content Analysis Scales can be reliably scored cross-culturally and have construct validity. The validated measures include the Anxiety Scale (and six subscales), the Hostility Outward Scale (and two subscales), the Hostility In Scale, the Ambivalent Hostility Scale, the Social Alienation-Personal Disorganization Scale, the Cognitive Impairment Scale, the Depression Scale (and seven subscales), and the Hope Scale. Here, the authors report the development of artificial intelligence (LISP based) software that can reliably score these Content Analysis Scales, whose achievement facilitates the application of these measures to biomedical and neuropsychiatric research. PMID:7587159

  1. Morphological Parsing of Tone: An Experiment with Two-Level Morphology on the Ha Language

    Microsoft Academic Search

    Lotta Harjula

    2005-01-01

    Morphological parsers are typically developed for languages without contrastive tonal systems. Ha, a typical Bantu language of Western Tanzania, proposes a challenge to these parses with both lexical and grammatical pitch-accent that would, in order to describe the tonal phenomena, seem to require an approach with a separate level for the tones. However, since the Two-Level Morphology (Koskenniemi 1983) has

  2. Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping.

    PubMed

    Smith, Andrew E; Humphreys, Michael S

    2006-05-01

    The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in a nunsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks. PMID:16956103

  3. Practical systems use natural languages and store human expertise (artificial intelligence)

    SciTech Connect

    Evanczuk, S.; Manuel, T.

    1983-12-01

    For earlier articles see T. Manuel et al., ibid., vol.56, no.22, p.127-37. This second part of a special report on commercial applications of artificial intelligence examines the milestones which mark this major new path for the software industry. It covers state-space search, the problem of ambiguity, augmented transition networks, early commercial products, current and expected personal computer software, natural-language interfaces, research projects, knowledge engineering, the workings of artificial-intelligence-based applications programs, LISP, attributes and object orientation.

  4. On the Simultaneous Interpretation of Real World Image Sequences and their Natural Language Description: The System Soccer

    Microsoft Academic Search

    Elisabeth André; Gerd Herzog; Thomas Rist

    1988-01-01

    The aim of previous attempts at connecting vision systems and natural lan- guage systems has been to provide a retrospective descripti on of the analysed image sequence. The step from such an a posterioriapproach towards simultane- ous natural language description reveals a problem which has not yet been dealt with in generation systems. Automatic generation of simultaneous descriptions calls for

  5. Natural Language Processing and Machine Translation Encyclopedia of Language and Linguistics, 2nd ed. (ELL2). Machine Translation: Interlingual Methods

    Microsoft Academic Search

    Bonnie J. Dorr; Eduard H. Hovy; Lori S. Levin

    An interlingua is a notation for representing the content of a text that abstracts away from the characteristics of the language itself and focuses on the meaning (semantics) alone. Interlinguas are typically used as pivot representations in machine translation, allowing the contents of a source text to be generated in many different target languages. Due to the complexities involved, few

  6. Language of the Earth: Exploring Natural Hazards through a Literary Anthology

    NASA Astrophysics Data System (ADS)

    Malamud, B. D.; Rhodes, F. H. T.

    2009-04-01

    This paper explores natural hazards teaching and communications through the use of a literary anthology of writings about the earth aimed at non-experts. Teaching natural hazards in high-school and university introductory Earth Science and Geography courses revolves mostly around lectures, examinations, and laboratory demonstrations/activities. Often the results of such a course are that a student 'memorizes' the answers, and is penalized when they miss a given fact [e.g., "You lost one point because you were off by 50 km/hr on the wind speed of an F5 tornado."] Although facts and general methodologies are certainly important when teaching natural hazards, it is a strong motivation to a student's assimilation of, and enthusiasm for, this knowledge, if supplemented by writings about the Earth. In this paper, we discuss a literary anthology which we developed [Language of the Earth, Rhodes, Stone, Malamud, Wiley-Blackwell, 2008] which includes many descriptions about natural hazards. Using first- and second-hand accounts of landslides, earthquakes, tsunamis, floods and volcanic eruptions, through the writings of McPhee, Gaskill, Voltaire, Austin, Cloos, and many others, hazards become 'alive', and more than 'just' a compilation of facts and processes. Using short excerpts such as these, or other similar anthologies, of remarkably written accounts and discussions about natural hazards results in 'dry' facts becoming more than just facts. These often highly personal viewpoints of our catostrophic world, provide a useful supplement to a student's understanding of the turbulent world in which we live.

  7. a straightforward rewriting: whenever a righthand side matches a subgraph, replace it (destructively) with the lefthand side. Bamji's parser does not try to obtain all possible

    E-print Network

    Wills, Linda Mary

    it (destructively) with the left­hand side. Bamji's parser does not try to obtain all possible parses, just one there has been more interest in developing graph parsers. Bamji [8, 9] developed a special case of a chart parser for graphs equivalent to Lutz's flow graphs. The interesting aspect of Bamji's graph grammar

  8. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

    NASA Technical Reports Server (NTRS)

    Genuardi, Michael T.

    1993-01-01

    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  9. Gesture language use in natural UI: pen-based sketching in conceptual design

    NASA Astrophysics Data System (ADS)

    Ma, Cuixia; Dai, Guozhong

    2003-04-01

    Natural User Interface is one of the important next generation interactions. Computers are not just the tools of many special people or areas but for most people. Ubiquitous computing makes the world magic and more comfortable. In the design domain, current systems, which need the detail information, cannot conveniently support the conceptual design of the early phrase. Pen and paper are the natural and simple tools to use in our daily life, especially in design domain. Gestures are the useful and natural mode in the interaction of pen-based. In natural UI, gestures can be introduced and used through the similar mode to the existing resources in interaction. But the gestures always are defined beforehand without the users' intention and recognized to represent something in certain applications without being transplanted to others. We provide the gesture description language (GDL) to try to cite the useful gestures to the applications conveniently. It can be used in terms of the independent control resource such as menus or icons in applications. So we give the idea from two perspectives: one from the application-dependent point of view and the other from the application-independent point of view.

  10. Voice-Dictated versus Typed-in Clinician Notes: Linguistic Properties and the Potential Implications on Natural Language Processing

    PubMed Central

    Zheng, Kai; Mei, Qiaozhu; Yang, Lei; Manion, Frank J.; Balis, Ulysses J.; Hanauer, David A.

    2011-01-01

    In this study, we comparatively examined the linguistic properties of narrative clinician notes created through voice dictation versus those directly entered by clinicians via a computer keyboard. Intuitively, the nature of voice-dictated notes would resemble that of natural language, while typed-in notes may demonstrate distinctive language features for reasons such as intensive usage of acronyms. The study analyses were based on an empirical dataset retrieved from our institutional electronic health records system. The dataset contains 30,000 voice-dictated notes and 30,000 notes that were entered manually; both were encounter notes generated in ambulatory care settings. The results suggest that between the narrative clinician notes created via these two different methods, there exists a considerable amount of lexical and distributional differences. Such differences could have a significant impact on the performance of natural language processing tools, necessitating these two different types of documents being differentially treated. PMID:22195229

  11. Machine Learning and the Cognitive Basis of Natural Language Shalom Lappin

    E-print Network

    Lappin, Shalom

    can be achieved through general learn- ing procedures, and that a richly articulated language faculty language faculty may be sufficient to support language acquistion and intepretation. In Section 2 I briefly language faculty. Section 3 reviews 1 #12;2 Shalom Lappin major developments in the application

  12. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 13911395, October 25-29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics

    E-print Network

    Languages spoken by immigrants change due to contact with the local languages. Capturing these changesProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP Predicting Dialect Variation in Immigrant Contexts Using Light Verb Constructions A. Seza Dogru

  13. Disclosure Control of Natural Language Information to Enable Secure and Enjoyable Communication over the Internet

    NASA Astrophysics Data System (ADS)

    Kataoka, Haruno; Utsumi, Akira; Hirose, Yuki; Yoshiura, Hiroshi

    Disclosure control of natural language information (DCNL), which we are trying to realize, is described. DCNL will be used for securing human communications over the internet, such as through blogs and social network services. Before sentences in the communications are disclosed, they are checked by DCNL and any phrases that could reveal sensitive information are transformed or omitted so that they are no longer revealing. DCNL checks not only phrases that directly represent sensitive information but also those that indirectly suggest it. Combinations of phrases are also checked. DCNL automatically learns the knowledge of sensitive phrases and the suggestive relations between phrases by using co-occurrence analysis and Web retrieval. The users' burden is therefore minimized, i.e., they do not need to define many disclosure control rules. DCNL complements the traditional access control in the fields where reliability needs to be balanced with enjoyment and objects classes for the access control cannot be predefined.

  14. Adapting a natural language processing tool to facilitate clinical trial curation for personalized cancer therapy.

    PubMed

    Zeng, Jia; Wu, Yonghui; Bailey, Ann; Johnson, Amber; Holla, Vijaykumar; Bernstam, Elmer V; Xu, Hua; Meric-Bernstam, Funda

    2014-01-01

    The design of personalized cancer therapy based upon patients' molecular profile requires an enormous amount of effort to review, analyze and integrate molecular, pharmacological, clinical and patient-specific information. The vast size, rapid expansion and non-standardized formats of the relevant information sources make it difficult for oncologists to gather pertinent information that can support routine personalized treatment. In this paper, we introduce informatics tools that assist the retrieval and curation of cancer-related clinical trials involving targeted therapies. Particularly, we adapted and extended an existing natural language processing tool, and explored its applicability in facilitating our annotation efforts. The system was evaluated using a gold standard of 539 curated clinical trials, demonstrating promising performance and good generalizability (81% accuracy in predicting genotype-selected trials and an average recall of 0.85 in predicting specific selection criteria). PMID:25717412

  15. Question Answering in a Natural Language Understanding System Based on Object-Oriented Semantics

    E-print Network

    Ostapov, Yuriy

    2011-01-01

    Algorithms of question answering in a computer system oriented on input and logical processing of text information are presented. A knowledge domain under consideration is social behavior of a person. A database of the system includes an internal representation of natural language sentences and supplemental information. The answer {\\it Yes} or {\\it No} is formed for a general question. A special question containing an interrogative word or group of interrogative words permits to find a subject, object, place, time, cause, purpose and way of action or event. Answer generation is based on identification algorithms of persons, organizations, machines, things, places, and times. Proposed algorithms of question answering can be realized in information systems closely connected with text processing (criminology, operation of business, medicine, document systems).

  16. Epidemiology of angina pectoris: role of natural language processing of the medical record

    PubMed Central

    Pakhomov, Serguei; Hemingway, Harry; Weston, Susan A.; Jacobsen, Steven J.; Rodeheffer, Richard; Roger, Véronique L.

    2007-01-01

    Background The diagnosis of angina is challenging as it relies on symptom descriptions. Natural language processing (NLP) of the electronic medical record (EMR) can provide access to such information contained in free text that may not be fully captured by conventional diagnostic coding. Objective To test the hypothesis that NLP of the EMR improves angina pectoris (AP) ascertainment over diagnostic codes. Methods Billing records of in- and out-patients were searched for ICD-9 codes for AP, chronic ischemic heart disease and chest pain. EMR clinical reports were searched electronically for 50 specific non-negated natural language synonyms to these ICD-9 codes. The two methods were compared to a standardized assessment of angina by Rose questionnaire for three diagnostic levels: unspecified chest pain, exertional chest pain, and Rose angina. Results Compared to the Rose questionnaire, the true positive rate of EMR-NLP for unspecified chest pain was 62% (95%CI:55–67) vs. 51% (95%CI:44–58) for diagnostic codes (p<0.001). For exertional chest pain, the EMR-NLP true positive rate was 71% (95%CI:61–80) vs. 62% (95%CI:52–73) for diagnostic codes (p=0.10). Both approaches had 88% (95%CI:65–100) true positive rate for Rose angina. The EMR-NLP method consistently identified more patients with exertional chest pain over 28-month follow-up. Conclusion EMR-NLP method improves the detection of unspecified and exertional chest pain cases compared to diagnostic codes. These findings have implications for epidemiological and clinical studies of angina pectoris. PMID:17383310

  17. LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

    NASA Technical Reports Server (NTRS)

    Will, H.

    1994-01-01

    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  18. Integrating natural language processing and web GIS for interactive knowledge domain visualization

    NASA Astrophysics Data System (ADS)

    Du, Fangming

    Recent years have seen a powerful shift towards data-rich environments throughout society. This has extended to a change in how the artifacts and products of scientific knowledge production can be analyzed and understood. Bottom-up approaches are on the rise that combine access to huge amounts of academic publications with advanced computer graphics and data processing tools, including natural language processing. Knowledge domain visualization is one of those multi-technology approaches, with its aim of turning domain-specific human knowledge into highly visual representations in order to better understand the structure and evolution of domain knowledge. For example, network visualizations built from co-author relations contained in academic publications can provide insight on how scholars collaborate with each other in one or multiple domains, and visualizations built from the text content of articles can help us understand the topical structure of knowledge domains. These knowledge domain visualizations need to support interactive viewing and exploration by users. Such spatialization efforts are increasingly looking to geography and GIS as a source of metaphors and practical technology solutions, even when non-georeferenced information is managed, analyzed, and visualized. When it comes to deploying spatialized representations online, web mapping and web GIS can provide practical technology solutions for interactive viewing of knowledge domain visualizations, from panning and zooming to the overlay of additional information. This thesis presents a novel combination of advanced natural language processing - in the form of topic modeling - with dimensionality reduction through self-organizing maps and the deployment of web mapping/GIS technology towards intuitive, GIS-like, exploration of a knowledge domain visualization. A complete workflow is proposed and implemented that processes any corpus of input text documents into a map form and leverages a web application framework to let users explore knowledge domain maps interactively. This workflow is implemented and demonstrated for a data set of more than 66,000 conference abstracts.

  19. Ambiguity Detection for Programming Language Grammars

    Microsoft Academic Search

    H. J. S. Basten

    2011-01-01

    Context-free grammars are the most suitable and most widely used method for describing the syntax of programming languages.\\u000aThey can be used to generate parsers, which transform a piece of source code into a tree-shaped representation of the code's syntactic structure.\\u000aThese parse trees can then be used for further processing or analysis of the source text.\\u000aIn this sense,

  20. The YACCcompatible Parser Generator December 1992, Bison Version 1.20

    E-print Network

    Miller, Jeffrey A.

    Bison The YACC­compatible Parser Generator December 1992, Bison Version 1.20 by Charles Donnelly, provided also that the sections entitled ``GNU General Public License'' and ``Conditions for Using Bison versions, except that the sections entitled ``GNU General Public License'', ``Conditions for Using Bison

  1. Testing Grammars For Top-Down Parsers A.M. Paracha and F. Franek

    E-print Network

    Franek, Frantisek

    Testing Grammars For Top-Down Parsers A.M. Paracha and F. Franek Dept. of Computing and Software Mc processing tools, software modification tools, and software analysis tools. Testing a grammar to make sure's algorithm to produce test data automatically for testing the MACS 1 grammar ( an LL(1) grammar

  2. Reachability Analysis of the HTML5 Parser Specification and its Application to

    E-print Network

    Minamide, Yasuhiko

    Reachability Analysis of the HTML5 Parser Specification and its Application to Compatibility for HTML, HTML5, includes the detailed specification of the parsing algorithm for HTML5 documents, includ of HTML5 and automatically generate HTML documents to test compatibilities of Web browsers. The set

  3. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus

    PubMed Central

    Comeau, Donald C.; Liu, Haibin; Islamaj Do?an, Rezarta; Wilbur, W. John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net PMID:24935050

  4. The Usual and the Unusual: Solving Remote Associates Test Tasks Using Simple Statistical Natural Language Processing Based on Language Use

    ERIC Educational Resources Information Center

    Klein, Ariel; Badia, Toni

    2015-01-01

    In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…

  5. Research in knowledge representation for natural language communication and planning assistance. Final report, 18 March 1985-30 September 1988

    Microsoft Academic Search

    B. A. Goodman; B. Grosz; A. Haas; D. Litman; T. Reinhardt

    1988-01-01

    BBN's DARPA project in Knowledge Representation for Natural Language Communication and Planning Assistance has two primary objectives: 1) To perform research on aspects of the interaction between users who are making complex decisions and systems that are assisting them with their task. In particular, this research is focused on communication and the reasoning required for performing its underlying task of

  6. Coding Neuroradiology Reports for the Northern Manhattan Stroke Study: A Comparison of Natural Language Processing and Manual Review

    Microsoft Academic Search

    Jacob S. Elkins; Carol Friedman; Bernadette Boden-Albala; Ralph L. Sacco; George Hripcsak

    2000-01-01

    Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study(NOMASS). We identified 471 neuroradiology reports of

  7. Does It Really Matter whether Students' Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language?

    ERIC Educational Resources Information Center

    D'Mello, Sidney K.; Dowell, Nia; Graesser, Arthur

    2011-01-01

    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The "speech facilitation" hypothesis predicts that spoken input will "increase" learning, whereas the "text…

  8. Proceedings of the Fourth International Natural Language Generation Conference, pages 8991, Sydney, July 2006. c 2006 Association for Computational Linguistics

    E-print Network

    Ritchie, Graeme

    Science University of Aberdeen Aberdeen AB24 3UE, U.K. {ikhan,gritchie,kvdeemte}@csd.abdn.ac.uk Abstract by a University of Aberdeen Sixth Century Studentship, and the TUNA project (EPSRC, UK) under grant number GRProceedings of the Fourth International Natural Language Generation Conference, pages 89­91, Sydney

  9. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard

    Microsoft Academic Search

    Matthias Lendholt; Martin Hammitzsch; Joachim Wächter

    2010-01-01

    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System)

  10. A VORONOi-BASED PIVOT REPRESENTATION OF SPATIAL CONCEPTS AND ITS APPLICATION TO ROUTE DESCRIPTIONS EXPRESSED IN NATURAL LANGUAGE

    Microsoft Academic Search

    G. Edwards; G. Ligozat; A. Gryl; L. Fraczak

    1996-01-01

    Different representations of space are not in general equivalent. This point is clearly illustrated in research on the generation of sketches from route descriptions given in natural language: many linguistic expressions determine only partially a spatial situation. This article explores the role played by a pivot representation based on the Voronoi diagram. We study the use of this model in

  11. Excavating grey literature : A case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources

    Microsoft Academic Search

    Andreas Vlachidis; Ceri Binding; Douglas Tudhope; Keith May

    2010-01-01

    Purpose – This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist “rich” semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware “rich” indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with

  12. Language in Nature: On the Evolutionary Roots of a Cultural Phenomenon

    NASA Astrophysics Data System (ADS)

    Zuidema, Willem

    What could an evolutionary explanation for language look like? Here I review relevant evidence from linguistics, comparative biology, evolutionary theory and the fossil record, which suggest vocal imitation and hierarchical compositionality as the essential and uniquely human biological foundations of language. I also outline a plausible scenario for how human language evolved, and propose that language preceded, and facilitated the development of, other cognitive domains such as reasoning, the ability to plan, and consciousness.

  13. Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof

    ERIC Educational Resources Information Center

    Robotti, Elisabetta

    2012-01-01

    In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language

  14. A Mandarin Dictation Machine Based Upon a Hierarchical Recognition Approach and Chinese Natural Language Analysis

    Microsoft Academic Search

    Lin-shan Lee; Chiu-yu Tseng; Keh-jiann Chen; James Huang; Chia-hwa Hwang; Pei-yih Ting; LONG-JI LIN; C. C. Chen

    1990-01-01

    An experimental Mandarin dictation machine for inputting Mandarin speech (spoken Chinese language) into computers is described. Because of the special characteristics of the Chinese language, syllables are chosen as the basic units for dictation. The machine is designed based on a hierarchical language recognition approach in which acoustic signals are first recognized as a sequence of syllables, possible word hypotheses

  15. Programming Languages.

    ERIC Educational Resources Information Center

    Tesler, Lawrence G.

    1984-01-01

    Discusses the nature of programing languages, considering the features of BASIC, LOGO, PASCAL, COBOL, FORTH, APL, and LISP. Also discusses machine/assembly codes, the operation of a compiler, and trends in the evolution of programing languages (including interest in notational systems called object-oriented languages). (JN)

  16. PS1-15: Pre-filling Breast MRI Abstraction Forms Using Natural Language Processing

    PubMed Central

    Gao, Hongyuan; Wernli, Karen

    2014-01-01

    Background/Aims Information in breast MRI reports is valuable for breast cancer research, but these data are only available in free-text reports and require resource-intensive manual abstraction. We developed and tested a Natural Language Processing (NLP) algorithm to extract information and pre-fill abstraction form from free-text breast MRI reports. Methods We identified 465 reports for women receiving breast MRI at Group Health between 2010–2012. We developed an NLP algorithm in SAS v9.2. The algorithm extracts information of reading radiologist, laterality, parenchymal enhancement, whether computer-aided technique is used, comparison exams, clinical indications and assessment from breast MRI reports. The NLP results are compared with manual abstraction from an experienced abstractor. Results The algorithm correctly extracts reading radiologist, laterality and whether computer-aided technique for all 465 breast MRI reports, except 1 report with inconsistent information on laterality itself. It correctly extracts 83% of 465 reports for assessment for right breast and 92% for assessment for left breast. Unstable gold standard impedes performance of the NLP algorithm for extracting parenchymal enhancement and clinical indications. There is no gold standard to show NLP performance for comparison exams yet. Conclusions This NLP algorithm holds promise for rapid, accurate extraction of information from free-text breast MRI reports. Manual review will be faster and more accurate due to the pre-filling of the abstraction form.

  17. Semi-supervised learning of statistical models for natural language understanding.

    PubMed

    Zhou, Deyu; He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure. PMID:25152899

  18. Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports

    PubMed Central

    Sohn, Sunghwan; Ye, Zi; Liu, Hongfang; Chute, Christopher G.; Kullo, Iftikhar J.

    Prevalence of abdominal aortic aneurysm (AAA) is increasing due to longer life expectancy and implementation of screening programs. Patient-specific longitudinal measurements of AAA are important to understand pathophysiology of disease development and modifiers of abdominal aortic size. In this paper, we applied natural language processing (NLP) techniques to process radiology reports and developed a rule-based algorithm to identify AAA patients and also extract the corresponding aneurysm size with the examination date. AAA patient cohorts were determined by a hierarchical approach that: 1) selected potential AAA reports using keywords; 2) classified reports into AAA-case vs. non-case using rules; and 3) determined the AAA patient cohort based on a report-level classification. Our system was built in an Unstructured Information Management Architecture framework that allows efficient use of existing NLP components. Our system produced an F-score of 0.961 for AAA-case report classification with an accuracy of 0.984 for aneurysm size extraction. PMID:24303276

  19. Semi-Supervised Learning of Statistical Models for Natural Language Understanding

    PubMed Central

    He, Yulan

    2014-01-01

    Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure. PMID:25152899

  20. A Natural Language Processing Algorithm to define a Venous Thromboembolism Phenotype

    PubMed Central

    McPeek Hinz, Eugenia R.; Bastarache, Lisa; Denny, Joshua C

    2013-01-01

    Deep venous thrombosis and pulmonary embolism are diseases associated with significant morbidity and mortality. Known risk factors are attributed for only slight majority of venous thromboembolic disease (VTE) with the remainder of risk presumably related to unidentified genetic factors. We designed a general purpose Natural Language (NLP) algorithm to retrospectively capture both acute and historical cases of thromboembolic disease in a de-identified electronic health record. Applying the NLP algorithm to a separate evaluation set found a positive predictive value of 84.7% and sensitivity of 95.3% for an F-measure of 0.897, which was similar to the training set of 0.925. Use of the same algorithm on problem lists only in patients without VTE ICD-9s was found to be the best means of capturing historical cases with a PPV of 83%. NLP of VTE ICD-9 positive cases and non-ICD-9 positive problem lists provides an effective means for capture of both acute and historical cases of venous thromboembolic disease. PMID:24551388

  1. Extraction of CYP chemical interactions from biomedical literature using natural language processing methods.

    PubMed

    Jiao, Dazhi; Wild, David J

    2009-02-01

    This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively. PMID:19434828

  2. Natural-Language Syntax as Procedures for Interpretation: The Dynamics of Ellipsis Construal

    NASA Astrophysics Data System (ADS)

    Kempson, Ruth; Gregoromichelaki, Eleni; Meyer-Viol, Wilfried; Purver, Matthew; White, Graham; Cann, Ronnie

    In this paper we set out the preliminaries needed for a formal theory of context, relative to a linguistic framework in which natural-language syntax is defined as procedures for context-dependent interpretation. Dynamic Syntax provides a formalism where both representations of content and context are defined dynamically and structurally, with time-linear monotonic growth across sequences of partial trees as the core structure-inducing notion. The primary data involve elliptical fragments, as these provide less familiar evidence of the requisite concept of context than anaphora, but equally central. As part of our sketch of the framework, we show how apparent anomalies for a time-linear basis for interpretation can be straightforwardly characterised once we adopt a new perspective on syntax as the dynamics of transitions between parse-states. We then take this as the basis for providing an integrated account of ellipsis construal. And, as a bonus, we will show how this intrinsically dynamic perspective extends in a seamless way to dialogue exchanges with free shifting of role between speaking and hearing (split-utterances). We shall argue that what is required to explain such dialogue phenomena is for contexts, as representations of content, to include not merely partial structures but also the sequence of actions that led to such structures.

  3. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

    PubMed Central

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  4. A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries

    PubMed Central

    Wu, Yonghui; Denny, Joshua C.; Rosenbloom, S. Trent; Miller, Randolph A.; Giuse, Dario A.; Xu, Hua

    2012-01-01

    Clinical Natural Language Processing (NLP) systems extract clinical information from narrative clinical texts in many settings. Previous research mentions the challenges of handling abbreviations in clinical texts, but provides little insight into how well current NLP systems correctly recognize and interpret abbreviations. In this paper, we compared performance of three existing clinical NLP systems in handling abbreviations: MetaMap, MedLEE, and cTAKES. The evaluation used an expert-annotated gold standard set of clinical documents (derived from from 32 de-identified patient discharge summaries) containing 1,112 abbreviations. The existing NLP systems achieved suboptimal performance in abbreviation identification, with F-scores ranging from 0.165 to 0.601. MedLEE achieved the best F-score of 0.601 for all abbreviations and 0.705 for clinically relevant abbreviations. This study suggested that accurate identification of clinical abbreviations is a challenging task and that more advanced abbreviation recognition modules might improve existing clinical NLP systems. PMID:23304375

  5. A Compositional Natural Semantics and Hoare Logic for Low-Level Languages

    Microsoft Academic Search

    Ando Saabas; Tarmo Uustalu

    2006-01-01

    The advent of proof-carrying code has generated significant interest in reasoning about low-level languages. It is widely believed that low-level languages with jumps must be dicult to reason about by being inherently non-modular. We argue that this is untrue. We take it seriously that, dierently from statements of a high-level language, pieces of low-level code are multiple-entry and multiple-exit. And

  6. The South African Sign Language Machine Translation Project: Issues on Non-manual Sign Generation

    E-print Network

    van Zijl, Lynette

    -adjoining grammar parser approach in order to generate non-manual signs and construct a suitable signing space. We-manual Signs, Machine Translation, Sign Language 1. INTRODUCTION The South African Deaf community of an effort to develop assistive technologies to bridge the communication gap between the hearing and the Deaf

  7. IGES/RIM Parser/Converter users guide. [Initial Graphics Exchange Specification/Relational Information Manager

    SciTech Connect

    Isler, R.E.

    1985-05-01

    Sandia National Laboratories has been assigned Lead Lab responsibility by the Department of Energy (DOE) for integrating the communications among computer-aided design/computer-aided manufacturing (CAD/CAM) activities throughout DOE's Nuclear Weapons Complex (NWC). A primary objective is to provide a capability for the exchange of digital data between dissimilar CAD systems within the NWC. A subset of the Initial Graphics Exchange Specification (IGES) will be the data exchange format. The IGES/RIM Parser/Converter is the first in a series of programs being developed within the NWC to carry out this automated exchange. The Parser/Converter program converts an IGES file into a file of input commands to a Relational Information Manager (RIM) database.

  8. Advanced Language Technologies

    E-print Network

    Erjavec, Toma?

    . Computer processing of natural language 2. Some history 3. Applications 4. Levels of linguistic analysis #12;I. Computer processing of natural language · Computational Linguistics: · a branch of computer/understand language · Natural Language Processing: · a subfield of CL, dealing with specific computational methods

  9. Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin

    Microsoft Academic Search

    Hua Xu; Min Jiang; Matt Oetjens; Erica A Bowton; Andrea H Ramirez; Janina M Jeff; Melissa A Basford; Jill M Pulley; James D Cowan; Xiaoming Wang; Marylyn D Ritchie; Daniel R Masys; Dan M Roden; Dana C Crawford; Joshua C Denny

    2011-01-01

    ObjectiveDNA biobanks linked to comprehensive electronic health records systems are potentially powerful resources for pharmacogenetic studies. This study sought to develop natural-language-processing algorithms to extract drug-dose information from clinical text, and to assess the capabilities of such tools to automate the data-extraction process for pharmacogenetic studies.Materials and methodsA manually validated warfarin pharmacogenetic study identified a cohort of 1125 patients with

  10. Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing

    PubMed Central

    Hou, Jason K.; Chang, Mimi; Nguyen, Thien; Kramer, Jennifer R.; Richardson, Peter; Sansgiry, Shubhada; D’Avolio, Leonard W.; El-Serag, Hashem B.

    2014-01-01

    Background Differentiating surveillance from non-surveillance colonoscopy for colorectal cancer in patients with inflammatory bowel disease (IBD) using electronic medical records (EMR) is important for practice improvement and research purposes, but diagnosis code algorithms are lacking. The automated retrieval console (ARC) is natural language processing (NLP)-based software that allows text-based document-level classification. Aims The purpose of this study was to test the feasibility and accuracy of ARC in identifying surveillance and non-surveillance colonoscopy in IBD using EMR. Methods We performed a split validation study of electronic reports of colonoscopy pathology for patients with IBD from the Michael E. DeBakey VA Medical Center. A gastroenterologist manually classified pathology reports as either derived from surveillance or non-surveillance colonoscopy. Pathology reports were randomly split into two sets: 70 % for algorithm derivation and 30 % for validation. An ARC generated classification model was applied to the validation set of pathology reports. The performance of the model was compared with manual classification for surveillance and non-surveillance colonoscopy. Results A total of 575 colonoscopy pathology reports were available on 195 IBD patients, of which 400 reports were designated as training and 175 as testing sets. Within the testing set, a total of 69 pathology reports were classified as surveillance by manual review, whereas the ARC model classified 66 reports as surveillance for a recall of 0.77, precision of 0.80, and specificity of 0.88. Conclusions ARC was able to identify surveillance colonoscopy for IBD without customized software programming. NLP-based document-level classification may be used to differentiate surveillance from non-surveillance colonoscopy in IBD. PMID:23086115

  11. Identifying primary and recurrent cancers using a SAS-based natural language processing algorithm

    PubMed Central

    Strauss, Justin A; Chao, Chun R; Kwan, Marilyn L; Ahmed, Syed A; Schottinger, Joanne E

    2013-01-01

    Objective Significant limitations exist in the timely and complete identification of primary and recurrent cancers for clinical and epidemiologic research. A SAS-based coding, extraction, and nomenclature tool (SCENT) was developed to address this problem. Materials and methods SCENT employs hierarchical classification rules to identify and extract information from electronic pathology reports. Reports are analyzed and coded using a dictionary of clinical concepts and associated SNOMED codes. To assess the accuracy of SCENT, validation was conducted using manual review of pathology reports from a random sample of 400 breast and 400 prostate cancer patients diagnosed at Kaiser Permanente Southern California. Trained abstractors classified the malignancy status of each report. Results Classifications of SCENT were highly concordant with those of abstractors, achieving ? of 0.96 and 0.95 in the breast and prostate cancer groups, respectively. SCENT identified 51 of 54 new primary and 60 of 61 recurrent cancer cases across both groups, with only three false positives in 792 true benign cases. Measures of sensitivity, specificity, positive predictive value, and negative predictive value exceeded 94% in both cancer groups. Discussion Favorable validation results suggest that SCENT can be used to identify, extract, and code information from pathology report text. Consequently, SCENT has wide applicability in research and clinical care. Further assessment will be needed to validate performance with other clinical text sources, particularly those with greater linguistic variability. Conclusion SCENT is proof of concept for SAS-based natural language processing applications that can be easily shared between institutions and used to support clinical and epidemiologic research. PMID:22822041

  12. PS2-25: Using Natural Language Processing to Extract Findings from Mammography Reports

    PubMed Central

    Gao, Hongyuan; Bowles, Erin Aiello; Carrell, David; Biust, Diana

    2013-01-01

    Background/Aims Mammographic findings such as a mass may be associated with breast cancer risk, but these data are only available in free-text reports and require resource-intensive manual abstraction. We developed and tested a Natural Language Processing (NLP) algorithm to extract mammographic findings (mass, calcification, asymmetric density, and architectural distortion) from free-text mammography reports. Methods We identified 92,947 reports for women receiving screening and diagnostic mammography at Group Health between 2007–2008. We developed an NLP algorithm based on Perl Regular Expressions in SAS v9.2. The algorithm identifies words indicating mammography findings (mass, distortion, asymmetry and calcification) and their related words denoting laterality, negation, family history, personal history and uncertainty. Three flags are made indicating possible errors of the NLP algorithm. An experienced abstractor manually reviewed a random sample of 50 mammography reports to test and refine the NLP algorithm. Results The algorithm correctly identified a mass on 46/50 reports, calcifications on 48/50 reports, asymmetric density on 50/50 reports, and architectural distortion on 48/50 reports. The NLP algorithm misinterprets sentences such as, “there are calcifications with no other asymmetry.” The NLP algorithm incorrectly associated the negation word “No” with the key word “calcifications.” Building more refined rules on association between negation words and key words will improve the accuracy. Conclusions This NLP algorithm holds promise for accurate and fast identification of findings from free-text mammography reports. It can be shared across institutions and is an example of what can be done with free-text radiology reports, in addition to mammography. Manual review may still be necessary for some reports with a high probability of error, depending on resources available.

  13. Automated Detection of Adverse Events Using Natural Language Processing of Discharge Summaries

    PubMed Central

    Melton, Genevieve B.; Hripcsak, George

    2005-01-01

    Objective: To determine whether natural language processing (NLP) can effectively detect adverse events defined in the New York Patient Occurrence Reporting and Tracking System (NYPORTS) using discharge summaries. Design: An adverse event detection system for discharge summaries using the NLP system MedLEE was constructed to identify 45 NYPORTS event types. The system was first applied to a random sample of 1,000 manually reviewed charts. The system then processed all inpatient cases with electronic discharge summaries for two years. All system-identified events were reviewed, and performance was compared with traditional reporting. Measurements: System sensitivity, specificity, and predictive value, with manual review serving as the gold standard. Results: The system correctly identified 16 of 65 events in 1,000 charts. Of 57,452 total electronic discharge summaries, the system identified 1,590 events in 1,461 cases, and manual review verified 704 events in 652 cases, resulting in an overall sensitivity of 0.28 (95% confidence interval [CI]: 0.17–0.42), specificity of 0.985 (CI: 0.984–0.986), and positive predictive value of 0.45 (CI: 0.42–0.47) for detecting cases with events and an average specificity of 0.9996 (CI: 0.9996–0.9997) per event type. Traditional event reporting detected 322 events during the period (sensitivity 0.09), of which the system identified 110 as well as 594 additional events missed by traditional methods. Conclusion: NLP is an effective technique for detecting a broad range of adverse events in text documents and outperformed traditional and previous automated adverse event detection methods. PMID:15802475

  14. Temporal reasoning with medical data--a review with emphasis on medical natural language processing.

    PubMed

    Zhou, Li; Hripcsak, George

    2007-04-01

    Temporal information is crucial in electronic medical records and biomedical information systems. Processing temporal information in medical narrative data is a very challenging area. It lies at the intersection of temporal representation and reasoning (TRR) in artificial intelligence and medical natural language processing (MLP). Some fundamental concepts and important issues in relation to TRR have previously been discussed, mainly in the context of processing structured data in biomedical informatics; however, it is important that these concepts be re-examined in the context of processing narrative data using MLP. Theoretical and methodological TRR studies in biomedical informatics can be classified into three main categories: category 1 applies theories and models from temporal reasoning in AI; category 2 defines frameworks that meet needs from clinical applications; category 3 resolves issues such as temporal granularity and uncertainty. Currently, most MLP systems are not designed with a formal representation of time, and their ability to reason about temporal relations among medical events is limited. Previous work in processing time with clinical narrative data includes processing time in clinical reports, modeling textual temporal expressions in clinical databases, processing time in clinical guidelines, and building time standards for data exchange and integration. In addition to common problems in MLP, there are challenges specific to TRR in medical text, which occur at each level of linguistic structure and analysis. Despite advances in temporal reasoning in biomedical informatics, processing time in medical text deserves more attention. Besides the need for more research in temporal granularity, fuzzy time, temporal contradiction, intermittent events and uncertainty, broad areas for future research include enhancing functions of current MLP systems on processing temporal information, incorporating medical knowledge into temporal reasoning systems, resolving coreference, integrating narrative data with structured data and evaluating these systems. PMID:17317332

  15. Validation of natural language processing to extract breast cancer pathology procedures and results

    PubMed Central

    Wieneke, Arika E.; Bowles, Erin J. A.; Cronkite, David; Wernli, Karen J.; Gao, Hongyuan; Carrell, David; Buist, Diana S. M.

    2015-01-01

    Background: Pathology reports typically require manual review to abstract research data. We developed a natural language processing (NLP) system to automatically interpret free-text breast pathology reports with limited assistance from manual abstraction. Methods: We used an iterative approach of machine learning algorithms and constructed groups of related findings to identify breast-related procedures and results from free-text pathology reports. We evaluated the NLP system using an all-or-nothing approach to determine which reports could be processed entirely using NLP and which reports needed manual review beyond NLP. We divided 3234 reports for development (2910, 90%), and evaluation (324, 10%) purposes using manually reviewed pathology data as our gold standard. Results: NLP correctly coded 12.7% of the evaluation set, flagged 49.1% of reports for manual review, incorrectly coded 30.8%, and correctly omitted 7.4% from the evaluation set due to irrelevancy (i.e. not breast-related). Common procedures and results were identified correctly (e.g. invasive ductal with 95.5% precision and 94.0% sensitivity), but entire reports were flagged for manual review because of rare findings and substantial variation in pathology report text. Conclusions: The NLP system we developed did not perform sufficiently for abstracting entire breast pathology reports. The all-or-nothing approach resulted in too broad of a scope of work and limited our flexibility to identify breast pathology procedures and results. Our NLP system was also limited by the lack of the gold standard data on rare findings and wide variation in pathology text. Focusing on individual, common elements and improving pathology text report standardization may improve performance.

  16. Using rule-based natural language processing to improve disease normalization in biomedical text

    PubMed Central

    Kang, Ning; Singh, Bharat; Afzal, Zubair; van Mulligen, Erik M; Kors, Jan A

    2013-01-01

    Background and objective In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization. Methods We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching. Results Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching. Conclusions We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated. PMID:23043124

  17. Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 106114, Portland, Oregon, USA, 2324 June 2011. c 2011 Association for Computational Linguistics

    E-print Network

    Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 106­114, Portland, Oregon, USA, 23­24 June 2011. c 2011 Association for Computational Linguistics Assessing Benefit Institute School of Computer Science Carnegie Mellon University shilpaa@cs.cmu.edu Eric Nyberg Language

  18. The Development of Bilingual Proficiency. Final Report. Volume I: The Nature of Language Proficiency, Volume II: Classroom Treatment, Volume III: Social Context and Age.

    ERIC Educational Resources Information Center

    Harley, Birgit; And Others

    The Development of Bilingual Proficiency is a large-scale, five-year research project begun in 1981. The final report contains three volumes, each concentrating on specific issues investigated in the research: (1) the nature of language proficiency, including second language lexical proficiency and the development and growth of metaphor…

  19. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  20. What Is a Programming Language?

    ERIC Educational Resources Information Center

    Wold, Allen L.

    1983-01-01

    The nature of programing languages is discussed, focusing on machine/assembly language and high-level languages. The latter includes systems (such as "Basic") in which an entire set of low-level instructions (in assembly/machine language) are combined. Also discusses the nature of other languages such as "Lisp" and list-processing languages. (JN)

  1. The Natural History of Human Language: Bridging the Gaps without Magic

    Microsoft Academic Search

    Bjorn Merker; Kazuo Okanoya

    2007-01-01

    \\u000a Human languages are quintessentially historical phenomena. Every known aspect of linguistic form and content is subject to\\u000a change in historical time (Lehmann, 1995; Bybee, 2004). Many facts of language, syntactic no less than semantic, find their\\u000a explanation in the historical processes that generated them. If adpositions were once verbs, then the fact that they tend\\u000a to occur on the same

  2. Applying semantic-based probabilistic context-free grammar to medical language processing – A preliminary study on parsing medication sentences

    Microsoft Academic Search

    Hua Xu; Samir AbdelRahman; Yanxin Lu; Joshua C. Denny; Son Doan

    Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously,

  3. Oral and visual language are not processed in like fashion: Constraints on the SOC Christophe Parisse and Henri Cohen

    E-print Network

    Paris-Sud XI, Université de

    Oral and visual language are not processed in like fashion: Constraints on the SOC framework Christophe Parisse and Henri Cohen ABSTRACT The SOC framework does not take into account the fact by the existence of the Self-Organizing Consciousness (SOC), the principles of which are exemplified in PARSER

  4. DBPQL: A view-oriented query language for the Intel Data Base Processor

    NASA Technical Reports Server (NTRS)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  5. Linguistics in Language Education

    ERIC Educational Resources Information Center

    Kumar, Rajesh; Yunus, Reva

    2014-01-01

    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  6. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language

    NASA Astrophysics Data System (ADS)

    Szathmáry, Eörs; Szathmáry, Zoltán; Ittzés, Péter; Orbaán, Gero?; Zachár, István; Huszár, Ferenc; Fedor, Anna; Varga, Máté; Számadó, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  7. A comparison of speech versus keyboard input and scrolling versus nonscrolling menus on a menu-based natural language interface

    E-print Network

    Armstrong, Mark Edward

    1987-01-01

    -based natural language interi'ace. It ivas predicted that speech input would result in faster entry times and feiver errors than kevboard input f' or menu selection. The study also compared the use of non-scrolling menus, ivhere the required selections a, re... shown on screen, versus scrolling menus, where soriie selections are hidden and must be scrolled to to be seen. The non-scrolling menu format was predicted to have faster entry times and fewer errors than the scrolling format. The objectives oi...

  8. School Meaning Systems: The Symbiotic Nature of Culture and "Language-In-Use"

    ERIC Educational Resources Information Center

    Abawi, Lindy

    2013-01-01

    Recent research has produced evidence to suggest a strong reciprocal link between school context-specific language constructions that reflect a school's vision and schoolwide pedagogy, and the way that meaning making occurs, and a school's culture is characterized. This research was conducted within three diverse settings: one school in…

  9. Signed or Spoken, Children need Natural Languages Daphne Bavelier, Elissa L. Newport, and Ted Supalla

    E-print Network

    DeAngelis, Gregory

    will those children learn to communicate--and at what pace, with what success, and with what implications for later education? New systems and new technology offer these children some additional alternatives --but as English, Mandarin Chinese, and Navajo, share striking similarities. All spoken languages draw their sounds

  10. Nature, Nurture, and Age in Language Acquisition: The Case of Speech Perception.

    ERIC Educational Resources Information Center

    Wode, Henning

    1994-01-01

    This paper reviews the research on speech perception and reassesses the contribution of innate capacities versus external stimulation in conjunction with age in first- and second-language acquisition. A developmental model of speech perception is then discussed in relation to neonatal auditory perception. (Contains 86 references.) (MDM)

  11. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  12. Coconstructing Learning: The Dynamic Nature of Foreign Language Pedagogy in a CMC Environment

    Microsoft Academic Search

    NELLEKE VAN DEUSEN-SCHOLL; CHRISTINA FREI; EDWARD DIXON

    Recent innovations in technology allow foreign language learners and their in- structors to interact both inside and beyond the classroom using a variety of communicative tools. As a consequence, the classroom has been transformed into an extended learning environment which has had a profound effect on both student and teacher roles. However, the theoretical and pedagogical issues emerging from these

  13. From Object-Process Diagrams to a Natural Object-Process Language

    E-print Network

    Peleg, Mor

    serve as a basis for automatic conversion into executable code. Nevertheless, the task of specifying at an elevator, the action Checking begins its execution within 1 minute" in three different logic languages. #12 resembles a sentence in English. x [@(REPAIRMAN_ARRIVED_AT_ELEVATOR, x ) @(CHECKING, x) @(CHECKING, x

  14. Grounding natural language quantifiers in visual attention. Kenny R. Coventry1

    E-print Network

    Cangelosi, Angelo

    for Thinking and Language, School of Psychology, & 2 Adaptive Behaviour and Cognitive Group, School the accuracy of judgments of number in visual scenes under conditions of time pressure. We present the results to describe a number of objects in a visual scene, and show corresponding effects on judgments of number using

  15. The Preliminary Results of a Mandarin Dictation Machine Based Upon Chinese Natural Language Analysis

    Microsoft Academic Search

    Lin-shan Lee; Chiu-yu Tseng; Keh-jiann Chen; James Huang

    1987-01-01

    This paper describes the preliminary results of the first research effort toward a Mandarin dictation machine in the world for the input of Chinese characters to computers. Considering the special characteristics of Chinese language, syllables are chosen as the basic units for dictation. The machine is divided into two subsystems. The first is to recognize the syllables using speech signal

  16. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework

    ERIC Educational Resources Information Center

    Vlas, Radu Eduard

    2012-01-01

    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  17. Advanced Language Technologies

    E-print Network

    Erjavec, Toma?

    Levels of linguistic analysis #12;2 I.I. Computer processing ofComputer processing of natural languagenatural language1 Advanced Language Technologies Information and Communication Technologies Research Area thatthat enablesenables usus toto produce/understandproduce/understand languagelanguage Natural LanguageNatural

  18. Research in knowledge representation for natural language communication and planning assistance. Final report, 18 March 1985-30 September 1988

    SciTech Connect

    Goodman, B.A.; Grosz, B.; Haas, A.; Litman, D.; Reinhardt, T.

    1988-11-01

    BBN's DARPA project in Knowledge Representation for Natural Language Communication and Planning Assistance has two primary objectives: 1) To perform research on aspects of the interaction between users who are making complex decisions and systems that are assisting them with their task. In particular, this research is focused on communication and the reasoning required for performing its underlying task of discourse processing, planning, and plan recognition and communication repair. 2) Based on the research objectives to build tools for communication, plan recognition, and planning assistance and for the representation of knowledge and reasoning that underlie all of these processes. This final report summarizes BBN's research activities performed under this contract in the areas of knowledge representation and speech and natural language. In particular, the report discusses the work in the areas of knowledge representation, planning, and discourse modeling. We describe a parallel truth maintenance system. We provide an extension to the sentential theory of propositional attitudes by adding a sentential semantics. The report also contains a description of our research in discourse modelling in the areas of planning and plan recognition.

  19. A natural language query system for Hubble Space Telescope proposal selection

    NASA Technical Reports Server (NTRS)

    Hornick, Thomas; Cohen, William; Miller, Glenn

    1987-01-01

    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  20. The nature of the visual environment induces implicit biases during language-mediated visual search

    Microsoft Academic Search

    Falk Huettig; James M. McQueen

    2011-01-01

    Four eyetracking experiments examined whether semantic and visual-shape representations are routinely retrieved from printed\\u000a word displays and used during language-mediated visual search. Participants listened to sentences containing target words\\u000a that were similar semantically or in shape to concepts invoked by concurrently displayed printed words. In Experiment 1, the displays contained semantic and shape competitors of the targets along with two

  1. HPARSER: extracting formal patient data from free text history and physical reports using natural language processing software.

    PubMed Central

    Sponsler, J. L.

    2001-01-01

    A prototype, HPARSER, processes a patient history and physical report such that specific data are obtained and stored in a patient data record. HPARSER is a recursive transition network (RTN) parser, and includes English and medical grammar rules, lexicon, and database constraints. Medical grammar rules augment the grammar rule base and specify common phrases seen in patient reports (e.g., "pupils are equal and reactive"). Each database constraint associates a grammar rule with a database table and attribute. Constraint behavior is such that if a rule is satisfied, data is extracted from the parse tree and stored into the database. Control reports guided construction of grammar and constraint rules. Test reports were processed with the control rules. 85% of test report sentences parsed and a 60% data capture rate, compared to controls, was achieved. HPARSER demonstrates use of an RTN to parse patient reports, and database constraints to transfer formal data from parse trees into a database. PMID:11825263

  2. IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 BibPro: A Citation Parser Based on Sequence

    E-print Network

    Yang, Kai-Hsiang

    automatically parse citations scattered over the Internet [1]. Potential problems include data entry errors, common author names, abbreviations of publication venues and large- scale citation data. In this paperIEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 BibPro: A Citation Parser Based on Sequence

  3. An O(n 3 ) AgendaBased Chart Parser for Arbitrary Probabilistic ContextFree Grammars

    E-print Network

    Manning, Christopher

    An O(n 3 ) Agenda­Based Chart Parser for Arbitrary Probabilistic Context­Free Grammars Dan Klein and Christopher D. Manning Computer Science Department Stanford University Stanford, CA 94305­9040 fklein Agenda­based active chart parsing (Kay, 1973; Kay, 1980; Pereira and Shieber, 1987) provides an elegant

  4. On the nature and evolution of the neural bases of human language

    NASA Technical Reports Server (NTRS)

    Lieberman, Philip

    2002-01-01

    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on the brains of human beings and other species provides insight into the evolution of the brain bases of human language. The neural substrate that regulated motor control in the common ancestor of apes and humans most likely was modified to enhance cognitive and linguistic ability. Speech communication played a central role in this process. However, the process that ultimately resulted in the human brain may have started when our earliest hominid ancestors began to walk.

  5. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 129136, Vancouver, October 2005. c 2005 Association for Computational Linguistics

    E-print Network

    Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural are de- scribed by relevant seed features. Our method introduces two unsupervised steps that improve as initial seeds. 1 Introduction Supervised classification is the task of assigning cat- egory labels, taken

  6. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 395402, Vancouver, October 2005. c 2005 Association for Computational Linguistics

    E-print Network

    Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural "Bootstrapping" methods for learning require a small amount of supervision to seed the learning process. We show that it is sometimes possible to eliminate this last bit of supervision, by trying many candidate seeds and selecting

  7. Writing in science: Exploring teachers' and students' views of the nature of science in language enriched environments

    NASA Astrophysics Data System (ADS)

    Decoito, Isha

    Writing in science can be used to address some of the issues relevant to contemporary scientific literacy, such as the nature of science, which describes the scientific enterprise for science education. This has implications for the kinds of writing tasks students should attempt in the classroom, and for how students should understand the rationale and claims of these tasks. While scientific writing may train the mind to think scientifically in a disciplined and structured way thus encouraging students to gain access to the public domain of scientific knowledge, the counter-argument is that students need to be able to express their thoughts freely in their own language. Writing activities must aim to promote philosophical and epistemological views of science that accurately portray contemporary science. This mixed-methods case study explored language-enriched environments, in this case, secondary science classrooms with a focus on teacher-developed activities, involving diversified writing styles, that were directly linked to the science curriculum. The research foci included: teachers' implementation of these activities in their classrooms; how the activities reflected the teachers' nature of science views; common attributes between students' views of science and how they represented science in their writings; and if, and how the activities influenced students' nature of science views. Teachers' and students' views of writing and the nature of science are illustrated through pre-and post-questionnaire responses; interviews; student work; and classroom observations. Results indicated that diversified writing activities have the potential to accurately portray science to students, personalize learning in science, improve students' overall attitude towards science, and enhance scientific literacy through learning science, learning about science, and doing science. Further research is necessary to develop an understanding of whether the choice of genre has an influence on meaning construction and understanding in science. Finally, this study concluded that the relationship between students' views of the nature of science and writing in science is complex and is dependent on several factors including the teachers' influence and attitude towards student writing in science.

  8. Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

    PubMed Central

    Scott-Phillips, Thomas C.; Blythe, Richard A.

    2013-01-01

    In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication. PMID:24047871

  9. On the Dual Nature of the Functional Discourse Grammar Model: Context, the Language System/Language Use Distinction, and Indexical Reference in Discourse

    ERIC Educational Resources Information Center

    Cornish, Francis

    2013-01-01

    The Functional Discourse Grammar model has a twofold objective: on the one hand, to provide a descriptively, psychologically and pragmatically adequate account of the forms made available by a typologically diverse range of languages; and on the other, to provide a model of language which is set up to reflect, at one remove, certain of the stages…

  10. Computer based extraction of phenoptypic features of human congenital anomalies from the digital literature with natural language processing techniques.

    PubMed

    Karakülah, Gökhan; Dicle, Özgün; Ko?aner, Ozgün; Suner, Asl?; Birant, Ça?da? Can; Berber, Tolga; Canbek, Sezin

    2014-01-01

    The lack of laboratory tests for the diagnosis of most of the congenital anomalies renders the physical examination of the case crucial for the diagnosis of the anomaly; and the cases in the diagnostic phase are mostly being evaluated in the light of the literature knowledge. In this respect, for accurate diagnosis, ,it is of great importance to provide the decision maker with decision support by presenting the literature knowledge about a particular case. Here, we demonstrated a methodology for automated scanning and determining of the phenotypic features from the case reports related to congenital anomalies in the literature with text and natural language processing methods, and we created a framework of an information source for a potential diagnostic decision support system for congenital anomalies. PMID:25160250

  11. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  12. An Evaluation of a Natural Language Processing Tool for Identifying and Encoding Allergy Information in Emergency Department Clinical Notes

    PubMed Central

    Goss, Foster R.; Plasek, Joseph M.; Lau, Jason J.; Seger, Diane L.; Chang, Frank Y.; Zhou, Li

    2014-01-01

    Emergency department (ED) visits due to allergic reactions are common. Allergy information is often recorded in free-text provider notes; however, this domain has not yet been widely studied by the natural language processing (NLP) community. We developed an allergy module built on the MTERMS NLP system to identify and encode food, drug, and environmental allergies and allergic reactions. The module included updates to our lexicon using standard terminologies, and novel disambiguation algorithms. We developed an annotation schema and annotated 400 ED notes that served as a gold standard for comparison to MTERMS output. MTERMS achieved an F-measure of 87.6% for the detection of allergen names and no known allergies, 90% for identifying true reactions in each allergy statement where true allergens were also identified, and 69% for linking reactions to their allergen. These preliminary results demonstrate the feasibility using NLP to extract and encode allergy information from clinical notes. PMID:25954363

  13. The development of a natural language interface to a geographical information system

    NASA Technical Reports Server (NTRS)

    Toledo, Sue Walker; Davis, Bruce

    1993-01-01

    This paper will discuss a two and a half year long project undertaken to develop an English-language interface for the geographical information system GRASS. The work was carried out for NASA by a small business, Netrologic, based in San Diego, California, under Phase 1 and 2 Small Business Innovative Research contracts. We consider here the potential value of this system whose current functionality addresses numerical, categorical and boolean raster layers and includes the display of point sets defined by constraints on one or more layers, answers yes/no and numerical questions, and creates statistical reports. It also handles complex queries and lexical ambiguities, and allows temporarily switching to UNIX or GRASS.

  14. FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text

    Microsoft Academic Search

    Jerry R. Hobbs; Douglas Appelt; John Bear

    1996-01-01

    FASTUS is a system for extracting information from natural lan- guage text for entry into a database and for other applications. It works essentially as a cascaded, nondeterministic finite-state automa- ton. There are five stages in the operation of FASTUS. In Stage 1, names and other fixed form expressions are recognized. In Stage 2, basic noun groups, verb groups, and

  15. Using Natural Language Generation Technology to Improve Information Flows in Intensive Care Units

    E-print Network

    Paris-Sud XI, Université de

    and nursing staff to assimilate what is important. It has been demonstrated that data summarization in natural generated summaries showed that the decisions made by medical and nursing staff after reading the summaries and nursing staff, particularly when integrated with the currently available graphical presentations. The main

  16. Semi-Supervised and Latent-Variable Models of Natural Language Semantics

    E-print Network

    Eskenazi, Maxine

    parsing task. We work within the frame- work of graph-based semi-supervised learning, a powerful method that are absent in annotated data. We also present a family of novel i #12;ii objective functions for graph-based learning that result in sparse probability measures over graph vertices, a desirable property for natural

  17. Psychological linguistics: A natural science approach to the study of language interactions

    PubMed Central

    Bijou, Sidney W.; Umbreit, John; Ghezzi, Patrick M.; Chao, Chia-Chen

    1986-01-01

    Kantor's theoretical analysis of “psychological linguistics” offers a natural science approach to the study of linguistic behavior and interactions. This paper includes brief descriptions of (a) some of the basic assumptions of the approach, (b) Kantor's conception of linguistic behavior and interactions, (c) a compatible research method and sample research data, and (d) some areas of research and application. PMID:22477507

  18. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 20822093, October 25-29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics

    E-print Network

    Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP anaphors (The five as- tronauts and touchdown) and the antecedent (The space shuttle Atlantis) establish (local) entity co- herence.1 (1) The space shuttle Atlantis landed at a desert air strip at Edwards Air

  19. Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP), pages 2837, Dublin, Ireland, August 24 2014.

    E-print Network

    Proceedings of the Second Workshop on Natural Language Processing for Social Media (Social Sentiment analysis is a rapidly growing research field that has attracted both academia and in- dustry accuracy for both datasets. 1 Introduction In opinion mining, different levels of granularity analysis have

  20. Report on Workshop on High Performance Computing and Communications for Grand Challenge Applications: Computer Vision, Speech and Natural Language Processing, and Artificial Intelligence

    Microsoft Academic Search

    Benjamin W. Wah; Thomas S. Huang; Aravind K. Joshi; Dan I. Moldovan; Yiannis Aloimonos; Ruzena Bajcsy; Dana H. Ballard; Doug Degroot; Kenneth Dejong; Charles R. Dyer; Scott E. Fahlman; Ralph Grishman; Lynette Hirschman; Richard E. Korf; Stephen E. Levinson; Daniel P. Miranker; N. H. Morgan; Sergei Nirenburg; Tomaso Poggio; Edward M. Riseman; Craig Stanfil; Salvatore J. Stolfo; Steven L. Tanimoto; Charles C. Weems

    1993-01-01

    The findings of a workshop, the goals of which were to identify applications, research problems, and designs of high performance computing and communications (HPCC) systems for supporting applications are discussed. In computer vision, the main scientific issues are machine learning, surface reconstruction, inverse optics and integration, model acquisition, and perception and action. In speech and natural language processing (SNLP), issues

  1. Generating a 3D Simulation of a Car Accident from a Written Description in Natural Language: the CarSim System

    E-print Network

    Nugues, Pierre

    Generating a 3D Simulation of a Car Accident from a Written Description in Natural Language: the CarSim System Sylvain DUPUY, Arjan EGGES, Vincent LEGENDRE, and Pierre NUGUES GREYC laboratory - ISMRA from car accident reports, written in French. The problem of generating such a 3D simulation can

  2. Proceedings of the 8th International Natural Language Generation Conference, pages 2634, Philadelphia, Pennsylvania, 19-21 June 2014. c 2014 Association for Computational Linguistics

    E-print Network

    , Germany harth@kit.edu Abstract With the rise of the Semantic Web more and more data become available encoded using the Semantic Web standard RDF. RDF is faced towards machines: de- signed to be easily articles and DBpedia data for English and German. 1 Introduction Natural Language Generation (NLG) systems

  3. Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 216220, New York City, June 2006. c 2006 Association for Computational Linguistics

    E-print Network

    developments in dependency parsing strategies. Dependency graphs also encode much of the deep syntacticProceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 216 the out- put from the first and labels all the edges in the dependency graph with appropri- ate syntactic

  4. 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011 Regularizing Mono-and Bi-Word Models for Word Alignment

    E-print Network

    Lunds Universitet

    5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011 Regularizing Mono- and Bi-Word Models for Word Alignment Thomas Schoenemann Lund University, Sweden Abstract Conditional probabilistic models for word alignment are popular due to the elegant way of handling them

  5. Proceedings of the 8th International Natural Language Generation Conference, pages 9394, Philadelphia, Pennsylvania, 19-21 June 2014. c 2014 Association for Computational Linguistics

    E-print Network

    NLG for Brazilian Portuguese realisation Rodrigo de Oliveira Department of Computing Science University of Aberdeen University of Aberdeen Aberdeen, UK, AB24 3UE yaji.sripada@abdn.ac.uk Abstract This paper describesProceedings of the 8th International Natural Language Generation Conference, pages 93

  6. Proceedings of the 8th International Natural Language Generation Conference, pages 1625, Philadelphia, Pennsylvania, 19-21 June 2014. c 2014 Association for Computational Linguistics

    E-print Network

    Computing Science University of Aberdeen, UK angroshmandya@abdn.ac.uk Advaith Siddharthan Computing Science University of Aberdeen, UK advaith@abdn.ac.uk Abstract We present an approach to text simplifi- cation basedProceedings of the 8th International Natural Language Generation Conference, pages 16

  7. Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 947955, Chiang Mai, Thailand, November 8 13, 2011. c 2011 AFNLP

    E-print Network

    Tomkins, Andrew

    is ambiguous in its intent. The user may want restaurants in New York, or want to know the tipping practice be articulated specifically in the form of natural language. For instance, key- words such as New York restaurant at New York restaurants, or something else under the myr- iad other possible interpretations

  8. High-School Projects at the Laboratory for Laser Energetics (2011) Brandon Avila (Allendale Columbia) researched Natural Language Processing (NLP) for extracting information from LLE

    E-print Network

    Portman, Douglas

    2011-01-01

    Columbia) researched Natural Language Processing (NLP) for extracting information from LLE documentation Boyce (McQuaid) carried out experiments to measure the rate at which tritium is removed from metal to measure the activity of the tritium and measured the dependence of the rate of tritium removal

  9. Arbitrary Symbolism in Natural Language Revisited: When Word Forms Carry Meaning

    PubMed Central

    Reilly, Jamie; Westbury, Chris; Kean, Jacob; Peelle, Jonathan E.

    2012-01-01

    Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog). Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 1–2 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete) for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., “apartment”) as abstract and shorter uninflected abstract words (e.g., “fate”) as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words. PMID:22879931

  10. Neurolinguistic Approach to Natural Language Processing with Applications to Medical Text Analysis

    PubMed Central

    Matykiewicz, Pawe?; Pestian, John

    2008-01-01

    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts that are not found directly in the text. Approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications. PMID:18614334

  11. SIMD-parallel understanding of natural language with application to magnitude-only optical parsing of text

    NASA Astrophysics Data System (ADS)

    Schmalz, Mark S.

    1992-08-01

    A novel parallel model of natural language (NL) understanding is presented which can realize high levels of semantic abstraction, and is designed for implementation on synchronous SIMD architectures and optical processors. Theory is expressed in terms of the Image Algebra (IA), a rigorous, concise, inherently parallel notation which unifies the design, analysis, and implementation of image processing algorithms. The IA has been implemented on numerous parallel architectures, and IA preprocessors and interpreters are available for the FORTRAN and Ada languages. In a previous study, we demonstrated the utility of IA for mapping MEA- conformable (Multiple Execution Array) algorithms to optical architectures. In this study, we extend our previous theory to map serial parsing algorithms to the synchronous SIMD paradigm. We initially derive a two-dimensional image that is based upon the adjacency matrix of a semantic graph. Via IA template mappings, the operations of bottom-up parsing, semantic disambiguation, and referential resolution are implemented as image-processing operations upon the adjacency matrix. Pixel-level operations are constrained to Hadamard addition and multiplication, thresholding, and row/column summation, which are available in magnitude-only optics. Assuming high parallelism in the parse rule base, the parsing of n input symbols with a grammar consisting of M rules of arity H, on an N-processor architecture, could exhibit time complexity of T(n)

  12. First Language Acquisition and Teaching

    ERIC Educational Resources Information Center

    Cruz-Ferreira, Madalena

    2011-01-01

    "First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in…

  13. Disfluencies and human language comprehension.

    PubMed

    Ferreira, Fernanda; Bailey, Karl G D

    2004-05-01

    Spoken language contains disfluencies, which include editing terms such as uh and um as well as repeats and corrections. In less than ten years the question of how disfluencies are handled by the human sentence comprehension system has gone from virtually ignored to a topic of major interest in computational linguistics and psycholinguistics. We discuss relevant empirical findings and describe a computational model that captures how disfluencies influence parsing and comprehension. The research reviewed shows that the parser, which presumably evolved to handle conversations, deals with disfluencies in a way that is efficient and linguistically principled. The success of this research program reinforces the current trend in cognitive science to view cognitive mechanisms as adaptations to real-world constraints and challenges. PMID:15120682

  14. This Language-Learning Business.

    ERIC Educational Resources Information Center

    Palmer, Harold E.; Redman, H. Vere

    This compilation of writings, first published in 1932, provides a historical overview of early thought concerning the nature of language and language instruction. The authors, pioneers in the field of "natural" language learning, consider language as: (1) code, (2) literature, (3) conversation, (4) communication, (5) sounds, and (6) speech. In a…

  15. An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating.

    PubMed

    Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B

    2015-07-01

    Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers. PMID:26148107

  16. Combining Speech Recognition/Natural Language Processing with 3D Online Learning Environments to Create Distributed Authentic and Situated Spoken Language Learning

    ERIC Educational Resources Information Center

    Jones, Greg; Squires, Todd; Hicks, Jeramie

    2008-01-01

    This article will describe research done at the National Institute of Multimedia in Education, Japan and the University of North Texas on the creation of a distributed Internet-based spoken language learning system that would provide more interactive and motivating learning than current multimedia and audiotape-based systems. The project combined…

  17. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 568578, Jeju Island, Korea, 1214 July 2012. c 2012 Association for Computational Linguistics

    E-print Network

    Ng, Hwee Tou

    instantaneous accurate feedback to language learners, e.g., learners of English as a Second Language (ESL for Computational Linguistics A Beam-Search Decoder for Grammatical Error Correction Daniel Dahlmeier1 and Hwee Tou-search decoder for grammatical error correction. The decoder iteratively generates new hypothesis correc- tions

  18. MeSH Speller + askMEDLINE: auto-completes MeSH terms then searches MEDLINE/PubMed via free-text, natural language queries.

    PubMed

    Fontelo, Paul; Liu, Fang; Ackerman, Michael

    2005-01-01

    Medical terminology is challenging even for healthcare personnel. Spelling errors can make searching MEDLINE/PubMed ineffective. We developed a utility that provides MeSH term and Specialist Lexicon Vocabulary suggestions as it is typed on a search page. The correctly spelled term can be incorporated into a free-text, natural language search or used as a clinical queries search. PMID:16779244

  19. Integrating Learner Corpora and Natural Language Processing: A Crucial Step towards Reconciling Technological Sophistication and Pedagogical Effectiveness

    ERIC Educational Resources Information Center

    Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie

    2007-01-01

    Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with…

  20. The Acquisition of Written Language: Response and Revision. Writing Research: Multidisciplinary Inquiries into the Nature of Writing Series.

    ERIC Educational Resources Information Center

    Freedman, Sarah Warshauer, Ed.

    Viewing writing as both a form of language learning and an intellectual skill, this book presents essays on how writers acquire trusted inner voices and the roles schools and teachers can play in helping student writers in the learning process. The essays in the book focus on one of three topics: the language of instruction and how response and…

  1. Language, Gesture, and Space.

    ERIC Educational Resources Information Center

    Emmorey, Karen, Ed.; Reilly, Judy S., Ed.

    A collection of papers addresses a variety of issues regarding the nature and structure of sign language, gesture, and gesture systems. Articles include: "Theoretical Issues Relating Language, Gesture, and Space: An Overview" (Karen Emmorey, Judy S. Reilly); "Real, Surrogate, and Token Space: Grammatical Consequences in ASL American Sign Language"…

  2. Creativity, Grammar and the Language Teacher.

    ERIC Educational Resources Information Center

    Di Pietro, Robert J.

    1971-01-01

    Aspects of language instruction which "derive from the nature of language itself" are discussed in this study. The notion that language teachers should teach grammar exclusively is disputed. This position is based on the following generalizations presented in an analysis of the nature of grammar and language: (1) language comprises more than what…

  3. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

    PubMed Central

    2012-01-01

    Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054

  4. Common data model for natural language processing based on two existing standard information models: CDA+GrAF.

    PubMed

    Meystre, Stéphane M; Lee, Sanghoon; Jung, Chai Young; Chevrier, Raphaël D

    2012-08-01

    An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. PMID:22197801

  5. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard

    NASA Astrophysics Data System (ADS)

    Lendholt, Matthias; Hammitzsch, Martin; Wächter, Joachim

    2010-05-01

    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System) [3] project where CAP serves as central message format containing both human readable warnings and structured data for automatic processing by message receivers. In particular the spatial reference capabilities are of paramount importance both in CAP and EDXL. Affected areas are addressable via geo codes like HASC (Hierarchical Administrative Subdivision Codes) [4] or UN/LOCODE [5] but also with arbitrary polygons that can be directly generated out of GML [6]. For each affected area standardized criticality values (urgency, severity and certainty) have to be set but also application specific key-value-pairs like estimated time of arrival or maximum inundation height can be specified. This enables - together with multilingualism, message aggregation and message conversion for different dissemination channels - the generation of user-specific tailored warning messages. [1] CAP, http://www.oasis-emergency.org/cap [2] EDXL-DE, http://docs.oasis-open.org/emergency/edxl-de/v1.0/EDXL-DE_Spec_v1.0.pdf [3] DEWS, http://www.dews-online.org [4] HASC, "Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998" ISBN 0-7864-0729-8 [5] UN/LOCODE, http://www.unece.org/cefact/codesfortrade/codes_index.htm [6] GML, http://www.opengeospatial.org/standards/gml

  6. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation

    PubMed Central

    Ferraro, Jeffrey P; Daumé, Hal; DuVall, Scott L; Chapman, Wendy W; Harkema, Henk; Haug, Peter J

    2013-01-01

    Objective Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives. Methods Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt. Results The evaluated POS taggers drop in accuracy by 8.5–15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3–91.0% on clinical texts. ClinAdapt reports 93.2–93.9%. Conclusions ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks. PMID:23486109

  7. Advanced computer languages

    SciTech Connect

    Bryce, H.

    1984-05-03

    If software is to become an equal partner in the so-called fifth generation of computers-which of course it must-programming languages and the human interface will need to clear some high hurdles. Again, the solutions being sought turn to cerebral emulation-here, the way that human beings understand language. The result would be natural or English-like languages that would allow a person to communicate with a computer much as he or she does with another person. In the discussion the authors look at fourth level languages and fifth level languages, used in meeting the goal of AI. The higher level languages aim to be non procedural. Application of LISP, and Forth to natural language interface are described as well as programs such as natural link technology package, written in C.

  8. Psycholinguistic studies on bilingualism frequently explore the nature of the relationship between first and second language knowledge at both the conceptual level and in the lexica (e.g., Basnight-Brown & Altarriba 2007; Dijkstra &

    E-print Network

    Habib, Ayman

    Psycholinguistic studies on bilingualism frequently explore the nature of the relationship between-language/cross-dialect paradigm. Independent groups of English(L1)­Mori(L2) bilingual New Zealanders participated in a short

  9. Natural Language Spatal Reasoning

    E-print Network

    Tellex, Stefanie

    , water, until, always, away, public, something, fact, less, through, far, put, head, think, called, set, looked, ever, become, best, need, within, felt, along, children, saw, church, light, power, least, family, taken, anything, field, having, seen, word, car, experience, I'm, money, real

  10. Study and examination regulations for English language Master's degree course "Advanced Materials" offered by the Faculties for Natural Science, Engineering

    E-print Network

    Pfeifer, Holger

    1 Study and examination regulations for English language Master's degree course "Advanced Materials I. General Regulations § 1 Applicability § 2 Course objectives, academic degrees § 3 Commencement Organisation of module examinations § 11 Related study courses § 12 Regulations on module Master's thesis

  11. Whole Language Strategies for ESL Students. Language and Literacy Series.

    ERIC Educational Resources Information Center

    Heald-Taylor, Gail

    This handbook outlines learning strategies in language arts for children in kindergarten to third grade learning English as a second language (ESL). They are designed for the Whole Language or Natural Approach. Although reading and writing are the key language components emphasized, listening, speaking, drama, and visual arts activities have been…

  12. Programming for the Language Laboratory.

    ERIC Educational Resources Information Center

    Turner, John D., Ed.

    The present book is an attempt to stimulate thinking on the nature of the problems involved in writing material for language laboratory use in relation to the teaching of five languages widely taught in Britain today. All the contributors to this volume are language teachers currently using the language laboratory in their work. The editor notes…

  13. Cultural Perspectives Toward Language Learning

    ERIC Educational Resources Information Center

    Lin, Li-Li

    2008-01-01

    Cultural conflicts may be derived from using inappropriate language. Appropriate linguistic-pragmatic competence may also be produced by providing various and multicultural backgrounds. Culture and language are linked together naturally, unconsciously, and closely in daily social lives. Culture affects language and language affects culture through…

  14. The Tao of Whole Language.

    ERIC Educational Resources Information Center

    Zola, Meguido

    1989-01-01

    Uses the philosophy of Taoism as a metaphor in describing the whole language approach to language arts instruction. The discussion covers the key principles that inform the whole language approach, the resulting holistic nature of language programs, and the role of the teacher in this approach. (16 references) (CLB)

  15. Social Network Development, Language Use, and Language Acquisition during Study Abroad: Arabic Language Learners' Perspectives

    ERIC Educational Resources Information Center

    Dewey, Dan P.; Belnap, R. Kirk; Hillstrom, Rebecca

    2013-01-01

    Language learners and educators have subscribed to the belief that those who go abroad will have many opportunities to use the target language and will naturally become proficient. They also assume that language learners will develop relationships with native speakers allowing them to use the language and become more fluent, an assumption…

  16. Coh-metrix: analysis of text on cohesion and language.

    PubMed

    Graesser, Arthur C; McNamara, Danielle S; Louwerse, Max M; Cai, Zhiqiang

    2004-05-01

    Advances in computational linguistics and discourse processing have made it possible to automate many language- and text-processing mechanisms. We have developed a computer tool called Coh-Metrix, which analyzes texts on over 200 measures of cohesion, language, and readability. Its modules use lexicons, part-of-speech classifiers, syntactic parsers, templates, corpora, latent semantic analysis, and other components that are widely used in computational linguistics. After the user enters an English text, CohMetrix returns measures requested by the user. In addition, a facility allows the user to store the results of these analyses in data files (such as Text, Excel, and SPSS). Standard text readability formulas scale texts on difficulty by relying on word length and sentence length, whereas Coh-Metrix is sensitive to cohesion relations, world knowledge, and language and discourse characteristics. PMID:15354684

  17. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 190198, Prague, June 2007. c 2007 Association for Computational Linguistics

    E-print Network

    Linguistics Towards Robust Unsupervised Personal Name Disambiguation Ying Chen Center for Spoken Language University of Colorado at Boulder James.Martin@colorado.edu Abstract The increasing use of large open presented here, there- fore, addresses the problem of automatically problem of automatically separating sets

  18. Introducing a gender-neutral pronoun in a natural gender language: the influence of time on attitudes and behavior

    PubMed Central

    Gustafsson Sendén, Marie; Bäck, Emma A.; Lindqvist, Anna

    2015-01-01

    The implementation of gender fair language is often associated with negative reactions and hostile attacks on people who propose a change. This was also the case in Sweden in 2012 when a third gender-neutral pronoun hen was proposed as an addition to the already existing Swedish pronouns for she (hon) and he (han). The pronoun hen can be used both generically, when gender is unknown or irrelevant, and as a transgender pronoun for people who categorize themselves outside the gender dichotomy. In this article we review the process from 2012 to 2015. No other language has so far added a third gender-neutral pronoun, existing parallel with two gendered pronouns, that actually have reached the broader population of language users. This makes the situation in Sweden unique. We present data on attitudes toward hen during the past 4 years and analyze how time is associated with the attitudes in the process of introducing hen to the Swedish language. In 2012 the majority of the Swedish population was negative to the word, but already in 2014 there was a significant shift to more positive attitudes. Time was one of the strongest predictors for attitudes also when other relevant factors were controlled for. The actual use of the word also increased, although to a lesser extent than the attitudes shifted. We conclude that new words challenging the binary gender system evoke hostile and negative reactions, but also that attitudes can normalize rather quickly. We see this finding very positive and hope it could motivate language amendments and initiatives for gender-fair language, although the first responses may be negative.

  19. Grounding language models in spatiotemporal context

    E-print Network

    Roy, Brandon C.

    Natural language is rich and varied, but also highly structured. The rules of grammar are a primary source of linguistic regularity, but there are many other factors that govern patterns of language use. Language models ...

  20. Description of a Rule-based System for the i2b2 Challenge in Natural Language Processing for Clinical Data

    Microsoft Academic Search

    KIMBERLY M. KOWALSKI; R OBERT J. TAYLOR

    Abstract The Obesity Challenge, sponsored by Informatics for Integrating Biology and the Bedside (i2b2), a National Center for Biomedical Computing, asked participants to build software systems that could “read” a patient’s clinical discharge,summary,and,replicate the judgments,of physicians,in evaluating,presence,or absence of obesity and,15 comorbidities. The authors,describe,their methodology,and,discuss the results of applying Lockheed Martin’s rule-based natural language processing (NLP) capability, ClinREAD. We tailored