ERIC Educational Resources Information Center
Chowdhury, Gobinda G.
2003-01-01
Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…
Natural Language Processing: Toward Large-Scale, Robust Systems.
ERIC Educational Resources Information Center
Haas, Stephanie W.
1996-01-01
Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…
ERIC Educational Resources Information Center
Snyder, Robin M.
2015-01-01
In 2014, in conjunction with doing research in natural language processing and attending a global conference on computational linguistics, the author decided to learn a new foreign language, Greek, that uses a non-English character set. This paper/session will present/discuss an overview of the current state of natural language processing and…
Survey of Natural Language Processing Techniques in Bioinformatics.
Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling
2015-01-01
Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.
An Overview of Computer-Based Natural Language Processing.
ERIC Educational Resources Information Center
Gevarter, William B.
Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…
Applying language technology to nursing documents: pros and cons with a focus on ethics.
Suominen, Hanna; Lehtikunnas, Tuija; Back, Barbro; Karsten, Helena; Salakoski, Tapio; Salanterä, Sanna
2007-10-01
The present study discusses ethics in building and using applications based on natural language processing in electronic nursing documentation. Specifically, we first focus on the question of how patient confidentiality can be ensured in developing language technology for the nursing documentation domain. Then, we identify and theoretically analyze the ethical outcomes which arise when using natural language processing to support clinical judgement and decision-making. In total, we put forward and justify 10 claims related to ethics in applying language technology to nursing documents. A review of recent scientific articles related to ethics in electronic patient records or in the utilization of large databases was conducted. Then, the results were compared with ethical guidelines for nurses and the Finnish legislation covering health care and processing of personal data. Finally, the practical experiences of the authors in applying the methods of natural language processing to nursing documents were appended. Patient records supplemented with natural language processing capabilities may help nurses give better, more efficient and more individualized care for their patients. In addition, language technology may facilitate patients' possibility to receive truthful information about their health and improve the nature of narratives. Because of these benefits, research about the use of language technology in narratives should be encouraged. In contrast, privacy-sensitive health care documentation brings specific ethical concerns and difficulties to the natural language processing of nursing documents. Therefore, when developing natural language processing tools, patient confidentiality must be ensured. While using the tools, health care personnel should always be responsible for the clinical judgement and decision-making. One should also consider that the use of language technology in nursing narratives may threaten patients' rights by using documentation collected for other purposes. Applying language technology to nursing documents may, on the one hand, contribute to the quality of care, but, on the other hand, threaten patient confidentiality. As an overall conclusion, natural language processing of nursing documents holds the promise of great benefits if the potential risks are taken into consideration.
Emerging Approach of Natural Language Processing in Opinion Mining: A Review
NASA Astrophysics Data System (ADS)
Kim, Tai-Hoon
Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.
Semi-Automated Methods for Refining a Domain-Specific Terminology Base
2011-02-01
only as a resource for written and oral translation, but also for Natural Language Processing ( NLP ) applications, text retrieval, document indexing...Natural Language Processing ( NLP ) applications, text retrieval, document indexing, and other knowledge management tasks. The objective of this...also for Natural Language Processing ( NLP ) applications, text retrieval (1), document indexing, and other knowledge management tasks. The National
Integrated Intelligence: Robot Instruction via Interactive Grounded Learning
2016-02-14
ADDRESS (ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 Robotics; Natural Language Processing ; Grounded Language ...Logical Forms for Referring Expression Generation, Emperical Methods in Natural Language Processing (EMNLP). 18-OCT-13, . : , Tom Kwiatkowska, Eunsol...Choi, Yoav Artzi, Luke Zettlemoyer. Scaling Semantic Parsers with On-the-fly Ontology Matching, Emperical Methods in Natural Langauge Processing
Neural Network Computing and Natural Language Processing.
ERIC Educational Resources Information Center
Borchardt, Frank
1988-01-01
Considers the application of neural network concepts to traditional natural language processing and demonstrates that neural network computing architecture can: (1) learn from actual spoken language; (2) observe rules of pronunciation; and (3) reproduce sounds from the patterns derived by its own processes. (Author/CB)
Natural Language Processing in Game Studies Research: An Overview
ERIC Educational Resources Information Center
Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy
2012-01-01
Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…
Paradigms of Evaluation in Natural Language Processing: Field Linguistics for Glass Box Testing
ERIC Educational Resources Information Center
Cohen, Kevin Bretonnel
2010-01-01
Although software testing has been well-studied in computer science, it has received little attention in natural language processing. Nonetheless, a fully developed methodology for glass box evaluation and testing of language processing applications already exists in the field methods of descriptive linguistics. This work lays out a number of…
Integration of Speech and Natural Language
1988-04-01
major activities: • Development of the syntax and semantics components for natural language processing. • Integration of the developed syntax and...evaluating the performance of speech recognition algonthms developed K» under the Strategic Computing Program. grs Our work on natural language processing...included the developement of a grammar (syntax) that uses the Uiuficanon gnmmaj formaMsm (an augmented context free formalism). The Unification
Perceptual Decoding Processes for Language in a Visual Mode and for Language in an Auditory Mode.
ERIC Educational Resources Information Center
Myerson, Rosemarie Farkas
The purpose of this paper is to gain insight into the nature of the reading process through an understanding of the general nature of sensory processing mechanisms which reorganize and restructure input signals for central recognition, and an understanding of how the grammar of the language functions in defining the set of possible sentences in…
A Large-Scale Analysis of Variance in Written Language.
Johns, Brendan T; Jamieson, Randall K
2018-01-22
The collection of very large text sources has revolutionized the study of natural language, leading to the development of several models of language learning and distributional semantics that extract sophisticated semantic representations of words based on the statistical redundancies contained within natural language (e.g., Griffiths, Steyvers, & Tenenbaum, ; Jones & Mewhort, ; Landauer & Dumais, ; Mikolov, Sutskever, Chen, Corrado, & Dean, ). The models treat knowledge as an interaction of processing mechanisms and the structure of language experience. But language experience is often treated agnostically. We report a distributional semantic analysis that shows written language in fiction books varies appreciably between books from the different genres, books from the same genre, and even books written by the same author. Given that current theories assume that word knowledge reflects an interaction between processing mechanisms and the language environment, the analysis shows the need for the field to engage in a more deliberate consideration and curation of the corpora used in computational studies of natural language processing. Copyright © 2018 Cognitive Science Society, Inc.
Natural language processing and the Now-or-Never bottleneck.
Gómez-Rodríguez, Carlos
2016-01-01
Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science.
Development and Evaluation of a Thai Learning System on the Web Using Natural Language Processing.
ERIC Educational Resources Information Center
Dansuwan, Suyada; Nishina, Kikuko; Akahori, Kanji; Shimizu, Yasutaka
2001-01-01
Describes the Thai Learning System, which is designed to help learners acquire the Thai word order system. The system facilitates the lessons on the Web using HyperText Markup Language and Perl programming, which interfaces with natural language processing by means of Prolog. (Author/VWL)
Automatic Item Generation via Frame Semantics: Natural Language Generation of Math Word Problems.
ERIC Educational Resources Information Center
Deane, Paul; Sheehan, Kathleen
This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language…
DOE Office of Scientific and Technical Information (OSTI.GOV)
McHale, M.L.
The field of artificial Intelligence strives to produce computer programs that exhibit intelligent behavior. One of the areas of interest is the processing of natural language. This report discusses the role of the computer language PROLOG in Natural Language Processing (NLP) both from theoretic and pragmatic viewpoints. The reasons for using PROLOG for NLP are numerous. First, linguists can write natural-language grammars almost directly as PROLOG programs; this allows fast-prototyping of NLP systems and facilitates analysis of NLP theories. Second, semantic representations of natural-language texts that use logic formalisms are readily produced in PROLOG because of PROLOG's logical foundations. Third,more » PROLOG's built-in inferencing mechanisms are often sufficient for inferences on the logical forms produced by NLPs. Fourth, the logical, declarative nature of PROLOG may make it the language of choice for parallel computing systems. Finally, the fact that PROLOG has a de facto standard (Edinburgh) makes the porting of code from one computer system to another virtually trouble free. Perhaps the strongest tie one could make between NLP and PROLOG was stated by John Stuart Mill in his inaugural Address at St. Andrews: The structure of every sentence is a lesson in logic.« less
Clinical Natural Language Processing in languages other than English: opportunities and challenges.
Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra; Savova, Guergana; Zweigenbaum, Pierre
2018-03-30
Natural language processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. This paper offers the first broad overview of clinical Natural Language Processing (NLP) for languages other than English. Recent studies are summarized to offer insights and outline opportunities in this area. We envision three groups of intended readers: (1) NLP researchers leveraging experience gained in other languages, (2) NLP researchers faced with establishing clinical text processing in a language other than English, and (3) clinical informatics researchers and practitioners looking for resources in their languages in order to apply NLP techniques and tools to clinical practice and/or investigation. We review work in clinical NLP in languages other than English. We classify these studies into three groups: (i) studies describing the development of new NLP systems or components de novo, (ii) studies describing the adaptation of NLP architectures developed for English to another language, and (iii) studies focusing on a particular clinical application. We show the advantages and drawbacks of each method, and highlight the appropriate application context. Finally, we identify major challenges and opportunities that will affect the impact of NLP on clinical practice and public health studies in a context that encompasses English as well as other languages.
Video to Text (V2T) in Wide Area Motion Imagery
2015-09-01
microtext) or a document (e.g., using Sphinx or Apache NLP ) as an automated approach [102]. Previous work in natural language full-text searching...language processing ( NLP ) based module. The heart of the structured text processing module includes the following seven key word banks...Features Tracker MHT Multiple Hypothesis Tracking MIL Multiple Instance Learning NLP Natural Language Processing OAB Online AdaBoost OF Optic Flow
ERIC Educational Resources Information Center
Jarman, Jay
2011-01-01
This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…
Dutta, Sayon; Long, William J; Brown, David F M; Reisner, Andrew T
2013-08-01
As use of radiology studies increases, there is a concurrent increase in incidental findings (eg, lung nodules) for which the radiologist issues recommendations for additional imaging for follow-up. Busy emergency physicians may be challenged to carefully communicate recommendations for additional imaging not relevant to the patient's primary evaluation. The emergence of electronic health records and natural language processing algorithms may help address this quality gap. We seek to describe recommendations for additional imaging from our institution and develop and validate an automated natural language processing algorithm to reliably identify recommendations for additional imaging. We developed a natural language processing algorithm to detect recommendations for additional imaging, using 3 iterative cycles of training and validation. The third cycle used 3,235 radiology reports (1,600 for algorithm training and 1,635 for validation) of discharged emergency department (ED) patients from which we determined the incidence of discharge-relevant recommendations for additional imaging and the frequency of appropriate discharge documentation. The test characteristics of the 3 natural language processing algorithm iterations were compared, using blinded chart review as the criterion standard. Discharge-relevant recommendations for additional imaging were found in 4.5% (95% confidence interval [CI] 3.5% to 5.5%) of ED radiology reports, but 51% (95% CI 43% to 59%) of discharge instructions failed to note those findings. The final natural language processing algorithm had 89% (95% CI 82% to 94%) sensitivity and 98% (95% CI 97% to 98%) specificity for detecting recommendations for additional imaging. For discharge-relevant recommendations for additional imaging, sensitivity improved to 97% (95% CI 89% to 100%). Recommendations for additional imaging are common, and failure to document relevant recommendations for additional imaging in ED discharge instructions occurs frequently. The natural language processing algorithm's performance improved with each iteration and offers a promising error-prevention tool. Copyright © 2013 American College of Emergency Physicians. Published by Mosby, Inc. All rights reserved.
Storytelling, behavior planning, and language evolution in context.
McBride, Glen
2014-01-01
An attempt is made to specify the structure of the hominin bands that began steps to language. Storytelling could evolve without need for language yet be strongly subject to natural selection and could provide a major feedback process in evolving language. A storytelling model is examined, including its effects on the evolution of consciousness and the possible timing of language evolution. Behavior planning is presented as a model of language evolution from storytelling. The behavior programming mechanism in both directions provide a model of creating and understanding behavior and language. Culture began with societies, then family evolution, family life in troops, but storytelling created a culture of experiences, a final step in the long process of achieving experienced adults by natural selection. Most language evolution occurred in conversations where evolving non-verbal feedback ensured mutual agreements on understanding. Natural language evolved in conversations with feedback providing understanding of changes.
Storytelling, behavior planning, and language evolution in context
McBride, Glen
2014-01-01
An attempt is made to specify the structure of the hominin bands that began steps to language. Storytelling could evolve without need for language yet be strongly subject to natural selection and could provide a major feedback process in evolving language. A storytelling model is examined, including its effects on the evolution of consciousness and the possible timing of language evolution. Behavior planning is presented as a model of language evolution from storytelling. The behavior programming mechanism in both directions provide a model of creating and understanding behavior and language. Culture began with societies, then family evolution, family life in troops, but storytelling created a culture of experiences, a final step in the long process of achieving experienced adults by natural selection. Most language evolution occurred in conversations where evolving non-verbal feedback ensured mutual agreements on understanding. Natural language evolved in conversations with feedback providing understanding of changes. PMID:25360123
Three Dimensions of Reproducibility in Natural Language Processing.
Cohen, K Bretonnel; Xia, Jingbo; Zweigenbaum, Pierre; Callahan, Tiffany J; Hargraves, Orin; Goss, Foster; Ide, Nancy; Névéol, Aurélie; Grouin, Cyril; Hunter, Lawrence E
2018-05-01
Despite considerable recent attention to problems with reproducibility of scientific research, there is a striking lack of agreement about the definition of the term. That is a problem, because the lack of a consensus definition makes it difficult to compare studies of reproducibility, and thus to have even a broad overview of the state of the issue in natural language processing. This paper proposes an ontology of reproducibility in that field. Its goal is to enhance both future research and communication about the topic, and retrospective meta-analyses. We show that three dimensions of reproducibility, corresponding to three kinds of claims in natural language processing papers, can account for a variety of types of research reports. These dimensions are reproducibility of a conclusion , of a finding , and of a value. Three biomedical natural language processing papers by the authors of this paper are analyzed with respect to these dimensions.
Thought beyond language: neural dissociation of algebra and natural language.
Monti, Martin M; Parsons, Lawrence M; Osherson, Daniel N
2012-08-01
A central question in cognitive science is whether natural language provides combinatorial operations that are essential to diverse domains of thought. In the study reported here, we addressed this issue by examining the role of linguistic mechanisms in forging the hierarchical structures of algebra. In a 3-T functional MRI experiment, we showed that processing of the syntax-like operations of algebra does not rely on the neural mechanisms of natural language. Our findings indicate that processing the syntax of language elicits the known substrate of linguistic competence, whereas algebraic operations recruit bilateral parietal brain regions previously implicated in the representation of magnitude. This double dissociation argues against the view that language provides the structure of thought across all cognitive domains.
Extraction of UMLS® Concepts Using Apache cTAKES™ for German Language.
Becker, Matthias; Böckmann, Britta
2016-01-01
Automatic information extraction of medical concepts and classification with semantic standards from medical reports is useful for standardization and for clinical research. This paper presents an approach for an UMLS concept extraction with a customized natural language processing pipeline for German clinical notes using Apache cTAKES. The objectives are, to test the natural language processing tool for German language if it is suitable to identify UMLS concepts and map these with SNOMED-CT. The German UMLS database and German OpenNLP models extended the natural language processing pipeline, so the pipeline can normalize to domain ontologies such as SNOMED-CT using the German concepts. For testing, the ShARe/CLEF eHealth 2013 training dataset translated into German was used. The implemented algorithms are tested with a set of 199 German reports, obtaining a result of average 0.36 F1 measure without German stemming, pre- and post-processing of the reports.
Buckley, Julliette M; Coopey, Suzanne B; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K; Kim, Elizabeth M H; Garber, Judy E; Smith, Barbara L; Gadd, Michele A; Specht, Michelle C; Roche, Constance A; Gudewicz, Thomas M; Hughes, Kevin S
2012-01-01
The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.
Analyzing Discourse Processing Using a Simple Natural Language Processing Tool
ERIC Educational Resources Information Center
Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.
2014-01-01
Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…
Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon.
Clody, Michael C
2011-01-01
The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key.
Using Language Learning Conditions in Mathematics. PEN 68.
ERIC Educational Resources Information Center
Stoessiger, Rex
This pamphlet reports on a project in Tasmania exploring whether the "natural learning conditions" approach to language learning could be adapted for mathematics. The connections between language and mathematics, as well as the natural learning processes of language learning are described in the pamphlet. The project itself is…
An overview of computer-based natural language processing
NASA Technical Reports Server (NTRS)
Gevarter, W. B.
1983-01-01
Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.
The Promise of NLP and Speech Processing Technologies in Language Assessment
ERIC Educational Resources Information Center
Chapelle, Carol A.; Chung, Yoo-Ree
2010-01-01
Advances in natural language processing (NLP) and automatic speech recognition and processing technologies offer new opportunities for language testing. Despite their potential uses on a range of language test item types, relatively little work has been done in this area, and it is therefore not well understood by test developers, researchers or…
Boguslav, Mayla; Cohen, Kevin Bretonnel
2017-01-01
Human-annotated data is a fundamental part of natural language processing system development and evaluation. The quality of that data is typically assessed by calculating the agreement between the annotators. It is widely assumed that this agreement between annotators is the upper limit on system performance in natural language processing: if humans can't agree with each other about the classification more than some percentage of the time, we don't expect a computer to do any better. We trace the logical positivist roots of the motivation for measuring inter-annotator agreement, demonstrate the prevalence of the widely-held assumption about the relationship between inter-annotator agreement and system performance, and present data that suggest that inter-annotator agreement is not, in fact, an upper bound on language processing system performance.
CONSTRUCT: In Search of a Theory of Meaning. Technical Report No. 238.
ERIC Educational Resources Information Center
Smith, R. L.; And Others
A new language-processing system, CONSTRUCT, is described and defined as a question-answering system for elementary mathematical language using natural language input. The primary goal is said to be an attempt to reach a better understanding of the relationship between syntactic and semantic components of natural language. The "meaning…
Human-Level Natural Language Understanding: False Progress and Real Challenges
ERIC Educational Resources Information Center
Bignoli, Perrin G.
2013-01-01
The field of Natural Language Processing (NLP) focuses on the study of how utterances composed of human-level languages can be understood and generated. Typically, there are considered to be three intertwined levels of structure that interact to create meaning in language: syntax, semantics, and pragmatics. Not only is a large amount of…
Inferring heuristic classification hierarchies from natural language input
NASA Technical Reports Server (NTRS)
Hull, Richard; Gomez, Fernando
1993-01-01
A methodology for inferring hierarchies representing heuristic knowledge about the check out, control, and monitoring sub-system (CCMS) of the space shuttle launch processing system from natural language input is explained. Our method identifies failures explicitly and implicitly described in natural language by domain experts and uses those descriptions to recommend classifications for inclusion in the experts' heuristic hierarchies.
Linguistics and Information Science
ERIC Educational Resources Information Center
Montgomery, Christine A.
1972-01-01
This paper defines the relationship between linguistics and information science in terms of a common interest in natural language. The concept of a natural language information system is introduced as a framework for reviewing automated language processing efforts by computational linguists and information scientists. (96 references) (Author)
QATT: a Natural Language Interface for QPE. M.S. Thesis
NASA Technical Reports Server (NTRS)
White, Douglas Robert-Graham
1989-01-01
QATT, a natural language interface developed for the Qualitative Process Engine (QPE) system is presented. The major goal was to evaluate the use of a preexisting natural language understanding system designed to be tailored for query processing in multiple domains of application. The other goal of QATT is to provide a comfortable environment in which to query envisionments in order to gain insight into the qualitative behavior of physical systems. It is shown that the use of the preexisting system made possible the development of a reasonably useful interface in a few months.
BIT BY BIT: A Game Simulating Natural Language Processing in Computers
ERIC Educational Resources Information Center
Kato, Taichi; Arakawa, Chuichi
2008-01-01
BIT BY BIT is an encryption game that is designed to improve students' understanding of natural language processing in computers. Participants encode clear words into binary code using an encryption key and exchange them in the game. BIT BY BIT enables participants who do not understand the concept of binary numbers to perform the process of…
Eliminating Unpredictable Variation through Iterated Learning
ERIC Educational Resources Information Center
Smith, Kenny; Wonnacott, Elizabeth
2010-01-01
Human languages may be shaped not only by the (individual psychological) processes of language acquisition, but also by population-level processes arising from repeated language learning and use. One prevalent feature of natural languages is that they avoid unpredictable variation. The current work explores whether linguistic predictability might…
Automatic Requirements Specification Extraction from Natural Language (ARSENAL)
2014-10-01
designers, implementers) involved in the design of software systems. However, natural language descriptions can be informal, incomplete, imprecise...communication of technical descriptions between the various stakeholders (e.g., customers, designers, imple- menters) involved in the design of software systems...the accuracy of the natural language processing stage, the degree of automation, and robustness to noise. 1 2 Introduction Software systems operate in
Bibliography of Research in Natural Language Generation
1993-11-01
on 1397] Barbara J. Gross Focuing and description in Artifcial Intelligence (GWAI-88), Geseke, West natural language dialogues, In Joshi et al. (557...Proceedings of the Fifth Canadian Conference from information in a frame structure. Data and on Artificial Intelligence , pages Ŕ-24, London, Knowledge...generation workshops (IWNLGS, ENLGWS), natural language processing conferences (ANLP, TINLAP, SPEECH), artificial intelligence conferences (AAAI, SCA
Trombert-Paviot, B; Rodrigues, J M; Rogers, J E; Baud, R; van der Haring, E; Rassinoux, A M; Abrial, V; Clavel, L; Idir, H
2000-09-01
Generalised architecture for languages, encyclopedia and nomenclatures in medicine (GALEN) has developed a new generation of terminology tools based on a language independent model describing the semantics and allowing computer processing and multiple reuses as well as natural language understanding systems applications to facilitate the sharing and maintaining of consistent medical knowledge. During the European Union 4 Th. framework program project GALEN-IN-USE and later on within two contracts with the national health authorities we applied the modelling and the tools to the development of a new multipurpose coding system for surgical procedures named CCAM in a minority language country, France. On one hand, we contributed to a language independent knowledge repository and multilingual semantic dictionaries for multicultural Europe. On the other hand, we support the traditional process for creating a new coding system in medicine which is very much labour consuming by artificial intelligence tools using a medically oriented recursive ontology and natural language processing. We used an integrated software named CLAW (for classification workbench) to process French professional medical language rubrics produced by the national colleges of surgeons domain experts into intermediate dissections and to the Grail reference ontology model representation. From this language independent concept model representation, on one hand, we generate with the LNAT natural language generator controlled French natural language to support the finalization of the linguistic labels (first generation) in relation with the meanings of the conceptual system structure. On the other hand, the Claw classification manager proves to be very powerful to retrieve the initial domain experts rubrics list with different categories of concepts (second generation) within a semantic structured representation (third generation) bridge to the electronic patient record detailed terminology.
Rank and Sparsity in Language Processing
ERIC Educational Resources Information Center
Hutchinson, Brian
2013-01-01
Language modeling is one of many problems in language processing that have to grapple with naturally high ambient dimensions. Even in large datasets, the number of unseen sequences is overwhelmingly larger than the number of observed ones, posing clear challenges for estimation. Although existing methods for building smooth language models tend to…
Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension
ERIC Educational Resources Information Center
Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.
2017-01-01
This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…
Verification Processes in Recognition Memory: The Role of Natural Language Mediators
ERIC Educational Resources Information Center
Marshall, Philip H.; Smith, Randolph A. S.
1977-01-01
The existence of verification processes in recognition memory was confirmed in the context of Adams' (Adams & Bray, 1970) closed-loop theory. Subjects' recognition was tested following a learning session. The expectation was that data would reveal consistent internal relationships supporting the position that natural language mediation plays…
Net-centric ACT-R-Based Cognitive Architecture with DEVS Unified Process
2011-04-01
effort has been spent in analyzing various forms of requirement specifications, viz, state-based, Natural Language based, UML-based, Rule- based, BPMN ...requirement specifications in one of the chosen formats such as BPMN , DoDAF, Natural Language Processing (NLP) based, UML- based, DSL or simply
NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.
ERIC Educational Resources Information Center
Zhou, Lina; Zhang, Dongsong
2003-01-01
Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…
Generating and Executing Complex Natural Language Queries across Linked Data.
Hamon, Thierry; Mougin, Fleur; Grabar, Natalia
2015-01-01
With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at http://search.cpan.org/ thhamon/RDF-NLP-SPARQLQuery.
Apprentissage naturel et apprentissage guide (Natural Learning and Guided Learning).
ERIC Educational Resources Information Center
Veronique, Daniel
1984-01-01
Although second language pedagogy has tended increasingly toward simulation, role-playing, and natural communication, it has not profited from existing research on natural learning in second languages. The emphasis should be on understanding how the processes of guided learning and natural learning differ, psychologically and sociologically, and…
Categorization of Survey Text Utilizing Natural Language Processing and Demographic Filtering
2017-09-01
SURVEY TEXT UTILIZING NATURAL LANGUAGE PROCESSING AND DEMOGRAPHIC FILTERING by Christine M. Cairoli September 2017 Thesis Advisor: Lyn...DATE September 2017 3. REPORT TYPE AND DATES COVERED Master’s thesis 4. TITLE AND SUBTITLE CATEGORIZATION OF SURVEY TEXT UTILIZING NATURAL...words) Thousands of Navy survey free text comments are overlooked every year because reading and interpreting comments is expensive, time consuming
Advances in natural language processing.
Hirschberg, Julia; Manning, Christopher D
2015-07-17
Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.
Do neural nets learn statistical laws behind natural language?
Takahashi, Shuntaro; Tanaka-Ishii, Kumiko
2017-01-01
The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.
Do neural nets learn statistical laws behind natural language?
Takahashi, Shuntaro
2017-01-01
The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks. PMID:29287076
Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang
1999-01-01
Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230
Natural language information retrieval in digital libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.
In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and buildmore » a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.« less
Conclusiveness of natural languages and recognition of images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wojcik, Z.M.
1983-01-01
The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It ismore » suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.« less
First Language Acquisition and Teaching
ERIC Educational Resources Information Center
Cruz-Ferreira, Madalena
2011-01-01
"First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in…
State of the Art of Natural Language Processing
1987-11-15
work of Chomsky , Hewlett-Packard, Generalized Phase Structure Grammar . D. Lunar, DARPA speech understanding, Schank’s Conceptual Dependency Theory...of computers that a machine which understood natural languages was highly desirable. It also was evident from the work of Chomsky * and others that...computers. ♦Noam Chomsky , Aspects of the Theory of Syntax (Cambridge, Mass.: MIT Press, 1965). -A- One of the earliest attempts at Natural Language
ERIC Educational Resources Information Center
Ziegler, Nicole; Meurers, Detmar; Rebuschat, Patrick; Ruiz, Simón; Moreno-Vega, José L.; Chinkina, Maria; Li, Wenjing; Grey, Sarah
2017-01-01
Despite the promise of research conducted at the intersection of computer-assisted language learning (CALL), natural language processing, and second language acquisition, few studies have explored the potential benefits of using intelligent CALL systems to deepen our understanding of the process and products of second language (L2) learning. The…
Sources of Difficulty in the Processing of Written Language. Report Series 4.3.
ERIC Educational Resources Information Center
Chafe, Wallace
Ease of language processing varies with the nature of the language involved. Ordinary spoken language is the easiest kind to produce and understand, while writing is a relatively new development. On thoughtful inspection, the readability of writing has shown itself to be a complex topic requiring insights from many academic disciplines and…
ERIC Educational Resources Information Center
Kamio, Yoko; Robins, Diana; Kelley, Elizabeth; Swainson, Brook; Fein, Deborah
2007-01-01
Although autism is associated with impaired language functions, the nature of semantic processing in high-functioning pervasive developmental disorders (HFPDD) without a history of early language delay has been debated. In this study, we aimed to examine whether the automatic lexical/semantic aspect of language is impaired or intact in these…
Artificial intelligence, expert systems, computer vision, and natural language processing
NASA Technical Reports Server (NTRS)
Gevarter, W. B.
1984-01-01
An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Revealing the Naturalization of Language and Literacy: The Common Sense of Text Complexity
ERIC Educational Resources Information Center
Newhouse, Erica H.
2017-01-01
This article illustrates the process and obstacles encountered when applying the Common Core's three-part model of determining text complexity to an urban literature text. This analysis revealed how the model privileges language and literacy practices that limit the range of texts used in classrooms through a process of naturalization and by…
Research at Yale in Natural Language Processing. Research Report #84.
ERIC Educational Resources Information Center
Schank, Roger C.
This report summarizes the capabilities of five computer programs at Yale that do automatic natural language processing as of the end of 1976. For each program an introduction to its overall intent is given, followed by the input/output, a short discussion of the research underlying the program, and a prognosis for future development. The programs…
Recurrent Artificial Neural Networks and Finite State Natural Language Processing.
ERIC Educational Resources Information Center
Moisl, Hermann
It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…
The Application of Natural Language Processing to Augmentative and Alternative Communication
ERIC Educational Resources Information Center
Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian
2012-01-01
Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…
Construct Validity in TOEFL iBT Speaking Tasks: Insights from Natural Language Processing
ERIC Educational Resources Information Center
Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S.
2016-01-01
This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…
Advanced Natural Language Processing and Temporal Mining for Clinical Discovery
ERIC Educational Resources Information Center
Mehrabi, Saeed
2016-01-01
There has been vast and growing amount of healthcare data especially with the rapid adoption of electronic health records (EHRs) as a result of the HITECH act of 2009. It is estimated that around 80% of the clinical information resides in the unstructured narrative of an EHR. Recently, natural language processing (NLP) techniques have offered…
You Are Your Words: Modeling Students' Vocabulary Knowledge with Natural Language Processing Tools
ERIC Educational Resources Information Center
Allen, Laura K.; McNamara, Danielle S.
2015-01-01
The current study investigates the degree to which the lexical properties of students' essays can inform stealth assessments of their vocabulary knowledge. In particular, we used indices calculated with the natural language processing tool, TAALES, to predict students' performance on a measure of vocabulary knowledge. To this end, two corpora were…
Divide and Recombine for Large Complex Data
2017-12-01
Empirical Methods in Natural Language Processing , October 2014 Keywords Enter keywords for the publication. URL Enter the URL...low-latency data processing systems. Declarative Languages for Interactive Visualization: The Reactive Vega Stack Another thread of XDATA research...for array processing operations embedded in the R programming language . Vector virtual machines work well for long vectors. One of the most
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharp, J.K.
1997-11-01
This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, projectmore » management, budget reviews, tax, laws, machine function).« less
Restrictions on biological adaptation in language evolution.
Chater, Nick; Reali, Florencia; Christiansen, Morten H
2009-01-27
Language acquisition and processing are governed by genetic constraints. A crucial unresolved question is how far these genetic constraints have coevolved with language, perhaps resulting in a highly specialized and species-specific language "module," and how much language acquisition and processing redeploy preexisting cognitive machinery. In the present work, we explored the circumstances under which genes encoding language-specific properties could have coevolved with language itself. We present a theoretical model, implemented in computer simulations, of key aspects of the interaction of genes and language. Our results show that genes for language could have coevolved only with highly stable aspects of the linguistic environment; a rapidly changing linguistic environment does not provide a stable target for natural selection. Thus, a biological endowment could not coevolve with properties of language that began as learned cultural conventions, because cultural conventions change much more rapidly than genes. We argue that this rules out the possibility that arbitrary properties of language, including abstract syntactic principles governing phrase structure, case marking, and agreement, have been built into a "language module" by natural selection. The genetic basis of human language acquisition and processing did not coevolve with language, but primarily predates the emergence of language. As suggested by Darwin, the fit between language and its underlying mechanisms arose because language has evolved to fit the human brain, rather than the reverse.
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Triantafyllopoulos, Spiros
1985-01-01
A collection of presentation visuals associated with the companion report entitled KARL: A Knowledge-Assisted Retrieval Language, is presented. Information is given on data retrieval, natural language database front ends, generic design objectives, processing capababilities and the query processing cycle.
Assessment of Language and Literacy: A Process of Hypothesis Testing for Individual Differences
ERIC Educational Resources Information Center
Scott, Cheryl M.
2011-01-01
Purpose: Older school-aged children and adolescents with persistent language and literacy impairments vary in their individual profiles of linguistic strengths and weaknesses. Given the multidimensional nature and complexity of language, designing an assessment protocol capable of uncovering linguistic variation is challenging. A process of…
Towards Automatic Treatment of Natural Language.
ERIC Educational Resources Information Center
Lonsdale, Deryle
1984-01-01
Because automated natural language processing relies heavily on the still developing fields of linguistics, knowledge representation, and computational linguistics, no system is capable of mimicking human linguistic capabilities. For the present, interactive systems may be used to augment today's technology. (MSE)
The application of natural language processing to augmentative and alternative communication.
Higginbotham, D Jeffery; Lesher, Gregory W; Moulton, Bryan J; Roark, Brian
2011-01-01
Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next generation of AAC technology.
A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory
ERIC Educational Resources Information Center
Tierney, Patrick J.
2012-01-01
This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…
ERIC Educational Resources Information Center
Duran, Nicholas D.; Hall, Charles; McCarthy, Philip M.; McNamara, Danielle S.
2010-01-01
The words people use and the way they use them can reveal a great deal about their mental states when they attempt to deceive. The challenge for researchers is how to reliably distinguish the linguistic features that characterize these hidden states. In this study, we use a natural language processing tool called Coh-Metrix to evaluate deceptive…
ERIC Educational Resources Information Center
Sager, Naomi
This investigation matches the emerging techniques in computerized natural language processing against emerging needs for such techniques in the information field to evaluate and extend such techniques for future applications and to establish a basis and direction for further research toward these goals. An overview describes developments in the…
2012-01-01
Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054
Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł
2015-05-01
In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. Copyright © 2015 Elsevier Ltd. All rights reserved.
Exploring Social Meaning in Online Bilingual Text through Social Network Analysis
2015-09-01
p. 1). 30 GATE development began in 1995. As techniques for natural language processing ( NLP ) are investigated by the research community and...become part of the NLP repetoire, developers incorporate them with wrappers, which allow the output from GATE processes to be recognized as input by...University NEE Named Entity Extraction NLP natural language processing OSD Office of the Secretary of Defense POS parts of speech SBIR Small Business
The language of gene ontology: a Zipf's law analysis.
Kalankesh, Leila Ranandeh; Stevens, Robert; Brass, Andy
2012-06-07
Most major genome projects and sequence databases provide a GO annotation of their data, either automatically or through human annotators, creating a large corpus of data written in the language of GO. Texts written in natural language show a statistical power law behaviour, Zipf's law, the exponent of which can provide useful information on the nature of the language being used. We have therefore explored the hypothesis that collections of GO annotations will show similar statistical behaviours to natural language. Annotations from the Gene Ontology Annotation project were found to follow Zipf's law. Surprisingly, the measured power law exponents were consistently different between annotation captured using the three GO sub-ontologies in the corpora (function, process and component). On filtering the corpora using GO evidence codes we found that the value of the measured power law exponent responded in a predictable way as a function of the evidence codes used to support the annotation. Techniques from computational linguistics can provide new insights into the annotation process. GO annotations show similar statistical behaviours to those seen in natural language with measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent might provide a signal regarding the information content of the annotation.
A Codasyl-Type Schema for Natural Language Medical Records
Sager, N.; Tick, L.; Story, G.; Hirschman, L.
1980-01-01
This paper describes a CODASYL (network) database schema for information derived from narrative clinical reports. The goal of this work is to create an automated process that accepts natural language documents as input and maps this information into a database of a type managed by existing database management systems. The schema described here represents the medical events and facts identified through the natural language processing. This processing decomposes each narrative into a set of elementary assertions, represented as MEDFACT records in the database. Each assertion in turn consists of a subject and a predicate classed according to a limited number of medical event types, e.g., signs/symptoms, laboratory tests, etc. The subject and predicate are represented by EVENT records which are owned by the MEDFACT record associated with the assertion. The CODASYL-type network structure was found to be suitable for expressing most of the relations needed to represent the natural language information. However, special mechanisms were developed for storing the time relations between EVENT records and for recording connections (such as causality) between certain MEDFACT records. This schema has been implemented using the UNIVAC DMS-1100 DBMS.
Topaz, Maxim; Radhakrishnan, Kavita; Blackley, Suzanne; Lei, Victor; Lai, Kenneth; Zhou, Li
2016-09-14
This study developed an innovative natural language processing algorithm to automatically identify heart failure (HF) patients with ineffective self-management status (in the domains of diet, physical activity, medication adherence, and adherence to clinician appointments) from narrative discharge summary notes. We also analyzed the association between self-management status and preventable 30-day hospital readmissions. Our natural language system achieved relatively high accuracy (F-measure = 86.3%; precision = 95%; recall = 79.2%) on a testing sample of 300 notes annotated by two human reviewers. In a sample of 8,901 HF patients admitted to our healthcare system, 14.4% (n = 1,282) had documentation of ineffective HF self-management. Adjusted regression analyses indicated that presence of any skill-related self-management deficit (odds ratio [OR] = 1.3, 95% confidence interval [CI] = [1.1, 1.6]) and non-specific ineffective self-management (OR = 1.5, 95% CI = [1.2, 2]) was significantly associated with readmissions. We have demonstrated the feasibility of identifying ineffective HF self-management from electronic discharge summaries with natural language processing. © The Author(s) 2016.
Honda, Masayuki; Matsumoto, Takehiro
2017-01-01
Several kinds of event log data produced in daily clinical activities have yet to be used for secure and efficient improvement of hospital activities. Data Warehouse systems in Hospital Information Systems used for the analysis of structured data such as disease, lab-tests, and medications, have also shown efficient outcomes. This article is focused on two kinds of essential functions: process mining using log data and non-structured data analysis via Natural Language Processing.
Assessing Group Interaction with Social Language Network Analysis
NASA Astrophysics Data System (ADS)
Scholand, Andrew J.; Tausczik, Yla R.; Pennebaker, James W.
In this paper we discuss a new methodology, social language network analysis (SLNA), that combines tools from social language processing and network analysis to assess socially situated working relationships within a group. Specifically, SLNA aims to identify and characterize the nature of working relationships by processing artifacts generated with computer-mediated communication systems, such as instant message texts or emails. Because social language processing is able to identify psychological, social, and emotional processes that individuals are not able to fully mask, social language network analysis can clarify and highlight complex interdependencies between group members, even when these relationships are latent or unrecognized.
Using natural language processing techniques to inform research on nanotechnology.
Lewinski, Nastassja A; McInnes, Bridget T
2015-01-01
Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.
ERIC Educational Resources Information Center
Amaral, Luiz; Meurers, Detmar; Ziai, Ramon
2011-01-01
Intelligent language tutoring systems (ILTS) typically analyze learner input to diagnose learner language properties and provide individualized feedback. Despite a long history of ILTS research, such systems are virtually absent from real-life foreign language teaching (FLT). Taking a step toward more closely linking ILTS research to real-life…
Interactive natural language acquisition in a multi-modal recurrent neural architecture
NASA Astrophysics Data System (ADS)
Heinrich, Stefan; Wermter, Stefan
2018-01-01
For the complex human brain that enables us to communicate in natural language, we gathered good understandings of principles underlying language acquisition and processing, knowledge about sociocultural conditions, and insights into activity patterns in the brain. However, we were not yet able to understand the behavioural and mechanistic characteristics for natural language and how mechanisms in the brain allow to acquire and process language. In bridging the insights from behavioural psychology and neuroscience, the goal of this paper is to contribute a computational understanding of appropriate characteristics that favour language acquisition. Accordingly, we provide concepts and refinements in cognitive modelling regarding principles and mechanisms in the brain and propose a neurocognitively plausible model for embodied language acquisition from real-world interaction of a humanoid robot with its environment. In particular, the architecture consists of a continuous time recurrent neural network, where parts have different leakage characteristics and thus operate on multiple timescales for every modality and the association of the higher level nodes of all modalities into cell assemblies. The model is capable of learning language production grounded in both, temporal dynamic somatosensation and vision, and features hierarchical concept abstraction, concept decomposition, multi-modal integration, and self-organisation of latent representations.
Sound Evidence: The Missing Piece of the Jigsaw in Formulaic Language Research
ERIC Educational Resources Information Center
Lin, Phoebe M. S.
2012-01-01
With the ever increasing number of studies on formulaic language, we are beginning to learn more about the processing of formulaic language (e.g. Ellis et al. 2008; Siyanova et al. 2011), its use in speech (e.g. Aijmer 1996; Wood 2012) and writing (e.g. Hyland 2008a, 2008b) and its application in natural language processing (e.g. Tschichold 2000).…
Natural language processing, pragmatics, and verbal behavior
Cherpas, Chris
1992-01-01
Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052
A Natural Language Interface Concordant with a Knowledge Base.
Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young
2016-01-01
The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively.
Data-Driven Approaches for Paraphrasing across Language Variations
ERIC Educational Resources Information Center
Xu, Wei
2014-01-01
Our language changes very rapidly, accompanying political, social and cultural trends, as well as the evolution of science and technology. The Internet, especially the social media, has accelerated this process of change. This poses a severe challenge for both human beings and natural language processing (NLP) systems, which usually only model a…
Comeau, Donald C.; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W. John
2014-01-01
BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net PMID:24935050
ERIC Educational Resources Information Center
Vlas, Radu Eduard
2012-01-01
Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…
Review of Knowledge Enhanced Electronic Logic (KEEL) Technology
2016-09-01
compiled. Two KEEL Engine processing models are available for most languages : The “Normal Model” processes information as if it was processed on an... language also makes it easy to “see” the functional relationships and the dynamic (interactive) nature of the language , allows one to interact with...for the Accelerated Processing Model ( Patent number 7,512,581 (3/31/2009)). In June 2006, application US 11/446/801 was submitted to support
Danforth, Kim N; Early, Megan I; Ngan, Sharon; Kosco, Anne E; Zheng, Chengyi; Gould, Michael K
2012-08-01
Lung nodules are commonly encountered in clinical practice, yet little is known about their management in community settings. An automated method for identifying patients with lung nodules would greatly facilitate research in this area. Using members of a large, community-based health plan from 2006 to 2010, we developed a method to identify patients with lung nodules, by combining five diagnostic codes, four procedural codes, and a natural language processing algorithm that performed free text searches of radiology transcripts. An experienced pulmonologist reviewed a random sample of 116 radiology transcripts, providing a reference standard for the natural language processing algorithm. With the use of an automated method, we identified 7112 unique members as having one or more incident lung nodules. The mean age of the patients was 65 years (standard deviation 14 years). There were slightly more women (54%) than men, and Hispanics and non-whites comprised 45% of the lung nodule cohort. Thirty-six percent were never smokers whereas 11% were current smokers. Fourteen percent of the patients were subsequently diagnosed with lung cancer. The sensitivity and specificity of the natural language processing algorithm for identifying the presence of lung nodules were 96% and 86%, respectively, compared with clinician review. Among the true positive transcripts in the validation sample, only 35% were solitary and unaccompanied by one or more associated findings, and 56% measured 8 to 30 mm in diameter. A combination of diagnostic codes, procedural codes, and a natural language processing algorithm for free text searching of radiology reports can accurately and efficiently identify patients with incident lung nodules, many of whom are subsequently diagnosed with lung cancer.
ERIC Educational Resources Information Center
Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine
2009-01-01
We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…
NASA Astrophysics Data System (ADS)
Gómez-Rodríguez, Carlos
2017-07-01
Liu et al. [1] provide a comprehensive account of research on dependency distance in human languages. While the article is a very rich and useful report on this complex subject, here I will expand on a few specific issues where research in computational linguistics (specifically natural language processing) can inform DDM research, and vice versa. These aspects have not been explored much in [1] or elsewhere, probably due to the little overlap between both research communities, but they may provide interesting insights for improving our understanding of the evolution of human languages, the mechanisms by which the brain processes and understands language, and the construction of effective computer systems to achieve this goal.
Zheng, Kai; Mei, Qiaozhu; Yang, Lei; Manion, Frank J.; Balis, Ulysses J.; Hanauer, David A.
2011-01-01
In this study, we comparatively examined the linguistic properties of narrative clinician notes created through voice dictation versus those directly entered by clinicians via a computer keyboard. Intuitively, the nature of voice-dictated notes would resemble that of natural language, while typed-in notes may demonstrate distinctive language features for reasons such as intensive usage of acronyms. The study analyses were based on an empirical dataset retrieved from our institutional electronic health records system. The dataset contains 30,000 voice-dictated notes and 30,000 notes that were entered manually; both were encounter notes generated in ambulatory care settings. The results suggest that between the narrative clinician notes created via these two different methods, there exists a considerable amount of lexical and distributional differences. Such differences could have a significant impact on the performance of natural language processing tools, necessitating these two different types of documents being differentially treated. PMID:22195229
Using natural language processing techniques to inform research on nanotechnology
Lewinski, Nastassja A
2015-01-01
Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848
Generating structure from experience: A retrieval-based model of language processing.
Johns, Brendan T; Jones, Michael N
2015-09-01
Standard theories of language generally assume that some abstraction of linguistic input is necessary to create higher level representations of linguistic structures (e.g., a grammar). However, the importance of individual experiences with language has recently been emphasized by both usage-based theories (Tomasello, 2003) and grounded and situated theories (e.g., Zwaan & Madden, 2005). Following the usage-based approach, we present a formal exemplar model that stores instances of sentences across a natural language corpus, applying recent advances from models of semantic memory. In this model, an exemplar memory is used to generate expectations about the future structure of sentences, using a mechanism for prediction in language processing (Altmann & Mirković, 2009). The model successfully captures a broad range of behavioral effects-reduced relative clause processing (Reali & Christiansen, 2007), the role of contextual constraint (Rayner & Well, 1996), and event knowledge activation (Ferretti, Kutas, & McRae, 2007), among others. We further demonstrate how perceptual knowledge could be integrated into this exemplar-based framework, with the goal of grounding language processing in perception. Finally, we illustrate how an exemplar memory system could have been used in the cultural evolution of language. The model provides evidence that an impressive amount of language processing may be bottom-up in nature, built on the storage and retrieval of individual linguistic experiences. (c) 2015 APA, all rights reserved).
Automatic Selection of Suitable Sentences for Language Learning Exercises
ERIC Educational Resources Information Center
Pilán, Ildikó; Volodina, Elena; Johansson, Richard
2013-01-01
In our study we investigated second and foreign language (L2) sentence readability, an area little explored so far in the case of several languages, including Swedish. The outcome of our research consists of two methods for sentence selection from native language corpora based on Natural Language Processing (NLP) and machine learning (ML)…
2017-03-01
Warfare. 14. SUBJECT TERMS data mining, natural language processing, machine learning, algorithm design , information warfare, propaganda 15. NUMBER OF...Speech Tags. Adapted from [12]. CC Coordinating conjunction PRP$ Possessive pronoun CD Cardinal number RB Adverb DT Determiner RBR Adverb, comparative ... comparative UH Interjection JJS Adjective, superlative VB Verb, base form LS List item marker VBD Verb, past tense MD Modal VBG Verb, gerund or
Cascading Air Power Effects Simulation (CAPES)
2010-05-01
governments using autmated natural language processing techniques. New data on the attitudes of the masses and Arabic mass media were also collected using...environmental contexts. To meet this objective, new data was collected on the behavior of groups and governments using automated natural language processing...illustrate one fruitful avenue for future research. We show that we can model the first, second , and third order effects of US actions on group violence
ERIC Educational Resources Information Center
Rahimi, Zahra; Litman, Diane; Correnti, Richard; Wang, Elaine; Matsumura, Lindsay Clare
2017-01-01
This paper presents an investigation of score prediction based on natural language processing for two targeted constructs within analytic text-based writing: 1) students' effective use of evidence and, 2) their organization of ideas and evidence in support of their claim. With the long-term goal of producing feedback for students and teachers, we…
Usability Evaluation of NLP-PIER: A Clinical Document Search Engine for Researchers.
Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B
2017-01-01
NLP-PIER (Natural Language Processing - Patient Information Extraction for Research) is a self-service platform with a search engine for clinical researchers to perform natural language processing (NLP) queries using clinical notes. We conducted user-centered testing of NLP-PIER's usability to inform future design decisions. Quantitative and qualitative data were analyzed. Our findings will be used to improve the usability of NLP-PIER.
Gross, Alexander; Murthy, Dhiraj
2014-10-01
This paper explores a variety of methods for applying the Latent Dirichlet Allocation (LDA) automated topic modeling algorithm to the modeling of the structure and behavior of virtual organizations found within modern social media and social networking environments. As the field of Big Data reveals, an increase in the scale of social data available presents new challenges which are not tackled by merely scaling up hardware and software. Rather, they necessitate new methods and, indeed, new areas of expertise. Natural language processing provides one such method. This paper applies LDA to the study of scientific virtual organizations whose members employ social technologies. Because of the vast data footprint in these virtual platforms, we found that natural language processing was needed to 'unlock' and render visible latent, previously unseen conversational connections across large textual corpora (spanning profiles, discussion threads, forums, and other social media incarnations). We introduce variants of LDA and ultimately make the argument that natural language processing is a critical interdisciplinary methodology to make better sense of social 'Big Data' and we were able to successfully model nested discussion topics from forums and blog posts using LDA. Importantly, we found that LDA can move us beyond the state-of-the-art in conventional Social Network Analysis techniques. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John
2014-01-01
BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. © The Author(s) 2014. Published by Oxford University Press.
Language Design in the Processing of Non-Restrictive Relative Clauses in French as a Second Language
ERIC Educational Resources Information Center
Lorente Lapole, Amandine
2012-01-01
Recent years have witnessed a lively debate on the nature of learners' morphological competence and use. Some argue that a breakdown in acquisition of second-language (L2) is expected whenever features required for the analysis of L2 input are not present in the L1. Others argue that features have the same nature and etiology in first…
The ALICE System: A Workbench for Learning and Using Language.
ERIC Educational Resources Information Center
Levin, Lori; And Others
1991-01-01
ALICE, a multimedia framework for intelligent computer-assisted language instruction (ICALI) at Carnegie Mellon University (PA), consists of a set of tools for building a number of different types of ICALI programs in any language. Its Natural Language Processing tools for syntactic error detection, morphological analysis, and generation of…
A Look at Natural Language Retrieval Systems
ERIC Educational Resources Information Center
Townley, Helen M.
1971-01-01
Natural language systems are seen as falling into two classes - those which process and analyse the input and store it in an ordered fashion, and those which employ controls at the output stage. A variety of systems of both types is reviewed, and their respective features are discussed. (12 references) (Author/NH)
Natural Language Processing and Game-Based Practice in iSTART
ERIC Educational Resources Information Center
Jackson, G. Tanner; Boonthum-Denecke, Chutima; McNamara, Danielle S.
2015-01-01
Intelligent Tutoring Systems (ITSs) are situated in a potential struggle between effective pedagogy and system enjoyment and engagement. iSTART, a reading strategy tutoring system in which students practice generating self-explanations and using reading strategies, employs two devices to engage the user. The first is natural language processing…
Linguistically Motivated Features for CCG Realization Ranking
ERIC Educational Resources Information Center
Rajkumar, Rajakrishnan
2012-01-01
Natural Language Generation (NLG) is the process of generating natural language text from an input, which is a communicative goal and a database or knowledge base. Informally, the architecture of a standard NLG system consists of the following modules (Reiter and Dale, 2000): content determination, sentence planning (or microplanning) and surface…
Leveraging Code Comments to Improve Software Reliability
ERIC Educational Resources Information Center
Tan, Lin
2009-01-01
Commenting source code has long been a common practice in software development. This thesis, consisting of three pieces of work, made novel use of the code comments written in natural language to improve software reliability. Our solution combines Natural Language Processing (NLP), Machine Learning, Statistics, and Program Analysis techniques to…
First Toronto Conference on Database Users. Systems that Enhance User Performance.
ERIC Educational Resources Information Center
Doszkocs, Tamas E.; Toliver, David
1987-01-01
The first of two papers discusses natural language searching as a user performance enhancement tool, focusing on artificial intelligence applications for information retrieval and problems with natural language processing. The second presents a conceptual framework for further development and future design of front ends to online bibliographic…
Learning a Foreign Language: A New Path to Enhancement of Cognitive Functions
ERIC Educational Resources Information Center
Shoghi Javan, Sara; Ghonsooly, Behzad
2018-01-01
The complicated cognitive processes involved in natural (primary) bilingualism lead to significant cognitive development. Executive functions as a fundamental component of human cognition are deemed to be affected by language learning. To date, a large number of studies have investigated how natural (primary) bilingualism influences executive…
A role for relaxed selection in the evolution of the language capacity
Deacon, Terrence W.
2010-01-01
Explaining the extravagant complexity of the human language and our competence to acquire it has long posed challenges for natural selection theory. To answer his critics, Darwin turned to sexual selection to account for the extreme development of language. Many contemporary evolutionary theorists have invoked incredibly lucky mutation or some variant of the assimilation of acquired behaviors to innate predispositions in an effort to explain it. Recent evodevo approaches have identified developmental processes that help to explain how complex functional synergies can evolve by Darwinian means. Interestingly, many of these developmental mechanisms bear a resemblance to aspects of Darwin's mechanism of natural selection, often differing only in one respect (e.g., form of duplication, kind of variation, competition/cooperation). A common feature is an interplay between processes of stabilizing selection and processes of relaxed selection at different levels of organism function. These may play important roles in the many levels of evolutionary process contributing to language. Surprisingly, the relaxation of selection at the organism level may have been a source of many complex synergistic features of the human language capacity, and may help explain why so much language information is “inherited” socially. PMID:20445088
An expert system for natural language processing
NASA Technical Reports Server (NTRS)
Hennessy, John F.
1988-01-01
A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.
A Tutorial on Techniques and Applications for Natural Language Processing
1983-10-17
mentioned above as specific to context-free grammars were tackled by linguists, in particular Chomsky [21, 221 through Transformational Grammar . As shown...DTIC e, C 17 October 1983 MAY 1,5 1990 DEPARTMENT of COMPUTER SCIENCE Approved for pu ]3 -- ,. " Carnegie-Mellon University . . . - -A.,,Anm m n n n n ln...A Tutorial on Techniques and Applications for Natural Language Processing Philip J. Hayes and Jaime G. Carbonell Carnegie-Mellon University 17
Topaz, Maxim; Lai, Kenneth; Dowding, Dawn; Lei, Victor J; Zisberg, Anna; Bowles, Kathryn H; Zhou, Li
2016-12-01
Electronic health records are being increasingly used by nurses with up to 80% of the health data recorded as free text. However, only a few studies have developed nursing-relevant tools that help busy clinicians to identify information they need at the point of care. This study developed and validated one of the first automated natural language processing applications to extract wound information (wound type, pressure ulcer stage, wound size, anatomic location, and wound treatment) from free text clinical notes. First, two human annotators manually reviewed a purposeful training sample (n=360) and random test sample (n=1100) of clinical notes (including 50% discharge summaries and 50% outpatient notes), identified wound cases, and created a gold standard dataset. We then trained and tested our natural language processing system (known as MTERMS) to process the wound information. Finally, we assessed our automated approach by comparing system-generated findings against the gold standard. We also compared the prevalence of wound cases identified from free-text data with coded diagnoses in the structured data. The testing dataset included 101 notes (9.2%) with wound information. The overall system performance was good (F-measure is a compiled measure of system's accuracy=92.7%), with best results for wound treatment (F-measure=95.7%) and poorest results for wound size (F-measure=81.9%). Only 46.5% of wound notes had a structured code for a wound diagnosis. The natural language processing system achieved good performance on a subset of randomly selected discharge summaries and outpatient notes. In more than half of the wound notes, there were no coded wound diagnoses, which highlight the significance of using natural language processing to enrich clinical decision making. Our future steps will include expansion of the application's information coverage to other relevant wound factors and validation of the model with external data. Copyright © 2016 Elsevier Ltd. All rights reserved.
Reconciliation of ontology and terminology to cope with linguistics.
Baud, Robert H; Ceusters, Werner; Ruch, Patrick; Rassinoux, Anne-Marie; Lovis, Christian; Geissbühler, Antoine
2007-01-01
To discuss the relationships between ontologies, terminologies and language in the context of Natural Language Processing (NLP) applications in order to show the negative consequences of confusing them. The viewpoints of the terminologist and (computational) linguist are developed separately, and then compared, leading to the presentation of reconciliation among these points of view, with consideration of the role of the ontologist. In order to encourage appropriate usage of terminologies, guidelines are presented advocating the simultaneous publication of pragmatic vocabularies supported by terminological material based on adequate ontological analysis. Ontologies, terminologies and natural languages each have their own purpose. Ontologies support machine understanding, natural languages support human communication, and terminologies should form the bridge between them. Therefore, future terminology standards should be based on sound ontology and do justice to the diversities in natural languages. Moreover, they should support local vocabularies, in order to be easily adaptable to local needs and practices.
CPP-TRS(C): On using visual cognitive symbols to enhance communication effectiveness
NASA Technical Reports Server (NTRS)
Tonfoni, Graziella
1994-01-01
Communicative Positioning Program/Text Representation Systems (CPP-TRS) is a visual language based on a system of 12 canvasses, 10 signals and 14 symbols. CPP-TRS is based on the fact that every communication action is the result of a set of cognitive processes and the whole system is based on the concept that you can enhance communication by visually perceiving text. With a simple syntax, CPP-TRS is capable of representing meaning and intention as well as communication functions visually. Those are precisely invisible aspects of natural language that are most relevant to getting the global meaning of a text. CPP-TRS reinforces natural language in human machine interaction systems. It complements natural language by adding certain important elements that are not represented by natural language by itself. These include communication intention and function of the text expressed by the sender, as well as the role the reader is supposed to play. The communication intention and function of a text and the reader's role are invisible in natural language because neither specific words nor punctuation conveys them sufficiently and unambiguously; they are therefore non-transparent.
An ontology model for nursing narratives with natural language generation technology.
Min, Yul Ha; Park, Hyeoun-Ae; Jeon, Eunjoo; Lee, Joo Yun; Jo, Soo Jung
2013-01-01
The purpose of this study was to develop an ontology model to generate nursing narratives as natural as human language from the entity-attribute-value triplets of a detailed clinical model using natural language generation technology. The model was based on the types of information and documentation time of the information along the nursing process. The typesof information are data characterizing the patient status, inferences made by the nurse from the patient data, and nursing actions selected by the nurse to change the patient status. This information was linked to the nursing process based on the time of documentation. We describe a case study illustrating the application of this model in an acute-care setting. The proposed model provides a strategy for designing an electronic nursing record system.
ERIC Educational Resources Information Center
Vilaseca, R.M.; Del Rio, M-J.
2004-01-01
Many child language studies emphasize the value of verbal and social support, of 'scaffolding' processes and mutual adjustments that naturally occur in adult-child interactions in everyday contexts. Based on such theories, this study attempted to improve the language and communication skills in children with special educational needs through…
ERIC Educational Resources Information Center
Kiraz, George Anton
This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…
An Intelligent Computer Assisted Language Learning System for Arabic Learners
ERIC Educational Resources Information Center
Shaalan, Khaled F.
2005-01-01
This paper describes the development of an intelligent computer-assisted language learning (ICALL) system for learning Arabic. This system could be used for learning Arabic by students at primary schools or by learners of Arabic as a second or foreign language. It explores the use of Natural Language Processing (NLP) techniques for learning…
Natural Language Processing: A Tutorial. Revision
1990-01-01
English in word-for-word language translations. An oft-repeated (although fictional) anecdote illustrates the ... English by a language translation program, became: " The vodka is strong but 3 the steak is rotten." The point made is that vast amounts of knowledge...are required for effective language translations. The initial goal for Language Translation was "fully-automatic high-quality translation" (FAHOT).
Encoding Standards for Linguistic Corpora.
ERIC Educational Resources Information Center
Ide, Nancy
The demand for extensive reusability of large language text collections for natural languages processing research requires development of standardized encoding formats. Such formats must be capable of representing different kinds of information across the spectrum of text types and languages, capable of representing different levels of…
ERIC Educational Resources Information Center
Klein, Ariel; Badia, Toni
2015-01-01
In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…
Laboratory process control using natural language commands from a personal computer
NASA Technical Reports Server (NTRS)
Will, Herbert A.; Mackin, Michael A.
1989-01-01
PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.
Natural language processing-based COTS software and related technologies survey.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.
Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.
Chapman, Wendy W; Christensen, Lee M; Wagner, Michael M; Haug, Peter J; Ivanov, Oleg; Dowling, John N; Olszewski, Robert T
2005-01-01
Develop and evaluate a natural language processing application for classifying chief complaints into syndromic categories for syndromic surveillance. Much of the input data for artificial intelligence applications in the medical field are free-text patient medical records, including dictated medical reports and triage chief complaints. To be useful for automated systems, the free-text must be translated into encoded form. We implemented a biosurveillance detection system from Pennsylvania to monitor the 2002 Winter Olympic Games. Because input data was in free-text format, we used a natural language processing text classifier to automatically classify free-text triage chief complaints into syndromic categories used by the biosurveillance system. The classifier was trained on 4700 chief complaints from Pennsylvania. We evaluated the ability of the classifier to classify free-text chief complaints into syndromic categories with a test set of 800 chief complaints from Utah. The classifier produced the following areas under the ROC curve: Constitutional = 0.95; Gastrointestinal = 0.97; Hemorrhagic = 0.99; Neurological = 0.96; Rash = 1.0; Respiratory = 0.99; Other = 0.96. Using information stored in the system's semantic model, we extracted from the Respiratory classifications lower respiratory complaints and lower respiratory complaints with fever with a precision of 0.97 and 0.96, respectively. Results suggest that a trainable natural language processing text classifier can accurately extract data from free-text chief complaints for biosurveillance.
Trébuchon-Da Fonseca, Agnès; Bénar, Christian-G; Bartoloméi, Fabrice; Régis, Jean; Démonet, Jean-François; Chauvel, Patrick; Liégeois-Chauvel, Catherine
2009-03-01
Regions involved in language processing have been observed in the inferior part of the left temporal lobe. Although collectively labelled 'the Basal Temporal Language Area' (BTLA), these territories are functionally heterogeneous and are involved in language perception (i.e. reading or semantic task) or language production (speech arrest after stimulation). The objective of this study was to clarify the role of BTLA in the language network in an epileptic patient who displayed jargonaphasia. Intracerebral evoked related potentials to verbal and non-verbal stimuli in auditory and visual modalities were recorded from BTLA. Time-frequency analysis was performed during ictal events. Evoked potentials and induced gamma-band activity provided direct evidence that BTLA is sensitive to language stimuli in both modalities, 350 ms after stimulation. In addition, spontaneous gamma-band discharges were recorded from this region during which we observed phonological jargon. The findings emphasize the multimodal nature of this region in speech perception. In the context of transient dysfunction, the patient's lexical semantic processing network is disrupted, reducing spoken output to meaningless phoneme combinations. This rare opportunity to study the BTLA "in vivo" demonstrates its pivotal role in lexico-semantic processing for speech production and its multimodal nature in speech perception.
Massaro, Dominic W
2012-01-01
I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally.
Caregiver communication to the child as moderator and mediator of genes for language.
Onnis, Luca
2017-05-15
Human language appears to be unique among natural communication systems, and such uniqueness impinges on both nature and nurture. Human babies are endowed with cognitive abilities that predispose them to learn language, and this process cannot operate in an impoverished environment. To be effectively complete the acquisition of human language in human children requires highly socialised forms of learning, scaffolded over years of prolonged and intense caretaker-child interactions. How genes and environment operate in shaping language is unknown. These two components have traditionally been considered as independent, and often pitted against each other in terms of the nature versus nurture debate. This perspective article considers how innate abilities and experience might instead work together. In particular, it envisages potential scenarios for research, in which early caregiver verbal and non-verbal attachment practices may mediate or moderate the expression of human genetic systems for language. Copyright © 2017 Elsevier B.V. All rights reserved.
The Naturalization Process in New Mexico. A Guide for ESL Teachers and Advocates.
ERIC Educational Resources Information Center
Irvine, Patricia, Ed.; And Others
This guide provides an overview of the naturalization process and what it means to Hispanic immigrants, describes techniques for integrating English-as-a-Second-Language (ESL) and civics/history content in multilevel classes, offers directions for filling out the naturalization forms and completing the legal steps to naturalization, and provides…
Evolution: Language Use and the Evolution of Languages
NASA Astrophysics Data System (ADS)
Croft, William
Language change can be understood as an evolutionary process. Language change occurs at two different timescales, corresponding to the two steps of the evolutionary process. The first timescale is very short, namely, the production of an utterance: this is where linguistic structures are replicated and language variation is generated. The second timescale is (or can be) very long, namely, the propagation of linguistic variants in the speech community: this is where certain variants are selected over others. At both timescales, the evolutionary process is driven by social interaction and the role language plays in it. An understanding of social interaction at the micro-level—face-to-face interactions—and at the macro-level—the structure of speech communities—gives us the basis for understanding the generation and propagation of language structures, and understanding the nature of language itself.
Artificial Intelligence and CALL.
ERIC Educational Resources Information Center
Underwood, John H.
The potential application of artificial intelligence (AI) to computer-assisted language learning (CALL) is explored. Two areas of AI that hold particular interest to those who deal with language meaning--knowledge representation and expert systems, and natural-language processing--are described and examples of each are presented. AI contribution…
What's So Hard about Understanding Language?
ERIC Educational Resources Information Center
Read, Walter; And Others
A discussion of the application of artificial intelligence to natural language processing looks at several problems in language comprehension, involving semantic ambiguity, anaphoric reference, and metonymy. Examples of these problems are cited, and the importance of the computational approach in analyzing them is explained. The approach applies…
Learning a Foreign Language through the Media. CLCS Occasional Paper No. 18.
ERIC Educational Resources Information Center
Devitt, Sean M.
The use of mass media as a means of learning a foreign language from the beginning of language study is discussed. Using the media enables many of the features of the natural language acquisition process to be brought into play in a way that much current language teaching material does not. This position is supported by recent research into the…
ERIC Educational Resources Information Center
Bilgin, Sezen Seymen
2016-01-01
Code switching involves the interplay of two languages and as well as serving linguistic functions, it has social and psychological implications. In the context of English language teaching, these psychological implications reveal themselves as teachers' thought processes. While the nature of code switching in language classrooms has been widely…
Natural Language Processing as a Discipline at LLNL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Firpo, M A
The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.
Trombert-Paviot, B; Rodrigues, J M; Rogers, J E; Baud, R; van der Haring, E; Rassinoux, A M; Abrial, V; Clavel, L; Idir, H
1999-01-01
GALEN has developed a new generation of terminology tools based on a language independent concept reference model using a compositional formalism allowing computer processing and multiple reuses. During the 4th framework program project Galen-In-Use we applied the modelling and the tools to the development of a new multipurpose coding system for surgical procedures (CCAM) in France. On one hand we contributed to a language independent knowledge repository for multicultural Europe. On the other hand we support the traditional process for creating a new coding system in medicine which is very much labour consuming by artificial intelligence tools using a medically oriented recursive ontology and natural language processing. We used an integrated software named CLAW to process French professional medical language rubrics produced by the national colleges of surgeons into intermediate dissections and to the Grail reference ontology model representation. From this language independent concept model representation on one hand we generate controlled French natural language to support the finalization of the linguistic labels in relation with the meanings of the conceptual system structure. On the other hand the classification manager of third generation proves to be very powerful to retrieve the initial professional rubrics with different categories of concepts within a semantic network.
The Relevance of Second Language Acquisition Theory to the Written Error Correction Debate
ERIC Educational Resources Information Center
Polio, Charlene
2012-01-01
The controversies surrounding written error correction can be traced to Truscott (1996) in his polemic against written error correction. He claimed that empirical studies showed that error correction was ineffective and that this was to be expected "given the nature of the correction process and "the nature of language learning" (p. 328, emphasis…
NASA Astrophysics Data System (ADS)
Rustamov, Samir; Mustafayev, Elshan; Clements, Mark A.
2018-04-01
The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM) can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.
Flexible Processing and the Design of Grammar
ERIC Educational Resources Information Center
Sag, Ivan A.; Wasow, Thomas
2015-01-01
We explore the consequences of letting the incremental and integrative nature of language processing inform the design of competence grammar. What emerges is a view of grammar as a system of local monotonic constraints that provide a direct characterization of the signs (the form-meaning correspondences) of a given language. This…
Rodrigues, J M; Trombert-Paviot, B; Baud, R; Wagner, J; Meusnier-Carriot, F
1998-01-01
GALEN has developed a language independent common reference model based on a medically oriented ontology and practical tools and techniques for managing healthcare terminology including natural language processing. GALEN-IN-USE is the current phase which applied the modelling and the tools to the development or the updating of coding systems for surgical procedures in different national coding centers co-operating within the European Federation of Coding Centre (EFCC) to create a language independent knowledge repository for multicultural Europe. We used an integrated set of artificial intelligence terminology tools named CLAssification Manager workbench to process French professional medical language rubrics into intermediate dissections and to the Grail reference ontology model representation. From this language independent concept model representation we generate controlled French natural language. The French national coding centre is then able to retrieve the initial professional rubrics with different categories of concepts, to compare the professional language proposed by expert clinicians to the French generated controlled vocabulary and to finalize the linguistic labels of the coding system in relation with the meanings of the conceptual system structure.
Vivona, Jeanine M
2013-12-01
Like psychoanalysis, poetry is possible because of the nature of verbal language, particularly its potentials to evoke the sensations of lived experience. These potentials are vestiges of the personal relational context in which language is learned, without which there would be no poetry and no psychoanalysis. Such a view of language infuses psychoanalytic writings on poetry, yet has not been fully elaborated. To further that elaboration, a poem by Billy Collins is presented to illustrate the sensorial and imagistic potentials of words, after which the interpersonal processes of language development are explored in an attempt to elucidate the original nature of words as imbued with personal meaning, embodied resonance, and emotion. This view of language and the verbal form allows a fuller understanding of the therapeutic processes of speech and conversation at the heart of psychoanalysis, including the relational potentials of speech between present individuals, which are beyond the reach of poetry. In one sense, the work of the analyst is to create language that mobilizes the experiential, memorial, and relational potentials of words, and in so doing to make a poet out of the patient so that she too can create such language.
Sharma, Vivekanand; Law, Wayne; Balick, Michael J; Sarkar, Indra Neil
2017-01-01
The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement.
Sharma, Vivekanand; Law, Wayne; Balick, Michael J.; Sarkar, Indra Neil
2017-01-01
The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement. PMID:29854223
NASA Technical Reports Server (NTRS)
Gill, E. N.
1986-01-01
The requirements are identified for a very high order natural language to be used by crew members on board the Space Station. The hardware facilities, databases, realtime processes, and software support are discussed. The operations and capabilities that will be required in both normal (routine) and abnormal (nonroutine) situations are evaluated. A structure and syntax for an interface (front-end) language to satisfy the above requirements are recommended.
Khalifa, Abdulrahman; Meystre, Stéphane
2015-12-01
The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. Copyright © 2015 Elsevier Inc. All rights reserved.
Gender, Identity and Intercultural Transformation in Second Language Socialisation
ERIC Educational Resources Information Center
Shi, Xingsong
2006-01-01
In L2 learners' second language socialisation process, males and females from different sociocultural backgrounds have diverse attitudes and access to second language acquisition. In this study, informed by feminist poststructuralist theory, we can see the highly context-sensitive nature of the gendered practices and the corresponding outcomes of…
Data-Informed Language Learning
ERIC Educational Resources Information Center
Godwin-Jones, Robert
2017-01-01
Although data collection has been used in language learning settings for some time, it is only in recent decades that large corpora have become available, along with efficient tools for their use. Advances in natural language processing (NLP) have enabled rich tagging and annotation of corpus data, essential for their effective use in language…
ERIC Educational Resources Information Center
Alexopoulou, Theodora; Michel, Marije; Murakami, Akira; Meurers, Detmar
2017-01-01
Large-scale learner corpora collected from online language learning platforms, such as the EF-Cambridge Open Language Database (EFCAMDAT), provide opportunities to analyze learner data at an unprecedented scale. However, interpreting the learner language in such corpora requires a precise understanding of tasks: How does the prompt and input of a…
ERIC Educational Resources Information Center
Khany, Reza; Amiri, Majid
2018-01-01
Theoretical developments in second or foreign language motivation research have led to a better understanding of the convoluted nature of motivation in the process of language acquisition. Among these theories, action control theory has recently shown a good deal of explanatory power in second language learning contexts and in the presence of…
A natural language interface plug-in for cooperative query answering in biological databases.
Jamil, Hasan M
2012-06-11
One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a semantic description of the intended application. We demonstrate the feasibility of our approach with a practical example.
Ahlberg, Daniela Katharina; Bischoff, Heike; Strozyk, Jessica Vanessa; Bryant, Doreen; Kaup, Barbara
2018-01-01
While much support is found for embodied language processing in a first language (L1), evidence for embodiment in second language (L2) processing is rather sparse. In a recent study, we found support for L2 embodiment, but also an influence of L1 on L2 processing in adult learners. In the present study, we compared bilingual schoolchildren who speak German as one of their languages with monolingual German schoolchildren. We presented the German prepositions auf (on), über (above), and unter (under) in a Stroop-like task. Upward or downward responses were made depending on the font colour, resulting in compatible and incompatible trials. We found compatibility effects for all children, but in contrast to the adult sample, there were no processing differences between the children depending on the nature of their other language, suggesting that the processing of German prepositions of bilingual children is embodied in a similar way as in monolingual German children.
Bischoff, Heike; Strozyk, Jessica Vanessa; Bryant, Doreen; Kaup, Barbara
2018-01-01
While much support is found for embodied language processing in a first language (L1), evidence for embodiment in second language (L2) processing is rather sparse. In a recent study, we found support for L2 embodiment, but also an influence of L1 on L2 processing in adult learners. In the present study, we compared bilingual schoolchildren who speak German as one of their languages with monolingual German schoolchildren. We presented the German prepositions auf (on), über (above), and unter (under) in a Stroop-like task. Upward or downward responses were made depending on the font colour, resulting in compatible and incompatible trials. We found compatibility effects for all children, but in contrast to the adult sample, there were no processing differences between the children depending on the nature of their other language, suggesting that the processing of German prepositions of bilingual children is embodied in a similar way as in monolingual German children. PMID:29538404
A Portable Natural Language Interface.
1987-09-01
regrets. - 27 - BIBLIOGRAPHY Bayer, Samuel. "A Theory of Linearization in Relational Grammar ," Senior essay, Yale University , 1984. Dyer, Michael. In... Grammar 1. Chicago: University Chicago Press, 1983. Rustin, R., ed., Natural Language Processing. New York: Algorithmics Press, 1973. Wasow, Tom...most notably, the theory of relational grammar developed by Perlmutter and his associates, and the theory of discourse developed by Barbara Grosz
Communicative Discourse in Second Language Classrooms: From Building Skills to Becoming Skillful
ERIC Educational Resources Information Center
Suleiman, Mahmoud
2013-01-01
The dynamics of the communicative discourse is a natural process that requires an application of a wide range of skills and strategies. In particular, linguistic discourse and the interaction process have a huge impact on promoting literacy and academic skills in all students especially English language learners (ELLs). Using interactive…
The Function of Semantics in Automated Language Processing.
ERIC Educational Resources Information Center
Pacak, Milos; Pratt, Arnold W.
This paper is a survey of some of the major semantic models that have been developed for automated semantic analysis of natural language. Current approaches to semantic analysis and logical interference are based mainly on models of human cognitive processes such as Quillian's semantic memory, Simmon's Protosynthex III and others. All existing…
ERIC Educational Resources Information Center
Liou, Hsien-Chin; Chang, Jason S; Chen, Hao-Jan; Lin, Chih-Cheng; Liaw, Meei-Ling; Gao, Zhao-Ming; Jang, Jyh-Shing Roger; Yeh, Yuli; Chuang, Thomas C.; You, Geeng-Neng
2006-01-01
This paper describes the development of an innovative web-based environment for English language learning with advanced data-driven and statistical approaches. The project uses various corpora, including a Chinese-English parallel corpus ("Sinorama") and various natural language processing (NLP) tools to construct effective English…
Sučević, Jelena; Savić, Andrej M; Popović, Mirjana B; Styles, Suzy J; Ković, Vanja
2015-01-01
There is something about the sound of a pseudoword like takete that goes better with a spiky, than a curvy shape (Köhler, 1929:1947). Yet despite decades of research into sound symbolism, the role of this effect on real words in the lexicons of natural languages remains controversial. We report one behavioural and one ERP study investigating whether sound symbolism is active during normal language processing for real words in a speaker's native language, in the same way as for novel word forms. The results indicate that sound-symbolic congruence has a number of influences on natural language processing: Written forms presented in a congruent visual context generate more errors during lexical access, as well as a chain of differences in the ERP. These effects have a very early onset (40-80 ms, 100-160 ms, 280-320 ms) and are later overshadowed by familiar types of semantic processing, indicating that sound symbolism represents an early sensory-co-activation effect. Copyright © 2015 Elsevier Inc. All rights reserved.
Policy Process Editor for P3BM Software
NASA Technical Reports Server (NTRS)
James, Mark; Chang, Hsin-Ping; Chow, Edward T.; Crichton, Gerald A.
2010-01-01
A computer program enables generation, in the form of graphical representations of process flows with embedded natural-language policy statements, input to a suite of policy-, process-, and performance-based management (P3BM) software. This program (1) serves as an interface between users and the Hunter software, which translates the input into machine-readable form; and (2) enables users to initialize and monitor the policy-implementation process. This program provides an intuitive graphical interface for incorporating natural-language policy statements into business-process flow diagrams. Thus, the program enables users who dictate policies to intuitively embed their intended process flows as they state the policies, reducing the likelihood of errors and reducing the time between declaration and execution of policy.
Neural correlates of fixation duration in natural reading: Evidence from fixation-related fMRI.
Henderson, John M; Choi, Wonil; Luke, Steven G; Desai, Rutvik H
2015-10-01
A key assumption of current theories of natural reading is that fixation duration reflects underlying attentional, language, and cognitive processes associated with text comprehension. The neurocognitive correlates of this relationship are currently unknown. To investigate this relationship, we compared neural activation associated with fixation duration in passage reading and a pseudo-reading control condition. The results showed that fixation duration was associated with activation in oculomotor and language areas during text reading. Fixation duration during pseudo-reading, on the other hand, showed greater involvement of frontal control regions, suggesting flexibility and task dependency of the eye movement network. Consistent with current models, these results provide support for the hypothesis that fixation duration in reading reflects attentional engagement and language processing. The results also demonstrate that fixation-related fMRI provides a method for investigating the neurocognitive bases of natural reading. Copyright © 2015 Elsevier Inc. All rights reserved.
Language Analysis Package (L.A.P.) Version I System Design.
ERIC Educational Resources Information Center
Porch, Ann
To permit researchers to use the speed and versatility of the computer to process natural language text as well as numerical data without undergoing special training in programing or computer operations, a language analysis package has been developed partially based on several existing programs. An overview of the design is provided and system…
Enhancing Grammatical Structures in Web-Based Texts
ERIC Educational Resources Information Center
Zilio, Leonardo; Wilkens, Rodrigo; Fairon, Cédrick
2017-01-01
Presentation of raw text to language learners is not enough to ensure learning. Thus, we present the Smart and Immersive Language Learning Environment (SMILLE), a system that uses Natural Language Processing (NLP) for enhancing grammatical information in texts chosen by a given user. The enhancements, carried out by means of text highlighting, are…
A Morphological Analyzer for Vocalized or Not Vocalized Arabic Language
NASA Astrophysics Data System (ADS)
El Amine Abderrahim, Med; Breksi Reguig, Fethi
This research has been to show the realization of a morphological analyzer of the Arabic language (vocalized or not vocalized). This analyzer is based upon our object model for the Arabic Natural Language Processing (NLP) and can be exploited by NLP applications such as translation machine, orthographical correction and the search for information.
The "Anchor" Method: Principle and Practice.
ERIC Educational Resources Information Center
Selgin, Paul
This report discusses the "anchor" language learning method that is based upon derivation rather than construction, using Italian as an example of a language to be learned. This method borrows from the natural process of language learning as it asks the student to remember whole expressions that serve as vehicles for learning both words…
Defining the Language Assessment Literacy Gap: Evidence from a Parliamentary Inquiry
ERIC Educational Resources Information Center
Pill, John; Harding, Luke
2013-01-01
This study identifies a unique context for exploring lay understandings of language testing and, by extension, for characterizing the nature of language assessment literacy among non-practitioners, stemming from data in an inquiry into the registration processes and support for overseas trained doctors by the Australian House of Representatives…
Integrated Processing in Planning and Understanding.
1986-12-01
to language analysis seemed necessary. The second observation was the rather commonsense one that it is easier to understand a foreign language ...syntactic analysis Probably the most widely employed method for natural language analysis is augmea ted transition network parsing, or ATNs (Thorne, Bratley...accomplished. It is for this reason that the programming language Prolog, which implements that general method , has proven so well-stilted to writing ATN
Marques, J Frederico; Canessa, Nicola; Cappa, Stefano
2009-06-01
The inquiry on the nature of truth in language comprehension has a long history of opposite perspectives. These perspectives either consider that there are qualitative differences in the processing of true and false statements, or that these processes are fundamentally the same and only differ in quantitative terms. The present study evaluated the processing nature of true and false statements in terms of patterns of brain activity using event-related functional-Magnetic-Resonance-Imaging (fMRI). We show that when true and false concept-feature statements are controlled for relation strength/ambiguity, their processing is associated to qualitatively different processes. Verifying true statements activates the left inferior parietal cortex and the caudate nucleus, a neural correlate compatible with an extended search and matching process for particular stored information. In contrast, verifying false statements activates the fronto-polar cortex and is compatible with a reasoning process of finding and evaluating a contradiction between the sentence information and stored knowledge.
ERIC Educational Resources Information Center
Commissaire, Eva; Duncan, Lynne G.; Casalis, Severine
2011-01-01
This study explores the nature of orthographic processing skills among French-speaking children in Grades 6 and 8 who are learning English at school as a second language (L2). Two aspects of orthographic processing skills are thought to form a convergent construct in monolingual beginning readers: word-specific knowledge (e.g. "rain-rane") and…
TOWARDS A MULTI-SCALE AGENT-BASED PROGRAMMING LANGUAGE METHODOLOGY
Somogyi, Endre; Hagar, Amit; Glazier, James A.
2017-01-01
Living tissues are dynamic, heterogeneous compositions of objects, including molecules, cells and extra-cellular materials, which interact via chemical, mechanical and electrical process and reorganize via transformation, birth, death and migration processes. Current programming language have difficulty describing the dynamics of tissues because: 1: Dynamic sets of objects participate simultaneously in multiple processes, 2: Processes may be either continuous or discrete, and their activity may be conditional, 3: Objects and processes form complex, heterogeneous relationships and structures, 4: Objects and processes may be hierarchically composed, 5: Processes may create, destroy and transform objects and processes. Some modeling languages support these concepts, but most cannot translate models into executable simulations. We present a new hybrid executable modeling language paradigm, the Continuous Concurrent Object Process Methodology (CCOPM) which naturally expresses tissue models, enabling users to visually create agent-based models of tissues, and also allows computer simulation of these models. PMID:29282379
TOWARDS A MULTI-SCALE AGENT-BASED PROGRAMMING LANGUAGE METHODOLOGY.
Somogyi, Endre; Hagar, Amit; Glazier, James A
2016-12-01
Living tissues are dynamic, heterogeneous compositions of objects , including molecules, cells and extra-cellular materials, which interact via chemical, mechanical and electrical process and reorganize via transformation, birth, death and migration processes . Current programming language have difficulty describing the dynamics of tissues because: 1: Dynamic sets of objects participate simultaneously in multiple processes, 2: Processes may be either continuous or discrete, and their activity may be conditional, 3: Objects and processes form complex, heterogeneous relationships and structures, 4: Objects and processes may be hierarchically composed, 5: Processes may create, destroy and transform objects and processes. Some modeling languages support these concepts, but most cannot translate models into executable simulations. We present a new hybrid executable modeling language paradigm, the Continuous Concurrent Object Process Methodology ( CCOPM ) which naturally expresses tissue models, enabling users to visually create agent-based models of tissues, and also allows computer simulation of these models.
Innateness and culture in the evolution of language
Kirby, Simon; Dowman, Mike; Griffiths, Thomas L.
2007-01-01
Human language arises from biological evolution, individual learning, and cultural transmission, but the interaction of these three processes has not been widely studied. We set out a formal framework for analyzing cultural transmission, which allows us to investigate how innate learning biases are related to universal properties of language. We show that cultural transmission can magnify weak biases into strong linguistic universals, undermining one of the arguments for strong innate constraints on language learning. As a consequence, the strength of innate biases can be shielded from natural selection, allowing these genes to drift. Furthermore, even when there is no natural selection, cultural transmission can produce apparent adaptations. Cultural transmission thus provides an alternative to traditional nativist and adaptationist explanations for the properties of human languages. PMID:17360393
Odean, Rosalie; Nazareth, Alina; Pruden, Shannon M.
2015-01-01
Developmental systems theory posits that development cannot be segmented by influences acting in isolation, but should be studied through a scientific lens that highlights the complex interactions between these forces over time (Overton, 2013a). This poses a unique challenge for developmental psychologists studying complex processes like language development. In this paper, we advocate for the combining of highly sophisticated data collection technologies in an effort to move toward a more systemic approach to studying language development. We investigate the efficiency and appropriateness of combining eye-tracking technology and the LENA (Language Environment Analysis) system, an automated language analysis tool, in an effort to explore the relation between language processing in early development, and external dynamic influences like parent and educator language input in the home and school environments. Eye-tracking allows us to study language processing via eye movement analysis; these eye movements have been linked to both conscious and unconscious cognitive processing, and thus provide one means of evaluating cognitive processes underlying language development that does not require the use of subjective parent reports or checklists. The LENA system, on the other hand, provides automated language output that describes a child’s language-rich environment. In combination, these technologies provide critical information not only about a child’s language processing abilities but also about the complexity of the child’s language environment. Thus, when used in conjunction these technologies allow researchers to explore the nature of interacting systems involved in language development. PMID:26379591
Efficient Caption-Based Retrieval of Multimedia Information
1993-10-09
in the design of transportable natural language interfaces. Artifcial Intelligence , 32 (1987), 173-243. - 13- (101 Jones, M. and Eisner, J. A...systems for multimedia data . They exploit captions on the data and perform natural-language processing of them and English retrieval requests. Some...content analysis of the data is also performed to obtain additional descriptive information. The key to getting this approach to work is sufficiently
The Now-or-Never bottleneck: A fundamental constraint on language.
Christiansen, Morten H; Chater, Nick
2016-01-01
Memory is fleeting. New material rapidly obliterates previous material. How, then, can the brain deal successfully with the continual deluge of linguistic input? We argue that, to deal with this "Now-or-Never" bottleneck, the brain must compress and recode linguistic input as rapidly as possible. This observation has strong implications for the nature of language processing: (1) the language system must "eagerly" recode and compress linguistic input; (2) as the bottleneck recurs at each new representational level, the language system must build a multilevel linguistic representation; and (3) the language system must deploy all available information predictively to ensure that local linguistic ambiguities are dealt with "Right-First-Time"; once the original input is lost, there is no way for the language system to recover. This is "Chunk-and-Pass" processing. Similarly, language learning must also occur in the here and now, which implies that language acquisition is learning to process, rather than inducing, a grammar. Moreover, this perspective provides a cognitive foundation for grammaticalization and other aspects of language change. Chunk-and-Pass processing also helps explain a variety of core properties of language, including its multilevel representational structure and duality of patterning. This approach promises to create a direct relationship between psycholinguistics and linguistic theory. More generally, we outline a framework within which to integrate often disconnected inquiries into language processing, language acquisition, and language change and evolution.
ERIC Educational Resources Information Center
Hawson, Anne
1997-01-01
Three threshold hypotheses proposed by Cummins (1976) and Diaz (1985) as explanations of data on the cognitive consequences of bilingualism are examined in depth and compared to one another. A neuroscientifically updated information-processing perspective on the interaction of second-language comprehension and visual-processing ability is…
Quantifiable and objective approach to organizational performance enhancement.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scholand, Andrew Joseph; Tausczik, Yla R.
This report describes a new methodology, social language network analysis (SLNA), that combines tools from social language processing and network analysis to identify socially situated relationships between individuals which, though subtle, are highly influential. Specifically, SLNA aims to identify and characterize the nature of working relationships by processing artifacts generated with computer-mediated communication systems, such as instant message texts or emails. Because social language processing is able to identify psychological, social, and emotional processes that individuals are not able to fully mask, social language network analysis can clarify and highlight complex interdependencies between group members, even when these relationships aremore » latent or unrecognized. This report outlines the philosophical antecedents of SLNA, the mechanics of preprocessing, processing, and post-processing stages, and some example results obtained by applying this approach to a 15-month corporate discussion archive.« less
Koizumi, Masatoshi; Imamura, Satoshi
2017-02-01
The effects of syntactic and information structures on sentence processing load were investigated using two reading comprehension experiments in Japanese, a head-final SOV language. In the first experiment, we discovered the main effects of syntactic and information structures, as well as their interaction, showing that interaction of these two factors is not restricted to head-initial languages. The second experiment revealed that the interaction between syntactic structure and information structure occurs at the second NP (O of SOV and S of OSV), which, crucially, is a pre-head position, suggesting the incremental nature of the processing of both syntactic structure and information structure in head-final languages.
NASA Astrophysics Data System (ADS)
Knoeferle, Pia
2016-03-01
In his review article [19], Arbib outlines an ambitious research agenda: to accommodate within a unified framework the evolution, the development, and the processing of language in natural settings (implicating other systems such as vision). He does so with neuro-computationally explicit modeling in mind [1,2] and inspired by research on the mirror neuron system in primates. Similar research questions have received substantial attention also among other scientists [3,4,12].
Integrating language models into classifiers for BCI communication: a review
NASA Astrophysics Data System (ADS)
Speier, W.; Arnold, C.; Pouratian, N.
2016-06-01
Objective. The present review systematically examines the integration of language models to improve classifier performance in brain-computer interface (BCI) communication systems. Approach. The domain of natural language has been studied extensively in linguistics and has been used in the natural language processing field in applications including information extraction, machine translation, and speech recognition. While these methods have been used for years in traditional augmentative and assistive communication devices, information about the output domain has largely been ignored in BCI communication systems. Over the last few years, BCI communication systems have started to leverage this information through the inclusion of language models. Main results. Although this movement began only recently, studies have already shown the potential of language integration in BCI communication and it has become a growing field in BCI research. BCI communication systems using language models in their classifiers have progressed down several parallel paths, including: word completion; signal classification; integration of process models; dynamic stopping; unsupervised learning; error correction; and evaluation. Significance. Each of these methods have shown significant progress, but have largely been addressed separately. Combining these methods could use the full potential of language model, yielding further performance improvements. This integration should be a priority as the field works to create a BCI system that meets the needs of the amyotrophic lateral sclerosis population.
Integrating language models into classifiers for BCI communication: a review.
Speier, W; Arnold, C; Pouratian, N
2016-06-01
The present review systematically examines the integration of language models to improve classifier performance in brain-computer interface (BCI) communication systems. The domain of natural language has been studied extensively in linguistics and has been used in the natural language processing field in applications including information extraction, machine translation, and speech recognition. While these methods have been used for years in traditional augmentative and assistive communication devices, information about the output domain has largely been ignored in BCI communication systems. Over the last few years, BCI communication systems have started to leverage this information through the inclusion of language models. Although this movement began only recently, studies have already shown the potential of language integration in BCI communication and it has become a growing field in BCI research. BCI communication systems using language models in their classifiers have progressed down several parallel paths, including: word completion; signal classification; integration of process models; dynamic stopping; unsupervised learning; error correction; and evaluation. Each of these methods have shown significant progress, but have largely been addressed separately. Combining these methods could use the full potential of language model, yielding further performance improvements. This integration should be a priority as the field works to create a BCI system that meets the needs of the amyotrophic lateral sclerosis population.
ERIC Educational Resources Information Center
Kolodny, Oren; Lotem, Arnon; Edelman, Shimon
2015-01-01
We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given…
NLPReViz: an interactive tool for natural language processing on clinical text.
Trivedi, Gaurav; Pham, Phuong; Chapman, Wendy W; Hwa, Rebecca; Wiebe, Janyce; Hochheiser, Harry
2018-01-01
The gap between domain experts and natural language processing expertise is a barrier to extracting understanding from clinical text. We describe a prototype tool for interactive review and revision of natural language processing models of binary concepts extracted from clinical notes. We evaluated our prototype in a user study involving 9 physicians, who used our tool to build and revise models for 2 colonoscopy quality variables. We report changes in performance relative to the quantity of feedback. Using initial training sets as small as 10 documents, expert review led to final F1scores for the "appendiceal-orifice" variable between 0.78 and 0.91 (with improvements ranging from 13.26% to 29.90%). F1for "biopsy" ranged between 0.88 and 0.94 (-1.52% to 11.74% improvements). The average System Usability Scale score was 70.56. Subjective feedback also suggests possible design improvements. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Guide to IRUS-II Application Development
1989-09-01
Stallard (editors). Research and Develo; nent in Natural Language b’nderstan,;ng as Part of t/i Strategic Computing Program . chapter 3, pages 27-34...Development in Natural Language Processing in the Strategic Computing Program . Compi-nrional Linguistics 12(2):132-136. April-June, 1986. [24] Sidner. C.L...assist developers interested in adapting IRUS-11 to new application domains Chapter 2 provides a general introduction and overviev ,. Chapter 3 describes
Direct brain recordings reveal hippocampal rhythm underpinnings of language processing.
Piai, Vitória; Anderson, Kristopher L; Lin, Jack J; Dewar, Callum; Parvizi, Josef; Dronkers, Nina F; Knight, Robert T
2016-10-04
Language is classically thought to be supported by perisylvian cortical regions. Here we provide intracranial evidence linking the hippocampal complex to linguistic processing. We used direct recordings from the hippocampal structures to investigate whether theta oscillations, pivotal in memory function, track the amount of contextual linguistic information provided in sentences. Twelve participants heard sentences that were either constrained ("She locked the door with the") or unconstrained ("She walked in here with the") before presentation of the final word ("key"), shown as a picture that participants had to name. Hippocampal theta power increased for constrained relative to unconstrained contexts during sentence processing, preceding picture presentation. Our study implicates hippocampal theta oscillations in a language task using natural language associations that do not require memorization. These findings reveal that the hippocampal complex contributes to language in an active fashion, relating incoming words to stored semantic knowledge, a necessary process in the generation of sentence meaning.
Creation of structured documentation templates using Natural Language Processing techniques.
Kashyap, Vipul; Turchin, Alexander; Morin, Laura; Chang, Frank; Li, Qi; Hongsermeier, Tonya
2006-01-01
Structured Clinical Documentation is a fundamental component of the healthcare enterprise, linking both clinical (e.g., electronic health record, clinical decision support) and administrative functions (e.g., evaluation and management coding, billing). One of the challenges in creating good quality documentation templates has been the inability to address specialized clinical disciplines and adapt to local clinical practices. A one-size-fits-all approach leads to poor adoption and inefficiencies in the documentation process. On the other hand, the cost associated with manual generation of documentation templates is significant. Consequently there is a need for at least partial automation of the template generation process. We propose an approach and methodology for the creation of structured documentation templates for diabetes using Natural Language Processing (NLP).
ERIC Educational Resources Information Center
Gallas, Karen
Noting children's natural proclivity to interpret language freely and use that potential to expand and develop as learners, this book offers a new approach to understanding how young children communicate their knowledge of the world and how that understanding can transform the educative process. The book also describes the process of conducting…
A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language.
Golosio, Bruno; Cangelosi, Angelo; Gamotina, Olesya; Masala, Giovanni Luca
2015-01-01
Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.
Neural substrates of interactive musical improvisation: an FMRI study of 'trading fours' in jazz.
Donnay, Gabriel F; Rankin, Summer K; Lopez-Gonzalez, Monica; Jiradejvong, Patpong; Limb, Charles J
2014-01-01
Interactive generative musical performance provides a suitable model for communication because, like natural linguistic discourse, it involves an exchange of ideas that is unpredictable, collaborative, and emergent. Here we show that interactive improvisation between two musicians is characterized by activation of perisylvian language areas linked to processing of syntactic elements in music, including inferior frontal gyrus and posterior superior temporal gyrus, and deactivation of angular gyrus and supramarginal gyrus, brain structures directly implicated in semantic processing of language. These findings support the hypothesis that musical discourse engages language areas of the brain specialized for processing of syntax but in a manner that is not contingent upon semantic processing. Therefore, we argue that neural regions for syntactic processing are not domain-specific for language but instead may be domain-general for communication.
Neural Substrates of Interactive Musical Improvisation: An fMRI Study of ‘Trading Fours’ in Jazz
Donnay, Gabriel F.; Rankin, Summer K.; Lopez-Gonzalez, Monica; Jiradejvong, Patpong; Limb, Charles J.
2014-01-01
Interactive generative musical performance provides a suitable model for communication because, like natural linguistic discourse, it involves an exchange of ideas that is unpredictable, collaborative, and emergent. Here we show that interactive improvisation between two musicians is characterized by activation of perisylvian language areas linked to processing of syntactic elements in music, including inferior frontal gyrus and posterior superior temporal gyrus, and deactivation of angular gyrus and supramarginal gyrus, brain structures directly implicated in semantic processing of language. These findings support the hypothesis that musical discourse engages language areas of the brain specialized for processing of syntax but in a manner that is not contingent upon semantic processing. Therefore, we argue that neural regions for syntactic processing are not domain-specific for language but instead may be domain-general for communication. PMID:24586366
Music and language perception: expectations, structural integration, and cognitive sequencing.
Tillmann, Barbara
2012-10-01
Music can be described as sequences of events that are structured in pitch and time. Studying music processing provides insight into how complex event sequences are learned, perceived, and represented by the brain. Given the temporal nature of sound, expectations, structural integration, and cognitive sequencing are central in music perception (i.e., which sounds are most likely to come next and at what moment should they occur?). This paper focuses on similarities in music and language cognition research, showing that music cognition research provides insight into the understanding of not only music processing but also language processing and the processing of other structured stimuli. The hypothesis of shared resources between music and language processing and of domain-general dynamic attention has motivated the development of research to test music as a means to stimulate sensory, cognitive, and motor processes. Copyright © 2012 Cognitive Science Society, Inc.
ERIC Educational Resources Information Center
Marchman, Virginia A.; Fernald, Anne
2008-01-01
The nature of predictive relations between early language and later cognitive function is a fundamental question in research on human cognition. In a longitudinal study assessing speed of language processing in infancy, Fernald, Perfors and Marchman (2006 ) found that reaction time at 25 months was strongly related to lexical and grammatical…
ERIC Educational Resources Information Center
Crossley, Scott A.
2013-01-01
This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies…
ERIC Educational Resources Information Center
Ozturk, Mustafa; Yildirim, Ali
2012-01-01
This study aimed to investigate the nature of the induction process of English as a foreign language (EFL) teachers teaching at tertiary level through individual interviews. In order to gather intended data, fifteen novice instructors teaching at four different public universities in Ankara were interviewed on a basis of two criteria: (a) having 1…
ERIC Educational Resources Information Center
Granger, Sylviane; Kraif, Olivier; Ponton, Claude; Antoniadis, Georges; Zampa, Virginie
2007-01-01
Learner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with…
Natural Language as a Tool for Analyzing the Proving Process: The Case of Plane Geometry Proof
ERIC Educational Resources Information Center
Robotti, Elisabetta
2012-01-01
In the field of human cognition, language plays a special role that is connected directly to thinking and mental development (e.g., Vygotsky, "1938"). Thanks to "verbal thought", language allows humans to go beyond the limits of immediately perceived information, to form concepts and solve complex problems (Luria, "1975"). So, it appears language…
Relaxation of selection, niche construction, and the Baldwin effect in language evolution.
Yamauchi, Hajime; Hashimoto, Takashi
2010-01-01
Deacon has suggested that one of the key factors of language evolution is not characterized by an increase in genetic contribution, often known as the Baldwin effect, but rather by a decrease. This process effectively increases linguistic learning capability by organizing a novel synergy of multiple lower-order functions previously irrelevant to the process of language acquisition. Deacon posits that this transition is not caused by natural selection. Rather, it is due to the relaxation of natural selection. While there are some cases in which relaxation caused by some external factors indeed induces the transition, we do not know what kind of relaxation has worked in language evolution. In this article, a genetic-algorithm-based computer simulation is used to investigate how the niche-constructing aspect of linguistic behavior may trigger the degradation of genetic predisposition related to language learning. The results show that agents initially increase their genetic predisposition for language learning—the Baldwin effect. They create a highly uniform sociolinguistic environment—a linguistic niche construction. This means that later generations constantly receive very similar inputs from adult agents, and subsequently the selective pressure to retain the genetic predisposition is relaxed.
Verbal Counting in Bilingual Contexts
ERIC Educational Resources Information Center
Donevska-Todorova, Ana
2015-01-01
Informal experiences in mathematics often include playful competitions among young children in counting numbers in as many as possible different languages. Can these enjoyable experiences result with excellence in the formal processes of education? This article discusses connections between mathematical achievements and natural languages within…
Culture and biology in the origins of linguistic structure.
Kirby, Simon
2017-02-01
Language is systematically structured at all levels of description, arguably setting it apart from all other instances of communication in nature. In this article, I survey work over the last 20 years that emphasises the contributions of individual learning, cultural transmission, and biological evolution to explaining the structural design features of language. These 3 complex adaptive systems exist in a network of interactions: individual learning biases shape the dynamics of cultural evolution; universal features of linguistic structure arise from this cultural process and form the ultimate linguistic phenotype; the nature of this phenotype affects the fitness landscape for the biological evolution of the language faculty; and in turn this determines individuals' learning bias. Using a combination of computational simulation, laboratory experiments, and comparison with real-world cases of language emergence, I show that linguistic structure emerges as a natural outcome of cultural evolution once certain minimal biological requirements are in place.
The bridge of iconicity: from a world of experience to the experience of language.
Perniss, Pamela; Vigliocco, Gabriella
2014-09-19
Iconicity, a resemblance between properties of linguistic form (both in spoken and signed languages) and meaning, has traditionally been considered to be a marginal, irrelevant phenomenon for our understanding of language processing, development and evolution. Rather, the arbitrary and symbolic nature of language has long been taken as a design feature of the human linguistic system. In this paper, we propose an alternative framework in which iconicity in face-to-face communication (spoken and signed) is a powerful vehicle for bridging between language and human sensori-motor experience, and, as such, iconicity provides a key to understanding language evolution, development and processing. In language evolution, iconicity might have played a key role in establishing displacement (the ability of language to refer beyond what is immediately present), which is core to what language does; in ontogenesis, iconicity might play a critical role in supporting referentiality (learning to map linguistic labels to objects, events, etc., in the world), which is core to vocabulary development. Finally, in language processing, iconicity could provide a mechanism to account for how language comes to be embodied (grounded in our sensory and motor systems), which is core to meaningful communication.
The bridge of iconicity: from a world of experience to the experience of language
Perniss, Pamela; Vigliocco, Gabriella
2014-01-01
Iconicity, a resemblance between properties of linguistic form (both in spoken and signed languages) and meaning, has traditionally been considered to be a marginal, irrelevant phenomenon for our understanding of language processing, development and evolution. Rather, the arbitrary and symbolic nature of language has long been taken as a design feature of the human linguistic system. In this paper, we propose an alternative framework in which iconicity in face-to-face communication (spoken and signed) is a powerful vehicle for bridging between language and human sensori-motor experience, and, as such, iconicity provides a key to understanding language evolution, development and processing. In language evolution, iconicity might have played a key role in establishing displacement (the ability of language to refer beyond what is immediately present), which is core to what language does; in ontogenesis, iconicity might play a critical role in supporting referentiality (learning to map linguistic labels to objects, events, etc., in the world), which is core to vocabulary development. Finally, in language processing, iconicity could provide a mechanism to account for how language comes to be embodied (grounded in our sensory and motor systems), which is core to meaningful communication. PMID:25092668
Greenwald, Jeffrey L; Cronin, Patrick R; Carballo, Victoria; Danaei, Goodarz; Choy, Garry
2017-03-01
With the increasing focus on reducing hospital readmissions in the United States, numerous readmissions risk prediction models have been proposed, mostly developed through analyses of structured data fields in electronic medical records and administrative databases. Three areas that may have an impact on readmission but are poorly captured using structured data sources are patients' physical function, cognitive status, and psychosocial environment and support. The objective of the study was to build a discriminative model using information germane to these 3 areas to identify hospitalized patients' risk for 30-day all cause readmissions. We conducted clinician focus groups to identify language used in the clinical record regarding these 3 areas. We then created a dataset including 30,000 inpatients, 10,000 from each of 3 hospitals, and searched those records for the focus group-derived language using natural language processing. A 30-day readmission prediction model was developed on 75% of the dataset and validated on the other 25% and also on hospital specific subsets. Focus group language was aggregated into 35 variables. The final model had 16 variables, a validated C-statistic of 0.74, and was well calibrated. Subset validation of the model by hospital yielded C-statistics of 0.70-0.75. Deriving a 30-day readmission risk prediction model through identification of physical, cognitive, and psychosocial issues using natural language processing yielded a model that performs similarly to the better performing models previously published with the added advantage of being based on clinically relevant factors and also automated and scalable. Because of the clinical relevance of the variables in the model, future research may be able to test if targeting interventions to identified risks results in reductions in readmissions.
Learning for Semantic Parsing with Kernels under Various Forms of Supervision
2007-08-01
natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194
Tanana, Michael; Hallgren, Kevin A; Imel, Zac E; Atkins, David C; Srikumar, Vivek
2016-06-01
Motivational interviewing (MI) is an efficacious treatment for substance use disorders and other problem behaviors. Studies on MI fidelity and mechanisms of change typically use human raters to code therapy sessions, which requires considerable time, training, and financial costs. Natural language processing techniques have recently been utilized for coding MI sessions using machine learning techniques, rather than human coders, and preliminary results have suggested these methods hold promise. The current study extends this previous work by introducing two natural language processing models for automatically coding MI sessions via computer. The two models differ in the way they semantically represent session content, utilizing either 1) simple discrete sentence features (DSF model) and 2) more complex recursive neural networks (RNN model). Utterance- and session-level predictions from these models were compared to ratings provided by human coders using a large sample of MI sessions (N=341 sessions; 78,977 clinician and client talk turns) from 6 MI studies. Results show that the DSF model generally had slightly better performance compared to the RNN model. The DSF model had "good" or higher utterance-level agreement with human coders (Cohen's kappa>0.60) for open and closed questions, affirm, giving information, and follow/neutral (all therapist codes); considerably higher agreement was obtained for session-level indices, and many estimates were competitive with human-to-human agreement. However, there was poor agreement for client change talk, client sustain talk, and therapist MI-inconsistent behaviors. Natural language processing methods provide accurate representations of human derived behavioral codes and could offer substantial improvements to the efficiency and scale in which MI mechanisms of change research and fidelity monitoring are conducted. Copyright © 2016 Elsevier Inc. All rights reserved.
Chen, W; Kowatch, R; Lin, S; Splaingard, M; Huang, Y
2015-01-01
Nationwide Children's Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use.
Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.
2015-01-01
Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080
BioC: a minimalist approach to interoperability for biomedical text processing
Comeau, Donald C.; Islamaj Doğan, Rezarta; Ciccarese, Paolo; Cohen, Kevin Bretonnel; Krallinger, Martin; Leitner, Florian; Lu, Zhiyong; Peng, Yifan; Rinaldi, Fabio; Torii, Manabu; Valencia, Alfonso; Verspoor, Karin; Wiegers, Thomas C.; Wu, Cathy H.; Wilbur, W. John
2013-01-01
A vast amount of scientific information is encoded in natural language text, and the quantity of such text has become so great that it is no longer economically feasible to have a human as the first step in the search process. Natural language processing and text mining tools have become essential to facilitate the search for and extraction of information from text. This has led to vigorous research efforts to create useful tools and to create humanly labeled text corpora, which can be used to improve such tools. To encourage combining these efforts into larger, more powerful and more capable systems, a common interchange format to represent, store and exchange the data in a simple manner between different language processing systems and text mining tools is highly desirable. Here we propose a simple extensible mark-up language format to share text documents and annotations. The proposed annotation approach allows a large number of different annotations to be represented including sentences, tokens, parts of speech, named entities such as genes or diseases and relationships between named entities. In addition, we provide simple code to hold this data, read it from and write it back to extensible mark-up language files and perform some sample processing. We also describe completed as well as ongoing work to apply the approach in several directions. Code and data are available at http://bioc.sourceforge.net/. Database URL: http://bioc.sourceforge.net/ PMID:24048470
ERIC Educational Resources Information Center
Dominey, Peter Ford; Inui, Toshio; Hoen, Michel
2009-01-01
A central issue in cognitive neuroscience today concerns how distributed neural networks in the brain that are used in language learning and processing can be involved in non-linguistic cognitive sequence learning. This issue is informed by a wealth of functional neurophysiology studies of sentence comprehension, along with a number of recent…
ERIC Educational Resources Information Center
North, Brian; Piccardo, Enrica
2016-01-01
The notion of mediation has been the object of growing interest in second language education in recent years. The increasing awareness of the complex nature of the process of learning--and teaching--stretches our collective reflection towards less explored areas. In mediation, the immediate focus is on the role of language in processes like…
The KIT Motion-Language Dataset.
Plappert, Matthias; Mandery, Christian; Asfour, Tamim
2016-12-01
Linking human motion and natural language is of great interest for the generation of semantic representations of human activities as well as for the generation of robot activities based on natural language input. However, although there have been years of research in this area, no standardized and openly available data set exists to support the development and evaluation of such systems. We, therefore, propose the Karlsruhe Institute of Technology (KIT) Motion-Language Dataset, which is large, open, and extensible. We aggregate data from multiple motion capture databases and include them in our data set using a unified representation that is independent of the capture system or marker set, making it easy to work with the data regardless of its origin. To obtain motion annotations in natural language, we apply a crowd-sourcing approach and a web-based tool that was specifically build for this purpose, the Motion Annotation Tool. We thoroughly document the annotation process itself and discuss gamification methods that we used to keep annotators motivated. We further propose a novel method, perplexity-based selection, which systematically selects motions for further annotation that are either under-represented in our data set or that have erroneous annotations. We show that our method mitigates the two aforementioned problems and ensures a systematic annotation process. We provide an in-depth analysis of the structure and contents of our resulting data set, which, as of October 10, 2016, contains 3911 motions with a total duration of 11.23 hours and 6278 annotations in natural language that contain 52,903 words. We believe this makes our data set an excellent choice that enables more transparent and comparable research in this important area.
Spanish language generation engine to enhance the syntactic quality of AAC systems
NASA Astrophysics Data System (ADS)
Narváez A., Cristian; Sastoque H., Sebastián.; Iregui G., Marcela
2015-12-01
People with Complex Communication Needs (CCN) face difficulties to communicate their ideas, feelings and needs. Augmentative and Alternative Communication (AAC) approaches aim to provide support to enhance socialization of these individuals. However, there are many limitations in current applications related with systems operation, target scenarios and language consistency. This work presents an AAC approach to enhance produced messages by applying elements of Natural Language Generation. Specifically, a Spanish language engine, composed of a grammar ontology and a set of linguistic rules, is proposed to improve the naturalness in the communication process, when persons with CCN tell stories about their daily activities to non-disabled receivers. The assessment of the proposed method confirms the validity of the model to improve messages quality.
ERIC Educational Resources Information Center
Serna Dimas, Hector Manuel
2013-01-01
Literacy is one of the most fundamental processes in the life of people. It is complex enough when people develop these processes in their first language, and the nature of the task becomes even more challenging when it is developed with students in a second language within the context of a bilingual setting. Bilingual education has been based on…
Corina, David P; Knapp, Heather Patterson
2008-12-01
In the quest to further understand the neural underpinning of human communication, researchers have turned to studies of naturally occurring signed languages used in Deaf communities. The comparison of the commonalities and differences between spoken and signed languages provides an opportunity to determine core neural systems responsible for linguistic communication independent of the modality in which a language is expressed. The present article examines such studies, and in addition asks what we can learn about human languages by contrasting formal visual-gestural linguistic systems (signed languages) with more general human action perception. To understand visual language perception, it is important to distinguish the demands of general human motion processing from the highly task-dependent demands associated with extracting linguistic meaning from arbitrary, conventionalized gestures. This endeavor is particularly important because theorists have suggested close homologies between perception and production of actions and functions of human language and social communication. We review recent behavioral, functional imaging, and neuropsychological studies that explore dissociations between the processing of human actions and signed languages. These data suggest incomplete overlap between the mirror-neuron systems proposed to mediate human action and language.
Dynamic changes in network activations characterize early learning of a natural language.
Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius
2014-09-01
Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. Copyright © 2014 Elsevier Ltd. All rights reserved.
On a Possible Relationship between Linguistic Expertise and EEG Gamma Band Phase Synchrony
Reiterer, Susanne; Pereda, Ernesto; Bhattacharya, Joydeep
2011-01-01
Recent research has shown that extensive training in and exposure to a second language can modify the language organization in the brain by causing both structural and functional changes. However it is not yet known how these changes are manifested by the dynamic brain oscillations and synchronization patterns subserving the language networks. In search for synchronization correlates of proficiency and expertise in second language acquisition, multivariate EEG signals were recorded from 44 high and low proficiency bilinguals during processing of natural language in their first and second languages. Gamma band (30–45 Hz) phase synchronization (PS) was calculated mainly by two recently developed methods: coarse-graining of Markov chains (estimating global phase synchrony, measuring the degree of PS between one electrode and all other electrodes), and phase lag index (PLI; estimating bivariate phase synchrony, measuring the degree of PS between a pair of electrodes). On comparing second versus first language processing, global PS by coarse-graining Markov chains indicated that processing of the second language needs significantly higher synchronization strength than first language. On comparing the proficiency groups, bivariate PS measure (i.e., PLI) revealed that during second language processing the low proficiency group showed stronger and broader network patterns than the high proficiency group, with interconnectivities between a left fronto-parietal network. Mean phase coherence analysis also indicated that the network activity was globally stronger in the low proficiency group during second language processing. PMID:22125542
Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis
2017-09-01
We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Vasishth, Shravan
2017-07-01
This interesting and informative review by Liu and colleagues [17] in this issue covers the full spectrum of research on the idea that in natural language, dependency distance tends to be small. The authors discuss two distinct research threads: experimental work from psycholinguistics on online processes in comprehension and production, and text-corpus studies of dependency length distributions.
1974-07-01
iiWU -immmemmmmm This document was generated by the Stanford Artificial Intelligence Laboratory’s document compiler, "PUB" and reproducec’ on a...for more sophisticated artificial (programming) languages. The new issues became those of how to represent a grammar as precise syntactic structures...challenge lies in discovering - either by synthesis of an artificial system, or by analysis of a natural one - the underlying logical (a. opposed to
Memory Operations and Structures in Sentence Comprehension: Evidence from Ellipsis
ERIC Educational Resources Information Center
Martin, Andrea Eyleen
2010-01-01
Natural language often contains dependencies that span words, phrases, or even sentences. Thus, language comprehension relies on recovering recently processed information from memory for subsequent interpretation. This dissertation investigates the memory operations that subserve dependency resolution through the lens of "verb-phrase ellipsis"…
A Novel Approach to Creating Disambiguated Multilingual Dictionaries
ERIC Educational Resources Information Center
Boguslavsky, Igor; Cardenosa, Jesus; Gallardo, Carolina
2009-01-01
Multilingual lexicons are needed in various applications, such as cross-lingual information retrieval, machine translation, and some others. Often, these applications suffer from the ambiguity of dictionary items, especially when an intermediate natural language is involved in the process of the dictionary construction, since this language adds…
NASA Technical Reports Server (NTRS)
Gomez, Fernando
1989-01-01
It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.
Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen
2017-09-25
In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.
Hampton Wray, Amanda; Weber-Fox, Christine
2013-07-01
The neural activity mediating language processing in young children is characterized by large individual variability that is likely related in part to individual strengths and weakness across various cognitive abilities. The current study addresses the following question: How does proficiency in specific cognitive and language functions impact neural indices mediating language processing in children? Thirty typically developing seven- and eight-year-olds were divided into high-normal and low-normal proficiency groups based on performance on nonverbal IQ, auditory word recall, and grammatical morphology tests. Event-related brain potentials (ERPs) were elicited by semantic anomalies and phrase structure violations in naturally spoken sentences. The proficiency for each of the specific cognitive and language tasks uniquely contributed to specific aspects (e.g., timing and/or resource allocation) of neural indices underlying semantic (N400) and syntactic (P600) processing. These results suggest that distinct aptitudes within broader domains of cognition and language, even within the normal range, influence the neural signatures of semantic and syntactic processing. Furthermore, the current findings have important implications for the design and interpretation of developmental studies of ERPs indexing language processing, and they highlight the need to take into account cognitive abilities both within and outside the classic language domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
2001-01-01
In this contract, which is a component of a larger contract that we plan to submit in the coming months, we plan to study the preprocessing issues which arise in applying natural language processing techniques to NASA-KSC problem reports. The goals of this work will be to deal with the issues of: a) automatically obtaining the problem reports from NASA-KSC data bases, b) the format of these reports and c) the conversion of these reports to a format that will be adequate for our natural language software. At the end of this contract, we expect that these problems will be solved and that we will be ready to apply our natural language software to a text database of over 1000 KSC problem reports.
Neurophysiological mechanisms involved in language learning in adults
Rodríguez-Fornells, Antoni; Cunillera, Toni; Mestres-Missé, Anna; de Diego-Balaguer, Ruth
2009-01-01
Little is known about the brain mechanisms involved in word learning during infancy and in second language acquisition and about the way these new words become stable representations that sustain language processing. In several studies we have adopted the human simulation perspective, studying the effects of brain-lesions and combining different neuroimaging techniques such as event-related potentials and functional magnetic resonance imaging in order to examine the language learning (LL) process. In the present article, we review this evidence focusing on how different brain signatures relate to (i) the extraction of words from speech, (ii) the discovery of their embedded grammatical structure, and (iii) how meaning derived from verbal contexts can inform us about the cognitive mechanisms underlying the learning process. We compile these findings and frame them into an integrative neurophysiological model that tries to delineate the major neural networks that might be involved in the initial stages of LL. Finally, we propose that LL simulations can help us to understand natural language processing and how the recovery from language disorders in infants and adults can be accomplished. PMID:19933142
Somogyi, Endre; Glazier, James A.
2017-01-01
Biological cells are the prototypical example of active matter. Cells sense and respond to mechanical, chemical and electrical environmental stimuli with a range of behaviors, including dynamic changes in morphology and mechanical properties, chemical uptake and secretion, cell differentiation, proliferation, death, and migration. Modeling and simulation of such dynamic phenomena poses a number of computational challenges. A modeling language describing cellular dynamics must naturally represent complex intra and extra-cellular spatial structures and coupled mechanical, chemical and electrical processes. Domain experts will find a modeling language most useful when it is based on concepts, terms and principles native to the problem domain. A compiler must then be able to generate an executable model from this physically motivated description. Finally, an executable model must efficiently calculate the time evolution of such dynamic and inhomogeneous phenomena. We present a spatial hybrid systems modeling language, compiler and mesh-free Lagrangian based simulation engine which will enable domain experts to define models using natural, biologically motivated constructs and to simulate time evolution of coupled cellular, mechanical and chemical processes acting on a time varying number of cells and their environment. PMID:29303160
Somogyi, Endre; Glazier, James A
2017-04-01
Biological cells are the prototypical example of active matter. Cells sense and respond to mechanical, chemical and electrical environmental stimuli with a range of behaviors, including dynamic changes in morphology and mechanical properties, chemical uptake and secretion, cell differentiation, proliferation, death, and migration. Modeling and simulation of such dynamic phenomena poses a number of computational challenges. A modeling language describing cellular dynamics must naturally represent complex intra and extra-cellular spatial structures and coupled mechanical, chemical and electrical processes. Domain experts will find a modeling language most useful when it is based on concepts, terms and principles native to the problem domain. A compiler must then be able to generate an executable model from this physically motivated description. Finally, an executable model must efficiently calculate the time evolution of such dynamic and inhomogeneous phenomena. We present a spatial hybrid systems modeling language, compiler and mesh-free Lagrangian based simulation engine which will enable domain experts to define models using natural, biologically motivated constructs and to simulate time evolution of coupled cellular, mechanical and chemical processes acting on a time varying number of cells and their environment.
Selecting the Best Mobile Information Service with Natural Language User Input
NASA Astrophysics Data System (ADS)
Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu
Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.
Automatic generation of the index of productive syntax for child language transcripts.
Hassanali, Khairun-nisa; Liu, Yang; Iglesias, Aquiles; Solorio, Thamar; Dollaghan, Christine
2014-03-01
The index of productive syntax (IPSyn; Scarborough (Applied Psycholinguistics 11:1-22, 1990) is a measure of syntactic development in child language that has been used in research and clinical settings to investigate the grammatical development of various groups of children. However, IPSyn is mostly calculated manually, which is an extremely laborious process. In this article, we describe the AC-IPSyn system, which automatically calculates the IPSyn score for child language transcripts using natural language processing techniques. Our results show that the AC-IPSyn system performs at levels comparable to scores computed manually. The AC-IPSyn system can be downloaded from www.hlt.utdallas.edu/~nisa/ipsyn.html .
Language Supports for Journal Abstract Writing across Disciplines
ERIC Educational Resources Information Center
Liou, H.-C.; Yang, P.-C.; Chang, J. S.
2012-01-01
Various writing assistance tools have been developed through efforts in the areas of natural language processing with different degrees of success of curriculum integration depending on their functional rigor and pedagogical designs. In this paper, we developed a system, WriteAhead, that provides six types of suggestions when non-native graduate…
Inferring Speaker Affect in Spoken Natural Language Communication
ERIC Educational Resources Information Center
Pon-Barry, Heather Roberta
2013-01-01
The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
English Complex Verb Constructions: Identification and Inference
ERIC Educational Resources Information Center
Tu, Yuancheng
2012-01-01
The fundamental problem faced by automatic text understanding in Natural Language Processing (NLP) is to identify semantically related pieces of text and integrate them together to compute the meaning of the whole text. However, the principle of compositionality runs into trouble very quickly when real language is examined with its frequent…
Process for selecting engineering tools : applied to selecting a SysML tool.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Spain, Mark J.; Post, Debra S.; Taylor, Jeffrey L.
2011-02-01
Process for Selecting Engineering Tools outlines the process and tools used to select a SysML (Systems Modeling Language) tool. The process is general in nature and users could use the process to select most engineering tools and software applications.
Acquiring and processing verb argument structure: distributional learning in a miniature language.
Wonnacott, Elizabeth; Newport, Elissa L; Tanenhaus, Michael K
2008-05-01
Adult knowledge of a language involves correctly balancing lexically-based and more language-general patterns. For example, verb argument structures may sometimes readily generalize to new verbs, yet with particular verbs may resist generalization. From the perspective of acquisition, this creates significant learnability problems, with some researchers claiming a crucial role for verb semantics in the determination of when generalization may and may not occur. Similarly, there has been debate regarding how verb-specific and more generalized constraints interact in sentence processing and on the role of semantics in this process. The current work explores these issues using artificial language learning. In three experiments using languages without semantic cues to verb distribution, we demonstrate that learners can acquire both verb-specific and verb-general patterns, based on distributional information in the linguistic input regarding each of the verbs as well as across the language as a whole. As with natural languages, these factors are shown to affect production, judgments and real-time processing. We demonstrate that learners apply a rational procedure in determining their usage of these different input statistics and conclude by suggesting that a Bayesian perspective on statistical learning may be an appropriate framework for capturing our findings.
Sequence Memory Constraints Give Rise to Language-Like Structure through Iterated Learning
Cornish, Hannah; Dale, Rick; Kirby, Simon; Christiansen, Morten H.
2017-01-01
Human language is composed of sequences of reusable elements. The origins of the sequential structure of language is a hotly debated topic in evolutionary linguistics. In this paper, we show that sets of sequences with language-like statistical properties can emerge from a process of cultural evolution under pressure from chunk-based memory constraints. We employ a novel experimental task that is non-linguistic and non-communicative in nature, in which participants are trained on and later asked to recall a set of sequences one-by-one. Recalled sequences from one participant become training data for the next participant. In this way, we simulate cultural evolution in the laboratory. Our results show a cumulative increase in structure, and by comparing this structure to data from existing linguistic corpora, we demonstrate a close parallel between the sets of sequences that emerge in our experiment and those seen in natural language. PMID:28118370
Sequence Memory Constraints Give Rise to Language-Like Structure through Iterated Learning.
Cornish, Hannah; Dale, Rick; Kirby, Simon; Christiansen, Morten H
2017-01-01
Human language is composed of sequences of reusable elements. The origins of the sequential structure of language is a hotly debated topic in evolutionary linguistics. In this paper, we show that sets of sequences with language-like statistical properties can emerge from a process of cultural evolution under pressure from chunk-based memory constraints. We employ a novel experimental task that is non-linguistic and non-communicative in nature, in which participants are trained on and later asked to recall a set of sequences one-by-one. Recalled sequences from one participant become training data for the next participant. In this way, we simulate cultural evolution in the laboratory. Our results show a cumulative increase in structure, and by comparing this structure to data from existing linguistic corpora, we demonstrate a close parallel between the sets of sequences that emerge in our experiment and those seen in natural language.
ERIC Educational Resources Information Center
Freedman, Sarah Warshauer, Ed.
Viewing writing as both a form of language learning and an intellectual skill, this book presents essays on how writers acquire trusted inner voices and the roles schools and teachers can play in helping student writers in the learning process. The essays in the book focus on one of three topics: the language of instruction and how response and…
Indo-European languages tree by Levenshtein distance
NASA Astrophysics Data System (ADS)
Serva, M.; Petroni, F.
2008-03-01
The evolution of languages closely resembles the evolution of haploid organisms. This similarity has been recently exploited (Gray R. D. and Atkinson Q. D., Nature, 426 (2003) 435; Gray R. D. and Jordan F. M., Nature, 405 (2000) 1052) to construct language trees. The key point is the definition of a distance among all pairs of languages which is the analogous of a genetic distance. Many methods have been proposed to define these distances; one of these, used by glottochronology, computes the distance from the percentage of shared "cognates". Cognates are words inferred to have a common historical origin, and subjective judgment plays a relevant role in the identification process. Here we push closer the analogy with evolutionary biology and we introduce a genetic distance among language pairs by considering a renormalized Levenshtein distance among words with same meaning and averaging on all words contained in a Swadesh list (Swadesh M., Proc. Am. Philos. Soc., 96 (1952) 452). The subjectivity of process is consistently reduced and the reproducibility is highly facilitated. We test our method against the Indo-European group considering fifty different languages and the two hundred words of the Swadesh list for any of them. We find out a tree which closely resembles the one published in Gray and Atkinson (2003), with some significant differences.
Christiansen, Morten H.; Conway, Christopher M.; Onnis, Luca
2011-01-01
We used event-related potentials (ERPs) to investigate the time course and distribution of brain activity while adults performed (a) a sequential learning task involving complex structured sequences, and (b) a language processing task. The same positive ERP deflection, the P600 effect, typically linked to difficult or ungrammatical syntactic processing, was found for structural incongruencies in both sequential learning as well as natural language, and with similar topographical distributions. Additionally, a left anterior negativity (LAN) was observed for language but not for sequential learning. These results are interpreted as an indication that the P600 provides an index of violations and the cost of integration of expectations for upcoming material when processing complex sequential structure. We conclude that the same neural mechanisms may be recruited for both syntactic processing of linguistic stimuli and sequential learning of structured sequence patterns more generally. PMID:23678205
The neurobiology of syntax: beyond string sets.
Petersson, Karl Magnus; Hagoort, Peter
2012-07-19
The human capacity to acquire language is an outstanding scientific challenge to understand. Somehow our language capacities arise from the way the human brain processes, develops and learns in interaction with its environment. To set the stage, we begin with a summary of what is known about the neural organization of language and what our artificial grammar learning (AGL) studies have revealed. We then review the Chomsky hierarchy in the context of the theory of computation and formal learning theory. Finally, we outline a neurobiological model of language acquisition and processing based on an adaptive, recurrent, spiking network architecture. This architecture implements an asynchronous, event-driven, parallel system for recursive processing. We conclude that the brain represents grammars (or more precisely, the parser/generator) in its connectivity, and its ability for syntax is based on neurobiological infrastructure for structured sequence processing. The acquisition of this ability is accounted for in an adaptive dynamical systems framework. Artificial language learning (ALL) paradigms might be used to study the acquisition process within such a framework, as well as the processing properties of the underlying neurobiological infrastructure. However, it is necessary to combine and constrain the interpretation of ALL results by theoretical models and empirical studies on natural language processing. Given that the faculty of language is captured by classical computational models to a significant extent, and that these can be embedded in dynamic network architectures, there is hope that significant progress can be made in understanding the neurobiology of the language faculty.
The neurobiology of syntax: beyond string sets
Petersson, Karl Magnus; Hagoort, Peter
2012-01-01
The human capacity to acquire language is an outstanding scientific challenge to understand. Somehow our language capacities arise from the way the human brain processes, develops and learns in interaction with its environment. To set the stage, we begin with a summary of what is known about the neural organization of language and what our artificial grammar learning (AGL) studies have revealed. We then review the Chomsky hierarchy in the context of the theory of computation and formal learning theory. Finally, we outline a neurobiological model of language acquisition and processing based on an adaptive, recurrent, spiking network architecture. This architecture implements an asynchronous, event-driven, parallel system for recursive processing. We conclude that the brain represents grammars (or more precisely, the parser/generator) in its connectivity, and its ability for syntax is based on neurobiological infrastructure for structured sequence processing. The acquisition of this ability is accounted for in an adaptive dynamical systems framework. Artificial language learning (ALL) paradigms might be used to study the acquisition process within such a framework, as well as the processing properties of the underlying neurobiological infrastructure. However, it is necessary to combine and constrain the interpretation of ALL results by theoretical models and empirical studies on natural language processing. Given that the faculty of language is captured by classical computational models to a significant extent, and that these can be embedded in dynamic network architectures, there is hope that significant progress can be made in understanding the neurobiology of the language faculty. PMID:22688633
Are Nonadjacent Collocations Processed Faster?
ERIC Educational Resources Information Center
Vilkaite, Laura
2016-01-01
Numerous studies have shown processing advantages for collocations, but they only investigated processing of adjacent collocations (e.g., "provide information"). However, in naturally occurring language, nonadjacent collocations ("provide" some of the "information") are equally, if not more frequent. This raises the…
Guimaraes, Carolina V; Grzeszczuk, Robert; Bisset, George S; Donnelly, Lane F
2018-03-01
When implementing or monitoring department-sanctioned standardized radiology reports, feedback about individual faculty performance has been shown to be a useful driver of faculty compliance. Most commonly, these data are derived from manual audit, which can be both time-consuming and subject to sampling error. The purpose of this study was to evaluate whether a software program using natural language processing and machine learning could accurately audit radiologist compliance with the use of standardized reports compared with performed manual audits. Radiology reports from a 1-month period were loaded into such a software program, and faculty compliance with use of standardized reports was calculated. For that same period, manual audits were performed (25 reports audited for each of 42 faculty members). The mean compliance rates calculated by automated auditing were then compared with the confidence interval of the mean rate by manual audit. The mean compliance rate for use of standardized reports as determined by manual audit was 91.2% with a confidence interval between 89.3% and 92.8%. The mean compliance rate calculated by automated auditing was 92.0%, within that confidence interval. This study shows that by use of natural language processing and machine learning algorithms, an automated analysis can accurately define whether reports are compliant with use of standardized report templates and language, compared with manual audits. This may avoid significant labor costs related to conducting the manual auditing process. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Intelligent communication assistant for databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakobson, G.; Shaked, V.; Rowley, S.
1983-01-01
An intelligent communication assistant for databases, called FRED (front end for databases) is explored. FRED is designed to facilitate access to database systems by users of varying levels of experience. FRED is a second generation of natural language front-ends for databases and intends to solve two critical interface problems existing between end-users and databases: connectivity and communication problems. The authors report their experiences in developing software for natural language query processing, dialog control, and knowledge representation, as well as the direction of future work. 10 references.
Bilateral parietal contributions to spatial language.
Conder, Julie; Fridriksson, Julius; Baylis, Gordon C; Smith, Cameron M; Boiteau, Timothy W; Almor, Amit
2017-01-01
It is commonly held that language is largely lateralized to the left hemisphere in most individuals, whereas spatial processing is associated with right hemisphere regions. In recent years, a number of neuroimaging studies have yielded conflicting results regarding the role of language and spatial processing areas in processing language about space (e.g., Carpenter, Just, Keller, Eddy, & Thulborn, 1999; Damasio et al., 2001). In the present study, we used sparse scanning event-related functional magnetic resonance imaging (fMRI) to investigate the neural correlates of spatial language, that is; language used to communicate the spatial relationship of one object to another. During scanning, participants listened to sentences about object relationships that were either spatial or non-spatial in nature (color or size relationships). Sentences describing spatial relationships elicited more activation in the superior parietal lobule and precuneus bilaterally in comparison to sentences describing size or color relationships. Activation of the precuneus suggests that spatial sentences elicit spatial-mental imagery, while the activation of the SPL suggests sentences containing spatial language involve integration of two distinct sets of information - linguistic and spatial. Copyright © 2016 Elsevier Inc. All rights reserved.
[Big data, medical language and biomedical terminology systems].
Schulz, Stefan; López-García, Pablo
2015-08-01
A variety of rich terminology systems, such as thesauri, classifications, nomenclatures and ontologies support information and knowledge processing in health care and biomedical research. Nevertheless, human language, manifested as individually written texts, persists as the primary carrier of information, in the description of disease courses or treatment episodes in electronic medical records, and in the description of biomedical research in scientific publications. In the context of the discussion about big data in biomedicine, we hypothesize that the abstraction of the individuality of natural language utterances into structured and semantically normalized information facilitates the use of statistical data analytics to distil new knowledge out of textual data from biomedical research and clinical routine. Computerized human language technologies are constantly evolving and are increasingly ready to annotate narratives with codes from biomedical terminology. However, this depends heavily on linguistic and terminological resources. The creation and maintenance of such resources is labor-intensive. Nevertheless, it is sensible to assume that big data methods can be used to support this process. Examples include the learning of hierarchical relationships, the grouping of synonymous terms into concepts and the disambiguation of homonyms. Although clear evidence is still lacking, the combination of natural language technologies, semantic resources, and big data analytics is promising.
The semantic web and computer vision: old AI meets new AI
NASA Astrophysics Data System (ADS)
Mundy, J. L.; Dong, Y.; Gilliam, A.; Wagner, R.
2018-04-01
There has been vast process in linking semantic information across the billions of web pages through the use of ontologies encoded in the Web Ontology Language (OWL) based on the Resource Description Framework (RDF). A prime example is the Wikipedia where the knowledge contained in its more than four million pages is encoded in an ontological database called DBPedia http://wiki.dbpedia.org/. Web-based query tools can retrieve semantic information from DBPedia encoded in interlinked ontologies that can be accessed using natural language. This paper will show how this vast context can be used to automate the process of querying images and other geospatial data in support of report changes in structures and activities. Computer vision algorithms are selected and provided with context based on natural language requests for monitoring and analysis. The resulting reports provide semantically linked observations from images and 3D surface models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azevedo, S.G.; Fitch, J.P.
1987-10-21
Conventional software interfaces that use imperative computer commands or menu interactions are often restrictive environments when used for researching new algorithms or analyzing processed experimental data. We found this to be true with current signal-processing software (SIG). As an alternative, ''functional language'' interfaces provide features such as command nesting for a more natural interaction with the data. The Image and Signal LISP Environment (ISLE) is an example of an interpreted functional language interface based on common LISP. Advantages of ISLE include multidimensional and multiple data-type independence through dispatching functions, dynamic loading of new functions, and connections to artificial intelligence (AI)more » software. 10 refs.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azevedo, S.G.; Fitch, J.P.
1987-05-01
Conventional software interfaces which utilize imperative computer commands or menu interactions are often restrictive environments when used for researching new algorithms or analyzing processed experimental data. We found this to be true with current signal processing software (SIG). Existing ''functional language'' interfaces provide features such as command nesting for a more natural interaction with the data. The Image and Signal Lisp Environment (ISLE) will be discussed as an example of an interpreted functional language interface based on Common LISP. Additional benefits include multidimensional and multiple data-type independence through dispatching functions, dynamic loading of new functions, and connections to artificial intelligencemore » software.« less
LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER
NASA Technical Reports Server (NTRS)
Will, H.
1994-01-01
The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.
Natural language processing and advanced information management
NASA Technical Reports Server (NTRS)
Hoard, James E.
1989-01-01
Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.
Perniss, Pamela; Özyürek, Asli; Morgan, Gary
2015-01-01
For humans, the ability to communicate and use language is instantiated not only in the vocal modality but also in the visual modality. The main examples of this are sign languages and (co-speech) gestures. Sign languages, the natural languages of Deaf communities, use systematic and conventionalized movements of the hands, face, and body for linguistic expression. Co-speech gestures, though non-linguistic, are produced in tight semantic and temporal integration with speech and constitute an integral part of language together with speech. The articles in this issue explore and document how gestures and sign languages are similar or different and how communicative expression in the visual modality can change from being gestural to grammatical in nature through processes of conventionalization. As such, this issue contributes to our understanding of how the visual modality shapes language and the emergence of linguistic structure in newly developing systems. Studying the relationship between signs and gestures provides a new window onto the human ability to recruit multiple levels of representation (e.g., categorical, gradient, iconic, abstract) in the service of using or creating conventionalized communicative systems. Copyright © 2015 Cognitive Science Society, Inc.
Proposal: A Hybrid Dictionary Modelling Approach for Malay Tweet Normalization
NASA Astrophysics Data System (ADS)
Muhamad, Nor Azlizawati Binti; Idris, Norisma; Arshi Saloot, Mohammad
2017-02-01
Malay Twitter message presents a special deviation from the original language. Malay Tweet widely used currently by Twitter users, especially at Malaya archipelago. Thus, it is important to make a normalization system which can translated Malay Tweet language into the standard Malay language. Some researchers have conducted in natural language processing which mainly focuses on normalizing English Twitter messages, while few studies have been done for normalize Malay Tweets. This paper proposes an approach to normalize Malay Twitter messages based on hybrid dictionary modelling methods. This approach normalizes noisy Malay twitter messages such as colloquially language, novel words, and interjections into standard Malay language. This research will be used Language Model and N-grams model.
Gaining insights from social media language: Methodologies and challenges.
Kern, Margaret L; Park, Gregory; Eichstaedt, Johannes C; Schwartz, H Andrew; Sap, Maarten; Smith, Laura K; Ungar, Lyle H
2016-12-01
Language data available through social media provide opportunities to study people at an unprecedented scale. However, little guidance is available to psychologists who want to enter this area of research. Drawing on tools and techniques developed in natural language processing, we first introduce psychologists to social media language research, identifying descriptive and predictive analyses that language data allow. Second, we describe how raw language data can be accessed and quantified for inclusion in subsequent analyses, exploring personality as expressed on Facebook to illustrate. Third, we highlight challenges and issues to be considered, including accessing and processing the data, interpreting effects, and ethical issues. Social media has become a valuable part of social life, and there is much we can learn by bringing together the tools of computer science with the theories and insights of psychology. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Large, David R; Clark, Leigh; Quandt, Annie; Burnett, Gary; Skrypchuk, Lee
2017-09-01
Given the proliferation of 'intelligent' and 'socially-aware' digital assistants embodying everyday mobile technology - and the undeniable logic that utilising voice-activated controls and interfaces in cars reduces the visual and manual distraction of interacting with in-vehicle devices - it appears inevitable that next generation vehicles will be embodied by digital assistants and utilise spoken language as a method of interaction. From a design perspective, defining the language and interaction style that a digital driving assistant should adopt is contingent on the role that they play within the social fabric and context in which they are situated. We therefore conducted a qualitative, Wizard-of-Oz study to explore how drivers might interact linguistically with a natural language digital driving assistant. Twenty-five participants drove for 10 min in a medium-fidelity driving simulator while interacting with a state-of-the-art, high-functioning, conversational digital driving assistant. All exchanges were transcribed and analysed using recognised linguistic techniques, such as discourse and conversation analysis, normally reserved for interpersonal investigation. Language usage patterns demonstrate that interactions with the digital assistant were fundamentally social in nature, with participants affording the assistant equal social status and high-level cognitive processing capability. For example, participants were polite, actively controlled turn-taking during the conversation, and used back-channelling, fillers and hesitation, as they might in human communication. Furthermore, participants expected the digital assistant to understand and process complex requests mitigated with hedging words and expressions, and peppered with vague language and deictic references requiring shared contextual information and mutual understanding. Findings are presented in six themes which emerged during the analysis - formulating responses; turn-taking; back-channelling, fillers and hesitation; vague language; mitigating requests and politeness and praise. The results can be used to inform the design of future in-vehicle natural language systems, in particular to help manage the tension between designing for an engaging dialogue (important for technology acceptance) and designing for an effective dialogue (important to minimise distraction in a driving context). Copyright © 2017 Elsevier Ltd. All rights reserved.
Appendix Y. The Integrated Communications Experiment (ICE) Summary.
ERIC Educational Resources Information Center
Coffin, Robert
This appendix describes the Integrated Communications Experiment (ICE), a comprehensive computer software capability developed for the ComField Project. Each major characteristic of the data processing system is treated separately: natural language processing, flexibility, noninterference with the educational process, multipurposeness,…
Learning procedures from interactive natural language instructions
NASA Technical Reports Server (NTRS)
Huffman, Scott B.; Laird, John E.
1994-01-01
Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.
Natural Language Processing Methods and Systems for Biomedical Ontology Learning
Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.
2010-01-01
While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054
Language evolution in the laboratory.
Scott-Phillips, Thomas C; Kirby, Simon
2010-09-01
The historical origins of natural language cannot be observed directly. We can, however, study systems that support language and we can also develop models that explore the plausibility of different hypotheses about how language emerged. More recently, evolutionary linguists have begun to conduct language evolution experiments in the laboratory, where the emergence of new languages used by human participants can be observed directly. This enables researchers to study both the cognitive capacities necessary for language and the ways in which languages themselves emerge. One theme that runs through this work is how individual-level behaviours result in population-level linguistic phenomena. A central challenge for the future will be to explore how different forms of information transmission affect this process. 2010 Elsevier Ltd. All rights reserved.
Toward a unified account of comprehension and production in language development.
McCauley, Stewart M; Christiansen, Morten H
2013-08-01
Although Pickering & Garrod (P&G) argue convincingly for a unified system for language comprehension and production, they fail to explain how such a system might develop. Using a recent computational model of language acquisition as an example, we sketch a developmental perspective on the integration of comprehension and production. We conclude that only through development can we fully understand the intertwined nature of comprehension and production in adult processing.
2004-03-01
and current work is that most developers see unstructured language input as a useful complement to a Socratic tutoring approach. The general...demonstrating the possibility of Socratic tactical tutoring, led us , during the second half of our Phase II effort, away from unrestricted natural language ... language use . This often leads to faster, more useful processing, that is robust in the face of real-world input (ungrammatical, misspelled, or
Multilingual Metalanguage, or the Way Multilinguals Talk about Their Languages
ERIC Educational Resources Information Center
Jessner, Ulrike
2005-01-01
The increase of multilingualism in both natural and formal contexts has provoked a number of studies which have concentrated on providing evidence of multilingual processing and finding out about the differences and similarities between second and third language learning. This paper deals with the use of metalanguage in multilingual students in an…
Info/Information Theory: Speakers Choose Shorter Words in Predictive Contexts
ERIC Educational Resources Information Center
Mahowald, Kyle; Fedorenko, Evelina; Piantadosi, Steven T.; Gibson, Edward
2013-01-01
A major open question in natural language research is the role of communicative efficiency in the origin and on-line processing of language structures. Here, we use word pairs like "chimp/chimpanzee", which differ in length but have nearly identical meanings, to investigate the communicative properties of lexical systems and the communicative…
What Artificial Grammar Learning Reveals about the Neurobiology of Syntax
ERIC Educational Resources Information Center
Petersson, Karl-Magnus; Folia, Vasiliki; Hagoort, Peter
2012-01-01
In this paper we examine the neurobiological correlates of syntax, the processing of structured sequences, by comparing FMRI results on artificial and natural language syntax. We discuss these and similar findings in the context of formal language and computability theory. We used a simple right-linear unification grammar in an implicit artificial…
Automated Error Detection for Developing Grammar Proficiency of ESL Learners
ERIC Educational Resources Information Center
Feng, Hui-Hsien; Saricaoglu, Aysel; Chukharev-Hudilainen, Evgeny
2016-01-01
Thanks to natural language processing technologies, computer programs are actively being used not only for holistic scoring, but also for formative evaluation of writing. CyWrite is one such program that is under development. The program is built upon Second Language Acquisition theories and aims to assist ESL learners in higher education by…
SOME DIFFERENCES IN ENCODING AND DECODING MESSAGES.
ERIC Educational Resources Information Center
BICKLEY, A. C.; WEAVER, WENDELL W.
LANGUAGE ENCODING AND DECODING PROCESSES WERE EXAMINED BY DETERMINING THE ABILITY OF SUBJECTS TO PREDICT OMISSIONS FROM A NATURAL LANGUAGE TEXT WHICH THEY HAD PREVIOUSLY PRODUCED THEMSELVES, AND BY COMPARING THIS PERFORMANCE WITH THAT OF OTHER SUBJECTS TO PREDICT OMISSIONS FROM THESE SAME TEXTS WHICH THE SECOND GROUP READ AT THE TIME OF…
Computer-Mediated Communication as an Autonomy-Enhancement Tool for Advanced Learners of English
ERIC Educational Resources Information Center
Wach, Aleksandra
2012-01-01
This article examines the relevance of modern technology for the development of learner autonomy in the process of learning English as a foreign language. Computer-assisted language learning and computer-mediated communication (CMC) appear to be particularly conducive to fostering autonomous learning, as they naturally incorporate many elements of…
Patton, Desmond Upton; MacBeth, Jamie; Schoenebeck, Sarita; Shear, Katherine; McKeown, Kathleen
2018-01-01
There is a dearth of research investigating youths' experience of grief and mourning after the death of close friends or family. Even less research has explored the question of how youth use social media sites to engage in the grieving process. This study employs qualitative analysis and natural language processing to examine tweets that follow 2 deaths. First, we conducted a close textual read on a sample of tweets by Gakirah Barnes, a gang-involved teenaged girl in Chicago, and members of her Twitter network, over a 19-day period in 2014 during which 2 significant deaths occurred: that of Raason "Lil B" Shaw and Gakirah's own death. We leverage the grief literature to understand the way Gakirah and her peers express thoughts, feelings, and behaviors at the time of these deaths. We also present and explain the rich and complex style of online communication among gang-involved youth, one that has been overlooked in prior research. Next, we overview the natural language processing output for expressions of loss and grief in our data set based on qualitative findings and present an error analysis on its output for grief. We conclude with a call for interdisciplinary research that analyzes online and offline behaviors to help understand physical and emotional violence and other problematic behaviors prevalent among marginalized communities.
Patton, Desmond Upton; MacBeth, Jamie; Schoenebeck, Sarita; Shear, Katherine; McKeown, Kathleen
2018-01-01
There is a dearth of research investigating youths’ experience of grief and mourning after the death of close friends or family. Even less research has explored the question of how youth use social media sites to engage in the grieving process. This study employs qualitative analysis and natural language processing to examine tweets that follow 2 deaths. First, we conducted a close textual read on a sample of tweets by Gakirah Barnes, a gang-involved teenaged girl in Chicago, and members of her Twitter network, over a 19-day period in 2014 during which 2 significant deaths occurred: that of Raason “Lil B” Shaw and Gakirah’s own death. We leverage the grief literature to understand the way Gakirah and her peers express thoughts, feelings, and behaviors at the time of these deaths. We also present and explain the rich and complex style of online communication among gang-involved youth, one that has been overlooked in prior research. Next, we overview the natural language processing output for expressions of loss and grief in our data set based on qualitative findings and present an error analysis on its output for grief. We conclude with a call for interdisciplinary research that analyzes online and offline behaviors to help understand physical and emotional violence and other problematic behaviors prevalent among marginalized communities. PMID:29636619
Developing tools and resources for the biomedical domain of the Greek language.
Vagelatos, Aristides; Mantzari, Elena; Pantazara, Mavina; Tsalidis, Christos; Kalamara, Chryssoula
2011-06-01
This paper presents the design and implementation of terminological and specialized textual resources that were produced in the framework of the Greek research project "IATROLEXI". The aim of the project was to create the critical infrastructure for the Greek language, i.e. linguistic resources and tools for use in high level Natural Language Processing (NLP) applications in the domain of biomedicine. The project was built upon existing resources developed by the project partners and further enhanced within its framework, i.e. a Greek morphological lexicon of about 100,000 words, and language processing tools such as a lemmatiser and a morphosyntactic tagger. Christos Tsalidis, Additionally, it developed new assets, such as a specialized corpus of biomedical texts and an ontology of medical terminology.
The Natural History of Human Language: Bridging the Gaps without Magic
NASA Astrophysics Data System (ADS)
Merker, Bjorn; Okanoya, Kazuo
Human languages are quintessentially historical phenomena. Every known aspect of linguistic form and content is subject to change in historical time (Lehmann, 1995; Bybee, 2004). Many facts of language, syntactic no less than semantic, find their explanation in the historical processes that generated them. If adpositions were once verbs, then the fact that they tend to occur on the same side of their arguments as do verbs ("cross-category harmony": Hawkins, 1983) is a matter of historical contingency rather than a reflection of inherent structural constraints on human language (Delancey, 1993).
NASA Astrophysics Data System (ADS)
Delgado, Francisco
2017-12-01
Quantum information processing should be generated through control of quantum evolution for physical systems being used as resources, such as superconducting circuits, spinspin couplings in ions and artificial anyons in electronic gases. They have a quantum dynamics which should be translated into more natural languages for quantum information processing. On this terrain, this language should let to establish manipulation operations on the associated quantum information states as classical information processing does. This work shows how a kind of processing operations can be settled and implemented for quantum states design and quantum processing for systems fulfilling a SU(2) reduction in their dynamics.
Semantic biomedical resource discovery: a Natural Language Processing framework.
Sfakianaki, Pepi; Koumakis, Lefteris; Sfakianakis, Stelios; Iatraki, Galatia; Zacharioudakis, Giorgos; Graf, Norbert; Marias, Kostas; Tsiknakis, Manolis
2015-09-30
A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.
Visual sign phonology: insights into human reading and language from a natural soundless phonology.
Petitto, L A; Langdon, C; Stone, A; Andriola, D; Kartheiser, G; Cochran, C
2016-11-01
Among the most prevailing assumptions in science and society about the human reading process is that sound and sound-based phonology are critical to young readers. The child's sound-to-letter decoding is viewed as universal and vital to deriving meaning from print. We offer a different view. The crucial link for early reading success is not between segmental sounds and print. Instead the human brain's capacity to segment, categorize, and discern linguistic patterning makes possible the capacity to segment all languages. This biological process includes the segmentation of languages on the hands in signed languages. Exposure to natural sign language in early life equally affords the child's discovery of silent segmental units in visual sign phonology (VSP) that can also facilitate segmental decoding of print. We consider powerful biological evidence about the brain, how it builds sound and sign phonology, and why sound and sign phonology are equally important in language learning and reading. We offer a testable theoretical account, reading model, and predictions about how VSP can facilitate segmentation and mapping between print and meaning. We explain how VSP can be a powerful facilitator of all children's reading success (deaf and hearing)-an account with profound transformative impact on learning to read in deaf children with different language backgrounds. The existence of VSP has important implications for understanding core properties of all human language and reading, challenges assumptions about language and reading as being tied to sound, and provides novel insight into a remarkable biological equivalence in signed and spoken languages. WIREs Cogn Sci 2016, 7:366-381. doi: 10.1002/wcs.1404 For further resources related to this article, please visit the WIREs website. © 2016 Wiley Periodicals, Inc.
The language faculty that wasn't: a usage-based account of natural language recursion
Christiansen, Morten H.; Chater, Nick
2015-01-01
In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty. PMID:26379567
The language faculty that wasn't: a usage-based account of natural language recursion.
Christiansen, Morten H; Chater, Nick
2015-01-01
In the generative tradition, the language faculty has been shrinking-perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerges gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty.
Utilizing a Health Impact Assessment (HIA) to Connect Natural Resource Management and Community
Marrying scientific and health research with natural resource management should be a straightforward process. However, differences in purpose, goals, language, levels of detail and implementation authority between the scientists who conduct research and resource managers who plan...
Marrying scientific and health research with natural resource management should be a straightforward process. However, differences in purpose, goals, language, levels of detail and implementation authority between the scientists who conduct research and resource managers who plan...
Language related differences of the sustained response evoked by natural speech sounds.
Fan, Christina Siu-Dschu; Zhu, Xingyu; Dosch, Hans Günter; von Stutterheim, Christiane; Rupp, André
2017-01-01
In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between superior temporal gyrus and sulcus.
Open Source Clinical NLP - More than Any Single System.
Masanz, James; Pakhomov, Serguei V; Xu, Hua; Wu, Stephen T; Chute, Christopher G; Liu, Hongfang
2014-01-01
The number of Natural Language Processing (NLP) tools and systems for processing clinical free-text has grown as interest and processing capability have surged. Unfortunately any two systems typically cannot simply interoperate, even when both are built upon a framework designed to facilitate the creation of pluggable components. We present two ongoing activities promoting open source clinical NLP. The Open Health Natural Language Processing (OHNLP) Consortium was originally founded to foster a collaborative community around clinical NLP, releasing UIMA-based open source software. OHNLP's mission currently includes maintaining a catalog of clinical NLP software and providing interfaces to simplify the interaction of NLP systems. Meanwhile, Apache cTAKES aims to integrate best-of-breed annotators, providing a world-class NLP system for accessing clinical information within free-text. These two activities are complementary. OHNLP promotes open source clinical NLP activities in the research community and Apache cTAKES bridges research to the health information technology (HIT) practice.
Gundlapalli, Adi V; Divita, Guy; Redd, Andrew; Carter, Marjorie E; Ko, Danette; Rubin, Michael; Samore, Matthew; Strymish, Judith; Krein, Sarah; Gupta, Kalpana; Sales, Anne; Trautner, Barbara W
2017-07-01
To develop a natural language processing pipeline to extract positively asserted concepts related to the presence of an indwelling urinary catheter in hospitalized patients from the free text of the electronic medical note. The goal is to assist infection preventionists and other healthcare professionals in determining whether a patient has an indwelling urinary catheter when a catheter-associated urinary tract infection is suspected. Currently, data on indwelling urinary catheters is not consistently captured in the electronic medical record in structured format and thus cannot be reliably extracted for clinical and research purposes. We developed a lexicon of terms related to indwelling urinary catheters and urinary symptoms based on domain knowledge, prior experience in the field, and review of medical notes. A reference standard of 1595 randomly selected documents from inpatient admissions was annotated by human reviewers to identify all positively and negatively asserted concepts related to indwelling urinary catheters. We trained a natural language processing pipeline based on the V3NLP framework using 1050 documents and tested on 545 documents to determine agreement with the human reference standard. Metrics reported are positive predictive value and recall. The lexicon contained 590 terms related to the presence of an indwelling urinary catheter in various categories including insertion, care, change, and removal of urinary catheters and 67 terms for urinary symptoms. Nursing notes were the most frequent inpatient note titles in the reference standard document corpus; these also yielded the highest number of positively asserted concepts with respect to urinary catheters. Comparing the performance of the natural language processing pipeline against the human reference standard, the overall recall was 75% and positive predictive value was 99% on the training set; on the testing set, the recall was 72% and positive predictive value was 98%. The performance on extracting urinary symptoms (including fever) was high with recall and precision greater than 90%. We have shown that it is possible to identify the presence of an indwelling urinary catheter and urinary symptoms from the free text of electronic medical notes from inpatients using natural language processing. These are two key steps in developing automated protocols to assist humans in large-scale review of patient charts for catheter-associated urinary tract infection. The challenges associated with extracting indwelling urinary catheter-related concepts also inform the design of electronic medical record templates to reliably and consistently capture data on indwelling urinary catheters. Published by Elsevier Inc.
Hardjojo, Antony; Gunachandran, Arunan; Pang, Long; Abdullah, Mohammed Ridzwan Bin; Wah, Win; Chong, Joash Wen Chen; Goh, Ee Hui; Teo, Sok Huang; Lim, Gilbert; Lee, Mong Li; Hsu, Wynne; Lee, Vernon; Chen, Mark I-Cheng; Wong, Franco; Phang, Jonathan Siung King
2018-06-11
Free-text clinical records provide a source of information that complements traditional disease surveillance. To electronically harness these records, they need to be transformed into codified fields by natural language processing algorithms. The aim of this study was to develop, train, and validate Clinical History Extractor for Syndromic Surveillance (CHESS), an natural language processing algorithm to extract clinical information from free-text primary care records. CHESS is a keyword-based natural language processing algorithm to extract 48 signs and symptoms suggesting respiratory infections, gastrointestinal infections, constitutional, as well as other signs and symptoms potentially associated with infectious diseases. The algorithm also captured the assertion status (affirmed, negated, or suspected) and symptom duration. Electronic medical records from the National Healthcare Group Polyclinics, a major public sector primary care provider in Singapore, were randomly extracted and manually reviewed by 2 human reviewers, with a third reviewer as the adjudicator. The algorithm was evaluated based on 1680 notes against the human-coded result as the reference standard, with half of the data used for training and the other half for validation. The symptoms most commonly present within the 1680 clinical records at the episode level were those typically present in respiratory infections such as cough (744/7703, 9.66%), sore throat (591/7703, 7.67%), rhinorrhea (552/7703, 7.17%), and fever (928/7703, 12.04%). At the episode level, CHESS had an overall performance of 96.7% precision and 97.6% recall on the training dataset and 96.0% precision and 93.1% recall on the validation dataset. Symptoms suggesting respiratory and gastrointestinal infections were all detected with more than 90% precision and recall. CHESS correctly assigned the assertion status in 97.3%, 97.9%, and 89.8% of affirmed, negated, and suspected signs and symptoms, respectively (97.6% overall accuracy). Symptom episode duration was correctly identified in 81.2% of records with known duration status. We have developed an natural language processing algorithm dubbed CHESS that achieves good performance in extracting signs and symptoms from primary care free-text clinical records. In addition to the presence of symptoms, our algorithm can also accurately distinguish affirmed, negated, and suspected assertion statuses and extract symptom durations. ©Antony Hardjojo, Arunan Gunachandran, Long Pang, Mohammed Ridzwan Bin Abdullah, Win Wah, Joash Wen Chen Chong, Ee Hui Goh, Sok Huang Teo, Gilbert Lim, Mong Li Lee, Wynne Hsu, Vernon Lee, Mark I-Cheng Chen, Franco Wong, Jonathan Siung King Phang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 11.06.2018.
Friedman, Carol; Rindflesch, Thomas C; Corn, Milton
2013-10-01
Natural language processing (NLP) is crucial for advancing healthcare because it is needed to transform relevant information locked in text into structured data that can be used by computer processes aimed at improving patient care and advancing medicine. In light of the importance of NLP to health, the National Library of Medicine (NLM) recently sponsored a workshop to review the state of the art in NLP focusing on text in English, both in biomedicine and in the general language domain. Specific goals of the NLM-sponsored workshop were to identify the current state of the art, grand challenges and specific roadblocks, and to identify effective use and best practices. This paper reports on the main outcomes of the workshop, including an overview of the state of the art, strategies for advancing the field, and obstacles that need to be addressed, resulting in recommendations for a research agenda intended to advance the field. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Diminished Auditory Responses during NREM Sleep Correlate with the Hierarchy of Language Processing
Furman-Haran, Edna; Arzi, Anat; Levkovitz, Yechiel; Malach, Rafael
2016-01-01
Natural sleep provides a powerful model system for studying the neuronal correlates of awareness and state changes in the human brain. To quantitatively map the nature of sleep-induced modulations in sensory responses we presented participants with auditory stimuli possessing different levels of linguistic complexity. Ten participants were scanned using functional magnetic resonance imaging (fMRI) during the waking state and after falling asleep. Sleep staging was based on heart rate measures validated independently on 20 participants using concurrent EEG and heart rate measurements and the results were confirmed using permutation analysis. Participants were exposed to three types of auditory stimuli: scrambled sounds, meaningless word sentences and comprehensible sentences. During non-rapid eye movement (NREM) sleep, we found diminishing brain activation along the hierarchy of language processing, more pronounced in higher processing regions. Specifically, the auditory thalamus showed similar activation levels during sleep and waking states, primary auditory cortex remained activated but showed a significant reduction in auditory responses during sleep, and the high order language-related representation in inferior frontal gyrus (IFG) cortex showed a complete abolishment of responses during NREM sleep. In addition to an overall activation decrease in language processing regions in superior temporal gyrus and IFG, those areas manifested a loss of semantic selectivity during NREM sleep. Our results suggest that the decreased awareness to linguistic auditory stimuli during NREM sleep is linked to diminished activity in high order processing stations. PMID:27310812
Diminished Auditory Responses during NREM Sleep Correlate with the Hierarchy of Language Processing.
Wilf, Meytal; Ramot, Michal; Furman-Haran, Edna; Arzi, Anat; Levkovitz, Yechiel; Malach, Rafael
2016-01-01
Natural sleep provides a powerful model system for studying the neuronal correlates of awareness and state changes in the human brain. To quantitatively map the nature of sleep-induced modulations in sensory responses we presented participants with auditory stimuli possessing different levels of linguistic complexity. Ten participants were scanned using functional magnetic resonance imaging (fMRI) during the waking state and after falling asleep. Sleep staging was based on heart rate measures validated independently on 20 participants using concurrent EEG and heart rate measurements and the results were confirmed using permutation analysis. Participants were exposed to three types of auditory stimuli: scrambled sounds, meaningless word sentences and comprehensible sentences. During non-rapid eye movement (NREM) sleep, we found diminishing brain activation along the hierarchy of language processing, more pronounced in higher processing regions. Specifically, the auditory thalamus showed similar activation levels during sleep and waking states, primary auditory cortex remained activated but showed a significant reduction in auditory responses during sleep, and the high order language-related representation in inferior frontal gyrus (IFG) cortex showed a complete abolishment of responses during NREM sleep. In addition to an overall activation decrease in language processing regions in superior temporal gyrus and IFG, those areas manifested a loss of semantic selectivity during NREM sleep. Our results suggest that the decreased awareness to linguistic auditory stimuli during NREM sleep is linked to diminished activity in high order processing stations.
Getting it right by getting it wrong: When learners change languages
Hudson Kam, Carla L.; Newport, Elissa L.
2009-01-01
When natural language input contains grammatical forms that are used probabilistically and inconsistently, learners will sometimes reproduce the inconsistencies; but sometimes they will instead regularize the use of these forms, introducing consistency in the language that was not present in the input. In this paper we ask what produces such regularization. We conducted three artificial language experiments, varying the use of determiners in the types of inconsistency with which they are used, and also comparing adult and child learners. In Experiment 1 we presented adult learners with scattered inconsistency – the use of multiple determiners varying in frequency in the same context – and found that adults will reproduce these inconsistencies at low levels of scatter, but at very high levels of scatter will regularize the determiner system, producing the most frequent determiner form almost all the time. In Experiment 2 we showed that this is not merely the result of frequency: when determiners are used with low frequencies but in consistent contexts, adults will learn all of the determiners veridically. In Experiment 3 we compared adult and child learners, finding that children will almost always regularize inconsistent forms, whereas adult learners will only regularize the most complex inconsistencies. Taken together, these results suggest that regularization processes in natural language learning, such as those seen in the acquisition of language from non-native speakers or in the formation of young languages, may depend crucially on the nature of language learning by young children. PMID:19324332
The Influence of Formulaic Language on L2 Listener Decoding in Extended Discourse
ERIC Educational Resources Information Center
Yeldham, Michael
2018-01-01
This study investigated the effect of formulaic language on L2 learners' ability to decode words in listening texts. One possibility was that formulas would facilitate listening by reducing the need to process every word of the sequences. However, a contrasting possibility was that the commonly reduced nature of formulaic words would hinder…
ERIC Educational Resources Information Center
Walsh, Irene P.
2008-01-01
Background: Some people with schizophrenia are considered to have communication difficulties because of concomitant language impairment and/or because of suppressed or "unusual" communication skills due to the often-chronic nature and manifestation of the illness process. Conversations with a person with schizophrenia pose many pragmatic…
Syntax and Meaning as Sensuous, Visual, Historical Forms of Algebraic Thinking
ERIC Educational Resources Information Center
Radford, Luis; Puig, Luis
2007-01-01
Before the advent of symbolism, i.e. before the end of the 16th Century, algebraic calculations were made using natural language. Through a kind of metaphorical process, a few terms from everyday life (e.g. thing, root) acquired a technical mathematical status and constituted the specialized language of algebra. The introduction of letters and…
ERIC Educational Resources Information Center
Davidovitch, Nitza; Eckhaus, Eyal
2018-01-01
This study deals with immigrant scientists integrated in academia in Israel. Studies on the subject indicate the contribution of immigrant scientists to research. The current study focuses on the influence of scientists' birth country on selecting destinations for academic conferences, as well as on the influence of one's native language on the…
The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data.
ERIC Educational Resources Information Center
Popovic, Mirko; Willett, Peter
1992-01-01
Reports on the use of stemming for Slovene language documents and queries in free-text retrieval systems and demonstrates that an appropriate stemming algorithm results in an increase in retrieval effectiveness when compared with nonstemming processing. A comparison is made with stemming of English versions of the same documents and queries. (24…
ERIC Educational Resources Information Center
Zajenkowski, Marcin; Styla, Rafal; Szymanik, Jakub
2011-01-01
We compared the processing of natural language quantifiers in a group of patients with schizophrenia and a healthy control group. In both groups, the difficulty of the quantifiers was consistent with computational predictions, and patients with schizophrenia took more time to solve the problems. However, they were significantly less accurate only…
A Study of Readability of Texts in Bangla through Machine Learning Approaches
ERIC Educational Resources Information Center
Sinha, Manjira; Basu, Anupam
2016-01-01
In this work, we have investigated text readability in Bangla language. Text readability is an indicator of the suitability of a given document with respect to a target reader group. Therefore, text readability has huge impact on educational content preparation. The advances in the field of natural language processing have enabled the automatic…
Anatomy of Language Impairments in Primary Progressive Aphasia
Rogalski, Emily; Cobia, Derin; Harrison, Theresa M.; Wieneke, Christina; Thompson, Cynthia K; Weintraub, Sandra; Mesulam, M.-Marsel
2011-01-01
Primary progressive aphasia (PPA) is a clinical dementia syndrome characterized by progressive decline in language function but relative sparing of other cognitive domains. There are three recognized PPA variants: agrammatic, semantic, and logopenic. Although each PPA subtype is characterized by the nature of the principal deficit, individual patients frequently display subtle impairments in additional language domains. The present study investigated the distribution of atrophy related to performance in specific language domains (i.e., grammatical processing, semantic processing, fluency, and sentence repetition) across PPA variants to better understand the anatomical substrates of language. Results showed regionally specific relationships, primarily in the left hemisphere, between atrophy and impairments in language performance. Most notable was the neuroanatomical distinction between fluency and grammatical processing. Poor fluency was associated with regions dorsal to the traditional boundaries of Broca’s area in the inferior frontal sulcus and the posterior middle frontal gyrus, whereas grammatical processing was associated with more widespread atrophy, including the inferior frontal gyrus and supramarginal gyrus. Repetition performance was correlated with atrophy in the posterior superior temporal gyrus. The correlation of atrophy with semantic processing impairment was localized to the anterior temporal poles. Atrophy patterns were more closely correlated with domain-specific performance than with subtype. These results show that PPA reflects a selective disruption of the language network as a whole, with no rigid boundaries between subtypes. Further, these atrophy patterns reveal anatomical correlates of language that could not have been surmised in patients with aphasia resulting from cerebrovascular lesions. PMID:21368046
Anatomy of language impairments in primary progressive aphasia.
Rogalski, Emily; Cobia, Derin; Harrison, Theresa M; Wieneke, Christina; Thompson, Cynthia K; Weintraub, Sandra; Mesulam, M-Marsel
2011-03-02
Primary progressive aphasia (PPA) is a clinical dementia syndrome characterized by progressive decline in language function but relative sparing of other cognitive domains. There are three recognized PPA variants: agrammatic, semantic, and logopenic. Although each PPA subtype is characterized by the nature of the principal deficit, individual patients frequently display subtle impairments in additional language domains. The present study investigated the distribution of atrophy related to performance in specific language domains (i.e., grammatical processing, semantic processing, fluency, and sentence repetition) across PPA variants to better understand the anatomical substrates of language. Results showed regionally specific relationships, primarily in the left hemisphere, between atrophy and impairments in language performance. Most notable was the neuroanatomical distinction between fluency and grammatical processing. Poor fluency was associated with regions dorsal to the traditional boundaries of Broca's area in the inferior frontal sulcus and the posterior middle frontal gyrus, whereas grammatical processing was associated with more widespread atrophy, including the inferior frontal gyrus and supramarginal gyrus. Repetition performance was correlated with atrophy in the posterior superior temporal gyrus. The correlation of atrophy with semantic processing impairment was localized to the anterior temporal poles. Atrophy patterns were more closely correlated with domain-specific performance than with subtype. These results show that PPA reflects a selective disruption of the language network as a whole, with no rigid boundaries between subtypes. Further, these atrophy patterns reveal anatomical correlates of language that could not have been surmised in patients with aphasia resulting from cerebrovascular lesions.
Natural language from artificial life.
Kirby, Simon
2002-01-01
This article aims to show that linguistics, in particular the study of the lexico-syntactic aspects of language, provides fertile ground for artificial life modeling. A survey of the models that have been developed over the last decade and a half is presented to demonstrate that ALife techniques have a lot to offer an explanatory theory of language. It is argued that this is because much of the structure of language is determined by the interaction of three complex adaptive systems: learning, culture, and biological evolution. Computational simulation, informed by theoretical linguistics, is an appropriate response to the challenge of explaining real linguistic data in terms of the processes that underpin human language.
Chen, Elizabeth S.; Stetson, Peter D.; Lussier, Yves A.; Markatou, Marianthi; Hripcsak, George; Friedman, Carol
2007-01-01
Clinical knowledge, best evidence, and practice patterns evolve over time. The ability to track these changes and study practice trends may be valuable for performance measurement and quality improvement efforts. The goal of this study was to assess the feasibility and validity of methods to generate and compare trends in biomedical literature and clinical narrative. We focused on the challenge of detecting trends in medication usage over time for two diseases: HIV/AIDS and asthma. Information about disease-specific medications in published randomized control trials and discharge summaries at New York-Presbyterian Hospital over a ten-year period were extracted using Natural Language Processing. This paper reports on the ability of our semi-automated process to discover disease-drug practice pattern trends and interpretation of findings across the biomedical and clinical text sources. PMID:18693810
Automated speech understanding: the next generation
NASA Astrophysics Data System (ADS)
Picone, J.; Ebel, W. J.; Deshmukh, N.
1995-04-01
Modern speech understanding systems merge interdisciplinary technologies from Signal Processing, Pattern Recognition, Natural Language, and Linguistics into a unified statistical framework. These systems, which have applications in a wide range of signal processing problems, represent a revolution in Digital Signal Processing (DSP). Once a field dominated by vector-oriented processors and linear algebra-based mathematics, the current generation of DSP-based systems rely on sophisticated statistical models implemented using a complex software paradigm. Such systems are now capable of understanding continuous speech input for vocabularies of several thousand words in operational environments. The current generation of deployed systems, based on small vocabularies of isolated words, will soon be replaced by a new technology offering natural language access to vast information resources such as the Internet, and provide completely automated voice interfaces for mundane tasks such as travel planning and directory assistance.
Natural Resource Information System, design analysis
NASA Technical Reports Server (NTRS)
1972-01-01
The computer-based system stores, processes, and displays map data relating to natural resources. The system was designed on the basis of requirements established in a user survey and an analysis of decision flow. The design analysis effort is described, and the rationale behind major design decisions, including map processing, cell vs. polygon, choice of classification systems, mapping accuracy, system hardware, and software language is summarized.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-22
... what types of quality measures should a combination of natural language processing and structured data... collection, analysis, processing, and its ability to facilitate information exchange among and across care...
Montgomery, James W; Leonard, Laurence B
2006-12-01
This study reports the findings of an investigation designed to examine the effects of acoustic enhancement on the processing of low-phonetic-substance inflections (e.g., 3rd-person singular -s, possessive -s) versus a high-phonetic-substance inflection (e.g., present progressive -ing) by children with specific language impairment (SLI) in a word recognition, reaction time (RT) processing task. The effects of acoustic enhancement on the processing of the same morphemes as well as an additional morpheme (comparative -er) were examined in an offline grammaticality judgment task. The grammatical function of 1 of the higher-phonetic-substance inflections, -ing, was presumed to be hypothesized relatively early by children; the function of the other, -er, was presumed to be hypothesized relatively late. Sixteen children with SLI (age(M) = 9 years;0 months) and 16 chronological age (CA; age(M) = 8;11) children participated. For both tasks, children listened to sentences containing the target morphemes as they were produced naturally (natural condition) or with acoustic enhancement (enhanced condition). On the RT task, the children with SLI demonstrated RT sensitivity only to the presence of the high-substance inflection, irrespective of whether it was produced naturally or with enhancement. Acoustic enhancement had no effect on these children's processing of low-substance inflections. The CA children, by contrast, showed sensitivity to low-substance inflections when they were produced naturally and with acoustic enhancement. These children also showed sensitivity to the high-substance inflection in the natural condition, but in the enhanced condition they demonstrated significantly slower RT. On the grammaticality judgment task, the children with SLI performed worse than the CA children overall and showed especially poor performance on low-substance inflections. Acoustic enhancement had a beneficial effect on the inflectional processing of the children with SLI, but it had no effect on CA children. The findings are interpreted to suggest that the reduced language processing capacity of children with SLI constrains their ability to process low-substance grammatical material in real time. This factor should be considered along with any difficulty that might be attributable to the grammatical function of the inflection.
Natural Language Processing Technologies in Radiology Research and Clinical Applications.
Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios
2016-01-01
The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.
Natural Language Processing Technologies in Radiology Research and Clinical Applications
Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.
2016-01-01
The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536
Plazzotta, Fernando; Otero, Carlos; Luna, Daniel; de Quiros, Fernan Gonzalez Bernaldo
2013-01-01
Physicians do not always keep the problem list accurate, complete and updated. To analyze natural language processing (NLP) techniques and inference rules as strategies to maintain completeness and accuracy of the problem list in EHRs. Non systematic literature review in PubMed, in the last 10 years. Strategies to maintain the EHRs problem list were analyzed in two ways: inputting and removing problems from the problem list. NLP and inference rules have acceptable performance for inputting problems into the problem list. No studies using these techniques for removing problems were published Conclusion: Both tools, NLP and inference rules have had acceptable results as tools for maintain the completeness and accuracy of the problem list.
Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei
2014-04-01
Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. Copyright © 2014 Elsevier Inc. All rights reserved.
Building gold standard corpora for medical natural language processing tasks.
Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre
2012-01-01
We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.
Neural signatures of response planning occur midway through an incoming question in conversation
Bögels, Sara; Magyari, Lilla; Levinson, Stephen C.
2015-01-01
A striking puzzle about language use in everyday conversation is that turn-taking latencies are usually very short, whereas planning language production takes much longer. This implies overlap between language comprehension and production processes, but the nature and extent of such overlap has never been studied directly. Combining an interactive quiz paradigm with EEG measurements in an innovative way, we show that production planning processes start as soon as possible, that is, within half a second after the answer to a question can be retrieved (up to several seconds before the end of the question). Localization of ERP data shows early activation even of brain areas related to late stages of production planning (e.g., syllabification). Finally, oscillation results suggest an attention switch from comprehension to production around the same time frame. This perspective from interactive language use throws new light on the performance characteristics that language competence involves. PMID:26242909
Iverson, Paul; Wagner, Anita; Rosen, Stuart
2016-04-01
Cross-language differences in speech perception have traditionally been linked to phonological categories, but it has become increasingly clear that language experience has effects beginning at early stages of perception, which blurs the accepted distinctions between general and speech-specific processing. The present experiments explored this distinction by playing stimuli to English and Japanese speakers that manipulated the acoustic form of English /r/ and /l/, in order to determine how acoustically natural and phonologically identifiable a stimulus must be for cross-language discrimination differences to emerge. Discrimination differences were found for stimuli that did not sound subjectively like speech or /r/ and /l/, but overall they were strongly linked to phonological categorization. The results thus support the view that phonological categories are an important source of cross-language differences, but also show that these differences can extend to stimuli that do not clearly sound like speech.
Learning a Foreign Language: A New Path to Enhancement of Cognitive Functions.
Shoghi Javan, Sara; Ghonsooly, Behzad
2018-02-01
The complicated cognitive processes involved in natural (primary) bilingualism lead to significant cognitive development. Executive functions as a fundamental component of human cognition are deemed to be affected by language learning. To date, a large number of studies have investigated how natural (primary) bilingualism influences executive functions; however, the way acquired (secondary) bilingualism manipulates executive functions is poorly understood. To fill this gap, controlling for age, gender, IQ, and socio-economic status, the researchers compared 60 advanced learners of English as a foreign language (EFL) to 60 beginners on measures of executive functions involving Stroop, Wisconsin Card Sorting Task (WCST) and Wechsler's digit span tasks. The results suggested that mastering English as a foreign language causes considerable enhancement in two components of executive functions, namely cognitive flexibility and working memory. However, no significant difference was observed in inhibitory control between the advanced EFL learners and beginners.
Linking sounds to meanings: infant statistical learning in a natural language.
Hay, Jessica F; Pelucchi, Bruna; Graf Estes, Katharine; Saffran, Jenny R
2011-09-01
The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants' subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants' prior experience with the distribution of sounds that make up words in natural languages. Copyright © 2011 Elsevier Inc. All rights reserved.
Natural Language Processing in Radiology: A Systematic Review.
Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A
2016-05-01
Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.
Formal Definition of Measures for BPMN Models
NASA Astrophysics Data System (ADS)
Reynoso, Luis; Rolón, Elvira; Genero, Marcela; García, Félix; Ruiz, Francisco; Piattini, Mario
Business process models are currently attaining more relevance, and more attention is therefore being paid to their quality. This situation led us to define a set of measures for the understandability of BPMN models, which is shown in a previous work. We focus on understandability since a model must be well understood before any changes are made to it. These measures were originally informally defined in natural language. As is well known, natural language is ambiguous and may lead to misunderstandings and a misinterpretation of the concepts captured by a measure and the way in which the measure value is obtained. This has motivated us to provide the formal definition of the proposed measures using OCL (Object Constraint Language) upon the BPMN (Business Process Modeling Notation) metamodel presented in this paper. The main advantages and lessons learned (which were obtained both from the current work and from previous works carried out in relation to the formal definition of other measures) are also summarized.
Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D
2017-10-26
To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient's reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.
Baneyx, Audrey; Charlet, Jean; Jaulent, Marie-Christine
2007-01-01
Pathologies and acts are classified in thesauri to help physicians to code their activity. In practice, the use of thesauri is not sufficient to reduce variability in coding and thesauri are not suitable for computer processing. We think the automation of the coding task requires a conceptual modeling of medical items: an ontology. Our task is to help lung specialists code acts and diagnoses with software that represents medical knowledge of this concerned specialty by an ontology. The objective of the reported work was to build an ontology of pulmonary diseases dedicated to the coding process. To carry out this objective, we develop a precise methodological process for the knowledge engineer in order to build various types of medical ontologies. This process is based on the need to express precisely in natural language the meaning of each concept using differential semantics principles. A differential ontology is a hierarchy of concepts and relationships organized according to their similarities and differences. Our main research hypothesis is to apply natural language processing tools to corpora to develop the resources needed to build the ontology. We consider two corpora, one composed of patient discharge summaries and the other being a teaching book. We propose to combine two approaches to enrich the ontology building: (i) a method which consists of building terminological resources through distributional analysis and (ii) a method based on the observation of corpus sequences in order to reveal semantic relationships. Our ontology currently includes 1550 concepts and the software implementing the coding process is still under development. Results show that the proposed approach is operational and indicates that the combination of these methods and the comparison of the resulting terminological structures give interesting clues to a knowledge engineer for the building of an ontology.
Language discrimination without language: Experiments on tamarin monkeys
NASA Astrophysics Data System (ADS)
Tincoff, Ruth; Hauser, Marc; Spaepen, Geertrui; Tsao, Fritz; Mehler, Jacques
2002-05-01
Human newborns can discriminate spoken languages differing on prosodic characteristics such as the timing of rhythmic units [T. Nazzi et al., JEP:HPP 24, 756-766 (1998)]. Cotton-top tamarins have also demonstrated a similar ability to discriminate a morae- (Japanese) vs a stress-timed (Dutch) language [F. Ramus et al., Science 288, 349-351 (2000)]. The finding that tamarins succeed in this task when either natural or synthesized utterances are played in a forward direction, but fail on backward utterances which disrupt the rhythmic cues, suggests that sensitivity to language rhythm may rely on general processes of the primate auditory system. However, the rhythm hypothesis also predicts that tamarins would fail to discriminate languages from the same rhythm class, such as English and Dutch. To assess the robustness of this ability, tamarins were tested on a different-rhythm-class distinction, Polish vs Japanese, and a new same-rhythm-class distinction, English vs Dutch. The stimuli were natural forward utterances produced by multiple speakers. As predicted by the rhythm hypothesis, tamarins discriminated between Polish and Japanese, but not English and Dutch. These findings strengthen the claim that discriminating the rhythmic cues of language does not require mechanisms specialized for human speech. [Work supported by NSF.
Patterson, Olga V; Freiberg, Matthew S; Skanderson, Melissa; J Fodeh, Samah; Brandt, Cynthia A; DuVall, Scott L
2017-06-12
In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center study. A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in echocardiogram reports as measurement-value pairs. Curated semantic bootstrapping was used to create a custom dictionary that extends existing terminologies based on terms that actually appear in the medical record. A novel disambiguation method based on semantic constraints was created to identify and discard erroneous alternative definitions of the measurement terms. The system was built utilizing a scalable framework, making it available for processing large datasets. The system was developed for and validated on notes from three sources: general clinic notes, echocardiogram reports, and radiology reports. The system achieved F-scores of 0.872, 0.844, and 0.877 with precision of 0.936, 0.982, and 0.969 for each dataset respectively averaged across all extracted values. Left ventricular ejection fraction (LVEF) is the most frequently extracted measurement. The precision of extraction of the LVEF measure ranged from 0.968 to 1.0 across different document types. This system illustrates the feasibility and effectiveness of a large-scale information extraction on clinical data. New clinical questions can be addressed in the domain of heart failure using retrospective clinical data analysis because key heart function measurements can be successfully extracted using natural language processing.
Experiments in concept modeling for radiographic image reports.
Bell, D S; Pattison-Gordon, E; Greenes, R A
1994-01-01
OBJECTIVE: Development of methods for building concept models to support structured data entry and image retrieval in chest radiography. DESIGN: An organizing model for chest-radiographic reporting was built by analyzing manually a set of natural-language chest-radiograph reports. During model building, clinician-informaticians judged alternative conceptual structures according to four criteria: content of clinically relevant detail, provision for semantic constraints, provision for canonical forms, and simplicity. The organizing model was applied in representing three sample reports in their entirety. To explore the potential for automatic model discovery, the representation of one sample report was compared with the noun phrases derived from the same report by the CLARIT natural-language processing system. RESULTS: The organizing model for chest-radiographic reporting consists of 62 concept types and 17 relations, arranged in an inheritance network. The broadest types in the model include finding, anatomic locus, procedure, attribute, and status. Diagnoses are modeled as a subtype of finding. Representing three sample reports in their entirety added 79 narrower concept types. Some CLARIT noun phrases suggested valid associations among subtypes of finding, status, and anatomic locus. CONCLUSIONS: A manual modeling process utilizing explicitly stated criteria for making modeling decisions produced an organizing model that showed consistency in early testing. A combination of top-down and bottom-up modeling was required. Natural-language processing may inform model building, but algorithms that would replace manual modeling were not discovered. Further progress in modeling will require methods for objective model evaluation and tools for formalizing the model-building process. PMID:7719807
Tasking and sharing sensing assets using controlled natural language
NASA Astrophysics Data System (ADS)
Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David
2012-06-01
We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.
ERIC Educational Resources Information Center
Higgins, Derrick; Futagi, Yoko; Deane, Paul
2005-01-01
This paper reports on the process of modifying the ModelCreator item generation system to produce output in multiple languages. In particular, Japanese and Spanish are now supported in addition to English. The addition of multilingual functionality was considerably facilitated by the general formulation of our natural language generation system,…
ERIC Educational Resources Information Center
Wood, Peter
2011-01-01
"QuickAssist," the program presented in this paper, uses natural language processing (NLP) technologies. It places a range of NLP tools at the disposal of learners, intended to enable them to independently read and comprehend a German text of their choice while they extend their vocabulary, learn about different uses of particular words,…
ERIC Educational Resources Information Center
Meisel, Jurgen M.
2011-01-01
Children acquiring their first languages are frequently regarded as the principal agents of diachronic change. The causes and the precise nature of the processes of change are, however, far from clear. The following discussion focuses on possible changes of core properties of grammars which, in terms of the theory of Universal Grammar, can be…
Language evolution and human-computer interaction
NASA Technical Reports Server (NTRS)
Grudin, Jonathan; Norman, Donald A.
1991-01-01
Many of the issues that confront designers of interactive computer systems also appear in natural language evolution. Natural languages and human-computer interfaces share as their primary mission the support of extended 'dialogues' between responsive entities. Because in each case one participant is a human being, some of the pressures operating on natural languages, causing them to evolve in order to better support such dialogue, also operate on human-computer 'languages' or interfaces. This does not necessarily push interfaces in the direction of natural language - since one entity in this dialogue is not a human, this is not to be expected. Nonetheless, by discerning where the pressures that guide natural language evolution also appear in human-computer interaction, we can contribute to the design of computer systems and obtain a new perspective on natural languages.
Steinhauer, Karsten; DePriest, John; Koelsch, Stefan
2016-01-01
The processing of prosodic phrase boundaries in language is immediately reflected by a specific event-related potential component called the Closure Positive Shift (CPS). A component somewhat reminiscent of the CPS in language has also been reported for musical phrases (i.e., the so-called ‘music CPS’). However, in previous studies the quantification of the music-CPS as well as its morphology and timing differed substantially from the characteristics of the language-CPS. Therefore, the degree of correspondence between cognitive mechanisms of phrasing in music and in language has remained questionable. Here, we probed the shared nature of mechanisms underlying musical and prosodic phrasing by (1) investigating whether the music-CPS is present at phrase boundary positions where the language-CPS has been originally reported (i.e., at the onset of the pause between phrases), and (2) comparing the CPS in music and in language in non-musicians and professional musicians. For the first time, we report a positive shift at the onset of musical phrase boundaries that strongly resembles the language-CPS and argue that the post-boundary ‘music-CPS’ of previous studies may be an entirely distinct ERP component. Moreover, the language-CPS in musicians was found to be less prominent than in non-musicians, suggesting more efficient processing of prosodic phrases in language as a result of higher musical expertise. PMID:27192560
Glushko, Anastasia; Steinhauer, Karsten; DePriest, John; Koelsch, Stefan
2016-01-01
The processing of prosodic phrase boundaries in language is immediately reflected by a specific event-related potential component called the Closure Positive Shift (CPS). A component somewhat reminiscent of the CPS in language has also been reported for musical phrases (i.e., the so-called 'music CPS'). However, in previous studies the quantification of the music-CPS as well as its morphology and timing differed substantially from the characteristics of the language-CPS. Therefore, the degree of correspondence between cognitive mechanisms of phrasing in music and in language has remained questionable. Here, we probed the shared nature of mechanisms underlying musical and prosodic phrasing by (1) investigating whether the music-CPS is present at phrase boundary positions where the language-CPS has been originally reported (i.e., at the onset of the pause between phrases), and (2) comparing the CPS in music and in language in non-musicians and professional musicians. For the first time, we report a positive shift at the onset of musical phrase boundaries that strongly resembles the language-CPS and argue that the post-boundary 'music-CPS' of previous studies may be an entirely distinct ERP component. Moreover, the language-CPS in musicians was found to be less prominent than in non-musicians, suggesting more efficient processing of prosodic phrases in language as a result of higher musical expertise.
Temporal pattern processing in songbirds.
Comins, Jordan A; Gentner, Timothy Q
2014-10-01
Understanding how the brain perceives, organizes and uses patterned information is directly related to the neurobiology of language. Given the present limitations, such knowledge at the scale of neurons, neural circuits and neural populations can only come from non-human models, focusing on shared capacities that are relevant to language processing. Here we review recent advances in the behavioral and neural basis of temporal pattern processing of natural auditory communication signals in songbirds, focusing on European starlings. We suggest a general inhibitory circuit for contextual modulation that can act to control sensory representations based on patterning rules. Copyright © 2014. Published by Elsevier Ltd.
Constructing Concept Schemes From Astronomical Telegrams Via Natural Language Clustering
NASA Astrophysics Data System (ADS)
Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.
2012-01-01
The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs (http://www.astronomerstelegram.org) are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.
A natural language query system for Hubble Space Telescope proposal selection
NASA Technical Reports Server (NTRS)
Hornick, Thomas; Cohen, William; Miller, Glenn
1987-01-01
The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.
Luo, Yuan; Szolovits, Peter
2016-01-01
In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.
Luo, Yuan; Szolovits, Peter
2016-01-01
In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen’s interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen’s relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions. PMID:27478379
NASA Astrophysics Data System (ADS)
Liu, Bingli; Chen, Xinying
2017-07-01
In the target article [1], Liu et al. provide an informative introduction to the dependency distance studies and proclaim that language syntactic patterns, that relate to the dependency distance, are associated with human cognitive mechanisms, such as limited working memory and syntax processing. Therefore, such syntactic patterns are probably 'human-driven' language universals. Sufficient evidence based on big data analysis is also given in the article for supporting this idea. The hypotheses generally seem very convincing yet still need further tests from various perspectives. Diachronic linguistic study based on authentic language data, on our opinion, can be one of those 'further tests'.
Natural language indicators of differential gene regulation in the human immune system.
Mehl, Matthias R; Raison, Charles L; Pace, Thaddeus W W; Arevalo, Jesusa M G; Cole, Steve W
2017-11-21
Adverse social conditions have been linked to a conserved transcriptional response to adversity (CTRA) in circulating leukocytes that may contribute to social gradients in disease. However, the CNS mechanisms involved remain obscure, in part because CTRA gene-expression profiles often track external social-environmental variables more closely than they do self-reported internal affective states such as stress, depression, or anxiety. This study examined the possibility that variations in patterns of natural language use might provide more sensitive indicators of the automatic threat-detection and -response systems that proximally regulate autonomic induction of the CTRA. In 22,627 audio samples of natural speech sampled from the daily interactions of 143 healthy adults, both total language output and patterns of function-word use covaried with CTRA gene expression. These language features predicted CTRA gene expression substantially better than did conventional self-report measures of stress, depression, and anxiety and did so independently of demographic and behavioral factors (age, sex, race, smoking, body mass index) and leukocyte subset distributions. This predictive relationship held when language and gene expression were sampled more than a week apart, suggesting that associations reflect stable individual differences or chronic life circumstances. Given the observed relationship between personal expression and gene expression, patterns of natural language use may provide a useful behavioral indicator of nonconsciously evaluated well-being (implicit safety vs. threat) that is distinct from conscious affective experience and more closely tracks the neurobiological processes involved in peripheral gene regulation. Copyright © 2017 the Author(s). Published by PNAS.
Text Information Extraction System (TIES) | Informatics Technology for Cancer Research (ITCR)
TIES is a service based software system for acquiring, deidentifying, and processing clinical text reports using natural language processing, and also for querying, sharing and using this data to foster tissue and image based research, within and between institutions.
RGSS-ID: an approach to new radiologic reporting system.
Ikeda, M; Sakuma, S; Maruyama, K
1990-01-01
RGSS-ID is a developmental computer system that applies artificial intelligence (AI) methods to a reporting system. The representation scheme called Generalized Finding Representation (GFR) is proposed to bridge the gap between natural language expressions in the radiology report and AI methods. The entry process of RGSS-ID is made mainly by selecting items; our system allows a radiologist to compose a sentence which can be completely parsed by the computer. Further RGSS-ID encodes findings into the expression corresponding to GFR, and stores this expression into the knowledge data base. The final printed report is made in the natural language.
Native language shapes automatic neural processing of speech.
Intartaglia, Bastien; White-Schwoch, Travis; Meunier, Christine; Roman, Stéphane; Kraus, Nina; Schön, Daniele
2016-08-01
The development of the phoneme inventory is driven by the acoustic-phonetic properties of one's native language. Neural representation of speech is known to be shaped by language experience, as indexed by cortical responses, and recent studies suggest that subcortical processing also exhibits this attunement to native language. However, most work to date has focused on the differences between tonal and non-tonal languages that use pitch variations to convey phonemic categories. The aim of this cross-language study is to determine whether subcortical encoding of speech sounds is sensitive to language experience by comparing native speakers of two non-tonal languages (French and English). We hypothesized that neural representations would be more robust and fine-grained for speech sounds that belong to the native phonemic inventory of the listener, and especially for the dimensions that are phonetically relevant to the listener such as high frequency components. We recorded neural responses of American English and French native speakers, listening to natural syllables of both languages. Results showed that, independently of the stimulus, American participants exhibited greater neural representation of the fundamental frequency compared to French participants, consistent with the importance of the fundamental frequency to convey stress patterns in English. Furthermore, participants showed more robust encoding and more precise spectral representations of the first formant when listening to the syllable of their native language as compared to non-native language. These results align with the hypothesis that language experience shapes sensory processing of speech and that this plasticity occurs as a function of what is meaningful to a listener. Copyright © 2016 Elsevier Ltd. All rights reserved.
More Than Words: The Role of Multiword Sequences in Language Learning and Use.
Christiansen, Morten H; Arnon, Inbal
2017-07-01
The ability to convey our thoughts using an infinite number of linguistic expressions is one of the hallmarks of human language. Understanding the nature of the psychological mechanisms and representations that give rise to this unique productivity is a fundamental goal for the cognitive sciences. A long-standing hypothesis is that single words and rules form the basic building blocks of linguistic productivity, with multiword sequences being treated as units only in peripheral cases such as idioms. The new millennium, however, has seen a shift toward construing multiword linguistic units not as linguistic rarities, but as important building blocks for language acquisition and processing. This shift-which originated within theoretical approaches that emphasize language learning and use-has far-reaching implications for theories of language representation, processing, and acquisition. Incorporating multiword units as integral building blocks blurs the distinction between grammar and lexicon; calls for models of production and comprehension that can accommodate and give rise to the effect of multiword information on processing; and highlights the importance of such units to learning. In this special topic, we bring together cutting-edge work on multiword sequences in theoretical linguistics, first-language acquisition, psycholinguistics, computational modeling, and second-language learning to present a comprehensive overview of the prominence and importance of such units in language, their possible role in explaining differences between first- and second-language learning, and the challenges the combined findings pose for theories of language. Copyright © 2017 Cognitive Science Society, Inc.
Natural language processing: an introduction.
Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W
2011-01-01
To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.
What can Natural Language Processing do for Clinical Decision Support?
Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.
2009-01-01
Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066
Natural language processing: an introduction
Ohno-Machado, Lucila; Chapman, Wendy W
2011-01-01
Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786
Early Childhood Stuttering and Electrophysiological Indices of Language Processing
Weber-Fox, Christine; Wray, Amanda Hampton; Arnold, Hayley
2013-01-01
We examined neural activity mediating semantic and syntactic processing in 27 preschool-age children who stutter (CWS) and 27 preschool-age children who do not stutter (CWNS) matched for age, nonverbal IQ and language abilities. All participants displayed language abilities and nonverbal IQ within the normal range. Event-related brain potentials (ERPs) were elicited while participants watched a cartoon video and heard naturally spoken sentences that were either correct or contained semantic or syntactic (phrase structure) violations. ERPs in CWS, compared to CWNS, were characterized by longer N400 peak latencies elicited by semantic processing. In the CWS, syntactic violations elicited greater negative amplitudes for the early time window (150–350 ms) over medial sites compared to CWNS. Additionally, the amplitude of the P600 elicited by syntactic violations relative to control words was significant over the left hemisphere for the CWNS but showed the reverse pattern in CWS, a robust effect only over the right hemisphere. Both groups of preschoolage children demonstrated marked and differential effects for neural processes elicited by semantic and phrase structure violations; however, a significant proportion of young CWS exhibit differences in the neural functions mediating language processing compared to CWNS despite normal language abilities. These results are the first to show that differences in event-related brain potentials reflecting language processing occur as early as the preschool years in CWS and provide the first evidence that atypical lateralization of hemispheric speech/language functions previously observed in the brains of adults who stutter begin to emerge near the onset of developmental stuttering. PMID:23773672
Alt, Mary; Gutmann, Michelle L
2009-01-01
This study was designed to test the word learning abilities of adults with typical language abilities, those with a history of disorders of spoken or written language (hDSWL), and hDSWL plus attention deficit hyperactivity disorder (+ADHD). Sixty-eight adults were required to associate a novel object with a novel label, and then recognize semantic features of the object and phonological features of the label. Participants were tested for overt ability (accuracy) and covert processing (reaction time). The +ADHD group was less accurate at mapping semantic features and slower to respond to lexical labels than both other groups. Different factors correlated with word learning performance for each group. Adults with language and attention deficits are more impaired at word learning than adults with language deficits only. Despite behavioral profiles like typical peers, adults with hDSWL may use different processing strategies than their peers. Readers will be able to: (1) recognize the influence of a dual disability (hDSWL and ADHD) on word learning outcomes; (2) identify factors that may contribute to word learning in adults in terms of (a) the nature of the words to be learned and (b) the language processing of the learner.
Weber-Fox, Christine; Hart, Laura J; Spruill, John E
2006-07-01
This study examined how school-aged children process different grammatical categories. Event-related brain potentials elicited by words in visually presented sentences were analyzed according to seven grammatical categories with naturally varying characteristics of linguistic functions, semantic features, and quantitative attributes of length and frequency. The categories included nouns, adjectives, verbs, pronouns, conjunctions, prepositions, and articles. The findings indicate that by the age of 9-10 years, children exhibit robust neural indicators differentiating grammatical categories; however, it is also evident that development of language processing is not yet adult-like at this age. The current findings are consistent with the hypothesis that for beginning readers a variety of cues and characteristics interact to affect processing of different grammatical categories and indicate the need to take into account linguistic functions, prosodic salience, and grammatical complexity as they relate to the development of language abilities.
Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua
2017-11-24
Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Language in Comparative Perspective. Chapter 11
NASA Technical Reports Server (NTRS)
Rumbaugh, Duane M.; Savage-Rumbaugh, E. Sue
1994-01-01
The twentieth century will be noted for a wide variety of scientific and technological advancements, including powered flight, antibiotics, space travel, and the breaking of the genetic code. It also should be noted as the century in which major psychological, as well as biological, continuities between animal and human have been defined. Charles Darwin (1859) was quite right when he anticipated continuity in mental processes, some of which provide for language. Though none will argue that any animal has the full capacity of humans for language, none should deny that at least some animals have quite impressive competencies for language skills, including speech comprehension. The finding that the language skills in the bonobo and the chimpanzee are likely more fully and efficiently developed as a result of early rearing than by formal training at a later age declares a continuity even stronger than that defined by the language acquisition potential of the ape. To clarify, because early rearing facilitates the emergence of language in ape as well as in child, a naturalness to the familiar course of language acquisition, whereby comprehension precedes production, is also corroborated. In turn, the continuity and the shared naturalness of language acquisition serve jointly to define an advanced and critical point of linkage between the genera Pan and Homo - and, as concluded by Domjan (1993), one worthy of contributing to the series of reconceptions of ourselves as anticipated by Ploog and Melnechuk (1971).
Formal ontology for natural language processing and the integration of biomedical databases.
Simon, Jonathan; Dos Santos, Mariana; Fielding, James; Smith, Barry
2006-01-01
The central hypothesis underlying this communication is that the methodology and conceptual rigor of a philosophically inspired formal ontology can bring significant benefits in the development and maintenance of application ontologies [A. Flett, M. Dos Santos, W. Ceusters, Some Ontology Engineering Procedures and their Supporting Technologies, EKAW2002, 2003]. This hypothesis has been tested in the collaboration between Language and Computing (L&C), a company specializing in software for supporting natural language processing especially in the medical field, and the Institute for Formal Ontology and Medical Information Science (IFOMIS), an academic research institution concerned with the theoretical foundations of ontology. In the course of this collaboration L&C's ontology, LinKBase, which is designed to integrate and support reasoning across a plurality of external databases, has been subjected to a thorough auditing on the basis of the principles underlying IFOMIS's Basic Formal Ontology (BFO) [B. Smith, Basic Formal Ontology, 2002. http://ontology.buffalo.edu/bfo]. The goal is to transform a large terminology-based ontology into one with the ability to support reasoning applications. Our general procedure has been the implementation of a meta-ontological definition space in which the definitions of all the concepts and relations in LinKBase are standardized in the framework of first-order logic. In this paper we describe how this principles-based standardization has led to a greater degree of internal coherence of the LinKBase structure, and how it has facilitated the construction of mappings between external databases using LinKBase as translation hub. We argue that the collaboration here described represents a new phase in the quest to solve the so-called "Tower of Babel" problem of ontology integration [F. Montayne, J. Flanagan, Formal Ontology: The Foundation for Natural Language Processing, 2003. http://www.landcglobal.com/].
Willems, Roel M.; Hagoort, Peter
2016-01-01
Many studies have revealed shared music–language processing resources by finding an influence of music harmony manipulations on concurrent language processing. However, the nature of the shared resources has remained ambiguous. They have been argued to be syntax specific and thus due to shared syntactic integration resources. An alternative view regards them as related to general attention and, thus, not specific to syntax. The present experiments evaluated these accounts by investigating the influence of language on music. Participants were asked to provide closure judgements on harmonic sequences in order to assess the appropriateness of sequence endings. At the same time participants read syntactic garden-path sentences. Closure judgements revealed a change in harmonic processing as the result of reading a syntactically challenging word. We found no influence of an arithmetic control manipulation (experiment 1) or semantic garden-path sentences (experiment 2). Our results provide behavioural evidence for a specific influence of linguistic syntax processing on musical harmony judgements. A closer look reveals that the shared resources appear to be needed to hold a harmonic key online in some form of syntactic working memory or unification workspace related to the integration of chords and words. Overall, our results support the syntax specificity of shared music–language processing resources. PMID:26998339
ERIC Educational Resources Information Center
Russell, Dale W.
An obstacle in Natural Language understanding is the existence of lexical gaps, i.e. words or word senses that are not in the lexicon of the system. This thesis describes the implementation of MURRAY, a learning mechanism which infers the properties of a new lexical item from its syntactical environment and infers its meaning based on context and…
ERIC Educational Resources Information Center
Renaud, Claire
2010-01-01
Current second language (L2) research focuses on the level of features--that is, the core elements of languages in the Minimalist Program framework. These features, involved in computations, are further divided into two types: those that indicate to which category a word belongs (i.e., interpretable features) versus those that constrain the type…
Processing structure in language and music: a case for shared reliance on cognitive control.
Slevc, L Robert; Okada, Brooke M
2015-06-01
The relationship between structural processing in music and language has received increasing interest in the past several years, spurred by the influential Shared Syntactic Integration Resource Hypothesis (SSIRH; Patel, Nature Neuroscience, 6, 674-681, 2003). According to this resource-sharing framework, music and language rely on separable syntactic representations but recruit shared cognitive resources to integrate these representations into evolving structures. The SSIRH is supported by findings of interactions between structural manipulations in music and language. However, other recent evidence suggests that such interactions also can arise with nonstructural manipulations, and some recent neuroimaging studies report largely nonoverlapping neural regions involved in processing musical and linguistic structure. These conflicting results raise the question of exactly what shared (and distinct) resources underlie musical and linguistic structural processing. This paper suggests that one shared resource is prefrontal cortical mechanisms of cognitive control, which are recruited to detect and resolve conflict that occurs when expectations are violated and interpretations must be revised. By this account, musical processing involves not just the incremental processing and integration of musical elements as they occur, but also the incremental generation of musical predictions and expectations, which must sometimes be overridden and revised in light of evolving musical input.
Open Source Clinical NLP – More than Any Single System
Masanz, James; Pakhomov, Serguei V.; Xu, Hua; Wu, Stephen T.; Chute, Christopher G.; Liu, Hongfang
2014-01-01
The number of Natural Language Processing (NLP) tools and systems for processing clinical free-text has grown as interest and processing capability have surged. Unfortunately any two systems typically cannot simply interoperate, even when both are built upon a framework designed to facilitate the creation of pluggable components. We present two ongoing activities promoting open source clinical NLP. The Open Health Natural Language Processing (OHNLP) Consortium was originally founded to foster a collaborative community around clinical NLP, releasing UIMA-based open source software. OHNLP’s mission currently includes maintaining a catalog of clinical NLP software and providing interfaces to simplify the interaction of NLP systems. Meanwhile, Apache cTAKES aims to integrate best-of-breed annotators, providing a world-class NLP system for accessing clinical information within free-text. These two activities are complementary. OHNLP promotes open source clinical NLP activities in the research community and Apache cTAKES bridges research to the health information technology (HIT) practice. PMID:25954581
CE-SAM: a conversational interface for ISR mission support
NASA Astrophysics Data System (ADS)
Pizzocaro, Diego; Parizas, Christos; Preece, Alun; Braines, Dave; Mott, David; Bakdash, Jonathan Z.
2013-05-01
There is considerable interest in natural language conversational interfaces. These allow for complex user interactions with systems, such as fulfilling information requirements in dynamic environments, without requiring extensive training or a technical background (e.g. in formal query languages or schemas). To leverage the advantages of conversational interactions we propose CE-SAM (Controlled English Sensor Assignment to Missions), a system that guides users through refining and satisfying their information needs in the context of Intelligence, Surveillance, and Reconnaissance (ISR) operations. The rapidly-increasing availability of sensing assets and other information sources poses substantial challenges to effective ISR resource management. In a coalition context, the problem is even more complex, because assets may be "owned" by different partners. We show how CE-SAM allows a user to refine and relate their ISR information needs to pre-existing concepts in an ISR knowledge base, via conversational interaction implemented on a tablet device. The knowledge base is represented using Controlled English (CE) - a form of controlled natural language that is both human-readable and machine processable (i.e. can be used to implement automated reasoning). Users interact with the CE-SAM conversational interface using natural language, which the system converts to CE for feeding-back to the user for confirmation (e.g. to reduce misunderstanding). We show that this process not only allows users to access the assets that can support their mission needs, but also assists them in extending the CE knowledge base with new concepts.
Independent Research and Independent Exploratory Development Annual Report Fiscal Year 1976 and FYTQ
1976-10-01
Command Control Natural Language Processing; Network Study Small Ship C2 System Display Studies Communication HF-Propagation Signal Processing Theory...faster than their natur tl horizontally polarized light. This is passed through mechanical resonances. the wire grid polar’izer and mixes with the ZOOM...present in the coronary care unit Borkat, FR, Kataoka, RW, and Martin, JI, "Digital would contribute considerably to the high level of Cardiotachometer
[Hygiene requirements for the level of intellectual intensivity loads in foreign language learning].
Ivashchenko, S N
2013-01-01
The material of this article provides information about the basics of hygiene conditions and nature of intellectual loads of students of secondary schools in the perception of information in a foreign language. Are the most favorable conditions for the successful training of perception and assimilation of information supplied in the course of the learning process in one foreign language or some more different ones? It was found that the process of perception and assimilation of educational information in foreign languages is associated with some degree of mental and emotional stress of students. At the same time, the effectiveness of the learning process depends on the degree of stress. Certain parameters of the psychological and emotional stress students usually have a stimulating effect on their central nervous system. Another level, the psychological and emotional stress of students on the contrary, causes a braking effect of functional activity of the relevant structures of the central nervous system of students and reduces the effectiveness of training.
Keep meaning in conversational coordination
Cuffari, Elena C.
2014-01-01
Coordination is a widely employed term across recent quantitative and qualitative approaches to intersubjectivity, particularly approaches that give embodiment and enaction central explanatory roles. With a focus on linguistic and bodily coordination in conversational contexts, I review the operational meaning of coordination in recent empirical research and related theorizing of embodied intersubjectivity. This discussion articulates what must be involved in treating linguistic meaning as dynamic processes of coordination. The coordination approach presents languaging as a set of dynamic self-organizing processes and actions on multiple timescales and across multiple modalities that come about and work in certain domains (those jointly constructed in social, interactive, high-order sense-making). These processes go beyond meaning at the level that is available to first-person experience. I take one crucial consequence of this to be the ubiquitously moral nature of languaging with others. Languaging coordinates experience, among other levels of behavior and event. Ethical effort is called for by the automatic autonomy-influencing forces of languaging as coordination. PMID:25520693
Computational Natural Language Inference: Robust and Interpretable Question Answering
ERIC Educational Resources Information Center
Sharp, Rebecca Reynolds
2017-01-01
We address the challenging task of "computational natural language inference," by which we mean bridging two or more natural language texts while also providing an explanation of how they are connected. In the context of question answering (i.e., finding short answers to natural language questions), this inference connects the question…
Cross-Language Distributions of High Frequency and Phonetically Similar Cognates
Schepens, Job; Dijkstra, Ton; Grootjen, Franc; van Heuven, Walter J. B.
2013-01-01
The coinciding form and meaning similarity of cognates, e.g. ‘flamme’ (French), ‘Flamme’ (German), ‘vlam’ (Dutch), meaning ‘flame’ in English, facilitates learning of additional languages. The cross-language frequency and similarity distributions of cognates vary according to evolutionary change and language contact. We compare frequency and orthographic (O), phonetic (P), and semantic similarity of cognates, automatically identified in semi-complete lexicons of six widely spoken languages. Comparisons of P and O similarity reveal inconsistent mappings in language pairs with deep orthographies. The frequency distributions show that cognate frequency is reduced in less closely related language pairs as compared to more closely related languages (e.g., French-English vs. German-English). These frequency and similarity patterns may support a better understanding of cognate processing in natural and experimental settings. The automatically identified cognates are available in the supplementary materials, including the frequency and similarity measurements. PMID:23675449
Chen, Ao; Peter, Varghese; Wijnen, Frank; Schnack, Hugo; Burnham, Denis
2018-04-21
Language experience shapes musical and speech pitch processing. We investigated whether speaking a lexical tone language natively modulates neural processing of pitch in language and music as well as their correlation. We tested tone language (Mandarin Chinese), and non-tone language (Dutch) listeners in a passive oddball paradigm measuring mismatch negativity (MMN) for (i) Chinese lexical tones and (ii) three-note musical melodies with similar pitch contours. For lexical tones, Chinese listeners showed a later MMN peak than the non-tone language listeners, whereas for MMN amplitude there were no significant differences between groups. Dutch participants also showed a late discriminative negativity (LDN). In the music condition two MMNs, corresponding to the two notes that differed between the standard and the deviant were found for both groups, and an LDN were found for both the Dutch and the Chinese listeners. The music MMNs were significantly right lateralized. Importantly, significant correlations were found between the lexical tone and the music MMNs for the Dutch but not the Chinese participants. The results suggest that speaking a tone language natively does not necessarily enhance neural responses to pitch either in language or in music, but that it does change the nature of neural pitch processing: non-tone language speakers appear to perceive lexical tones as musical, whereas for tone language speakers, lexical tones and music may activate different neural networks. Neural resources seem to be assigned differently for the lexical tones and for musical melodies, presumably depending on the presence or absence of long-term phonological memory traces. Copyright © 2018 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Nash-Webber, Bonnie; Reiter, Raymond
This paper describes a computational approach to certain problems of anaphora in natural language and argues in favor of formal meaning representation languages (MRLs) for natural language. After presenting arguments in favor of formal meaning representation languages, appropriate MRLs are discussed. Minimal requirements include provisions for…
Ting, Simon Kang Seng; Chia, Pei Shi; Kwek, Kevin; Tan, Wilnard; Hameed, Shahul
2016-10-01
Number processing disorder is an acquired deficit in mathematical skills commonly observed in Alzheimer's disease (AD), usually as a consequence of neurological dysfunction. Common impairments include syntactic errors (800012 instead of 8012) and intrusion errors (8 thousand and 12 instead of eight thousand and twelve) in number transcoding tasks. This study aimed to understand the characterization of AD-related number processing disorder within an alphabetic language (English) and ideographical language (Chinese), and to investigate the differences between alphabetic and ideographic language processing. Chinese-speaking AD patients were hypothesized to make significantly more intrusion errors than English-speaking ones, due to the ideographical nature of both Chinese characters and Arabic numbers. A simplified number transcoding test derived from EC301 battery was administered to AD patients. Chinese-speaking AD patients made significantly more intrusion errors (p = 0.001) than English speakers. This demonstrates that number processing in an alphabetic language such as English does not function in the same manner as in Chinese. The impaired inhibition capability likely contributes to such observations due to its competitive lexical representation in brain for Chinese speakers.
Modelling of internal architecture of kinesin nanomotor as a machine language.
Khataee, H R; Ibrahim, M Y
2012-09-01
Kinesin is a protein-based natural nanomotor that transports molecular cargoes within cells by walking along microtubules. Kinesin nanomotor is considered as a bio-nanoagent which is able to sense the cell through its sensors (i.e. its heads and tail), make the decision internally and perform actions on the cell through its actuator (i.e. its motor domain). The study maps the agent-based architectural model of internal decision-making process of kinesin nanomotor to a machine language using an automata algorithm. The applied automata algorithm receives the internal agent-based architectural model of kinesin nanomotor as a deterministic finite automaton (DFA) model and generates a regular machine language. The generated regular machine language was acceptable by the architectural DFA model of the nanomotor and also in good agreement with its natural behaviour. The internal agent-based architectural model of kinesin nanomotor indicates the degree of autonomy and intelligence of the nanomotor interactions with its cell. Thus, our developed regular machine language can model the degree of autonomy and intelligence of kinesin nanomotor interactions with its cell as a language. Modelling of internal architectures of autonomous and intelligent bio-nanosystems as machine languages can lay the foundation towards the concept of bio-nanoswarms and next phases of the bio-nanorobotic systems development.
Computer Aided Management for Information Processing Projects.
ERIC Educational Resources Information Center
Akman, Ibrahim; Kocamustafaogullari, Kemal
1995-01-01
Outlines the nature of information processing projects and discusses some project management programming packages. Describes an in-house interface program developed to utilize a selected project management package (TIMELINE) by using Oracle Data Base Management System tools and Pascal programming language for the management of information system…
Space Station Mission Planning System (MPS) development study. Volume 2
NASA Technical Reports Server (NTRS)
Klus, W. J.
1987-01-01
The process and existing software used for Spacelab payload mission planning were studied. A complete baseline definition of the Spacelab payload mission planning process was established, along with a definition of existing software capabilities for potential extrapolation to the Space Station. This information was used as a basis for defining system requirements to support Space Station mission planning. The Space Station mission planning concept was reviewed for the purpose of identifying areas where artificial intelligence concepts might offer substantially improved capability. Three specific artificial intelligence concepts were to be investigated for applicability: natural language interfaces; expert systems; and automatic programming. The advantages and disadvantages of interfacing an artificial intelligence language with existing FORTRAN programs or of converting totally to a new programming language were identified.
Nayor, Jennifer; Borges, Lawrence F; Goryachev, Sergey; Gainer, Vivian S; Saltzman, John R
2018-07-01
ADR is a widely used colonoscopy quality indicator. Calculation of ADR is labor-intensive and cumbersome using current electronic medical databases. Natural language processing (NLP) is a method used to extract meaning from unstructured or free text data. (1) To develop and validate an accurate automated process for calculation of adenoma detection rate (ADR) and serrated polyp detection rate (SDR) on data stored in widely used electronic health record systems, specifically Epic electronic health record system, Provation ® endoscopy reporting system, and Sunquest PowerPath pathology reporting system. Screening colonoscopies performed between June 2010 and August 2015 were identified using the Provation ® reporting tool. An NLP pipeline was developed to identify adenomas and sessile serrated polyps (SSPs) on pathology reports corresponding to these colonoscopy reports. The pipeline was validated using a manual search. Precision, recall, and effectiveness of the natural language processing pipeline were calculated. ADR and SDR were then calculated. We identified 8032 screening colonoscopies that were linked to 3821 pathology reports (47.6%). The NLP pipeline had an accuracy of 100% for adenomas and 100% for SSPs. Mean total ADR was 29.3% (range 14.7-53.3%); mean male ADR was 35.7% (range 19.7-62.9%); and mean female ADR was 24.9% (range 9.1-51.0%). Mean total SDR was 4.0% (0-9.6%). We developed and validated an NLP pipeline that accurately and automatically calculates ADRs and SDRs using data stored in Epic, Provation ® and Sunquest PowerPath. This NLP pipeline can be used to evaluate colonoscopy quality parameters at both individual and practice levels.
Musical Experience Influences Statistical Learning of a Novel Language
Shook, Anthony; Marian, Viorica; Bartolotti, James; Schroeder, Scott R.
2014-01-01
Musical experience may benefit learning a new language by enhancing the fidelity with which the auditory system encodes sound. In the current study, participants with varying degrees of musical experience were exposed to two statistically-defined languages consisting of auditory Morse-code sequences which varied in difficulty. We found an advantage for highly-skilled musicians, relative to less-skilled musicians, in learning novel Morse-code based words. Furthermore, in the more difficult learning condition, performance of lower-skilled musicians was mediated by their general cognitive abilities. We suggest that musical experience may lead to enhanced processing of statistical information and that musicians’ enhanced ability to learn statistical probabilities in a novel Morse-code language may extend to natural language learning. PMID:23505962
Speech Processing and Recognition (SPaRe)
2011-01-01
results in the areas of automatic speech recognition (ASR), speech processing, machine translation (MT), natural language processing ( NLP ), and...Processing ( NLP ), Information Retrieval (IR) 16. SECURITY CLASSIFICATION OF: UNCLASSIFED 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME...Figure 9, the IOC was only expected to provide document submission and search; automatic speech recognition (ASR) for English, Spanish, Arabic , and
Reading Process and Practice: From Socio-Psycholinguistics to Whole Language.
ERIC Educational Resources Information Center
Weaver, Constance
Based on the thesis that reading is not a passive process by which readers soak up words and information from the page, but an active process by which they predict, sample, and confirm or correct their hypotheses about the written text, this book is an introduction to the theories of the psycholinguistic nature of the reading process and reading…
The crustal dynamics intelligent user interface anthology
NASA Technical Reports Server (NTRS)
Short, Nicholas M., Jr.; Campbell, William J.; Roelofs, Larry H.; Wattawa, Scott L.
1987-01-01
The National Space Science Data Center (NSSDC) has initiated an Intelligent Data Management (IDM) research effort which has, as one of its components, the development of an Intelligent User Interface (IUI). The intent of the IUI is to develop a friendly and intelligent user interface service based on expert systems and natural language processing technologies. The purpose of such a service is to support the large number of potential scientific and engineering users that have need of space and land-related research and technical data, but have little or no experience in query languages or understanding of the information content or architecture of the databases of interest. This document presents the design concepts, development approach and evaluation of the performance of a prototype IUI system for the Crustal Dynamics Project Database, which was developed using a microcomputer-based expert system tool (M. 1), the natural language query processor THEMIS, and the graphics software system GSS. The IUI design is based on a multiple view representation of a database from both the user and database perspective, with intelligent processes to translate between the views.
Unsupervised learning of natural languages
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-01-01
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885
Unsupervised learning of natural languages.
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-08-16
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.
Dediu, Dan
2011-04-01
It is generally accepted that the relationship between human genes and language is very complex and multifaceted. This has its roots in the “regular” complexity governing the interplay among genes and between genes and environment for most phenotypes, but with the added layer of supraontogenetic and supra-individual processes defining culture. At the coarsest level, focusing on the species, it is clear that human-specific--but not necessarily faculty-specific--genetic factors subtend our capacity for language and a currently very productive research program is aiming at uncovering them. At the other end of the spectrum, it is uncontroversial that individual-level variations in different aspects related to speech and language have an important genetic component and their discovery and detailed characterization have already started to revolutionize the way we think about human nature. However, at the intermediate, glossogenetic/population level, the relationship becomes controversial, partly due to deeply ingrained beliefs about language acquisition and universality and partly because of confusions with a different type of gene-languages correlation due to shared history. Nevertheless, conceptual, mathematical and computational models--and, recently, experimental evidence from artificial languages and songbirds--have repeatedly shown that genetic biases affecting the acquisition or processing of aspects of language and speech can be amplified by population-level intergenerational cultural processes and made manifest either as fixed “universal” properties of language or as structured linguistic diversity. Here, I review several such models as well as the recently proposed case of a causal relationship between the distribution of tone languages and two genes related to brain growth and development, ASPM and Microcephalin, and I discuss the relevance of such genetic biasing for language evolution, change, and diversity.
The native-language benefit for talker identification is robust in 7.5-month-old infants.
Fecher, Natalie; Johnson, Elizabeth K
2018-04-26
Adults recognize talkers better when the talkers speak a familiar language than when they speak an unfamiliar language. This language familiarity effect (LFE) demonstrates the inseparable nature of linguistic and indexical information in adult spoken language processing. Relatively little is known about children's integration of linguistic and indexical information in speech. For example, to date, only one study has explored the LFE in infants. Here, we sought to better understand the maturation of speech processing abilities in infants by replicating this earlier study using a more stringent experimental design (eliminating a potential voice-language confound), a different test population (English- rather than Dutch-learning infants), and a new language pairing (English vs. Polish rather than Dutch vs. Italian or Japanese). Furthermore, we explored the language exposure conditions required for infants to develop an LFE for a formerly unfamiliar language. We hypothesized based on previous studies (including the perceptual narrowing literature) that infants might develop an LFE more readily than would adults. Although our findings replicate those of the earlier study-demonstrating that the LFE is robust in 7.5-month-olds-we found no evidence that infants need less language exposure than do adults to develop an LFE. We concluded that both infants and adults need extensive (potentially live) exposure to an unfamiliar language before talker identification in that language improves. Moreover, our study suggests that the LFE is likely rooted in early emerging phonology rather than shared lexical knowledge and that infants already closely resemble adults in their processing of linguistic and indexical information. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Reliable Electronic Text: The Elusive Prerequisite for a Host of Human Language Technologies
2010-09-30
is not always the case—for example, ligatures in Latin-fonts, and glyphs in Arabic fonts (King, 2008; Carrier, 2009). This complexity, and others...such effects can render electronic text useless for natural language processing ( NLP ). Typically, file converters do not expose the details of the...the many component NLP technologies typically used inside information extraction and text categorization applications, such as tokenization, part-of
Grammatical Analysis as a Distributed Neurobiological Function
Bozic, Mirjana; Fonteneau, Elisabeth; Su, Li; Marslen-Wilson, William D
2015-01-01
Language processing engages large-scale functional networks in both hemispheres. Although it is widely accepted that left perisylvian regions have a key role in supporting complex grammatical computations, patient data suggest that some aspects of grammatical processing could be supported bilaterally. We investigated the distribution and the nature of grammatical computations across language processing networks by comparing two types of combinatorial grammatical sequences—inflectionally complex words and minimal phrases—and contrasting them with grammatically simple words. Novel multivariate analyses revealed that they engage a coalition of separable subsystems: inflected forms triggered left-lateralized activation, dissociable into dorsal processes supporting morphophonological parsing and ventral, lexically driven morphosyntactic processes. In contrast, simple phrases activated a consistently bilateral pattern of temporal regions, overlapping with inflectional activations in L middle temporal gyrus. These data confirm the role of the left-lateralized frontotemporal network in supporting complex grammatical computations. Critically, they also point to the capacity of bilateral temporal regions to support simple, linear grammatical computations. This is consistent with a dual neurobiological framework where phylogenetically older bihemispheric systems form part of the network that supports language function in the modern human, and where significant capacities for language comprehension remain intact even following severe left hemisphere damage. PMID:25421880
Frequency effects in monolingual and bilingual natural reading.
Cop, Uschi; Keuleers, Emmanuel; Drieghe, Denis; Duyck, Wouter
2015-10-01
This paper presents the first systematic examination of the monolingual and bilingual frequency effect (FE) during natural reading. We analyzed single fixation durations on content words for participants reading an entire novel. Unbalanced bilinguals and monolinguals show a similarly sized FE in their mother tongue (L1), but for bilinguals the FE is considerably larger in their second language (L2) than in their L1. The FE in both L1 and L2 reading decreased with increasing L1 proficiency, but it was not affected by L2 proficiency. Our results are consistent with an account of bilingual language processing that assumes an integrated mental lexicon with exposure as the main determiner for lexical entrenchment. This means that no qualitative difference in language processing between monolingual, bilingual L1, or bilingual L2 is necessary to explain reading behavior. We present this account and argue that not all groups of bilinguals necessarily have lower L1 exposure than monolinguals do and, in line with Kuperman and Van Dyke (Journal of Experimental Psychology, 39 (3), 802-823, 2013), that individual vocabulary size and language exposure change the accuracy of the relative corpus word frequencies and thereby determine the size of the FEs in the same way for all participants.
Endara, Lorena; Cui, Hong; Burleigh, J Gordon
2018-03-01
Phenotypic data sets are necessary to elucidate the genealogy of life, but assembling phenotypic data for taxa across the tree of life can be technically challenging and prohibitively time consuming. We describe a semi-automated protocol to facilitate and expedite the assembly of phenotypic character matrices of plants from formal taxonomic descriptions. This pipeline uses new natural language processing (NLP) techniques and a glossary of over 9000 botanical terms. Our protocol includes the Explorer of Taxon Concepts (ETC), an online application that assembles taxon-by-character matrices from taxonomic descriptions, and MatrixConverter, a Java application that enables users to evaluate and discretize the characters extracted by ETC. We demonstrate this protocol using descriptions from Araucariaceae. The NLP pipeline unlocks the phenotypic data found in taxonomic descriptions and makes them usable for evolutionary analyses.
Data Discovery with IBM Watson
NASA Astrophysics Data System (ADS)
Fessler, J.
2016-12-01
BM Watson is a cognitive computing system that uses machine learning, statistical analysis, and natural language processing to find and understand the clues in questions posed to it. Watson was made famous when it bested two champions on TV's Jeopardy! show. Since then, Watson has evolved into a platform of cognitive services that can be trained on very granular fields up study. Watson is being used to support a number of subject domains, such as cancer research, public safety, engineering, and the intelligence community. IBM will be providing a presentation and demonstration on the Watson technology and will discuss its capabilities including Natural Language Processing, text analytics and enterprise search, as well as cognitive computing with deep Q&A. The team will also be giving examples of how IBM Watson technology is being used to support real-world problems across a number of public sector agencies
Karakülah, Gökhan; Dicle, Oğuz; Koşaner, Ozgün; Suner, Aslı; Birant, Çağdaş Can; Berber, Tolga; Canbek, Sezin
2014-01-01
The lack of laboratory tests for the diagnosis of most of the congenital anomalies renders the physical examination of the case crucial for the diagnosis of the anomaly; and the cases in the diagnostic phase are mostly being evaluated in the light of the literature knowledge. In this respect, for accurate diagnosis, ,it is of great importance to provide the decision maker with decision support by presenting the literature knowledge about a particular case. Here, we demonstrated a methodology for automated scanning and determining of the phenotypic features from the case reports related to congenital anomalies in the literature with text and natural language processing methods, and we created a framework of an information source for a potential diagnostic decision support system for congenital anomalies.
Toledo, Cíntia Matsuda; Cunha, Andre; Scarton, Carolina; Aluísio, Sandra
2014-01-01
Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. The aims were to describe how to:(i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and(ii) automatically identify the features that best distinguish the groups. The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo 18 were used,which included 200 healthy Brazilians of both genders. A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods.
Research in Knowledge Representation for Natural Language Understanding.
1984-09-01
TYPE OF REPORT & PERIOO COVERED RESEARCH IN KNOWLEDGE REPRESENTATION Annual Report FOR NATURAL LANGUAGE UNDERSTANDING 9/1/83 - 8/31/84 S. PERFORMING...nhaber) Artificial intelligence, natural language understanding , knowledge representation, semantics, semantic networks, KL-TWO, NIKL, belief and...attempting to understand and react to a complex, evolving situation. This report summarizes our research in knowledge representation and natural language
Language shift, bilingualism and the future of Britain's Celtic languages.
Kandler, Anne; Unger, Roman; Steele, James
2010-12-12
'Language shift' is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100-120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the 'natural wastage' as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission.
How Involved Are American L2 Learners of Spanish in Lexical Input Processing Tasks during Reading?
ERIC Educational Resources Information Center
Pulido, Diana
2009-01-01
This study examines the nature of the involvement load (Laufer & Hulstijn, 2001) in second language (L2) lexical input processing through reading by considering the effects of the reader-based factors of L2 reading proficiency and background knowledge. The lexical input processing aspects investigated were lexical inferencing (search), attentional…
On the Evolution of Human Language.
ERIC Educational Resources Information Center
Lieberman, Philip
Human linguistic ability depends, in part, on the gradual evolution of man's supralaryngeal vocal tract. The anatomic basis of human speech production is the result of a long evolutionary process in which the Darwinian process of natural selection acted to retain mutations. For auditory perception, the listener operates in terms of the acoustic…
The Linguistic and Embodied Nature of Conceptual Processing
ERIC Educational Resources Information Center
Louwerse, Max M.; Jeuniaux, Patrick
2010-01-01
Recent theories of cognition have argued that embodied experience is important for conceptual processing. Embodiment can be contrasted with linguistic factors such as the typical order in which words appear in language. Here, we report four experiments that investigated the conditions under which embodiment and linguistic factors determine…
A Simple Blueprint for Automatic Boolean Query Processing.
ERIC Educational Resources Information Center
Salton, G.
1988-01-01
Describes a new Boolean retrieval environment in which an extended soft Boolean logic is used to automatically construct queries from original natural language formulations provided by users. Experimental results that compare the retrieval effectiveness of this method to conventional Boolean and vector processing are discussed. (27 references)…
Processing Trade-Offs in Non-Native Learners' Performance of Narrative Tasks
ERIC Educational Resources Information Center
Ben Maad, Mohamed Ridha
2011-01-01
Exploring learners' processes of memory and analysis has captivated considerable attention among language-learning researchers due to the recent prevalence of key concepts from feeder disciplines such as cognitive psychology and phraseology. However, there has been little empirical effort to describe the nature of interaction between these two…
ERIC Educational Resources Information Center
Weber-Fox, Christine; Hart, Laura J.; Spruill, John E., III
2006-01-01
This study examined how school-aged children process different grammatical categories. Event-related brain potentials elicited by words in visually presented sentences were analyzed according to seven grammatical categories with naturally varying characteristics of linguistic functions, semantic features, and quantitative attributes of length and…
ERIC Educational Resources Information Center
Connor, Meryl
1992-01-01
An instructional technique that enlisted natural language processes by ensuring that grammatical markers made semantic sense was examined to determine its usefulness in helping adult Anglophone classroom learners to make more accurate online aspectual choices in past tense oral narrative. (32 references) (LB)
NASA Astrophysics Data System (ADS)
Stout, Dietrich
2016-03-01
Twenty-five years ago, Pinker and Bloom [1] helped reinvigorate research on language evolution by arguing that language ;shows signs of complex design for the communication of propositional structures, and the only explanation for the origin of organs with complex design is the process of natural selection.; Since then, empirical research has tested the assertions of (cross-cultural) universality, (cross-species) uniqueness, and (cross-domain) specificity underpinning this argument from design. Appearances aside, points of consensus have emerged. The existence of a core computational and neural substrate unique to language and/or humans is still debated, but it is widely agreed that: 1) human language performance overlaps with behaviors in other domains and species, and 2) such general, pre-existing capacities provided the context for language-specific evolution (e.g. [2]).
Programming languages for synthetic biology.
Umesh, P; Naveen, F; Rao, Chanchala Uma Maheswara; Nair, Achuthsankar S
2010-12-01
In the backdrop of accelerated efforts for creating synthetic organisms, the nature and scope of an ideal programming language for scripting synthetic organism in-silico has been receiving increasing attention. A few programming languages for synthetic biology capable of defining, constructing, networking, editing and delivering genome scale models of cellular processes have been recently attempted. All these represent important points in a spectrum of possibilities. This paper introduces Kera, a state of the art programming language for synthetic biology which is arguably ahead of similar languages or tools such as GEC, Antimony and GenoCAD. Kera is a full-fledged object oriented programming language which is tempered by biopart rule library named Samhita which captures the knowledge regarding the interaction of genome components and catalytic molecules. Prominent feature of the language are demonstrated through a toy example and the road map for the future development of Kera is also presented.
Linking language and categorization in infancy.
Ferguson, Brock; Waxman, Sandra
2017-05-01
Language exerts a powerful influence on our concepts. We review evidence documenting the developmental origins of a precocious link between language and object categories in very young infants. This collection of studies documents a cascading process in which early links between language and cognition provide the foundation for later, more precise ones. We propose that, early in life, language promotes categorization at least in part through its status as a social, communicative signal. But over the first year, infants home in on the referential power of language and, by their second year, begin teasing apart distinct kinds of names (e.g. nouns, adjectives) and their relation to distinct kinds of concepts (e.g. object categories, properties). To complement this proposal, we also relate this evidence to several alternative accounts of language's effect on categorization, appealing to similarity ('labels-as-features'), familiarity ('auditory overshadowing'), and communicative biases ('natural pedagogy').
Using hybridization networks to retrace the evolution of Indo-European languages.
Willems, Matthieu; Lord, Etienne; Laforest, Louise; Labelle, Gilbert; Lapointe, François-Joseph; Di Sciullo, Anna Maria; Makarenkov, Vladimir
2016-09-06
Curious parallels between the processes of species and language evolution have been observed by many researchers. Retracing the evolution of Indo-European (IE) languages remains one of the most intriguing intellectual challenges in historical linguistics. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, thus not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively. More recently, implicit evolutionary networks, such as split graphs and minimal lateral networks, have been used to account for reticulate evolution in linguistics. Striking parallels existing between the evolution of species and natural languages allowed us to apply three computational biology methods for reconstruction of phylogenetic networks to model the evolution of IE languages. We show how the transfer of methods between the two disciplines can be achieved, making necessary methodological adaptations. Considering basic vocabulary data from the well-known Dyen's lexical database, which contains word forms in 84 IE languages for the meanings of a 200-meaning Swadesh list, we adapt a recently developed computational biology algorithm for building explicit hybridization networks to study the evolution of IE languages and compare our findings to the results provided by the split graph and galled network methods. We conclude that explicit phylogenetic networks can be successfully used to identify donors and recipients of lexical material as well as the degree of influence of each donor language on the corresponding recipient languages. We show that our algorithm is well suited to detect reticulate relationships among languages, and present some historical and linguistic justification for the results obtained. Our findings could be further refined if relevant syntactic, phonological and morphological data could be analyzed along with the available lexical data.
The Occurrence and the Success Rate of Self-Initiated Self-Repair
ERIC Educational Resources Information Center
Sato, Rintaro; Takatsuka, Shigenobu
2016-01-01
Errors naturally appear in spontaneous speeches and conversations. Particularly in a second or foreign language, it is only natural that mistakes happen as a part of the learning process. After an inappropriate expression is detected, it can be corrected. This act of correcting can be initiated either by the speaker (non-native speaker) or the…
White matter structure changes as adults learn a second language.
Schlegel, Alexander A; Rudelson, Justin J; Tse, Peter U
2012-08-01
Traditional models hold that the plastic reorganization of brain structures occurs mainly during childhood and adolescence, leaving adults with limited means to learn new knowledge and skills. Research within the last decade has begun to overturn this belief, documenting changes in the brain's gray and white matter as healthy adults learn simple motor and cognitive skills [Lövdén, M., Bodammer, N. C., Kühn, S., Kaufmann, J., Schütze, H., Tempelmann, C., et al. Experience-dependent plasticity of white-matter microstructure extends into old age. Neuropsychologia, 48, 3878-3883, 2010; Taubert, M., Draganski, B., Anwander, A., Müller, K., Horstmann, A., Villringer, A., et al. Dynamic properties of human brain structure: Learning-related changes in cortical areas and associated fiber connections. The Journal of Neuroscience, 30, 11670-11677, 2010; Scholz, J., Klein, M. C., Behrens, T. E. J., & Johansen-Berg, H. Training induces changes in white-matter architecture. Nature Neuroscience, 12, 1370-1371, 2009; Draganski, B., Gaser, C., Busch, V., Schuirer, G., Bogdahn, U., & May, A. Changes in grey matter induced by training. Nature, 427, 311-312, 2004]. Although the significance of these changes is not fully understood, they reveal a brain that remains plastic well beyond early developmental periods. Here we investigate the role of adult structural plasticity in the complex, long-term learning process of foreign language acquisition. We collected monthly diffusion tensor imaging scans of 11 English speakers who took a 9-month intensive course in written and spoken Modern Standard Chinese as well as from 16 control participants who did not study a language. We show that white matter reorganizes progressively across multiple sites as adults study a new language. Language learners exhibited progressive changes in white matter tracts associated with traditional left hemisphere language areas and their right hemisphere analogs. Surprisingly, the most significant changes occurred in frontal lobe tracts crossing the genu of the corpus callosum-a region not generally included in current neural models of language processing. These results indicate that plasticity of white matter plays an important role in adult language learning and additionally demonstrate the potential of longitudinal diffusion tensor imaging as a new tool to yield insights into cognitive processes.
Unity and diversity in human language
Fitch, W. Tecumseh
2011-01-01
Human language is both highly diverse—different languages have different ways of achieving the same functional goals—and easily learnable. Any language allows its users to express virtually any thought they can conceptualize. These traits render human language unique in the biological world. Understanding the biological basis of language is thus both extremely challenging and fundamentally interesting. I review the literature on linguistic diversity and language universals, suggesting that an adequate notion of ‘formal universals’ provides a promising way to understand the facts of language acquisition, offering order in the face of the diversity of human languages. Formal universals are cross-linguistic generalizations, often of an abstract or implicational nature. They derive from cognitive capacities to perceive and process particular types of structures and biological constraints upon integration of the multiple systems involved in language. Such formal universals can be understood on the model of a general solution to a set of differential equations; each language is one particular solution. An explicit formal conception of human language that embraces both considerable diversity and underlying biological unity is possible, and fully compatible with modern evolutionary theory. PMID:21199842
Understanding a technical language: A schema-based approach
NASA Technical Reports Server (NTRS)
Falzon, P.
1984-01-01
Workers in many job categories tend to develop technical languages, which are restricted subjects of natural language. A better knowledge of these retrictions provides guidelines for the design of the restricted languages of interactive systems. Accordingly, a technical language used by air-traffic controllers in their communications with pilots was studied. A method of analysis is presented that allows the schemata underlying each category of messages to be identified. This schematic knowledge was implemented in programs, which assume that the goal-oriented aspect of technical languages (and particularly the restricted domain of discourse) limits the processes and the data necessary in order to understand the messages (monosemy, limited vocabulary, evocation of the schemata by some command words, absence of syntax). The programs can interpret, and translate into sequences of action, the messages emitted by the controllers.
Intelligent CAI: An Author Aid for a Natural Language Interface.
ERIC Educational Resources Information Center
Burton, Richard R.; Brown, John Seely
This report addresses the problems of using natural language (English) as the communication language for advanced computer-based instructional systems. The instructional environment places requirements on a natural language understanding system that exceed the capabilities of all existing systems, including: (1) efficiency, (2) habitability, (3)…
Hierarchical processing in music, language, and action: Lashley revisited.
Fitch, W Tecumseh; Martins, Mauricio D
2014-05-01
Sixty years ago, Karl Lashley suggested that complex action sequences, from simple motor acts to language and music, are a fundamental but neglected aspect of neural function. Lashley demonstrated the inadequacy of then-standard models of associative chaining, positing a more flexible and generalized "syntax of action" necessary to encompass key aspects of language and music. He suggested that hierarchy in language and music builds upon a more basic sequential action system, and provided several concrete hypotheses about the nature of this system. Here, we review a diverse set of modern data concerning musical, linguistic, and other action processing, finding them largely consistent with an updated neuroanatomical version of Lashley's hypotheses. In particular, the lateral premotor cortex, including Broca's area, plays important roles in hierarchical processing in language, music, and at least some action sequences. Although the precise computational function of the lateral prefrontal regions in action syntax remains debated, Lashley's notion-that this cortical region implements a working-memory buffer or stack scannable by posterior and subcortical brain regions-is consistent with considerable experimental data. © 2014 The Authors. Annals of the New York Academy of Sciences published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Processing of speech and non-speech stimuli in children with specific language impairment
NASA Astrophysics Data System (ADS)
Basu, Madhavi L.; Surprenant, Aimee M.
2003-10-01
Specific Language Impairment (SLI) is a developmental language disorder in which children demonstrate varying degrees of difficulties in acquiring a spoken language. One possible underlying cause is that children with SLI have deficits in processing sounds that are of short duration or when they are presented rapidly. Studies so far have compared their performance on speech and nonspeech sounds of unequal complexity. Hence, it is still unclear whether the deficit is specific to the perception of speech sounds or whether it more generally affects the auditory function. The current study aims to answer this question by comparing the performance of children with SLI on speech and nonspeech sounds synthesized from sine-wave stimuli. The children will be tested using the classic categorical perception paradigm that includes both the identification and discrimination of stimuli along a continuum. If there is a deficit in the performance on both speech and nonspeech tasks, it will show that these children have a deficit in processing complex sounds. Poor performance on only the speech sounds will indicate that the deficit is more related to language. The findings will offer insights into the exact nature of the speech perception deficits in children with SLI. [Work supported by ASHF.
Iconicity as structure mapping
Emmorey, Karen
2014-01-01
Linguistic and psycholinguistic evidence is presented to support the use of structure-mapping theory as a framework for understanding effects of iconicity on sign language grammar and processing. The existence of structured mappings between phonological form and semantic mental representations has been shown to explain the nature of metaphor and pronominal anaphora in sign languages. With respect to processing, it is argued that psycholinguistic effects of iconicity may only be observed when the task specifically taps into such structured mappings. In addition, language acquisition effects may only be observed when the relevant cognitive abilities are in place (e.g. the ability to make structural comparisons) and when the relevant conceptual knowledge has been acquired (i.e. information key to processing the iconic mapping). Finally, it is suggested that iconicity is better understood as a structured mapping between two mental representations than as a link between linguistic form and human experience. PMID:25092669
Steele, James; Ferrari, Pier Francesco; Fogassi, Leonardo
2012-01-01
The papers in this Special Issue examine tool use and manual gestures in primates as a window on the evolution of the human capacity for language. Neurophysiological research has supported the hypothesis of a close association between some aspects of human action organization and of language representation, in both phonology and semantics. Tool use provides an excellent experimental context to investigate analogies between action organization and linguistic syntax. Contributors report and contextualize experimental evidence from monkeys, great apes, humans and fossil hominins, and consider the nature and the extent of overlaps between the neural representations of tool use, manual gestures and linguistic processes. PMID:22106422
LSTM-CRF | Informatics Technology for Cancer Research (ITCR)
LSTM-CRF uses Natural Language Processing methods for detecting Adverse Drug Events, Drugname, Indication and other medically relevant information from Electronic Health Records. It implements Recurrent Neural Networks using several CRF based inference methods.
Automatic reconstruction of a bacterial regulatory network using Natural Language Processing
Rodríguez-Penagos, Carlos; Salgado, Heladia; Martínez-Flores, Irma; Collado-Vides, Julio
2007-01-01
Background Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implementation of a state-of-the-art Natural Language Processing system that creates computer-readable networks of regulatory interactions directly from different collections of abstracts and full-text papers. Our major aim is to understand how automatic annotation using Text-Mining techniques can complement manual curation of biological databases. We implemented a rule-based system to generate networks from different sets of documents dealing with regulation in Escherichia coli K-12. Results Performance evaluation is based on the most comprehensive transcriptional regulation database for any organism, the manually-curated RegulonDB, 45% of which we were able to recreate automatically. From our automated analysis we were also able to find some new interactions from papers not already curated, or that were missed in the manual filtering and review of the literature. We also put forward a novel Regulatory Interaction Markup Language better suited than SBML for simultaneously representing data of interest for biologists and text miners. Conclusion Manual curation of the output of automatic processing of text is a good way to complement a more detailed review of the literature, either for validating the results of what has been already annotated, or for discovering facts and information that might have been overlooked at the triage or curation stages. PMID:17683642
Conceptual Complexity and Apparent Contradictions in Mathematics Language
ERIC Educational Resources Information Center
Gough, John
2007-01-01
Mathematics is like a language, although technically it is not a natural or informal human language, but a formal, that is, artificially constructed language. Importantly, educators use their natural everyday language to teach the formal language of mathematics. At times, however, instructors encounter problems when the technical words they use,…
Ivanova, Maria V; Isaev, Dmitry Yu; Dragoy, Olga V; Akinina, Yulia S; Petrushevskiy, Alexey G; Fedina, Oksana N; Shklovsky, Victor M; Dronkers, Nina F
2016-12-01
A growing literature is pointing towards the importance of white matter tracts in understanding the neural mechanisms of language processing, and determining the nature of language deficits and recovery patterns in aphasia. Measurements extracted from diffusion-weighted (DW) images provide comprehensive in vivo measures of local microstructural properties of fiber pathways. In the current study, we compared microstructural properties of major white matter tracts implicated in language processing in each hemisphere (these included arcuate fasciculus (AF), superior longitudinal fasciculus (SLF), inferior longitudinal fasciculus (ILF), inferior frontal-occipital fasciculus (IFOF), uncinate fasciculus (UF), and corpus callosum (CC), and corticospinal tract (CST) for control purposes) between individuals with aphasia and healthy controls and investigated the relationship between these neural indices and language deficits. Thirty-seven individuals with aphasia due to left hemisphere stroke and eleven age-matched controls were scanned using DW imaging sequences. Fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), axial diffusivity (AD) values for each major white matter tract were extracted from DW images using tract masks chosen from standardized atlases. Individuals with aphasia were also assessed with a standardized language test in Russian targeting comprehension and production at the word and sentence level. Individuals with aphasia had significantly lower FA values for left hemisphere tracts and significantly higher values of MD, RD and AD for both left and right hemisphere tracts compared to controls, all indicating profound impairment in tract integrity. Language comprehension was predominantly related to integrity of the left IFOF and left ILF, while language production was mainly related to integrity of the left AF. In addition, individual segments of these three tracts were differentially associated with language production and comprehension in aphasia. Our findings highlight the importance of fiber pathways in supporting different language functions and point to the importance of temporal tracts in language processing, in particular, comprehension. Copyright © 2016 Elsevier Ltd. All rights reserved.
Morrison, Frances P; Li, Li; Lai, Albert M; Hripcsak, George
2009-01-01
Electronic clinical documentation can be useful for activities such as public health surveillance, quality improvement, and research, but existing methods of de-identification may not provide sufficient protection of patient data. The general-purpose natural language processor MedLEE retains medical concepts while excluding the remaining text so, in addition to processing text into structured data, it may be able provide a secondary benefit of de-identification. Without modifying the system, the authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output. Of 809 instances of PHI, 26 (3.2%) were detected in output as a result of processing and identification errors. However, PHI in the output was highly transformed, much appearing as normalized terms for medical concepts, potentially making re-identification more difficult. The MedLEE processor may be a good enhancement to other de-identification systems, both removing PHI and providing coded data from clinical text.
Semantic based man-machine interface for real-time communication
NASA Technical Reports Server (NTRS)
Ali, M.; Ai, C.-S.
1988-01-01
A flight expert system (FLES) was developed to assist pilots in monitoring, diagnosing and recovering from in-flight faults. To provide a communications interface between the flight crew and FLES, a natural language interface (NALI) was implemented. Input to NALI is processed by three processors: (1) the semantics parser; (2) the knowledge retriever; and (3) the response generator. First the semantic parser extracts meaningful words and phrases to generate an internal representation of the query. At this point, the semantic parser has the ability to map different input forms related to the same concept into the same internal representation. Then the knowledge retriever analyzes and stores the context of the query to aid in resolving ellipses and pronoun references. At the end of this process, a sequence of retrievel functions is created as a first step in generating the proper response. Finally, the response generator generates the natural language response to the query. The architecture of NALI was designed to process both temporal and nontemporal queries. The architecture and implementation of NALI are described.
Rabovsky, Milena; Schad, Daniel J; Abdel Rahman, Rasha
2016-01-01
Communicating meaningful messages is the ultimate goal of language production. Yet, verbal messages can differ widely in the complexity and richness of their semantic content, and such differences should strongly modulate conceptual and lexical encoding processes during speech planning. However, despite the crucial role of semantic content in language production, the influence of this variability is currently unclear. Here, we investigate influences of the number of associated semantic features and intercorrelational feature density on language production during picture naming. While the number of semantic features facilitated naming, intercorrelational feature density inhibited naming. Both effects follow naturally from the assumption of conceptual facilitation and simultaneous lexical competition. They are difficult to accommodate with language production theories dismissing lexical competition. Copyright © 2015 Elsevier B.V. All rights reserved.
Divergence Measures Tool:An Introduction with Brief Tutorial
2014-03-01
in detecting differences across a wide range of Arabic -language text files (they varied by genre, domain, spelling variation, size, etc.), our...other. 2 These measures have been put to many uses in natural language processing ( NLP ). In the evaluation of machine translation (MT...files uploaded into the tool must be .txt files in ASCII or UTF-8 format. • This tool has been tested on English and Arabic script**, but should
Intelligent Agents as a Basis for Natural Language Interfaces
1988-01-01
language analysis component of UC, which produces a semantic representa tion of the input. This representation is in the form of a KODIAK network (see...Appendix A). Next, UC’s Concretion Mechanism performs concretion inferences ([Wilensky, 1983] and [Norvig, 1983]) based on the semantic network...The first step in UC’s processing is done by UC’s parser/understander component which produces a KODIAK semantic network representa tion of
Big Data Quality Case Study Preliminary Findings, U.S. Army MEDCOM MODS
2013-09-01
captured in electronic form is relatively small, on the order of hundreds of thousands of health profiles at say around 500K per profile, or in the...in electronic form, then different language identification, handwriting recognition, and Natural Language Processing (NLP) techniques could be used...and patterns” [15]. Volume - The free text fields vary in length from say ten characters to several hundred characters. Other materials can be much
ERIC Educational Resources Information Center
Le Page, R. B.
A discussion on the nature of language argues the following: (1) the concept of a closed and finite rule system is inadequate for the description of natural languages; (2) as a consequence, the writing of variable rules to modify such rule systems so as to accommodate the properties of natural language is inappropriate; (3) the concept of such…
Gradient language dominance affects talker learning.
Bregman, Micah R; Creel, Sarah C
2014-01-01
Traditional conceptions of spoken language assume that speech recognition and talker identification are computed separately. Neuropsychological and neuroimaging studies imply some separation between the two faculties, but recent perceptual studies suggest better talker recognition in familiar languages than unfamiliar languages. A familiar-language benefit in talker recognition potentially implies strong ties between the two domains. However, little is known about the nature of this language familiarity effect. The current study investigated the relationship between speech and talker processing by assessing bilingual and monolingual listeners' ability to learn voices as a function of language familiarity and age of acquisition. Two effects emerged. First, bilinguals learned to recognize talkers in their first language (Korean) more rapidly than they learned to recognize talkers in their second language (English), while English-speaking participants showed the opposite pattern (learning English talkers faster than Korean talkers). Second, bilinguals' learning rate for talkers in their second language (English) correlated with age of English acquisition. Taken together, these results suggest that language background materially affects talker encoding, implying a tight relationship between speech and talker representations. Copyright © 2013 Elsevier B.V. All rights reserved.
A UMLS-based spell checker for natural language processing in vaccine safety.
Tolentino, Herman D; Matters, Michael D; Walop, Wikke; Law, Barbara; Tong, Wesley; Liu, Fang; Fontelo, Paul; Kohl, Katrin; Payne, Daniel C
2007-02-12
The Institute of Medicine has identified patient safety as a key goal for health care in the United States. Detecting vaccine adverse events is an important public health activity that contributes to patient safety. Reports about adverse events following immunization (AEFI) from surveillance systems contain free-text components that can be analyzed using natural language processing. To extract Unified Medical Language System (UMLS) concepts from free text and classify AEFI reports based on concepts they contain, we first needed to clean the text by expanding abbreviations and shortcuts and correcting spelling errors. Our objective in this paper was to create a UMLS-based spelling error correction tool as a first step in the natural language processing (NLP) pipeline for AEFI reports. We developed spell checking algorithms using open source tools. We used de-identified AEFI surveillance reports to create free-text data sets for analysis. After expansion of abbreviated clinical terms and shortcuts, we performed spelling correction in four steps: (1) error detection, (2) word list generation, (3) word list disambiguation and (4) error correction. We then measured the performance of the resulting spell checker by comparing it to manual correction. We used 12,056 words to train the spell checker and tested its performance on 8,131 words. During testing, sensitivity, specificity, and positive predictive value (PPV) for the spell checker were 74% (95% CI: 74-75), 100% (95% CI: 100-100), and 47% (95% CI: 46%-48%), respectively. We created a prototype spell checker that can be used to process AEFI reports. We used the UMLS Specialist Lexicon as the primary source of dictionary terms and the WordNet lexicon as a secondary source. We used the UMLS as a domain-specific source of dictionary terms to compare potentially misspelled words in the corpus. The prototype sensitivity was comparable to currently available tools, but the specificity was much superior. The slow processing speed may be improved by trimming it down to the most useful component algorithms. Other investigators may find the methods we developed useful for cleaning text using lexicons specific to their area of interest.
A UMLS-based spell checker for natural language processing in vaccine safety
Tolentino, Herman D; Matters, Michael D; Walop, Wikke; Law, Barbara; Tong, Wesley; Liu, Fang; Fontelo, Paul; Kohl, Katrin; Payne, Daniel C
2007-01-01
Background The Institute of Medicine has identified patient safety as a key goal for health care in the United States. Detecting vaccine adverse events is an important public health activity that contributes to patient safety. Reports about adverse events following immunization (AEFI) from surveillance systems contain free-text components that can be analyzed using natural language processing. To extract Unified Medical Language System (UMLS) concepts from free text and classify AEFI reports based on concepts they contain, we first needed to clean the text by expanding abbreviations and shortcuts and correcting spelling errors. Our objective in this paper was to create a UMLS-based spelling error correction tool as a first step in the natural language processing (NLP) pipeline for AEFI reports. Methods We developed spell checking algorithms using open source tools. We used de-identified AEFI surveillance reports to create free-text data sets for analysis. After expansion of abbreviated clinical terms and shortcuts, we performed spelling correction in four steps: (1) error detection, (2) word list generation, (3) word list disambiguation and (4) error correction. We then measured the performance of the resulting spell checker by comparing it to manual correction. Results We used 12,056 words to train the spell checker and tested its performance on 8,131 words. During testing, sensitivity, specificity, and positive predictive value (PPV) for the spell checker were 74% (95% CI: 74–75), 100% (95% CI: 100–100), and 47% (95% CI: 46%–48%), respectively. Conclusion We created a prototype spell checker that can be used to process AEFI reports. We used the UMLS Specialist Lexicon as the primary source of dictionary terms and the WordNet lexicon as a secondary source. We used the UMLS as a domain-specific source of dictionary terms to compare potentially misspelled words in the corpus. The prototype sensitivity was comparable to currently available tools, but the specificity was much superior. The slow processing speed may be improved by trimming it down to the most useful component algorithms. Other investigators may find the methods we developed useful for cleaning text using lexicons specific to their area of interest. PMID:17295907
Expressing Biomedical Ontologies in Natural Language for Expert Evaluation.
Amith, Muhammad; Manion, Frank J; Harris, Marcelline R; Zhang, Yaoyun; Xu, Hua; Tao, Cui
2017-01-01
We report on a study of our custom Hootation software for the purposes of assessing its ability to produce clear and accurate natural language phrases from axioms embedded in three biomedical ontologies. Using multiple domain experts and three discrete rating scales, we evaluated the tool on clarity of the natural language produced, fidelity of the natural language produced from the ontology to the axiom, and the fidelity of the domain knowledge represented by the axioms. Results show that Hootation provided relatively clear natural language equivalents for a select set of OWL axioms, although the clarity of statements hinges on the accuracy and representation of axioms in the ontology.
A critical and interpretive literature review of birthing women's non-elicited pain language.
Power, Stephanie; Bogossian, Fiona E; Sussex, Roland; Strong, Jenny
2017-10-01
Standardised pain assessment i.e. the McGill Pain Questionnaire provide an elicited pain language. Midwives observe spontaneous non-elicited pain language to guide their assessment of how a woman is coping with labour. This paper examined the labour pain experience using the questions: What type of pain language do women use? Do any of the words match the descriptors of standardised pain assessments? What type of information doverbal and non-verbal cues provide to the midwife? A literature search was conducted in 2013. Studies were included if they had pain as the primary outcome and examined non-elicited pain language from the maternal perspective. A total of 12 articles were included. The analysis revealed six categories in which labour pain can be viewed: 'positive', 'negative', 'physical', 'emotional', 'transcendent' and 'natural'. Women's language comprised i.e. prefixes and suffixes, which indicate the qualities of pain, and figurative language. Language indicated location of pain, gave insight into other life phenomena i.e. death, and shared similarities with standardised pain assessmentdescriptors. Labour cues were 'functional', 'dysfunctional,' or 'neutral' (part of the physiological childbirth process), and were verbal, non-verbal, emotional, psychological, physical behaviour or reactions, or tactile. Labour can bring about a spectrum of sensations and therefore emotions from happiness and pleasure to suffering and grief. Spontaneous pain language comprises verbal language and non-verbal behaviour. Narratives are an effective form of pain communication in that they provide details regarding the quality, nature and dimensions of pain, and details notcaptured in quantitative data. Copyright © 2017 Australian College of Midwives. Published by Elsevier Ltd. All rights reserved.
Incorporating advanced language models into the P300 speller using particle filtering
NASA Astrophysics Data System (ADS)
Speier, W.; Arnold, C. W.; Deshpande, A.; Knall, J.; Pouratian, N.
2015-08-01
Objective. The P300 speller is a common brain-computer interface (BCI) application designed to communicate language by detecting event related potentials in a subject’s electroencephalogram signal. Information about the structure of natural language can be valuable for BCI communication, but attempts to use this information have thus far been limited to rudimentary n-gram models. While more sophisticated language models are prevalent in natural language processing literature, current BCI analysis methods based on dynamic programming cannot handle their complexity. Approach. Sampling methods can overcome this complexity by estimating the posterior distribution without searching the entire state space of the model. In this study, we implement sequential importance resampling, a commonly used particle filtering (PF) algorithm, to integrate a probabilistic automaton language model. Main result. This method was first evaluated offline on a dataset of 15 healthy subjects, which showed significant increases in speed and accuracy when compared to standard classification methods as well as a recently published approach using a hidden Markov model (HMM). An online pilot study verified these results as the average speed and accuracy achieved using the PF method was significantly higher than that using the HMM method. Significance. These findings strongly support the integration of domain-specific knowledge into BCI classification to improve system performance.
ERIC Educational Resources Information Center
Alfonseca, Enrique; Rodriguez, Pilar; Perez, Diana
2007-01-01
This work describes a framework that combines techniques from Adaptive Hypermedia and Natural Language processing in order to create, in a fully automated way, on-line information systems from linear texts in electronic format, such as textbooks. The process is divided into two steps: an "off-line" processing step, which analyses the source text,…
SEMCARE: Multilingual Semantic Search in Semi-Structured Clinical Data.
López-García, Pablo; Kreuzthaler, Markus; Schulz, Stefan; Scherr, Daniel; Daumke, Philipp; Markó, Kornél; Kors, Jan A; van Mulligen, Erik M; Wang, Xinkai; Gonna, Hanney; Behr, Elijah; Honrado, Ángel
2016-01-01
The vast amount of clinical data in electronic health records constitutes a great potential for secondary use. However, most of this content consists of unstructured or semi-structured texts, which is difficult to process. Several challenges are still pending: medical language idiosyncrasies in different natural languages, and the large variety of medical terminology systems. In this paper we present SEMCARE, a European initiative designed to minimize these problems by providing a multi-lingual platform (English, German, and Dutch) that allows users to express complex queries and obtain relevant search results from clinical texts. SEMCARE is based on a selection of adapted biomedical terminologies, together with Apache UIMA and Apache Solr as open source state-of-the-art natural language pipeline and indexing technologies. SEMCARE has been deployed and is currently being tested at three medical institutions in the UK, Austria, and the Netherlands, showing promising results in a cardiology use case.
Robust Resilience of the Frontotemporal Syntax System to Aging
Samu, Dávid; Davis, Simon W.; Geerligs, Linda; Mustafa, Abdur; Tyler, Lorraine K.
2016-01-01
Brain function is thought to become less specialized with age. However, this view is largely based on findings of increased activation during tasks that fail to separate task-related processes (e.g., attention, decision making) from the cognitive process under examination. Here we take a systems-level approach to separate processes specific to language comprehension from those related to general task demands and to examine age differences in functional connectivity both within and between those systems. A large population-based sample (N = 111; 22–87 years) from the Cambridge Centre for Aging and Neuroscience (Cam-CAN) was scanned using functional MRI during two versions of an experiment: a natural listening version in which participants simply listened to spoken sentences and an explicit task version in which they rated the acceptability of the same sentences. Independent components analysis across the combined data from both versions showed that although task-free language comprehension activates only the auditory and frontotemporal (FTN) syntax networks, performing a simple task with the same sentences recruits several additional networks. Remarkably, functionality of the critical FTN is maintained across age groups, showing no difference in within-network connectivity or responsivity to syntactic processing demands despite gray matter loss and reduced connectivity to task-related networks. We found no evidence for reduced specialization or compensation with age. Overt task performance was maintained across the lifespan and performance in older, but not younger, adults related to crystallized knowledge, suggesting that decreased between-network connectivity may be compensated for by older adults' richer knowledge base. SIGNIFICANCE STATEMENT Understanding spoken language requires the rapid integration of information at many different levels of analysis. Given the complexity and speed of this process, it is remarkably well preserved with age. Although previous work claims that this preserved functionality is due to compensatory activation of regions outside the frontotemporal language network, we use a novel systems-level approach to show that these “compensatory” activations simply reflect age differences in response to experimental task demands. Natural, task-free language comprehension solely recruits auditory and frontotemporal networks, the latter of which is similarly responsive to language-processing demands across the lifespan. These findings challenge the conventional approach to neurocognitive aging by showing that the neural underpinnings of a given cognitive function depend on how you test it. PMID:27170120
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.
Berezutskaya, Julia; Freudenburg, Zachary V; Güçlü, Umut; van Gerven, Marcel A J; Ramsey, Nick F
2017-08-16
Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain. SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the perisylvian cortex and potentially contribute to the emergence of "coarse" speech representations in inferior frontal gyrus typically associated with high-level language processing. These findings add to the previous work on auditory processing and underline a distinctive role of inferior frontal gyrus in natural speech comprehension. Copyright © 2017 the authors 0270-6474/17/377906-15$15.00/0.
Multiple Theory Formation in High-Level Perception. Technical Report No. 38.
ERIC Educational Resources Information Center
Woods, William A.
This paper is concerned with the process of human reading as a high-level perceptual task. Drawing on insights from artificial-intelligence research--specifically, research in natural language processing and continuous speech understanding--the paper attempts to present a fairly concrete picture of the kinds of hypothesis formation and inference…
Linguistic Complexity and Information Structure in Korean: Evidence from Eye-Tracking during Reading
ERIC Educational Resources Information Center
Lee, Yoonhyoung; Lee, Hanjung; Gordon, Peter C.
2007-01-01
The nature of the memory processes that support language comprehension and the manner in which information packaging influences online sentence processing were investigated in three experiments that used eye-tracking during reading to measure the ease of understanding complex sentences in Korean. All three experiments examined reading of embedded…
Computer Simulation of Reading.
ERIC Educational Resources Information Center
Leton, Donald A.
In recent years, coding and decoding have been claimed to be the processes for converting one language form to another. But there has been little effort to locate these processes in the human learner or to identify the nature of the internal codes. Computer simulation of reading is useful because the similarities in the human reception and…
ERIC Educational Resources Information Center
Proceedings of the ASIS Annual Meeting, 1994
1994-01-01
Includes abstracts of 18 special interest group (SIG) sessions. Highlights include natural language processing, information science and terminology science, classification, knowledge-intensive information systems, information value and ownership issues, economics and theories of information science, information retrieval interfaces, fuzzy thinking…
What Artificial Intelligence Is Doing for Training.
ERIC Educational Resources Information Center
Kirrane, Peter R.; Kirrane, Diane E.
1989-01-01
Discusses the three areas of research and application of artificial intelligence: (1) robotics, (2) natural language processing, and (3) knowledge-based or expert systems. Focuses on what expert systems can do, especially in the area of training. (JOW)
On application of image analysis and natural language processing for music search
NASA Astrophysics Data System (ADS)
Gwardys, Grzegorz
2013-10-01
In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.
Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B
2009-03-01
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo
2017-03-01
Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.
Redman, Joseph S; Natarajan, Yamini; Hou, Jason K; Wang, Jingqi; Hanif, Muzammil; Feng, Hua; Kramer, Jennifer R; Desiderio, Roxanne; Xu, Hua; El-Serag, Hashem B; Kanwal, Fasiha
2017-10-01
Natural language processing is a powerful technique of machine learning capable of maximizing data extraction from complex electronic medical records. We utilized this technique to develop algorithms capable of "reading" full-text radiology reports to accurately identify the presence of fatty liver disease. Abdominal ultrasound, computerized tomography, and magnetic resonance imaging reports were retrieved from the Veterans Affairs Corporate Data Warehouse from a random national sample of 652 patients. Radiographic fatty liver disease was determined by manual review by two physicians and verified with an expert radiologist. A split validation method was utilized for algorithm development. For all three imaging modalities, the algorithms could identify fatty liver disease with >90% recall and precision, with F-measures >90%. These algorithms could be used to rapidly screen patient records to establish a large cohort to facilitate epidemiological and clinical studies and examine the clinic course and outcomes of patients with radiographic hepatic steatosis.
Extracting Sexual Trauma Mentions from Electronic Medical Notes Using Natural Language Processing.
Divita, Guy; Brignone, Emily; Carter, Marjorie E; Suo, Ying; Blais, Rebecca K; Samore, Matthew H; Fargo, Jamison D; Gundlapalli, Adi V
2017-01-01
Patient history of sexual trauma is of clinical relevance to healthcare providers as survivors face adverse health-related outcomes. This paper describes a method for identifying mentions of sexual trauma within the free text of electronic medical notes. A natural language processing pipeline for information extraction was developed and scaled to handle a large corpus of electronic medical notes used for this study from US Veterans Health Administration medical facilities. The tool was used to identify sexual trauma mentions and create snippets around every asserted mention based on a domain-specific lexicon developed for this purpose. All snippets were evaluated by trained human reviewers. An overall positive predictive value (PPV) of 0.90 for identifying sexual trauma mentions from the free text and a PPV of 0.71 at the patient level are reported. The metrics are superior for records from female patients.
Terminology model discovery using natural language processing and visualization techniques.
Zhou, Li; Tao, Ying; Cimino, James J; Chen, Elizabeth S; Liu, Hongfang; Lussier, Yves A; Hripcsak, George; Friedman, Carol
2006-12-01
Medical terminologies are important for unambiguous encoding and exchange of clinical information. The traditional manual method of developing terminology models is time-consuming and limited in the number of phrases that a human developer can examine. In this paper, we present an automated method for developing medical terminology models based on natural language processing (NLP) and information visualization techniques. Surgical pathology reports were selected as the testing corpus for developing a pathology procedure terminology model. The use of a general NLP processor for the medical domain, MedLEE, provides an automated method for acquiring semantic structures from a free text corpus and sheds light on a new high-throughput method of medical terminology model development. The use of an information visualization technique supports the summarization and visualization of the large quantity of semantic structures generated from medical documents. We believe that a general method based on NLP and information visualization will facilitate the modeling of medical terminologies.
Behind the scenes: A medical natural language processing project.
Wu, Joy T; Dernoncourt, Franck; Gehrmann, Sebastian; Tyler, Patrick D; Moseley, Edward T; Carlson, Eric T; Grant, David W; Li, Yeran; Welt, Jonathan; Celi, Leo Anthony
2018-04-01
Advancement of Artificial Intelligence (AI) capabilities in medicine can help address many pressing problems in healthcare. However, AI research endeavors in healthcare may not be clinically relevant, may have unrealistic expectations, or may not be explicit enough about their limitations. A diverse and well-functioning multidisciplinary team (MDT) can help identify appropriate and achievable AI research agendas in healthcare, and advance medical AI technologies by developing AI algorithms as well as addressing the shortage of appropriately labeled datasets for machine learning. In this paper, our team of engineers, clinicians and machine learning experts share their experience and lessons learned from their two-year-long collaboration on a natural language processing (NLP) research project. We highlight specific challenges encountered in cross-disciplinary teamwork, dataset creation for NLP research, and expectation setting for current medical AI technologies. Copyright © 2017. Published by Elsevier B.V.
Usability Evaluation of an Unstructured Clinical Document Query Tool for Researchers.
Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B
2018-01-01
Natural Language Processing - Patient Information Extraction for Researchers (NLP-PIER) was developed for clinical researchers for self-service Natural Language Processing (NLP) queries with clinical notes. This study was to conduct a user-centered analysis with clinical researchers to gain insight into NLP-PIER's usability and to gain an understanding of the needs of clinical researchers when using an application for searching clinical notes. Clinical researcher participants (n=11) completed tasks using the system's two existing search interfaces and completed a set of surveys and an exit interview. Quantitative data including time on task, task completion rate, and survey responses were collected. Interviews were analyzed qualitatively. Survey scores, time on task and task completion proportions varied widely. Qualitative analysis indicated that participants found the system to be useful and usable in specific projects. This study identified several usability challenges and our findings will guide the improvement of NLP-PIER 's interfaces.
From Web Directories to Ontologies: Natural Language Processing Challenges
NASA Astrophysics Data System (ADS)
Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing
Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.
Cohen, K Bretonnel; Xia, Jingbo; Roeder, Christophe; Hunter, Lawrence E
2016-05-01
There is currently a crisis in science related to highly publicized failures to reproduce large numbers of published studies. The current work proposes, by way of case studies, a methodology for moving the study of reproducibility in computational work to a full stage beyond that of earlier work. Specifically, it presents a case study in attempting to reproduce the reports of two R libraries for doing text mining of the PubMed/MEDLINE repository of scientific publications. The main findings are that a rational paradigm for reproduction of natural language processing papers can be established; the advertised functionality was difficult, but not impossible, to reproduce; and reproducibility studies can produce additional insights into the functioning of the published system. Additionally, the work on reproducibility lead to the production of novel user-centered documentation that has been accessed 260 times since its publication-an average of once a day per library.
NASA Astrophysics Data System (ADS)
Adams, Christopher; Tate, Derrick
Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.
Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda
2014-01-01
In April 2012, the National Institutes of Health organized a two-day workshop entitled ‘Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making’ (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193
Nouns Referring to Tools and Natural Objects Differentially Modulate the Motor System
ERIC Educational Resources Information Center
Gough, Patricia M.; Riggio, Lucia; Chersi, Fabian; Sato, Marc; Fogassi, Leonardo; Buccino, Giovanni
2012-01-01
While increasing evidence points to a critical role for the motor system in language processing, the focus of previous work has been on the linguistic category of verbs. Here we tested whether nouns are effective in modulating the motor system and further whether different kinds of nouns--those referring to artifacts or natural items, and items…
Social media based NPL system to find and retrieve ARM data: Concept paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
Language shift, bilingualism and the future of Britain's Celtic languages
Kandler, Anne; Unger, Roman; Steele, James
2010-01-01
‘Language shift’ is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100–120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the ‘natural wastage’ as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission. PMID:21041210
Auditory scene analysis in school-aged children with developmental language disorders
Sussman, E.; Steinschneider, M.; Lee, W.; Lawson, K.
2014-01-01
Natural sound environments are dynamic, with overlapping acoustic input originating from simultaneously active sources. A key function of the auditory system is to integrate sensory inputs that belong together and segregate those that come from different sources. We hypothesized that this skill is impaired in individuals with phonological processing difficulties. There is considerable disagreement about whether phonological impairments observed in children with developmental language disorders can be attributed to specific linguistic deficits or to more general acoustic processing deficits. However, most tests of general auditory abilities have been conducted with a single set of sounds. We assessed the ability of school-aged children (7–15 years) to parse complex auditory non-speech input, and determined whether the presence of phonological processing impairments was associated with stream perception performance. A key finding was that children with language impairments did not show the same developmental trajectory for stream perception as typically developing children. In addition, children with language impairments required larger frequency separations between sounds to hear distinct streams compared to age-matched peers. Furthermore, phonological processing ability was a significant predictor of stream perception measures, but only in the older age groups. No such association was found in the youngest children. These results indicate that children with language impairments have difficulty parsing speech streams, or identifying individual sound events when there are competing sound sources. We conclude that language group differences may in part reflect fundamental maturational disparities in the analysis of complex auditory scenes. PMID:24548430
Query2Question: Translating Visualization Interaction into Natural Language.
Nafari, Maryam; Weaver, Chris
2015-06-01
Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Existing systems and techniques for recording provenance of interaction focus either on comprehensive automated recording of low-level interaction events or on idiosyncratic manual transcription of high-level analysis activities. In this paper, we present the architecture and translation design of a query-to-question (Q2Q) system that automatically records user interactions and presents them semantically using natural language (written English). Q2Q takes advantage of domain knowledge and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a visual log of styled text that complements and effectively extends the functionality of visualization tools. We present Q2Q as a means to support a cross-examination process in which questions rather than interactions are the focus of analytic reasoning and action. We describe the architecture and implementation of the Q2Q system, discuss key design factors and variations that effect question generation, and present several visualizations that incorporate Q2Q for analysis in a variety of knowledge domains.
NASA Astrophysics Data System (ADS)
Sakakibara, Kai; Hagiwara, Masafumi
In this paper, we propose a 3-dimensional self-organizing memory and describe its application to knowledge extraction from natural language. First, the proposed system extracts a relation between words by JUMAN (morpheme analysis system) and KNP (syntax analysis system), and stores it in short-term memory. In the short-term memory, the relations are attenuated with the passage of processing. However, the relations with high frequency of appearance are stored in the long-term memory without attenuation. The relations in the long-term memory are placed to the proposed 3-dimensional self-organizing memory. We used a new learning algorithm called ``Potential Firing'' in the learning phase. In the recall phase, the proposed system recalls relational knowledge from the learned knowledge based on the input sentence. We used a new recall algorithm called ``Waterfall Recall'' in the recall phase. We added a function to respond to questions in natural language with ``yes/no'' in order to confirm the validity of proposed system by evaluating the quantity of correct answers.
Benchmarking natural-language parsers for biological applications using dependency graphs.
Clegg, Andrew B; Shepherd, Adrian J
2007-01-25
Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques.
Benchmarking natural-language parsers for biological applications using dependency graphs
Clegg, Andrew B; Shepherd, Adrian J
2007-01-01
Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351
Struiksma, Marijn E.; Noordzij, Matthijs L.; Neggers, Sebastiaan F. W.; Bosker, Wendy M.; Postma, Albert
2011-01-01
Neuropsychological and imaging studies have shown that the left supramarginal gyrus (SMG) is specifically involved in processing spatial terms (e.g. above, left of), which locate places and objects in the world. The current fMRI study focused on the nature and specificity of representing spatial language in the left SMG by combining behavioral and neuronal activation data in blind and sighted individuals. Data from the blind provide an elegant way to test the supramodal representation hypothesis, i.e. abstract codes representing spatial relations yielding no activation differences between blind and sighted. Indeed, the left SMG was activated during spatial language processing in both blind and sighted individuals implying a supramodal representation of spatial and other dimensional relations which does not require visual experience to develop. However, in the absence of vision functional reorganization of the visual cortex is known to take place. An important consideration with respect to our finding is the amount of functional reorganization during language processing in our blind participants. Therefore, the participants also performed a verb generation task. We observed that only in the blind occipital areas were activated during covert language generation. Additionally, in the first task there was functional reorganization observed for processing language with a high linguistic load. As the visual cortex was not specifically active for spatial contents in the first task, and no reorganization was observed in the SMG, the latter finding further supports the notion that the left SMG is the main node for a supramodal representation of verbal spatial relations. PMID:21935391
Linking language and categorization in infancy
Ferguson, Brock; Waxman, Sandra
2017-01-01
Language exerts a powerful influence on our concepts. We review evidence documenting the developmental origins of a precocious link between language and object categories in very young infants. This collection of studies documents a cascading process in which early links between language and cognition provide the foundation for later, more precise ones. We propose that, early in life, language promotes categorization at least in part through its status as a social, communicative signal. But over the first year, infants home in on the referential power of language and, by their second year, begin teasing apart distinct kinds of names (e.g., nouns, adjectives) and their relation to distinct kinds of concepts (e.g., object categories, properties). To complement this proposal, we also relate this evidence to several alternative accounts of language’s effect on categorization, appealing to similarity (labels-as-features), familiarity (auditory overshadowing), and communicative biases (natural pedagogy). PMID:27830633
Automatically Expanding the Synonym Set of SNOMED CT using Wikipedia.
Schlegel, Daniel R; Crowner, Chris; Elkin, Peter L
2015-01-01
Clinical terminologies and ontologies are often used in natural language processing/understanding tasks as a method for semantically tagging text. One ontology commonly used for this task is SNOMED CT. Natural language is rich and varied: many different combinations of words may be used to express the same idea. It is therefore essential that ontologies and terminologies have a rich set of synonyms. One source of synonyms is Wikipedia. We examine methods for aligning concepts in SNOMED CT with articles in Wikipedia so that newly-found synonyms may be added to SNOMED CT. Our experiments show promising results and provide guidance to researchers who wish to use Wikipedia for similar tasks.
A grammar-based semantic similarity algorithm for natural language sentences.
Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng
2014-01-01
This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
Perera, Gayan; Broadbent, Matthew; Callard, Felicity; Chang, Chin-Kuo; Downs, Johnny; Dutta, Rina; Fernandes, Andrea; Hayes, Richard D; Henderson, Max; Jackson, Richard; Jewell, Amelia; Kadra, Giouliana; Little, Ryan; Pritchard, Megan; Shetty, Hitesh; Tulloch, Alex; Stewart, Robert
2016-03-01
The South London and Maudsley National Health Service (NHS) Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register and its Clinical Record Interactive Search (CRIS) application were developed in 2008, generating a research repository of real-time, anonymised, structured and open-text data derived from the electronic health record system used by SLaM, a large mental healthcare provider in southeast London. In this paper, we update this register's descriptive data, and describe the substantial expansion and extension of the data resource since its original development. Descriptive data were generated from the SLaM BRC Case Register on 31 December 2014. Currently, there are over 250,000 patient records accessed through CRIS. Since 2008, the most significant developments in the SLaM BRC Case Register have been the introduction of natural language processing to extract structured data from open-text fields, linkages to external sources of data, and the addition of a parallel relational database (Structured Query Language) output. Natural language processing applications to date have brought in new and hitherto inaccessible data on cognitive function, education, social care receipt, smoking, diagnostic statements and pharmacotherapy. In addition, through external data linkages, large volumes of supplementary information have been accessed on mortality, hospital attendances and cancer registrations. Coupled with robust data security and governance structures, electronic health records provide potentially transformative information on mental disorders and outcomes in routine clinical care. The SLaM BRC Case Register continues to grow as a database, with approximately 20,000 new cases added each year, in addition to extension of follow-up for existing cases. Data linkages and natural language processing present important opportunities to enhance this type of research resource further, achieving both volume and depth of data. However, research projects still need to be carefully tailored, so that they take into account the nature and quality of the source information. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Ries, Stephanie K.; Dronkers, Nina F.; Knight, Robert T.
2015-01-01
Language is considered to be one of the most lateralized human brain functions. Left hemisphere dominance for language has been consistently confirmed in clinical and experimental settings and constitutes one of the main axioms of neurology and neuroscience. However, functional neuroimaging studies are finding that the right hemisphere also plays a role in diverse language functions. Critically, the right hemisphere may also compensate for the loss or degradation of language functions following extensive stroke-induced damage to the left hemisphere. Here, we review studies that focus on our ability to choose words as we speak. Although fluidly performed in individuals with intact language, this process is routinely compromised in aphasic patients. We suggest that parceling word retrieval into its sub-processes—lexical activation and lexical selection—and examining which of these can be compensated for after left hemisphere stroke can advance the understanding of the lateralization of word retrieval in speech production. In particular, the domain-general nature of the brain regions associated with each process may be a helpful indicator of the right hemisphere's propensity for compensation. PMID:26766393
Reading disorders and dyslexia.
Hulme, Charles; Snowling, Margaret J
2016-12-01
We review current knowledge about the nature of reading development and disorders, distinguishing between the processes involved in learning to decode print, and the processes involved in reading comprehension. Children with decoding difficulties/dyslexia experience deficits in phoneme awareness, letter-sound knowledge and rapid automatized naming in the preschool years and beyond. These phonological/language difficulties appear to be proximal causes of the problems in learning to decode print in dyslexia. We review data from a prospective study of children at high risk of dyslexia to show that being at family risk of dyslexia is a primary risk factor for poor reading and children with persistent language difficulties at school entry are more likely to develop reading problems. Early oral language difficulties are strong predictors of later difficulties in reading comprehension. There are two distinct forms of reading disorder in children: dyslexia (a difficulty in learning to translate print into speech) and reading comprehension impairment. Both forms of reading problem appear to be predominantly caused by deficits in underlying oral language skills. Implications for screening and for the delivery of robust interventions for language and reading are discussed.
Linguistic Analysis of Natural Language Communication with Computers.
ERIC Educational Resources Information Center
Thompson, Bozena Henisz
Interaction with computers in natural language requires a language that is flexible and suited to the task. This study of natural dialogue was undertaken to reveal those characteristics which can make computer English more natural. Experiments were made in three modes of communication: face-to-face, terminal-to-terminal, and human-to-computer,…
Toledo, Cíntia Matsuda; Cunha, Andre; Scarton, Carolina; Aluísio, Sandra
2014-01-01
Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. Objective The aims were to describe how to: (i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and (ii) automatically identify the features that best distinguish the groups. Methods The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described – simple or complex; presentation order – which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo18 were used,which included 200 healthy Brazilians of both genders. Results and Conclusion A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods. PMID:29213908
Grammatical analysis as a distributed neurobiological function.
Bozic, Mirjana; Fonteneau, Elisabeth; Su, Li; Marslen-Wilson, William D
2015-03-01
Language processing engages large-scale functional networks in both hemispheres. Although it is widely accepted that left perisylvian regions have a key role in supporting complex grammatical computations, patient data suggest that some aspects of grammatical processing could be supported bilaterally. We investigated the distribution and the nature of grammatical computations across language processing networks by comparing two types of combinatorial grammatical sequences--inflectionally complex words and minimal phrases--and contrasting them with grammatically simple words. Novel multivariate analyses revealed that they engage a coalition of separable subsystems: inflected forms triggered left-lateralized activation, dissociable into dorsal processes supporting morphophonological parsing and ventral, lexically driven morphosyntactic processes. In contrast, simple phrases activated a consistently bilateral pattern of temporal regions, overlapping with inflectional activations in L middle temporal gyrus. These data confirm the role of the left-lateralized frontotemporal network in supporting complex grammatical computations. Critically, they also point to the capacity of bilateral temporal regions to support simple, linear grammatical computations. This is consistent with a dual neurobiological framework where phylogenetically older bihemispheric systems form part of the network that supports language function in the modern human, and where significant capacities for language comprehension remain intact even following severe left hemisphere damage. Copyright © 2014 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
New Ways to Learn a Foreign Language.
ERIC Educational Resources Information Center
Hall, Robert A., Jr.
This text focuses on the nature of language learning in the light of modern linguistic analysis. Common linguistic problems encountered by students of eight major languages are examined--Latin, Greek, French, Spanish, Portuguese, Italian, German, and Russian. The text discusses the nature of language, building new language habits, overcoming…
Intelligent acoustic data fusion technique for information security analysis
NASA Astrophysics Data System (ADS)
Jiang, Ying; Tang, Yize; Lu, Wenda; Wang, Zhongfeng; Wang, Zepeng; Zhang, Luming
2017-08-01
Tone is an essential component of word formation in all tonal languages, and it plays an important role in the transmission of information in speech communication. Therefore, tones characteristics study can be applied into security analysis of acoustic signal by the means of language identification, etc. In speech processing, fundamental frequency (F0) is often viewed as representing tones by researchers of speech synthesis. However, regular F0 values may lead to low naturalness in synthesized speech. Moreover, F0 and tone are not equivalent linguistically; F0 is just a representation of a tone. Therefore, the Electroglottography (EGG) signal is collected for deeper tones characteristics study. In this paper, focusing on the Northern Kam language, which has nine tonal contours and five level tone types, we first collected EGG and speech signals from six natural male speakers of the Northern Kam language, and then achieved the clustering distributions of the tone curves. After summarizing the main characteristics of tones of Northern Kam, we analyzed the relationship between EGG and speech signal parameters, and laid the foundation for further security analysis of acoustic signal.
Cop, Uschi; Drieghe, Denis; Duyck, Wouter
2015-01-01
Introduction and Method This paper presents a corpus of sentence level eye movement parameters for unbalanced bilingual first language (L1) and second-language (L2) reading and monolingual reading of a complete novel (56 000 words). We present important sentence-level basic eye movement parameters of both bilingual and monolingual natural reading extracted from this large data corpus. Results and Conclusion Bilingual L2 reading patterns show longer sentence reading times (20%), more fixations (21%), shorter saccades (12%) and less word skipping (4.6%), than L1 reading patterns. Regression rates are the same for L1 and L2 reading. These results could indicate, analogous to a previous simulation with the E-Z reader model in the literature, that it is primarily the speeding up of lexical access that drives both L1 and L2 reading development. Bilingual L1 reading does not differ in any major way from monolingual reading. This contrasts with predictions made by the weaker links account, which predicts a bilingual disadvantage in language processing caused by divided exposure between languages. PMID:26287379
NASA Astrophysics Data System (ADS)
Julià, Pere
2001-06-01
Norbert Wiener (1948, 1950) subscribed to the traditional outlook on language as a formal object (L)—a system of forms and rules that can presumably be described in abstraction from speakers, listeners, and their interaction with the world and with themselves. Information and communication theories, which Wiener also pioneered, similarly reify the object of study in lieu of treating it as the natural phenomenon it obviously is. Terms like `input,' `output,' `message,' `transmission,' `processing,' etc. implicitly legitimize older views as to the use of language as a "tool" or "instrument" for expressing feelings, thoughts, ideas, etc.; more important yet, they stand in the way of a satisfactory account of self-referentiality, which finds its natural place within the broader context of self-regulation. Cybernetics may be said to have lost sight of the "kybernetes," which gave it its name. The ways in which language participates in—or even helps to integrate—the behavior of the total human individual in his/her adaptation to their milieu casts a new light on questions about self-organization, meaning, experience, and creativity.
Keijzer, Merel; de Bot, Kees
2018-01-01
Cognitive advantages for bilinguals have inconsistently been observed in different populations, with different operationalisations of bilingualism, cognitive performance, and the process by which language control transfers to cognitive control. This calls for studies investigating which aspects of multilingualism drive a cognitive advantage, in which populations and under which conditions. This study reports on two cognitive tasks coupled with an extensive background questionnaire on health, wellbeing, personality, language knowledge and language use, administered to 387 older adults in the northern Netherlands, a small but highly multilingual area. Using linear mixed effects regression modeling, we find that when different languages are used frequently in different contexts, enhanced attentional control is observed. Subsequently, a PLS regression model targeting also other influential factors yielded a two-component solution whereby only more sensitive measures of language proficiency and language usage in different social contexts were predictive of cognitive performance above and beyond the contribution of age, gender, income and education. We discuss these findings in light of previous studies that try to uncover more about the nature of bilingualism and the cognitive processes that may drive an advantage. With an unusually large sample size our study advocates for a move away from dichotomous, knowledge-based operationalisations of multilingualism and offers new insights for future studies at the individual level. PMID:29783764
Processable English: The Theory Behind the PENG System
2009-06-01
implicit - is often buried amongst masses of irrelevant data. Heralding from unstructured sources such as natural language documents, email, audio ...estimation and prediction, data-mining, social network analysis, and semantic search and visualisation . This report describes the theoretical
A parallelized binary search tree
USDA-ARS?s Scientific Manuscript database
PTTRNFNDR is an unsupervised statistical learning algorithm that detects patterns in DNA sequences, protein sequences, or any natural language texts that can be decomposed into letters of a finite alphabet. PTTRNFNDR performs complex mathematical computations and its processing time increases when i...
Brain signatures of early lexical and morphological learning of a new language.
Havas, Viktória; Laine, Matti; Rodríguez Fornells, Antoni
2017-07-01
Morphology is an important part of language processing but little is known about how adult second language learners acquire morphological rules. Using a word-picture associative learning task, we have previously shown that a brief exposure to novel words with embedded morphological structure (suffix for natural gender) is enough for language learners to acquire the hidden morphological rule. Here we used this paradigm to study the brain signatures of early morphological learning in a novel language in adults. Behavioural measures indicated successful lexical (word stem) and morphological (gender suffix) learning. A day after the learning phase, event-related brain potentials registered during a recognition memory task revealed enhanced N400 and P600 components for stem and suffix violations, respectively. An additional effect observed with combined suffix and stem violations was an enhancement of an early N2 component, most probably related to conflict-detection processes. Successful morphological learning was also evident in the ERP responses to the subsequent rule-generalization task with new stems, where violation of the morphological rule was associated with an early (250-400ms) and late positivity (750-900ms). Overall, these findings tend to converge with lexical and morphosyntactic violation effects observed in L1 processing, suggesting that even after a short exposure, adult language learners can acquire both novel words and novel morphological rules. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sound-symbolism boosts novel word learning.
Lockwood, Gwilym; Dingemanse, Mark; Hagoort, Peter
2016-08-01
The existence of sound-symbolism (or a non-arbitrary link between form and meaning) is well-attested. However, sound-symbolism has mostly been investigated with nonwords in forced choice tasks, neither of which are representative of natural language. This study uses ideophones, which are naturally occurring sound-symbolic words that depict sensory information, to investigate how sensitive Dutch speakers are to sound-symbolism in Japanese in a learning task. Participants were taught 2 sets of Japanese ideophones; 1 set with the ideophones' real meanings in Dutch, the other set with their opposite meanings. In Experiment 1, participants learned the ideophones and their real meanings much better than the ideophones with their opposite meanings. Moreover, despite the learning rounds, participants were still able to guess the real meanings of the ideophones in a 2-alternative forced-choice test after they were informed of the manipulation. This shows that natural language sound-symbolism is robust beyond 2-alternative forced-choice paradigms and affects broader language processes such as word learning. In Experiment 2, participants learned regular Japanese adjectives with the same manipulation, and there was no difference between real and opposite conditions. This shows that natural language sound-symbolism is especially strong in ideophones, and that people learn words better when form and meaning match. The highlights of this study are as follows: (a) Dutch speakers learn real meanings of Japanese ideophones better than opposite meanings, (b) Dutch speakers accurately guess meanings of Japanese ideophones, (c) this sensitivity happens despite learning some opposite pairings, (d) no such learning effect exists for regular Japanese adjectives, and (e) this shows the importance of sound-symbolism in scaffolding language learning. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Memory Interference as a Determinant of Language Comprehension
Van Dyke, Julie A.; Johns, Clinton L.
2012-01-01
The parameters of the human memory system constrain the operation of language comprehension processes. In the memory literature, both decay and interference have been proposed as causes of forgetting; however, while there is a long history of research establishing the nature of interference effects in memory, the effects of decay are much more poorly supported. Nevertheless, research investigating the limitations of the human sentence processing mechanism typically focus on decay-based explanations, emphasizing the role of capacity, while the role of interference has received comparatively little attention. This paper reviews both accounts of difficulty in language comprehension by drawing direct connections to research in the memory domain. Capacity-based accounts are found to be untenable, diverging substantially from what is known about the operation of the human memory system. In contrast, recent research investigating comprehension difficulty using a retrieval-interference paradigm is shown to be wholly consistent with both behavioral and neuropsychological memory phenomena. The implications of adopting a retrieval-interference approach to investigating individual variation in language comprehension are discussed. PMID:22773927
A Computational Model of Linguistic Humor in Puns.
Kao, Justine T; Levy, Roger; Goodman, Noah D
2016-07-01
Humor plays an essential role in human interactions. Precisely what makes something funny, however, remains elusive. While research on natural language understanding has made significant advancements in recent years, there has been little direct integration of humor research with computational models of language understanding. In this paper, we propose two information-theoretic measures-ambiguity and distinctiveness-derived from a simple model of sentence processing. We test these measures on a set of puns and regular sentences and show that they correlate significantly with human judgments of funniness. Moreover, within a set of puns, the distinctiveness measure distinguishes exceptionally funny puns from mediocre ones. Our work is the first, to our knowledge, to integrate a computational model of general language understanding and humor theory to quantitatively predict humor at a fine-grained level. We present it as an example of a framework for applying models of language processing to understand higher level linguistic and cognitive phenomena. © 2015 The Authors. Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society.
Formalizing Knowledge in Multi-Scale Agent-Based Simulations
Somogyi, Endre; Sluka, James P.; Glazier, James A.
2017-01-01
Multi-scale, agent-based simulations of cellular and tissue biology are increasingly common. These simulations combine and integrate a range of components from different domains. Simulations continuously create, destroy and reorganize constituent elements causing their interactions to dynamically change. For example, the multi-cellular tissue development process coordinates molecular, cellular and tissue scale objects with biochemical, biomechanical, spatial and behavioral processes to form a dynamic network. Different domain specific languages can describe these components in isolation, but cannot describe their interactions. No current programming language is designed to represent in human readable and reusable form the domain specific knowledge contained in these components and interactions. We present a new hybrid programming language paradigm that naturally expresses the complex multi-scale objects and dynamic interactions in a unified way and allows domain knowledge to be captured, searched, formalized, extracted and reused. PMID:29338063
Standard model of knowledge representation
NASA Astrophysics Data System (ADS)
Yin, Wensheng
2016-09-01
Knowledge representation is the core of artificial intelligence research. Knowledge representation methods include predicate logic, semantic network, computer programming language, database, mathematical model, graphics language, natural language, etc. To establish the intrinsic link between various knowledge representation methods, a unified knowledge representation model is necessary. According to ontology, system theory, and control theory, a standard model of knowledge representation that reflects the change of the objective world is proposed. The model is composed of input, processing, and output. This knowledge representation method is not a contradiction to the traditional knowledge representation method. It can express knowledge in terms of multivariate and multidimensional. It can also express process knowledge, and at the same time, it has a strong ability to solve problems. In addition, the standard model of knowledge representation provides a way to solve problems of non-precision and inconsistent knowledge.
Formalizing Knowledge in Multi-Scale Agent-Based Simulations.
Somogyi, Endre; Sluka, James P; Glazier, James A
2016-10-01
Multi-scale, agent-based simulations of cellular and tissue biology are increasingly common. These simulations combine and integrate a range of components from different domains. Simulations continuously create, destroy and reorganize constituent elements causing their interactions to dynamically change. For example, the multi-cellular tissue development process coordinates molecular, cellular and tissue scale objects with biochemical, biomechanical, spatial and behavioral processes to form a dynamic network. Different domain specific languages can describe these components in isolation, but cannot describe their interactions. No current programming language is designed to represent in human readable and reusable form the domain specific knowledge contained in these components and interactions. We present a new hybrid programming language paradigm that naturally expresses the complex multi-scale objects and dynamic interactions in a unified way and allows domain knowledge to be captured, searched, formalized, extracted and reused.
Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language
DOE Office of Scientific and Technical Information (OSTI.GOV)
Powers, D.M.W.
1983-04-01
Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.
Natural language interface for command and control
NASA Technical Reports Server (NTRS)
Shuler, Robert L., Jr.
1986-01-01
A working prototype of a flexible 'natural language' interface for command and control situations is presented. This prototype is analyzed from two standpoints. First is the role of natural language for command and control, its realistic requirements, and how well the role can be filled with current practical technology. Second, technical concepts for implementation are discussed and illustrated by their application in the prototype system. It is also shown how adaptive or 'learning' features can greatly ease the task of encoding language knowledge in the language processor.
A new programming metaphor for image processing procedures
NASA Technical Reports Server (NTRS)
Smirnov, O. M.; Piskunov, N. E.
1992-01-01
Most image processing systems, besides an Application Program Interface (API) which lets users write their own image processing programs, also feature a higher level of programmability. Traditionally, this is a command or macro language, which can be used to build large procedures (scripts) out of simple programs or commands. This approach, a legacy of the teletypewriter has serious drawbacks. A command language is clumsy when (and if! it attempts to utilize the capabilities of a multitasking or multiprocessor environment, it is but adequate for real-time data acquisition and processing, it has a fairly steep learning curve, and the user interface is very inefficient,. especially when compared to a graphical user interface (GUI) that systems running under Xll or Windows should otherwise be able to provide. ll these difficulties stem from one basic problem: a command language is not a natural metaphor for an image processing procedure. A more natural metaphor - an image processing factory is described in detail. A factory is a set of programs (applications) that execute separate operations on images, connected by pipes that carry data (images and parameters) between them. The programs function concurrently, processing images as they arrive along pipes, and querying the user for whatever other input they need. From the user's point of view, programming (constructing) factories is a lot like playing with LEGO blocks - much more intuitive than writing scripts. Focus is on some of the difficulties of implementing factory support, most notably the design of an appropriate API. It also shows that factories retain all the functionality of a command language (including loops and conditional branches), while suffering from none of the drawbacks outlined above. Other benefits of factory programming include self-tuning factories and the process of encapsulation, which lets a factory take the shape of a standard application both from the system and the user's point of view, and thus be used as a component of other factories. A bare-bones prototype of factory programming was implemented under the PcIPS image processing system, and a complete version (on a multitasking platform) is under development.
A meta-analysis of fMRI studies on Chinese orthographic, phonological, and semantic processing.
Wu, Chiao-Yi; Ho, Moon-Ho Ringo; Chen, Shen-Hsing Annabel
2012-10-15
A growing body of neuroimaging evidence has shown that Chinese character processing recruits differential activation from alphabetic languages due to its unique linguistic features. As more investigations on Chinese character processing have recently become available, we applied a meta-analytic approach to summarize previous findings and examined the neural networks for orthographic, phonological, and semantic processing of Chinese characters independently. The activation likelihood estimation (ALE) method was used to analyze eight studies in the orthographic task category, eleven in the phonological and fifteen in the semantic task categories. Converging activation among three language-processing components was found in the left middle frontal gyrus, the left superior parietal lobule and the left mid-fusiform gyrus, suggesting a common sub-network underlying the character recognition process regardless of the task nature. With increasing task demands, the left inferior parietal lobule and the right superior temporal gyrus were specialized for phonological processing, while the left middle temporal gyrus was involved in semantic processing. Functional dissociation was identified in the left inferior frontal gyrus, with the posterior dorsal part for phonological processing and the anterior ventral part for semantic processing. Moreover, bilateral involvement of the ventral occipito-temporal regions was found for both phonological and semantic processing. The results provide better understanding of the neural networks underlying Chinese orthographic, phonological, and semantic processing, and consolidate the findings of additional recruitment of the left middle frontal gyrus and the right fusiform gyrus for Chinese character processing as compared with the universal language network that has been based on alphabetic languages. Copyright © 2012 Elsevier Inc. All rights reserved.
Zimmerer, Vitor C; Varley, Rosemary A
2015-08-01
Processing of linear word order (linear configuration) is important for virtually all languages and essential to languages such as English which have little functional morphology. Damage to systems underpinning configurational processing may specifically affect word-order reliant sentence structures. We explore order processing in WR, a man with primary progressive aphasia (PPA). In a previous report, we showed how WR showed impaired processing of actives, which rely strongly on word order, but not passives where functional morphology signals thematic roles. Using the artificial grammar learning (AGL) paradigm, we examined WR's ability to process order in non-verbal, visual sequences and compared his profile to that of healthy controls, and aphasic participants with and without severe syntactic disorder. Results suggested that WR, like some other patients with severe syntactic impairment, was unable to detect linear configurational structure. The data are consistent with the notion that disruption of possibly domain-general linearization systems differentially affects processing of active and passive sentence structures. Further research is needed to test this account, and we suggest hypotheses for future studies. Copyright © 2015 Elsevier Ltd. All rights reserved.
Flexible processing and the design of grammar.
Sag, Ivan A; Wasow, Thomas
2015-02-01
We explore the consequences of letting the incremental and integrative nature of language processing inform the design of competence grammar. What emerges is a view of grammar as a system of local monotonic constraints that provide a direct characterization of the signs (the form-meaning correspondences) of a given language. This "sign-based" conception of grammar has provided precise solutions to the key problems long thought to motivate movement-based analyses, has supported three decades of computational research developing large-scale grammar implementations, and is now beginning to play a role in computational psycholinguistics research that explores the use of underspecification in the incremental computation of partial meanings.
Action and language integration: from humans to cognitive robots.
Borghi, Anna M; Cangelosi, Angelo
2014-07-01
The topic is characterized by a highly interdisciplinary approach to the issue of action and language integration. Such an approach, combining computational models and cognitive robotics experiments with neuroscience, psychology, philosophy, and linguistic approaches, can be a powerful means that can help researchers disentangle ambiguous issues, provide better and clearer definitions, and formulate clearer predictions on the links between action and language. In the introduction we briefly describe the papers and discuss the challenges they pose to future research. We identify four important phenomena the papers address and discuss in light of empirical and computational evidence: (a) the role played not only by sensorimotor and emotional information but also of natural language in conceptual representation; (b) the contextual dependency and high flexibility of the interaction between action, concepts, and language; (c) the involvement of the mirror neuron system in action and language processing; (d) the way in which the integration between action and language can be addressed by developmental robotics and Human-Robot Interaction. Copyright © 2014 Cognitive Science Society, Inc.
Using Eye-Tracking in Applied Linguistics and Second Language Research
ERIC Educational Resources Information Center
Conklin, Kathy; Pellicer-Sánchez, Ana
2016-01-01
With eye-tracking technology the eye is thought to give researchers a window into the mind. Importantly, eye-tracking has significant advantages over traditional online processing measures: chiefly that it allows for more "natural" processing as it does not require a secondary task, and that it provides a very rich moment-to-moment data…
ERIC Educational Resources Information Center
Galloway, Edward A.; Michalek, Gabrielle V.
1995-01-01
Discusses the conversion project of the congressional papers of Senator John Heinz into digital format and the provision of electronic access to these papers by Carnegie Mellon University. Topics include collection background, project team structure, document processing, scanning, use of optical character recognition software, verification…
ERIC Educational Resources Information Center
Fukuda, Makiko
2014-01-01
The present study revealed the dynamic process of speech development in a domestic immersion program by seven adult beginning learners of Japanese. The speech data were analyzed with fluency, accuracy, and complexity measurements at group, interindividual, and intraindividual levels. The results revealed the complex nature of language development…
Speier, William; Fried, Itzhak; Pouratian, Nader
2013-07-01
The P300 speller is a system designed to restore communication to patients with advanced neuromuscular disorders. This study was designed to explore the potential improvement from using electrocorticography (ECoG) compared to the more traditional usage of electroencephalography (EEG). We tested the P300 speller on two epilepsy patients with temporary subdural electrode arrays over the occipital and temporal lobes respectively. We then performed offline analysis to determine the accuracy and bit rate of the system and integrated spectral features into the classifier and used a natural language processing (NLP) algorithm to further improve the results. The subject with the occipital grid achieved an accuracy of 82.77% and a bit rate of 41.02, which improved to 96.31% and 49.47 respectively using a language model and spectral features. The temporal grid patient achieved an accuracy of 59.03% and a bit rate of 18.26 with an improvement to 75.81% and 27.05 respectively using a language model and spectral features. Spatial analysis of the individual electrodes showed best performance using signals generated and recorded near the occipital pole. Using ECoG and integrating language information and spectral features can improve the bit rate of a P300 speller system. This improvement is sensitive to the electrode placement and likely depends on visually evoked potentials. This study shows that there can be an improvement in BCI performance when using ECoG, but that it is sensitive to the electrode location. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Code of Federal Regulations, 2010 CFR
2010-01-01
... demonstrate that an alternative source(s) of repayment will be available in order for further processing of... Note,” has been revised so that the language will no longer be inserted as an addendum, but the... natural condition, such as drought, and (b) without action by the producer that destroys a natural wetland...
Code of Federal Regulations, 2014 CFR
2014-01-01
... demonstrate that an alternative source(s) of repayment will be available in order for further processing of... Note,” has been revised so that the language will no longer be inserted as an addendum, but the... natural condition, such as drought, and (b) without action by the producer that destroys a natural wetland...
Code of Federal Regulations, 2012 CFR
2012-01-01
... demonstrate that an alternative source(s) of repayment will be available in order for further processing of... Note,” has been revised so that the language will no longer be inserted as an addendum, but the... natural condition, such as drought, and (b) without action by the producer that destroys a natural wetland...
Code of Federal Regulations, 2011 CFR
2011-01-01
... demonstrate that an alternative source(s) of repayment will be available in order for further processing of... Note,” has been revised so that the language will no longer be inserted as an addendum, but the... natural condition, such as drought, and (b) without action by the producer that destroys a natural wetland...
Code of Federal Regulations, 2013 CFR
2013-01-01
... demonstrate that an alternative source(s) of repayment will be available in order for further processing of... Note,” has been revised so that the language will no longer be inserted as an addendum, but the... natural condition, such as drought, and (b) without action by the producer that destroys a natural wetland...
Discourse Understanding. Technical Report No. 391.
ERIC Educational Resources Information Center
Scha, R. J. H.; And Others
Artificial intelligence research on natural language understanding is discussed in this report using the notions that (1) natural language understanding systems must "see" sentences as elements whose significance resides in the contribution they make to the larger whole, and (2) a natural language understanding computer system must…
Modeling Coevolution between Language and Memory Capacity during Language Origin
Gong, Tao; Shuai, Lan
2015-01-01
Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language. PMID:26544876
Modeling Coevolution between Language and Memory Capacity during Language Origin.
Gong, Tao; Shuai, Lan
2015-01-01
Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language.
Linguistic Knowledge and Reasoning for Error Diagnosis and Feedback Generation.
ERIC Educational Resources Information Center
Delmonte, Rodolfo
2003-01-01
Presents four sets of natural language processing-based exercises for which error correction and feedback are produced by means of a rich database in which linguistic information is encoded either at the lexical or the grammatical level. (Author/VWL)