Sample records for statistical language modeling

  1. A Statistical-Physics Approach to Language Acquisition and Language Change

    NASA Astrophysics Data System (ADS)

    Cassandro, Marzio; Collet, Pierre; Galves, Antonio; Galves, Charlotte

    1999-02-01

    The aim of this paper is to explain why Statistical Physics can help understanding two related linguistic questions. The first question is how to model first language acquisition by a child. The second question is how language change proceeds in time. Our approach is based on a Gibbsian model for the interface between syntax and prosody. We also present a simulated annealing model of language acquisition, which extends the Triggering Learning Algorithm recently introduced in the linguistic literature.

  2. Do neural nets learn statistical laws behind natural language?

    PubMed

    Takahashi, Shuntaro; Tanaka-Ishii, Kumiko

    2017-01-01

    The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.

  3. Do neural nets learn statistical laws behind natural language?

    PubMed Central

    Takahashi, Shuntaro

    2017-01-01

    The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks. PMID:29287076

  4. Co-occurrence statistics as a language-dependent cue for speech segmentation.

    PubMed

    Saksida, Amanda; Langus, Alan; Nespor, Marina

    2017-05-01

    To what extent can language acquisition be explained in terms of different associative learning mechanisms? It has been hypothesized that distributional regularities in spoken languages are strong enough to elicit statistical learning about dependencies among speech units. Distributional regularities could be a useful cue for word learning even without rich language-specific knowledge. However, it is not clear how strong and reliable the distributional cues are that humans might use to segment speech. We investigate cross-linguistic viability of different statistical learning strategies by analyzing child-directed speech corpora from nine languages and by modeling possible statistics-based speech segmentations. We show that languages vary as to which statistical segmentation strategies are most successful. The variability of the results can be partially explained by systematic differences between languages, such as rhythmical differences. The results confirm previous findings that different statistical learning strategies are successful in different languages and suggest that infants may have to primarily rely on non-statistical cues when they begin their process of speech segmentation. © 2016 John Wiley & Sons Ltd.

  5. The Use of a Context-Based Information Retrieval Technique

    DTIC Science & Technology

    2009-07-01

    provided in context. Latent Semantic Analysis (LSA) is a statistical technique for inferring contextual and structural information, and previous studies...WAIS). 10 DSTO-TR-2322 1.4.4 Latent Semantic Analysis LSA, which is also known as latent semantic indexing (LSI), uses a statistical and...1.4.6 Language Models In contrast, natural language models apply algorithms that combine statistical information with semantic information. Semantic

  6. A Role for Chunk Formation in Statistical Learning of Second Language Syntax

    ERIC Educational Resources Information Center

    Hamrick, Phillip

    2014-01-01

    Humans are remarkably sensitive to the statistical structure of language. However, different mechanisms have been proposed to account for such statistical sensitivities. The present study compared adult learning of syntax and the ability of two models of statistical learning to simulate human performance: Simple Recurrent Networks, which learn by…

  7. Stan: Statistical inference

    NASA Astrophysics Data System (ADS)

    Stan Development Team

    2018-01-01

    Stan facilitates statistical inference at the frontiers of applied statistics and provides both a modeling language for specifying complex statistical models and a library of statistical algorithms for computing inferences with those models. These components are exposed through interfaces in environments such as R, Python, and the command line.

  8. Improving Domain-specific Machine Translation by Constraining the Language Model

    DTIC Science & Technology

    2012-07-01

    performance. To make up for the lack of parallel training data, one assumption is that more monolingual target language data should be used in building the...target language model. Prior work on domain-specific MT has focused on training target language models with monolingual 2 domain-specific data...showed that the using a large dictionary extracted from medical domain documents in a statistical MT system to generalize the training data significantly

  9. A simple branching model that reproduces language family and language population distributions

    NASA Astrophysics Data System (ADS)

    Schwämmle, Veit; de Oliveira, Paulo Murilo Castro

    2009-07-01

    Human history leaves fingerprints in human languages. Little is known about language evolution and its study is of great importance. Here we construct a simple stochastic model and compare its results to statistical data of real languages. The model is based on the recent finding that language changes occur independently of the population size. We find agreement with the data additionally assuming that languages may be distinguished by having at least one among a finite, small number of different features. This finite set is also used in order to define the distance between two languages, similarly to linguistics tradition since Swadesh.

  10. Using Multilevel Modeling in Language Assessment Research: A Conceptual Introduction

    ERIC Educational Resources Information Center

    Barkaoui, Khaled

    2013-01-01

    This article critiques traditional single-level statistical approaches (e.g., multiple regression analysis) to examining relationships between language test scores and variables in the assessment setting. It highlights the conceptual, methodological, and statistical problems associated with these techniques in dealing with multilevel or nested…

  11. Assessing the Accuracy and Consistency of Language Proficiency Classification under Competing Measurement Models

    ERIC Educational Resources Information Center

    Zhang, Bo

    2010-01-01

    This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…

  12. A Large-Scale Analysis of Variance in Written Language

    ERIC Educational Resources Information Center

    Johns, Brendan T.; Jamieson, Randall K.

    2018-01-01

    The collection of very large text sources has revolutionized the study of natural language, leading to the development of several models of language learning and distributional semantics that extract sophisticated semantic representations of words based on the statistical redundancies contained within natural language (e.g., Griffiths, Steyvers,…

  13. Modeling Systematicity and Individuality in Nonlinear Second Language Development: The Case of English Grammatical Morphemes

    ERIC Educational Resources Information Center

    Murakami, Akira

    2016-01-01

    This article introduces two sophisticated statistical modeling techniques that allow researchers to analyze systematicity, individual variation, and nonlinearity in second language (L2) development. Generalized linear mixed-effects models can be used to quantify individual variation and examine systematic effects simultaneously, and generalized…

  14. Statistical learning of music- and language-like sequences and tolerance for spectral shifts.

    PubMed

    Daikoku, Tatsuya; Yatomi, Yutaka; Yumoto, Masato

    2015-02-01

    In our previous study (Daikoku, Yatomi, & Yumoto, 2014), we demonstrated that the N1m response could be a marker for the statistical learning process of pitch sequence, in which each tone was ordered by a Markov stochastic model. The aim of the present study was to investigate how the statistical learning of music- and language-like auditory sequences is reflected in the N1m responses based on the assumption that both language and music share domain generality. By using vowel sounds generated by a formant synthesizer, we devised music- and language-like auditory sequences in which higher-ordered transitional rules were embedded according to a Markov stochastic model by controlling fundamental (F0) and/or formant frequencies (F1-F2). In each sequence, F0 and/or F1-F2 were spectrally shifted in the last one-third of the tone sequence. Neuromagnetic responses to the tone sequences were recorded from 14 right-handed normal volunteers. In the music- and language-like sequences with pitch change, the N1m responses to the tones that appeared with higher transitional probability were significantly decreased compared with the responses to the tones that appeared with lower transitional probability within the first two-thirds of each sequence. Moreover, the amplitude difference was even retained within the last one-third of the sequence after the spectral shifts. However, in the language-like sequence without pitch change, no significant difference could be detected. The pitch change may facilitate the statistical learning in language and music. Statistically acquired knowledge may be appropriated to process altered auditory sequences with spectral shifts. The relative processing of spectral sequences may be a domain-general auditory mechanism that is innate to humans. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Evaluating pictogram prediction in a location-aware augmentative and alternative communication system.

    PubMed

    Garcia, Luís Filipe; de Oliveira, Luís Caldas; de Matos, David Martins

    2016-01-01

    This study compared the performance of two statistical location-aware pictogram prediction mechanisms, with an all-purpose (All) pictogram prediction mechanism, having no location knowledge. The All approach had a unique language model under all locations. One of the location-aware alternatives, the location-specific (Spec) approach, made use of specific language models for pictogram prediction in each location of interest. The other location-aware approach resulted from combining the Spec and the All approaches, and was designated the mixed approach (Mix). In this approach, the language models acquired knowledge from all locations, but a higher relevance was assigned to the vocabulary from the associated location. Results from simulations showed that the Mix and Spec approaches could only outperform the baseline in a statistically significant way if pictogram users reuse more than 50% and 75% of their sentences, respectively. Under low sentence reuse conditions there were no statistically significant differences between the location-aware approaches and the All approach. Under these conditions, the Mix approach performed better than the Spec approach in a statistically significant way.

  16. Word recognition and phonetic structure acquisition: Possible relations

    NASA Astrophysics Data System (ADS)

    Morgan, James

    2002-05-01

    Several accounts of possible relations between the emergence of the mental lexicon and acquisition of native language phonological structure have been propounded. In one view, acquisition of word meanings guides infants' attention toward those contrasts that are linguistically significant in their language. In the opposing view, native language phonological categories may be acquired from statistical patterns of input speech, prior to and independent of learning at the lexical level. Here, a more interactive account will be presented, in which phonological structure is modeled as emerging consequentially from the self-organization of perceptual space underlying word recognition. A key prediction of this model is that early native language phonological categories will be highly context specific. Data bearing on this prediction will be presented which provide clues to the nature of infants' statistical analysis of input.

  17. Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

  18. Brain-computer interface with language model-electroencephalography fusion for locked-in syndrome.

    PubMed

    Oken, Barry S; Orhan, Umut; Roark, Brian; Erdogmus, Deniz; Fowler, Andrew; Mooney, Aimee; Peters, Betts; Miller, Meghan; Fried-Oken, Melanie B

    2014-05-01

    Some noninvasive brain-computer interface (BCI) systems are currently available for locked-in syndrome (LIS) but none have incorporated a statistical language model during text generation. To begin to address the communication needs of individuals with LIS using a noninvasive BCI that involves rapid serial visual presentation (RSVP) of symbols and a unique classifier with electroencephalography (EEG) and language model fusion. The RSVP Keyboard was developed with several unique features. Individual letters are presented at 2.5 per second. Computer classification of letters as targets or nontargets based on EEG is performed using machine learning that incorporates a language model for letter prediction via Bayesian fusion enabling targets to be presented only 1 to 4 times. Nine participants with LIS and 9 healthy controls were enrolled. After screening, subjects first calibrated the system, and then completed a series of balanced word generation mastery tasks that were designed with 5 incremental levels of difficulty, which increased by selecting phrases for which the utility of the language model decreased naturally. Six participants with LIS and 9 controls completed the experiment. All LIS participants successfully mastered spelling at level 1 and one subject achieved level 5. Six of 9 control participants achieved level 5. Individuals who have incomplete LIS may benefit from an EEG-based BCI system, which relies on EEG classification and a statistical language model. Steps to further improve the system are discussed.

  19. Language learning, language use and the evolution of linguistic variation

    PubMed Central

    Perfors, Amy; Fehér, Olga; Samara, Anna; Swoboda, Kate; Wonnacott, Elizabeth

    2017-01-01

    Linguistic universals arise from the interaction between the processes of language learning and language use. A test case for the relationship between these factors is linguistic variation, which tends to be conditioned on linguistic or sociolinguistic criteria. How can we explain the scarcity of unpredictable variation in natural language, and to what extent is this property of language a straightforward reflection of biases in statistical learning? We review three strands of experimental work exploring these questions, and introduce a Bayesian model of the learning and transmission of linguistic variation along with a closely matched artificial language learning experiment with adult participants. Our results show that while the biases of language learners can potentially play a role in shaping linguistic systems, the relationship between biases of learners and the structure of languages is not straightforward. Weak biases can have strong effects on language structure as they accumulate over repeated transmission. But the opposite can also be true: strong biases can have weak or no effects. Furthermore, the use of language during interaction can reshape linguistic systems. Combining data and insights from studies of learning, transmission and use is therefore essential if we are to understand how biases in statistical learning interact with language transmission and language use to shape the structural properties of language. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’. PMID:27872370

  20. Language learning, language use and the evolution of linguistic variation.

    PubMed

    Smith, Kenny; Perfors, Amy; Fehér, Olga; Samara, Anna; Swoboda, Kate; Wonnacott, Elizabeth

    2017-01-05

    Linguistic universals arise from the interaction between the processes of language learning and language use. A test case for the relationship between these factors is linguistic variation, which tends to be conditioned on linguistic or sociolinguistic criteria. How can we explain the scarcity of unpredictable variation in natural language, and to what extent is this property of language a straightforward reflection of biases in statistical learning? We review three strands of experimental work exploring these questions, and introduce a Bayesian model of the learning and transmission of linguistic variation along with a closely matched artificial language learning experiment with adult participants. Our results show that while the biases of language learners can potentially play a role in shaping linguistic systems, the relationship between biases of learners and the structure of languages is not straightforward. Weak biases can have strong effects on language structure as they accumulate over repeated transmission. But the opposite can also be true: strong biases can have weak or no effects. Furthermore, the use of language during interaction can reshape linguistic systems. Combining data and insights from studies of learning, transmission and use is therefore essential if we are to understand how biases in statistical learning interact with language transmission and language use to shape the structural properties of language.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Authors.

  1. Bayesian models: A statistical primer for ecologists

    USGS Publications Warehouse

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  2. Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text

    PubMed Central

    2013-01-01

    Background Most of the institutional and research information in the biomedical domain is available in the form of English text. Even in countries where English is an official language, such as the United States, language can be a barrier for accessing biomedical information for non-native speakers. Recent progress in machine translation suggests that this technique could help make English texts accessible to speakers of other languages. However, the lack of adequate specialized corpora needed to train statistical models currently limits the quality of automatic translations in the biomedical domain. Results We show how a large-sized parallel corpus can automatically be obtained for the biomedical domain, using the MEDLINE database. The corpus generated in this work comprises article titles obtained from MEDLINE and abstract text automatically retrieved from journal websites, which substantially extends the corpora used in previous work. After assessing the quality of the corpus for two language pairs (English/French and English/Spanish) we use the Moses package to train a statistical machine translation model that outperforms previous models for automatic translation of biomedical text. Conclusions We have built translation data sets in the biomedical domain that can easily be extended to other languages available in MEDLINE. These sets can successfully be applied to train statistical machine translation models. While further progress should be made by incorporating out-of-domain corpora and domain-specific lexicons, we believe that this work improves the automatic translation of biomedical texts. PMID:23631733

  3. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    VERSPOOR, KARIN; LIN, SHOU-DE

    An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learnedmore » without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.« less

  4. An Overview of R in Health Decision Sciences.

    PubMed

    Jalal, Hawre; Pechlivanoglou, Petros; Krijkamp, Eline; Alarid-Escudero, Fernando; Enns, Eva; Hunink, M G Myriam

    2017-10-01

    As the complexity of health decision science applications increases, high-level programming languages are increasingly adopted for statistical analyses and numerical computations. These programming languages facilitate sophisticated modeling, model documentation, and analysis reproducibility. Among the high-level programming languages, the statistical programming framework R is gaining increased recognition. R is freely available, cross-platform compatible, and open source. A large community of users who have generated an extensive collection of well-documented packages and functions supports it. These functions facilitate applications of health decision science methodology as well as the visualization and communication of results. Although R's popularity is increasing among health decision scientists, methodological extensions of R in the field of decision analysis remain isolated. The purpose of this article is to provide an overview of existing R functionality that is applicable to the various stages of decision analysis, including model design, input parameter estimation, and analysis of model outputs.

  5. Computational Modeling of Statistical Learning: Effects of Transitional Probability versus Frequency and Links to Word Learning

    ERIC Educational Resources Information Center

    Mirman, Daniel; Estes, Katharine Graf; Magnuson, James S.

    2010-01-01

    Statistical learning mechanisms play an important role in theories of language acquisition and processing. Recurrent neural network models have provided important insights into how these mechanisms might operate. We examined whether such networks capture two key findings in human statistical learning. In Simulation 1, a simple recurrent network…

  6. Phonetic diversity, statistical learning, and acquisition of phonology.

    PubMed

    Pierrehumbert, Janet B

    2003-01-01

    In learning to perceive and produce speech, children master complex language-specific patterns. Daunting language-specific variation is found both in the segmental domain and in the domain of prosody and intonation. This article reviews the challenges posed by results in phonetic typology and sociolinguistics for the theory of language acquisition. It argues that categories are initiated bottom-up from statistical modes in use of the phonetic space, and sketches how exemplar theory can be used to model the updating of categories once they are initiated. It also argues that bottom-up initiation of categories is successful thanks to the perception-production loop operating in the speech community. The behavior of this loop means that the superficial statistical properties of speech available to the infant indirectly reflect the contrastiveness and discriminability of categories in the adult grammar. The article also argues that the developing system is refined using internal feedback from type statistics over the lexicon, once the lexicon is well-developed. The application of type statistics to a system initiated with surface statistics does not cause a fundamental reorganization of the system. Instead, it exploits confluences across levels of representation which characterize human language and make bootstrapping possible.

  7. Statistical Learning is Related to Early Literacy-Related Skills

    PubMed Central

    Spencer, Mercedes; Kaschak, Michael P.; Jones, John L.; Lonigan, Christopher J.

    2015-01-01

    It has been demonstrated that statistical learning, or the ability to use statistical information to learn the structure of one’s environment, plays a role in young children’s acquisition of linguistic knowledge. Although most research on statistical learning has focused on language acquisition processes, such as the segmentation of words from fluent speech and the learning of syntactic structure, some recent studies have explored the extent to which individual differences in statistical learning are related to literacy-relevant knowledge and skills. The present study extends on this literature by investigating the relations between two measures of statistical learning and multiple measures of skills that are critical to the development of literacy—oral language, vocabulary knowledge, and phonological processing—within a single model. Our sample included a total of 553 typically developing children from prekindergarten through second grade. Structural equation modeling revealed that statistical learning accounted for a unique portion of the variance in these literacy-related skills. Practical implications for instruction and assessment are discussed. PMID:26478658

  8. Rank Dynamics of Word Usage at Multiple Scales

    NASA Astrophysics Data System (ADS)

    Morales, José A.; Colman, Ewan; Sánchez, Sergio; Sánchez-Puig, Fernanda; Pineda, Carlos; Iñiguez, Gerardo; Cocho, Germinal; Flores, Jorge; Gershenson, Carlos

    2018-05-01

    The recent dramatic increase in online data availability has allowed researchers to explore human culture with unprecedented detail, such as the growth and diversification of language. In particular, it provides statistical tools to explore whether word use is similar across languages, and if so, whether these generic features appear at different scales of language structure. Here we use the Google Books N-grams dataset to analyze the temporal evolution of word usage in several languages. We apply measures proposed recently to study rank dynamics, such as the diversity of N-grams in a given rank, the probability that an N-gram changes rank between successive time intervals, the rank entropy, and the rank complexity. Using different methods, results show that there are generic properties for different languages at different scales, such as a core of words necessary to minimally understand a language. We also propose a null model to explore the relevance of linguistic structure across multiple scales, concluding that N-gram statistics cannot be reduced to word statistics. We expect our results to be useful in improving text prediction algorithms, as well as in shedding light on the large-scale features of language use, beyond linguistic and cultural differences across human populations.

  9. Huffman and linear scanning methods with statistical language models.

    PubMed

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  10. A Novel Model for Predicting Rehospitalization Risk Incorporating Physical Function, Cognitive Status, and Psychosocial Support Using Natural Language Processing.

    PubMed

    Greenwald, Jeffrey L; Cronin, Patrick R; Carballo, Victoria; Danaei, Goodarz; Choy, Garry

    2017-03-01

    With the increasing focus on reducing hospital readmissions in the United States, numerous readmissions risk prediction models have been proposed, mostly developed through analyses of structured data fields in electronic medical records and administrative databases. Three areas that may have an impact on readmission but are poorly captured using structured data sources are patients' physical function, cognitive status, and psychosocial environment and support. The objective of the study was to build a discriminative model using information germane to these 3 areas to identify hospitalized patients' risk for 30-day all cause readmissions. We conducted clinician focus groups to identify language used in the clinical record regarding these 3 areas. We then created a dataset including 30,000 inpatients, 10,000 from each of 3 hospitals, and searched those records for the focus group-derived language using natural language processing. A 30-day readmission prediction model was developed on 75% of the dataset and validated on the other 25% and also on hospital specific subsets. Focus group language was aggregated into 35 variables. The final model had 16 variables, a validated C-statistic of 0.74, and was well calibrated. Subset validation of the model by hospital yielded C-statistics of 0.70-0.75. Deriving a 30-day readmission risk prediction model through identification of physical, cognitive, and psychosocial issues using natural language processing yielded a model that performs similarly to the better performing models previously published with the added advantage of being based on clinically relevant factors and also automated and scalable. Because of the clinical relevance of the variables in the model, future research may be able to test if targeting interventions to identified risks results in reductions in readmissions.

  11. Future perspectives - proposal for Oxford Physiome Project.

    PubMed

    Oku, Yoshitaka

    2010-01-01

    The Physiome Project is an effort to understand living creatures using "analysis by synthesis" strategy, i.e., by reproducing their behaviors. In order to achieve its goal, sharing developed models between different computer languages and application programs to incorporate into integrated models is critical. To date, several XML-based markup languages has been developed for this purpose. However, source codes written with XML-based languages are very difficult to read and edit using text editors. An alternative way is to use an object-oriented meta-language, which can be translated to different computer languages and transplanted to different application programs. Object-oriented languages are suitable for describing structural organization by hierarchical classes and taking advantage of statistical properties to reduce the number of parameter while keeping the complexity of behaviors. Using object-oriented languages to describe each element and posting it to a public domain should be the next step to build up integrated models of the respiratory control system.

  12. Twice random, once mixed: applying mixed models to simultaneously analyze random effects of language and participants.

    PubMed

    Janssen, Dirk P

    2012-03-01

    Psychologists, psycholinguists, and other researchers using language stimuli have been struggling for more than 30 years with the problem of how to analyze experimental data that contain two crossed random effects (items and participants). The classical analysis of variance does not apply; alternatives have been proposed but have failed to catch on, and a statistically unsatisfactory procedure of using two approximations (known as F(1) and F(2)) has become the standard. A simple and elegant solution using mixed model analysis has been available for 15 years, and recent improvements in statistical software have made mixed models analysis widely available. The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages. The article also introduces the DJMIXED: add-on package for SPSS, which makes entering the models and reporting their results as straightforward as possible.

  13. Linguistic steganography on Twitter: hierarchical language modeling with manual interaction

    NASA Astrophysics Data System (ADS)

    Wilson, Alex; Blunsom, Phil; Ker, Andrew D.

    2014-02-01

    This work proposes a natural language stegosystem for Twitter, modifying tweets as they are written to hide 4 bits of payload per tweet, which is a greater payload than previous systems have achieved. The system, CoverTweet, includes novel components, as well as some already developed in the literature. We believe that the task of transforming covers during embedding is equivalent to unilingual machine translation (paraphrasing), and we use this equivalence to de ne a distortion measure based on statistical machine translation methods. The system incorporates this measure of distortion to rank possible tweet paraphrases, using a hierarchical language model; we use human interaction as a second distortion measure to pick the best. The hierarchical language model is designed to model the speci c language of the covers, which in this setting is the language of the Twitter user who is embedding. This is a change from previous work, where general-purpose language models have been used. We evaluate our system by testing the output against human judges, and show that humans are unable to distinguish stego tweets from cover tweets any better than random guessing.

  14. PharmML in Action: an Interoperable Language for Modeling and Simulation

    PubMed Central

    Bizzotto, R; Smith, G; Yvon, F; Kristensen, NR; Swat, MJ

    2017-01-01

    PharmML1 is an XML‐based exchange format2, 3, 4 created with a focus on nonlinear mixed‐effect (NLME) models used in pharmacometrics,5, 6 but providing a very general framework that also allows describing mathematical and statistical models such as single‐subject or nonlinear and multivariate regression models. This tutorial provides an overview of the structure of this language, brief suggestions on how to work with it, and use cases demonstrating its power and flexibility. PMID:28575551

  15. Learning the Language of Statistics: Challenges and Teaching Approaches

    ERIC Educational Resources Information Center

    Dunn, Peter K.; Carey, Michael D.; Richardson, Alice M.; McDonald, Christine

    2016-01-01

    Learning statistics requires learning the language of statistics. Statistics draws upon words from general English, mathematical English, discipline-specific English and words used primarily in statistics. This leads to many linguistic challenges in teaching statistics and the way in which the language is used in statistics creates an extra layer…

  16. Infant Statistical-Learning Ability Is Related to Real-Time Language Processing

    ERIC Educational Resources Information Center

    Lany, Jill; Shoaib, Amber; Thompson, Abbie; Estes, Katharine Graf

    2018-01-01

    Infants are adept at learning statistical regularities in artificial language materials, suggesting that the ability to learn statistical structure may support language development. Indeed, infants who perform better on statistical learning tasks tend to be more advanced in parental reports of infants' language skills. Work with adults suggests…

  17. Structural Equation Modeling: Possibilities for Language Learning Researchers

    ERIC Educational Resources Information Center

    Hancock, Gregory R.; Schoonen, Rob

    2015-01-01

    Although classical statistical techniques have been a valuable tool in second language (L2) research, L2 research questions have started to grow beyond those techniques' capabilities, and indeed are often limited by them. Questions about how complex constructs relate to each other or to constituent subskills, about longitudinal development in…

  18. A Complex Network Approach to Stylometry

    PubMed Central

    Amancio, Diego Raphael

    2015-01-01

    Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents. PMID:26313921

  19. Probability and Statistics in Sensor Performance Modeling

    DTIC Science & Technology

    2010-12-01

    language software program is called Environmental Awareness for Sensor and Emitter Employment. Some important numerical issues in the implementation...3 Statistical analysis for measuring sensor performance...complementary cumulative distribution function cdf cumulative distribution function DST decision-support tool EASEE Environmental Awareness of

  20. Evaluation of English Language Development Programs in the Santa Ana Unified School District. A Report on Data System Reliability and Statistical Modeling of Program Impacts.

    ERIC Educational Resources Information Center

    Mitchell, Douglas E.; Destino, Tom; Karam, Rita

    In response to concern about the effectiveness of programs for English-as-a-Second-Language students in California's schools, the Santa Ana Unified School District, in which over 80 percent of students are limited-English-proficient (LEP) conducted a study of both the operations and effectiveness of the district's language development program,…

  1. Evaluation of Theoretical and Empirical Characteristics of the Communication, Language, and Statistics Survey (CLASS)

    ERIC Educational Resources Information Center

    Wagler, Amy E.; Lesser, Lawrence M.

    2018-01-01

    The interaction between language and the learning of statistical concepts has been receiving increased attention. The Communication, Language, And Statistics Survey (CLASS) was developed in response to the need to focus on dynamics of language in light of the culturally and linguistically diverse environments of introductory statistics classrooms.…

  2. Quantum probabilistic logic programming

    NASA Astrophysics Data System (ADS)

    Balu, Radhakrishnan

    2015-05-01

    We describe a quantum mechanics based logic programming language that supports Horn clauses, random variables, and covariance matrices to express and solve problems in probabilistic logic. The Horn clauses of the language wrap random variables, including infinite valued, to express probability distributions and statistical correlations, a powerful feature to capture relationship between distributions that are not independent. The expressive power of the language is based on a mechanism to implement statistical ensembles and to solve the underlying SAT instances using quantum mechanical machinery. We exploit the fact that classical random variables have quantum decompositions to build the Horn clauses. We establish the semantics of the language in a rigorous fashion by considering an existing probabilistic logic language called PRISM with classical probability measures defined on the Herbrand base and extending it to the quantum context. In the classical case H-interpretations form the sample space and probability measures defined on them lead to consistent definition of probabilities for well formed formulae. In the quantum counterpart, we define probability amplitudes on Hinterpretations facilitating the model generations and verifications via quantum mechanical superpositions and entanglements. We cast the well formed formulae of the language as quantum mechanical observables thus providing an elegant interpretation for their probabilities. We discuss several examples to combine statistical ensembles and predicates of first order logic to reason with situations involving uncertainty.

  3. A Large-Scale Analysis of Variance in Written Language.

    PubMed

    Johns, Brendan T; Jamieson, Randall K

    2018-01-22

    The collection of very large text sources has revolutionized the study of natural language, leading to the development of several models of language learning and distributional semantics that extract sophisticated semantic representations of words based on the statistical redundancies contained within natural language (e.g., Griffiths, Steyvers, & Tenenbaum, ; Jones & Mewhort, ; Landauer & Dumais, ; Mikolov, Sutskever, Chen, Corrado, & Dean, ). The models treat knowledge as an interaction of processing mechanisms and the structure of language experience. But language experience is often treated agnostically. We report a distributional semantic analysis that shows written language in fiction books varies appreciably between books from the different genres, books from the same genre, and even books written by the same author. Given that current theories assume that word knowledge reflects an interaction between processing mechanisms and the language environment, the analysis shows the need for the field to engage in a more deliberate consideration and curation of the corpora used in computational studies of natural language processing. Copyright © 2018 Cognitive Science Society, Inc.

  4. Statistical learning and language acquisition

    PubMed Central

    Romberg, Alexa R.; Saffran, Jenny R.

    2011-01-01

    Human learners, including infants, are highly sensitive to structure in their environment. Statistical learning refers to the process of extracting this structure. A major question in language acquisition in the past few decades has been the extent to which infants use statistical learning mechanisms to acquire their native language. There have been many demonstrations showing infants’ ability to extract structures in linguistic input, such as the transitional probability between adjacent elements. This paper reviews current research on how statistical learning contributes to language acquisition. Current research is extending the initial findings of infants’ sensitivity to basic statistical information in many different directions, including investigating how infants represent regularities, learn about different levels of language, and integrate information across situations. These current directions emphasize studying statistical language learning in context: within language, within the infant learner, and within the environment as a whole. PMID:21666883

  5. Language acquisition and use: learning and applying probabilistic constraints.

    PubMed

    Seidenberg, M S

    1997-03-14

    What kinds of knowledge underlie the use of language and how is this knowledge acquired? Linguists equate knowing a language with knowing a grammar. Classic "poverty of the stimulus" arguments suggest that grammar identification is an intractable inductive problem and that acquisition is possible only because children possess innate knowledge of grammatical structure. An alternative view is emerging from studies of statistical and probabilistic aspects of language, connectionist models, and the learning capacities of infants. This approach emphasizes continuity between how language is acquired and how it is used. It retains the idea that innate capacities constrain language learning, but calls into question whether they include knowledge of grammatical structure.

  6. Bilinguals’ Existing Languages Benefit Vocabulary Learning in a Third Language

    PubMed Central

    Bartolotti, James; Marian, Viorica

    2017-01-01

    Learning a new language involves substantial vocabulary acquisition. Learners can accelerate this process by relying on words with native-language overlap, such as cognates. For bilingual third language learners, it is necessary to determine how their two existing languages interact during novel language learning. A scaffolding account predicts transfer from either language for individual words, whereas an accumulation account predicts cumulative transfer from both languages. To compare these accounts, twenty English-German bilingual adults were taught an artificial language containing 48 novel written words that varied orthogonally in English and German wordlikeness (neighborhood size and orthotactic probability). Wordlikeness in each language improved word production accuracy, and similarity to one language provided the same benefit as dual-language overlap. In addition, participants’ memory for novel words was affected by the statistical distributions of letters in the novel language. Results indicate that bilinguals utilize both languages during third language acquisition, supporting a scaffolding learning model. PMID:28781384

  7. Bilinguals' Existing Languages Benefit Vocabulary Learning in a Third Language.

    PubMed

    Bartolotti, James; Marian, Viorica

    2017-03-01

    Learning a new language involves substantial vocabulary acquisition. Learners can accelerate this process by relying on words with native-language overlap, such as cognates. For bilingual third language learners, it is necessary to determine how their two existing languages interact during novel language learning. A scaffolding account predicts transfer from either language for individual words, whereas an accumulation account predicts cumulative transfer from both languages. To compare these accounts, twenty English-German bilingual adults were taught an artificial language containing 48 novel written words that varied orthogonally in English and German wordlikeness (neighborhood size and orthotactic probability). Wordlikeness in each language improved word production accuracy, and similarity to one language provided the same benefit as dual-language overlap. In addition, participants' memory for novel words was affected by the statistical distributions of letters in the novel language. Results indicate that bilinguals utilize both languages during third language acquisition, supporting a scaffolding learning model.

  8. Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English

    ERIC Educational Resources Information Center

    Kastner, Itamar; Adriaans, Frans

    2018-01-01

    Statistical learning is often taken to lie at the heart of many cognitive tasks, including the acquisition of language. One particular task in which probabilistic models have achieved considerable success is the segmentation of speech into words. However, these models have mostly been tested against English data, and as a result little is known…

  9. Rage against the Machine: Evaluation Metrics in the 21st Century

    ERIC Educational Resources Information Center

    Yang, Charles

    2017-01-01

    I review the classic literature in generative grammar and Marr's three-level program for cognitive science to defend the Evaluation Metric as a psychological theory of language learning. Focusing on well-established facts of language variation, change, and use, I argue that optimal statistical principles embodied in Bayesian inference models are…

  10. Cognitive biases, linguistic universals, and constraint-based grammar learning.

    PubMed

    Culbertson, Jennifer; Smolensky, Paul; Wilson, Colin

    2013-07-01

    According to classical arguments, language learning is both facilitated and constrained by cognitive biases. These biases are reflected in linguistic typology-the distribution of linguistic patterns across the world's languages-and can be probed with artificial grammar experiments on child and adult learners. Beginning with a widely successful approach to typology (Optimality Theory), and adapting techniques from computational approaches to statistical learning, we develop a Bayesian model of cognitive biases and show that it accounts for the detailed pattern of results of artificial grammar experiments on noun-phrase word order (Culbertson, Smolensky, & Legendre, 2012). Our proposal has several novel properties that distinguish it from prior work in the domains of linguistic theory, computational cognitive science, and machine learning. This study illustrates how ideas from these domains can be synthesized into a model of language learning in which biases range in strength from hard (absolute) to soft (statistical), and in which language-specific and domain-general biases combine to account for data from the macro-level scale of typological distribution to the micro-level scale of learning by individuals. Copyright © 2013 Cognitive Science Society, Inc.

  11. The Impact of Language Experience on Language and Reading: A Statistical Learning Approach

    ERIC Educational Resources Information Center

    Seidenberg, Mark S.; MacDonald, Maryellen C.

    2018-01-01

    This article reviews the important role of statistical learning for language and reading development. Although statistical learning--the unconscious encoding of patterns in language input--has become widely known as a force in infants' early interpretation of speech, the role of this kind of learning for language and reading comprehension in…

  12. Translation of shuttle operations simulation from GPSS 2 to GPSS 1100

    NASA Technical Reports Server (NTRS)

    Marshall, A. J.

    1972-01-01

    A method has been developed which enables a programmer to convert the General Purpose Systems Simulator (GPSS) 2 simulation language into the GPSS 1100 language. To accomplish the conversion, a translator deck is used in addition to hand changes made by the analyst after translation. The conversion of a particular GPSS 2 program used at the Marshall Space Flight Center (MSFC) is reported and major changes required for compatibility of the two languages are summerized. Validation of the GPSS 1100 model was completed by comparing the results of the GPSS 2 statistics to the converted 1100 model.

  13. Using stochastic language models (SLM) to map lexical, syntactic, and phonological information processing in the brain.

    PubMed

    Lopopolo, Alessandro; Frank, Stefan L; van den Bosch, Antal; Willems, Roel M

    2017-01-01

    Language comprehension involves the simultaneous processing of information at the phonological, syntactic, and lexical level. We track these three distinct streams of information in the brain by using stochastic measures derived from computational language models to detect neural correlates of phoneme, part-of-speech, and word processing in an fMRI experiment. Probabilistic language models have proven to be useful tools for studying how language is processed as a sequence of symbols unfolding in time. Conditional probabilities between sequences of words are at the basis of probabilistic measures such as surprisal and perplexity which have been successfully used as predictors of several behavioural and neural correlates of sentence processing. Here we computed perplexity from sequences of words and their parts of speech, and their phonemic transcriptions. Brain activity time-locked to each word is regressed on the three model-derived measures. We observe that the brain keeps track of the statistical structure of lexical, syntactic and phonological information in distinct areas.

  14. Quantization, Frobenius and Bi algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics

    NASA Astrophysics Data System (ADS)

    Sadrzadeh, Mehrnoosh

    2017-07-01

    Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: ``categorical distributional compositional'' semantics, or in short, the ``DisCoCat'' model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the Quantization functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results.

  15. Can Counter-Gang Models be Applied to Counter ISIS’s Internet Recruitment Campaign

    DTIC Science & Technology

    2016-06-10

    limitation that exists is the lack of reliable statistics from social media companies in regards to the quantity of ISIS-affiliated sites, which exist on... statistics , they have approximately 320-million monthly active users with thirty-five-plus languages supported and 77 percent of accounts located...Justice and Delinquency Prevention program. For deterrence-based models, the primary point of research is focused deterrence models with emphasis placed

  16. RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language

    PubMed Central

    Höhna, Sebastian; Landis, Michael J.

    2016-01-01

    Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.] PMID:27235697

  17. RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

    PubMed

    Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik

    2016-07-01

    Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  18. Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System.

    PubMed

    Zamora-Martinez, Francisco; Castro-Bleda, Maria Jose

    2018-02-22

    Neural Network Language Models (NNLMs) are a successful approach to Natural Language Processing tasks, such as Machine Translation. We introduce in this work a Statistical Machine Translation (SMT) system which fully integrates NNLMs in the decoding stage, breaking the traditional approach based on [Formula: see text]-best list rescoring. The neural net models (both language models (LMs) and translation models) are fully coupled in the decoding stage, allowing to more strongly influence the translation quality. Computational issues were solved by using a novel idea based on memorization and smoothing of the softmax constants to avoid their computation, which introduces a trade-off between LM quality and computational cost. These ideas were studied in a machine translation task with different combinations of neural networks used both as translation models and as target LMs, comparing phrase-based and [Formula: see text]-gram-based systems, showing that the integrated approach seems more promising for [Formula: see text]-gram-based systems, even with nonfull-quality NNLMs.

  19. Statistical Models for Linguistic Variation in Online Media

    ERIC Educational Resources Information Center

    Kulkarni, Vivek

    2017-01-01

    Language on the Internet and social media varies due to time, geography, and social factors. For example, consider an online chat forum where people from different regions across the world interact. In such scenarios, it is important to track and detect regional variation in language. A person from the UK, who is in conversation with someone from…

  20. Modeling Statistical Insensitivity: Sources of Suboptimal Behavior

    ERIC Educational Resources Information Center

    Gagliardi, Annie; Feldman, Naomi H.; Lidz, Jeffrey

    2017-01-01

    Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the…

  1. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes

    PubMed Central

    2017-01-01

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274, 1926–1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105, 2745–2750; Thiessen & Yee 2010 Child Development 81, 1287–1303; Saffran 2002 Journal of Memory and Language 47, 172–196; Misyak & Christiansen 2012 Language Learning 62, 302–331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39, 246–263; Thiessen et al. 2013 Psychological Bulletin 139, 792–814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37, 310–343). This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences'. PMID:27872374

  2. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

    PubMed

    Thiessen, Erik D

    2017-01-05

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37: , 310-343).This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  3. Statistical Learning in a Natural Language by 8-Month-Old Infants

    PubMed Central

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

    2013-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896

  4. Statistical learning in a natural language by 8-month-old infants.

    PubMed

    Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R

    2009-01-01

    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition.

  5. Use of microcomputers for planning and managing silviculture habitat relationships.

    Treesearch

    B.G. Marcot; R.S. McNay; R.E. Page

    1988-01-01

    Microcomputers aid in monitoring, modeling, and decision support for integrating objectives of silviculture and wildlife habitat management. Spreadsheets, data bases, statistics, and graphics programs are described for use in monitoring. Stand growth models, modeling languages, area and geobased information systems, and optimization models are discussed for use in...

  6. System analysis for the Huntsville Operational Support Center distributed computer system

    NASA Technical Reports Server (NTRS)

    Ingels, E. M.

    1983-01-01

    A simulation model was developed and programmed in three languages BASIC, PASCAL, and SLAM. Two of the programs are included in this report, the BASIC and the PASCAL language programs. SLAM is not supported by NASA/MSFC facilities and hence was not included. The statistical comparison of simulations of the same HOSC system configurations are in good agreement and are in agreement with the operational statistics of HOSC that were obtained. Three variations of the most recent HOSC configuration was run and some conclusions drawn as to the system performance under these variations.

  7. PharmML in Action: an Interoperable Language for Modeling and Simulation.

    PubMed

    Bizzotto, R; Comets, E; Smith, G; Yvon, F; Kristensen, N R; Swat, M J

    2017-10-01

    PharmML is an XML-based exchange format created with a focus on nonlinear mixed-effect (NLME) models used in pharmacometrics, but providing a very general framework that also allows describing mathematical and statistical models such as single-subject or nonlinear and multivariate regression models. This tutorial provides an overview of the structure of this language, brief suggestions on how to work with it, and use cases demonstrating its power and flexibility. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  8. Survey of Native English Speakers and Spanish-Speaking English Language Learners in Tertiary Introductory Statistics

    ERIC Educational Resources Information Center

    Lesser, Lawrence M.; Wagler, Amy E.; Esquinca, Alberto; Valenzuela, M. Guadalupe

    2013-01-01

    The framework of linguistic register and case study research on Spanish-speaking English language learners (ELLs) learning statistics informed the construction of a quantitative instrument, the Communication, Language, And Statistics Survey (CLASS). CLASS aims to assess whether ELLs and non-ELLs approach the learning of statistics differently with…

  9. Infants with Williams syndrome detect statistical regularities in continuous speech.

    PubMed

    Cashon, Cara H; Ha, Oh-Ryeong; Graf Estes, Katharine; Saffran, Jenny R; Mervis, Carolyn B

    2016-09-01

    Williams syndrome (WS) is a rare genetic disorder associated with delays in language and cognitive development. The reasons for the language delay are unknown. Statistical learning is a domain-general mechanism recruited for early language acquisition. In the present study, we investigated whether infants with WS were able to detect the statistical structure in continuous speech. Eighteen 8- to 20-month-olds with WS were familiarized with 2min of a continuous stream of synthesized nonsense words; the statistical structure of the speech was the only cue to word boundaries. They were tested on their ability to discriminate statistically-defined "words" and "part-words" (which crossed word boundaries) in the artificial language. Despite significant cognitive and language delays, infants with WS were able to detect the statistical regularities in the speech stream. These findings suggest that an inability to track the statistical properties of speech is unlikely to be the primary basis for the delays in the onset of language observed in infants with WS. These results provide the first evidence of statistical learning by infants with developmental delays. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. What is India speaking? Exploring the "Hinglish" invasion

    NASA Astrophysics Data System (ADS)

    Parshad, Rana D.; Bhowmick, Suman; Chand, Vineeta; Kumari, Nitu; Sinha, Neha

    2016-05-01

    Language competition models help understand language shift dynamics, and have effectively captured how English has outcompeted various local languages, such as Scottish Gaelic in Scotland, and Mandarin in Singapore. India, with a 125 million English speakers boasts the second largest number of English speakers in the world, after the United States. The 1961-2001 Indian censuses report a sharp increase in Hindi/English Bilinguals, suggesting that English is on the rise in India. To the contrary, we claim supported by field evidence, that these statistics are inaccurate, ignoring an emerging class who do not have full bilingual competence and switch between Hindi and English, communicating via a code popularly known as "Hinglish". Since current language competition models occlude hybrid practices and detailed local ecological factors, they are inappropriate to capture the current language dynamics in India. Expanding predator-prey and sociolinguistic theories, we draw on local Indian ecological factors to develop a novel three-species model of interaction between Monolingual Hindi speakers, Hindi/English Bilinguals and Hinglish speakers, and explore the long time dynamics it predicts. The model also exhibits Turing instability, which is the first pattern formation result in language dynamics. These results challenge traditional assumptions of English encroachment in India. More broadly, the three-species model introduced here is a first step towards modeling the dynamics of hybrid language scenarios in other settings across the world.

  11. Value-Added Predictors of Expressive and Receptive Language Growth in Initially Nonverbal Preschoolers with Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Yoder, Paul; Watson, Linda R.; Lambert, Warren

    2015-01-01

    Eighty-seven preschoolers with autism spectrum disorders who were initially nonverbal (under 6 words in language sample and under 21 parent-reported words said) were assessed at five time points over 16 months. Statistical models that accounted for the intercorrelation among nine theoretically- and empirically-motivated predictors, as well as two…

  12. Statistics and Style. Mathematical Linguistics and Automatic Language Processing No. 6.

    ERIC Educational Resources Information Center

    Dolezel, Lubomir, Ed.; Bailey, Richard W., Ed.

    This collection of 17 articles concerning the application of mathematical models and techniques to the study of literary style is an attempt to overcome the communication barriers that exist between scholars in the various fields that find their meeting ground in statistical stylistics. The articles selected were chosen to represent the best…

  13. Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models

    DTIC Science & Technology

    2009-01-01

    88 4 Monolingually -Derived Phrasal Paraphrase Generation for Statistical Ma- chine Translation 90 4.1...123 4.4 Spanish-English (S2E) results . . . . . . . . . . . . . . . . . . . . . . 125 4.5 Gains from using larger monolingual corpora for...96 4.2 Visual example of a phrasal distributional profile . . . . . . . . . . . . 103 4.3 Monolingual corpus-based distributional

  14. Second Language Experience Facilitates Statistical Learning of Novel Linguistic Materials.

    PubMed

    Potter, Christine E; Wang, Tianlin; Saffran, Jenny R

    2017-04-01

    Recent research has begun to explore individual differences in statistical learning, and how those differences may be related to other cognitive abilities, particularly their effects on language learning. In this research, we explored a different type of relationship between language learning and statistical learning: the possibility that learning a new language may also influence statistical learning by changing the regularities to which learners are sensitive. We tested two groups of participants, Mandarin Learners and Naïve Controls, at two time points, 6 months apart. At each time point, participants performed two different statistical learning tasks: an artificial tonal language statistical learning task and a visual statistical learning task. Only the Mandarin-learning group showed significant improvement on the linguistic task, whereas both groups improved equally on the visual task. These results support the view that there are multiple influences on statistical learning. Domain-relevant experiences may affect the regularities that learners can discover when presented with novel stimuli. Copyright © 2016 Cognitive Science Society, Inc.

  15. Second language experience facilitates statistical learning of novel linguistic materials

    PubMed Central

    Potter, Christine E.; Wang, Tianlin; Saffran, Jenny R.

    2016-01-01

    Recent research has begun to explore individual differences in statistical learning, and how those differences may be related to other cognitive abilities, particularly their effects on language learning. In the present research, we explored a different type of relationship between language learning and statistical learning: the possibility that learning a new language may also influence statistical learning by changing the regularities to which learners are sensitive. We tested two groups of participants, Mandarin Learners and Naïve Controls, at two time points, six months apart. At each time point, participants performed two different statistical learning tasks: an artificial tonal language statistical learning task and a visual statistical learning task. Only the Mandarin-learning group showed significant improvement on the linguistic task, while both groups improved equally on the visual task. These results support the view that there are multiple influences on statistical learning. Domain-relevant experiences may affect the regularities that learners can discover when presented with novel stimuli. PMID:27988939

  16. Statistical Learning and Language: An Individual Differences Study

    ERIC Educational Resources Information Center

    Misyak, Jennifer B.; Christiansen, Morten H.

    2012-01-01

    Although statistical learning and language have been assumed to be intertwined, this theoretical presupposition has rarely been tested empirically. The present study investigates the relationship between statistical learning and language using a within-subject design embedded in an individual-differences framework. Participants were administered…

  17. Focal colors across languages are representative members of color categories.

    PubMed

    Abbott, Joshua T; Griffiths, Thomas L; Regier, Terry

    2016-10-04

    Focal colors, or best examples of color terms, have traditionally been viewed as either the underlying source of cross-language color-naming universals or derived from category boundaries that vary widely across languages. Existing data partially support and partially challenge each of these views. Here, we advance a position that synthesizes aspects of these two traditionally opposed positions and accounts for existing data. We do so by linking this debate to more general principles. We show that best examples of named color categories across 112 languages are well-predicted from category extensions by a statistical model of how representative a sample is of a distribution, independently shown to account for patterns of human inference. This model accounts for both universal tendencies and variation in focal colors across languages. We conclude that categorization in the contested semantic domain of color may be governed by principles that apply more broadly in cognition and that these principles clarify the interplay of universal and language-specific forces in color naming.

  18. Focal colors across languages are representative members of color categories

    PubMed Central

    Abbott, Joshua T.; Griffiths, Thomas L.; Regier, Terry

    2016-01-01

    Focal colors, or best examples of color terms, have traditionally been viewed as either the underlying source of cross-language color-naming universals or derived from category boundaries that vary widely across languages. Existing data partially support and partially challenge each of these views. Here, we advance a position that synthesizes aspects of these two traditionally opposed positions and accounts for existing data. We do so by linking this debate to more general principles. We show that best examples of named color categories across 112 languages are well-predicted from category extensions by a statistical model of how representative a sample is of a distribution, independently shown to account for patterns of human inference. This model accounts for both universal tendencies and variation in focal colors across languages. We conclude that categorization in the contested semantic domain of color may be governed by principles that apply more broadly in cognition and that these principles clarify the interplay of universal and language-specific forces in color naming. PMID:27647896

  19. Generating action descriptions from statistically integrated representations of human motions and sentences.

    PubMed

    Takano, Wataru; Kusajima, Ikuo; Nakamura, Yoshihiko

    2016-08-01

    It is desirable for robots to be able to linguistically understand human actions during human-robot interactions. Previous research has developed frameworks for encoding human full body motion into model parameters and for classifying motion into specific categories. For full understanding, the motion categories need to be connected to the natural language such that the robots can interpret human motions as linguistic expressions. This paper proposes a novel framework for integrating observation of human motion with that of natural language. This framework consists of two models; the first model statistically learns the relations between motions and their relevant words, and the second statistically learns sentence structures as word n-grams. Integration of these two models allows robots to generate sentences from human motions by searching for words relevant to the motion using the first model and then arranging these words in appropriate order using the second model. This allows making sentences that are the most likely to be generated from the motion. The proposed framework was tested on human full body motion measured by an optical motion capture system. In this, descriptive sentences were manually attached to the motions, and the validity of the system was demonstrated. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Modeling Coevolution between Language and Memory Capacity during Language Origin

    PubMed Central

    Gong, Tao; Shuai, Lan

    2015-01-01

    Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language. PMID:26544876

  1. Modeling Coevolution between Language and Memory Capacity during Language Origin.

    PubMed

    Gong, Tao; Shuai, Lan

    2015-01-01

    Memory is essential to many cognitive tasks including language. Apart from empirical studies of memory effects on language acquisition and use, there lack sufficient evolutionary explorations on whether a high level of memory capacity is prerequisite for language and whether language origin could influence memory capacity. In line with evolutionary theories that natural selection refined language-related cognitive abilities, we advocated a coevolution scenario between language and memory capacity, which incorporated the genetic transmission of individual memory capacity, cultural transmission of idiolects, and natural and cultural selections on individual reproduction and language teaching. To illustrate the coevolution dynamics, we adopted a multi-agent computational model simulating the emergence of lexical items and simple syntax through iterated communications. Simulations showed that: along with the origin of a communal language, an initially-low memory capacity for acquired linguistic knowledge was boosted; and such coherent increase in linguistic understandability and memory capacities reflected a language-memory coevolution; and such coevolution stopped till memory capacities became sufficient for language communications. Statistical analyses revealed that the coevolution was realized mainly by natural selection based on individual communicative success in cultural transmissions. This work elaborated the biology-culture parallelism of language evolution, demonstrated the driving force of culturally-constituted factors for natural selection of individual cognitive abilities, and suggested that the degree difference in language-related cognitive abilities between humans and nonhuman animals could result from a coevolution with language.

  2. Second Language Experience Facilitates Statistical Learning of Novel Linguistic Materials

    ERIC Educational Resources Information Center

    Potter, Christine E.; Wang, Tianlin; Saffran, Jenny R.

    2017-01-01

    Recent research has begun to explore individual differences in statistical learning, and how those differences may be related to other cognitive abilities, particularly their effects on language learning. In this research, we explored a different type of relationship between language learning and statistical learning: the possibility that learning…

  3. Automated Vocal Analysis of Children with Hearing Loss and Their Typical and Atypical Peers

    PubMed Central

    VanDam, Mark; Oller, D. Kimbrough; Ambrose, Sophie E.; Gray, Sharmistha; Richards, Jeffrey A.; Xu, Dongxin; Gilkerson, Jill; Silbert, Noah H.; Moeller, Mary Pat

    2014-01-01

    Objectives This study investigated automatic assessment of vocal development in children with hearing loss as compared with children who are typically developing, have language delays, and autism spectrum disorder. Statistical models are examined for performance in a classification model and to predict age within the four groups of children. Design The vocal analysis system analyzed over 1900 whole-day, naturalistic acoustic recordings from 273 toddlers and preschoolers comprising children who were typically developing, hard of hearing, language delayed, or autistic. Results Samples from children who were hard-of-hearing patterned more similarly to those of typically-developing children than to the language-delayed or autistic samples. The statistical models were able to classify children from the four groups examined and estimate developmental age based on automated vocal analysis. Conclusions This work shows a broad similarity between children with hearing loss and typically developing children, although children with hearing loss show some delay in their production of speech. Automatic acoustic analysis can now be used to quantitatively compare vocal development in children with and without speech-related disorders. The work may serve to better distinguish among various developmental disorders and ultimately contribute to improved intervention. PMID:25587667

  4. Learning a generative probabilistic grammar of experience: a process-level model of language acquisition.

    PubMed

    Kolodny, Oren; Lotem, Arnon; Edelman, Shimon

    2015-03-01

    We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given a stream of linguistic input, our model incrementally learns a grammar that captures its statistical patterns, which can then be used to parse or generate new data. The grammar constructed in this manner takes the form of a directed weighted graph, whose nodes are recursively (hierarchically) defined patterns over the elements of the input stream. We evaluated the model in seventeen experiments, grouped into five studies, which examined, respectively, (a) the generative ability of grammar learned from a corpus of natural language, (b) the characteristics of the learned representation, (c) sequence segmentation and chunking, (d) artificial grammar learning, and (e) certain types of structure dependence. The model's performance largely vindicates our design choices, suggesting that progress in modeling language acquisition can be made on a broad front-ranging from issues of generativity to the replication of human experimental findings-by bringing biological and computational considerations, as well as lessons from prior efforts, to bear on the modeling approach. Copyright © 2014 Cognitive Science Society, Inc.

  5. Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms

    PubMed Central

    Bentz, Christian; Verkerk, Annemarie; Kiela, Douwe; Hill, Felix; Buttery, Paula

    2015-01-01

    Explaining the diversity of languages across the world is one of the central aims of typological, historical, and evolutionary linguistics. We consider the effect of language contact-the number of non-native speakers a language has-on the way languages change and evolve. By analysing hundreds of languages within and across language families, regions, and text types, we show that languages with greater levels of contact typically employ fewer word forms to encode the same information content (a property we refer to as lexical diversity). Based on three types of statistical analyses, we demonstrate that this variance can in part be explained by the impact of non-native speakers on information encoding strategies. Finally, we argue that languages are information encoding systems shaped by the varying needs of their speakers. Language evolution and change should be modeled as the co-evolution of multiple intertwined adaptive systems: On one hand, the structure of human societies and human learning capabilities, and on the other, the structure of language. PMID:26083380

  6. Robust model selection and the statistical classification of languages

    NASA Astrophysics Data System (ADS)

    García, J. E.; González-López, V. A.; Viola, M. L. L.

    2012-10-01

    In this paper we address the problem of model selection for the set of finite memory stochastic processes with finite alphabet, when the data is contaminated. We consider m independent samples, with more than half of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We devise a model selection procedure such that for a sample size large enough, the selected process is the one with law Q. Our model selection strategy is based on estimating relative entropies to select a subset of samples that are realizations of the same law. Although the procedure is valid for any family of finite order Markov models, we will focus on the family of variable length Markov chain models, which include the fixed order Markov chain model family. We define the asymptotic breakdown point (ABDP) for a model selection procedure, and we show the ABDP for our procedure. This means that if the proportion of contaminated samples is smaller than the ABDP, then, as the sample size grows our procedure selects a model for the process with law Q. We also use our procedure in a setting where we have one sample conformed by the concatenation of sub-samples of two or more stochastic processes, with most of the subsamples having law Q. We conducted a simulation study. In the application section we address the question of the statistical classification of languages according to their rhythmic features using speech samples. This is an important open problem in phonology. A persistent difficulty on this problem is that the speech samples correspond to several sentences produced by diverse speakers, corresponding to a mixture of distributions. The usual procedure to deal with this problem has been to choose a subset of the original sample which seems to best represent each language. The selection is made by listening to the samples. In our application we use the full dataset without any preselection of samples. We apply our robust methodology estimating a model which represent the main law for each language. Our findings agree with the linguistic conjecture, related to the rhythm of the languages included on our dataset.

  7. Modeling and Simulation with INS.

    ERIC Educational Resources Information Center

    Roberts, Stephen D.; And Others

    INS, the Integrated Network Simulation language, puts simulation modeling into a network framework and automatically performs such programming activities as placing the problem into a next event structure, coding events, collecting statistics, monitoring status, and formatting reports. To do this, INS provides a set of symbols (nodes and branches)…

  8. An empirical generative framework for computational modeling of language acquisition.

    PubMed

    Waterfall, Heidi R; Sandbank, Ben; Onnis, Luca; Edelman, Shimon

    2010-06-01

    This paper reports progress in developing a computer model of language acquisition in the form of (1) a generative grammar that is (2) algorithmically learnable from realistic corpus data, (3) viable in its large-scale quantitative performance and (4) psychologically real. First, we describe new algorithmic methods for unsupervised learning of generative grammars from raw CHILDES data and give an account of the generative performance of the acquired grammars. Next, we summarize findings from recent longitudinal and experimental work that suggests how certain statistically prominent structural properties of child-directed speech may facilitate language acquisition. We then present a series of new analyses of CHILDES data indicating that the desired properties are indeed present in realistic child-directed speech corpora. Finally, we suggest how our computational results, behavioral findings, and corpus-based insights can be integrated into a next-generation model aimed at meeting the four requirements of our modeling framework.

  9. Musical Experience Influences Statistical Learning of a Novel Language

    PubMed Central

    Shook, Anthony; Marian, Viorica; Bartolotti, James; Schroeder, Scott R.

    2014-01-01

    Musical experience may benefit learning a new language by enhancing the fidelity with which the auditory system encodes sound. In the current study, participants with varying degrees of musical experience were exposed to two statistically-defined languages consisting of auditory Morse-code sequences which varied in difficulty. We found an advantage for highly-skilled musicians, relative to less-skilled musicians, in learning novel Morse-code based words. Furthermore, in the more difficult learning condition, performance of lower-skilled musicians was mediated by their general cognitive abilities. We suggest that musical experience may lead to enhanced processing of statistical information and that musicians’ enhanced ability to learn statistical probabilities in a novel Morse-code language may extend to natural language learning. PMID:23505962

  10. Rate of language evolution is affected by population size

    PubMed Central

    Bromham, Lindell; Hua, Xia; Fitzpatrick, Thomas G.; Greenhill, Simon J.

    2015-01-01

    The effect of population size on patterns and rates of language evolution is controversial. Do languages with larger speaker populations change faster due to a greater capacity for innovation, or do smaller populations change faster due to more efficient diffusion of innovations? Do smaller populations suffer greater loss of language elements through founder effects or drift, or do languages with more speakers lose features due to a process of simplification? Revealing the influence of population size on the tempo and mode of language evolution not only will clarify underlying mechanisms of language change but also has practical implications for the way that language data are used to reconstruct the history of human cultures. Here, we provide, to our knowledge, the first empirical, statistically robust test of the influence of population size on rates of language evolution, controlling for the evolutionary history of the populations and formally comparing the fit of different models of language evolution. We compare rates of gain and loss of cognate words for basic vocabulary in Polynesian languages, an ideal test case with a well-defined history. We demonstrate that larger populations have higher rates of gain of new words whereas smaller populations have higher rates of word loss. These results show that demographic factors can influence rates of language evolution and that rates of gain and loss are affected differently. These findings are strikingly consistent with general predictions of evolutionary models. PMID:25646448

  11. Longitudinal Effects on Early Adolescent Language: A Twin Study

    PubMed Central

    DeThorne, Laura Segebart; Smith, Jamie Mahurin; Betancourt, Mariana Aparicio; Petrill, Stephen A.

    2016-01-01

    Purpose We evaluated genetic and environmental contributions to individual differences in language skills during early adolescence, measured by both language sampling and standardized tests, and examined the extent to which these genetic and environmental effects are stable across time. Method We used structural equation modeling on latent factors to estimate additive genetic, shared environmental, and nonshared environmental effects on variance in standardized language skills (i.e., Formal Language) and productive language-sample measures (i.e., Productive Language) in a sample of 527 twins across 3 time points (mean ages 10–12 years). Results Individual differences in the Formal Language factor were influenced primarily by genetic factors at each age, whereas individual differences in the Productive Language factor were primarily due to nonshared environmental influences. For the Formal Language factor, the stability of genetic effects was high across all 3 time points. For the Productive Language factor, nonshared environmental effects showed low but statistically significant stability across adjacent time points. Conclusions The etiology of language outcomes may differ substantially depending on assessment context. In addition, the potential mechanisms for nonshared environmental influences on language development warrant further investigation. PMID:27732720

  12. Improving DHH students' grammar through an individualized software program.

    PubMed

    Cannon, Joanna E; Easterbrooks, Susan R; Gagné, Phill; Beal-Alvarez, Jennifer

    2011-01-01

    The purpose of this study was to determine if the frequent use of a targeted, computer software grammar instruction program, used as an individualized classroom activity, would influence the comprehension of morphosyntax structures (determiners, tense, and complementizers) in deaf/hard-of-hearing (DHH) participants who use American Sign Language (ASL). Twenty-six students from an urban day school for the deaf participated in this study. Two hierarchical linear modeling growth curve analyses showed that the influence of LanguageLinks: Syntax Assessment and Intervention (LL) resulted in statistically significant gains in participants' comprehension of morphosyntax structures. Two dependent t tests revealed statistically significant results between the pre- and postintervention assessments on the Diagnostic Evaluation of Language Variation-Norm Referenced. The daily use of LL increased the morphosyntax comprehension of the participants in this study and may be a promising practice for DHH students who use ASL.

  13. Automated Assessment of Child Vocalization Development Using LENA.

    PubMed

    Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance

    2017-07-12

    To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.

  14. Statistical Learning of Two Artificial Languages Presented Successively: How Conscious?

    PubMed Central

    Franco, Ana; Cleeremans, Axel; Destrebecqz, Arnaud

    2011-01-01

    Statistical learning is assumed to occur automatically and implicitly, but little is known about the extent to which the representations acquired over training are available to conscious awareness. In this study, we focus on whether the knowledge acquired in a statistical learning situation is available to conscious control. Participants were first exposed to an artificial language presented auditorily. Immediately thereafter, they were exposed to a second artificial language. Both languages were composed of the same corpus of syllables and differed only in the transitional probabilities. We first determined that both languages were equally learnable (Experiment 1) and that participants could learn the two languages and differentiate between them (Experiment 2). Then, in Experiment 3, we used an adaptation of the Process-Dissociation Procedure (Jacoby, 1991) to explore whether participants could consciously manipulate the acquired knowledge. Results suggest that statistical information can be used to parse and differentiate between two different artificial languages, and that the resulting representations are available to conscious control. PMID:21960981

  15. The Effects of Various High School Scheduling Models on Student Achievement in Michigan

    ERIC Educational Resources Information Center

    Pickell, Russell E.

    2017-01-01

    This study reviews research and data to determine whether student achievement is affected by the high school scheduling model, and whether changes in scheduling models result in statistically significant changes in student achievement, as measured by the ACT Composite, ACT English Language Arts, and ACT Math scores. The high school scheduling…

  16. STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE

    DTIC Science & Technology

    2017-12-01

    E 23rd St Austin , TX 78712 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) Air Force Research ...Processing (EMNLP), Austin , TX , 2016. 
 Pichotta, K. and Mooney, R.J., “Using Sentence-Level LSTM Language Models for Script Inference,” Proceedings of the...on Uphill Battles in Language Processing, Austin , TX , 2016. 
 Rajani, N., and Mooney, R. J., “Stacked Ensembles of Information Extractors for

  17. Semantic Coherence Facilitates Distributional Learning.

    PubMed

    Ouyang, Long; Boroditsky, Lera; Frank, Michael C

    2017-04-01

    Computational models have shown that purely statistical knowledge about words' linguistic contexts is sufficient to learn many properties of words, including syntactic and semantic category. For example, models can infer that "postman" and "mailman" are semantically similar because they have quantitatively similar patterns of association with other words (e.g., they both tend to occur with words like "deliver," "truck," "package"). In contrast to these computational results, artificial language learning experiments suggest that distributional statistics alone do not facilitate learning of linguistic categories. However, experiments in this paradigm expose participants to entirely novel words, whereas real language learners encounter input that contains some known words that are semantically organized. In three experiments, we show that (a) the presence of familiar semantic reference points facilitates distributional learning and (b) this effect crucially depends both on the presence of known words and the adherence of these known words to some semantic organization. Copyright © 2016 Cognitive Science Society, Inc.

  18. Implicit Statistical Learning and Language Skills in Bilingual Children

    ERIC Educational Resources Information Center

    Yim, Dongsun; Rudoy, John

    2013-01-01

    Purpose: Implicit statistical learning in 2 nonlinguistic domains (visual and auditory) was used to investigate (a) whether linguistic experience influences the underlying learning mechanism and (b) whether there are modality constraints in predicting implicit statistical learning with age and language skills. Method: Implicit statistical learning…

  19. An Application of Epidemiological Modeling to Information Diffusion

    NASA Astrophysics Data System (ADS)

    McCormack, Robert; Salter, William

    Messages often spread within a population through unofficial - particularly web-based - media. Such ideas have been termed "memes." To impede the flow of terrorist messages and to promote counter messages within a population, intelligence analysts must understand how messages spread. We used statistical language processing technologies to operationalize "memes" as latent topics in electronic text and applied epidemiological techniques to describe and analyze patterns of message propagation. We developed our methods and applied them to English-language newspapers and blogs in the Arab world. We found that a relatively simple epidemiological model can reproduce some dynamics of observed empirical relationships.

  20. Teacher's Corner: Structural Equation Modeling with the Sem Package in R

    ERIC Educational Resources Information Center

    Fox, John

    2006-01-01

    R is free, open-source, cooperatively developed software that implements the S statistical programming language and computing environment. The current capabilities of R are extensive, and it is in wide use, especially among statisticians. The sem package provides basic structural equation modeling facilities in R, including the ability to fit…

  1. Modular Open-Source Software for Item Factor Analysis

    ERIC Educational Resources Information Center

    Pritikin, Joshua N.; Hunter, Micheal D.; Boker, Steven M.

    2015-01-01

    This article introduces an item factor analysis (IFA) module for "OpenMx," a free, open-source, and modular statistical modeling package that runs within the R programming environment on GNU/Linux, Mac OS X, and Microsoft Windows. The IFA module offers a novel model specification language that is well suited to programmatic generation…

  2. Optimizing DNA assembly based on statistical language modelling.

    PubMed

    Fang, Gang; Zhang, Shemin; Dong, Yafei

    2017-12-15

    By successively assembling genetic parts such as BioBrick according to grammatical models, complex genetic constructs composed of dozens of functional blocks can be built. However, usually every category of genetic parts includes a few or many parts. With increasing quantity of genetic parts, the process of assembling more than a few sets of these parts can be expensive, time consuming and error prone. At the last step of assembling it is somewhat difficult to decide which part should be selected. Based on statistical language model, which is a probability distribution P(s) over strings S that attempts to reflect how frequently a string S occurs as a sentence, the most commonly used parts will be selected. Then, a dynamic programming algorithm was designed to figure out the solution of maximum probability. The algorithm optimizes the results of a genetic design based on a grammatical model and finds an optimal solution. In this way, redundant operations can be reduced and the time and cost required for conducting biological experiments can be minimized. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. The Examination of the Effects of Writing Strategy-Based Procedural Facilitative Environments on Students' English Foreign Language Writing Anxiety Levels

    PubMed Central

    Tsiriotakis, Ioanna K.; Vassilaki, Eleni; Spantidakis, Ioannis; Stavrou, Nektarios A. M.

    2017-01-01

    Empirical studies have shown that anxiety and negative emotion can hinder language acquisition. The present study implemented a writing instructional model so as to investigate its effects on the writing anxiety levels of English Foreign Language learners. The study was conducted with 177 participants, who were administered the Second Language Writing Anxiety Inventory (SLWAI; Cheng, 2004) that assesses somatic, cognitive and behavioral anxiety, both at baseline and following the implementation of a writing instructional model. The hypothesis stated that the participant's writing anxiety levels would lessen following the provision of a writing strategy-based procedural facilitative environment that fosters cognitive apprenticeship. The initial hypothesis was supported by the findings. Specifically, in the final measurement statistical significant differences appeared where participants in the experimental group showed notable lower mean values of the three factors of anxiety, a factor that largely can be attributed to the content of the intervention program applied to this specific group. The findings validate that Foreign Language writing anxiety negatively effects Foreign Language learning and performance. The findings also support the effectiveness of strategy-based procedural facilitative writing environments that foster cognitive apprenticeship, so as to enhance language skill development and reduce feelings of Foreign Language writing anxiety. PMID:28119658

  4. The Examination of the Effects of Writing Strategy-Based Procedural Facilitative Environments on Students' English Foreign Language Writing Anxiety Levels.

    PubMed

    Tsiriotakis, Ioanna K; Vassilaki, Eleni; Spantidakis, Ioannis; Stavrou, Nektarios A M

    2016-01-01

    Empirical studies have shown that anxiety and negative emotion can hinder language acquisition. The present study implemented a writing instructional model so as to investigate its effects on the writing anxiety levels of English Foreign Language learners. The study was conducted with 177 participants, who were administered the Second Language Writing Anxiety Inventory (SLWAI; Cheng, 2004) that assesses somatic, cognitive and behavioral anxiety, both at baseline and following the implementation of a writing instructional model. The hypothesis stated that the participant's writing anxiety levels would lessen following the provision of a writing strategy-based procedural facilitative environment that fosters cognitive apprenticeship. The initial hypothesis was supported by the findings. Specifically, in the final measurement statistical significant differences appeared where participants in the experimental group showed notable lower mean values of the three factors of anxiety, a factor that largely can be attributed to the content of the intervention program applied to this specific group. The findings validate that Foreign Language writing anxiety negatively effects Foreign Language learning and performance. The findings also support the effectiveness of strategy-based procedural facilitative writing environments that foster cognitive apprenticeship, so as to enhance language skill development and reduce feelings of Foreign Language writing anxiety.

  5. Re-visiting the electrophysiology of language.

    PubMed

    Obleser, Jonas

    2015-09-01

    This editorial accompanies a special issue of Brain and Language re-visiting old themes and new leads in the electrophysiology of language. The event-related potential (ERP) as a series of characteristic deflections ("components") over time and their distribution on the scalp has been exploited by speech and language researchers over decades to find support for diverse psycholinguistic models. Fortunately, methodological and statistical advances have allowed human neuroscience to move beyond some of the limitations imposed when looking at the ERP only. Most importantly, we currently witness a refined and refreshed look at "event-related" (in the literal sense) brain activity that relates itself more closely to the actual neurobiology of speech and language processes. It is this imminent change in handling and interpreting electrophysiological data of speech and language experiments that this special issue intends to capture. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Building out a Measurement Model to Incorporate Complexities of Testing in the Language Domain

    ERIC Educational Resources Information Center

    Wilson, Mark; Moore, Stephen

    2011-01-01

    This paper provides a summary of a novel and integrated way to think about the item response models (most often used in measurement applications in social science areas such as psychology, education, and especially testing of various kinds) from the viewpoint of the statistical theory of generalized linear and nonlinear mixed models. In addition,…

  7. Universality in eye movements and reading: A trilingual investigation.

    PubMed

    Liversedge, Simon P; Drieghe, Denis; Li, Xin; Yan, Guoli; Bai, Xuejun; Hyönä, Jukka

    2016-02-01

    Universality in language has been a core issue in the fields of linguistics and psycholinguistics for many years (e.g., Chomsky, 1965). Recently, Frost (2012) has argued that establishing universals of process is critical to the development of meaningful, theoretically motivated, cross-linguistic models of reading. In contrast, other researchers argue that there is no such thing as universals of reading (e.g., Coltheart & Crain, 2012). Reading is a complex, visually mediated psychological process, and eye movements are the behavioural means by which we encode the visual information required for linguistic processing. To investigate universality of representation and process across languages we examined eye movement behaviour during reading of very comparable stimuli in three languages, Chinese, English and Finnish. These languages differ in numerous respects (character based vs. alphabetic, visual density, informational density, word spacing, orthographic depth, agglutination, etc.). We used linear mixed modelling techniques to identify variables that captured common variance across languages. Despite fundamental visual and linguistic differences in the orthographies, statistical models of reading behaviour were strikingly similar in a number of respects, and thus, we argue that their composition might reflect universality of representation and process in reading. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. Language workbench user interfaces for data analysis

    PubMed Central

    Benson, Victoria M.

    2015-01-01

    Biological data analysis is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Workflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with workflows, and that do not require specialized user interfaces. However, some types of analyses can benefit from custom user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that can be expressed more succinctly with specialized user interfaces. Here, we show how Language Workbench (LW) technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS) LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/). PMID:25755929

  9. Darwinian perspectives on the evolution of human languages.

    PubMed

    Pagel, Mark

    2017-02-01

    Human languages evolve by a process of descent with modification in which parent languages give rise to daughter languages over time and in a manner that mimics the evolution of biological species. Descent with modification is just one of many parallels between biological and linguistic evolution that, taken together, offer up a Darwinian perspective on how languages evolve. Combined with statistical methods borrowed from evolutionary biology, this Darwinian perspective has brought new opportunities to the study of the evolution of human languages. These include the statistical inference of phylogenetic trees of languages, the study of how linguistic traits evolve over thousands of years of language change, the reconstruction of ancestral or proto-languages, and using language change to date historical events.

  10. Multilingualism and fMRI: Longitudinal Study of Second Language Acquisition

    PubMed Central

    Andrews, Edna; Frigau, Luca; Voyvodic-Casabo, Clara; Voyvodic, James; Wright, John

    2013-01-01

    BOLD fMRI is often used for the study of human language. However, there are still very few attempts to conduct longitudinal fMRI studies in the study of language acquisition by measuring auditory comprehension and reading. The following paper is the first in a series concerning a unique longitudinal study devoted to the analysis of bi- and multilingual subjects who are: (1) already proficient in at least two languages; or (2) are acquiring Russian as a second/third language. The focus of the current analysis is to present data from the auditory sections of a set of three scans acquired from April, 2011 through April, 2012 on a five-person subject pool who are learning Russian during the study. All subjects were scanned using the same protocol for auditory comprehension on the same General Electric LX 3T Signa scanner in Duke University Hospital. Using a multivariate analysis of covariance (MANCOVA) for statistical analysis, proficiency measurements are shown to correlate significantly with scan results in the Russian conditions over time. The importance of both the left and right hemispheres in language processing is discussed. Special attention is devoted to the importance of contextualizing imaging data with corresponding behavioral and empirical testing data using a multivariate analysis of variance. This is the only study to date that includes: (1) longitudinal fMRI data with subject-based proficiency and behavioral data acquired in the same time frame; and (2) statistical modeling that demonstrates the importance of covariate language proficiency data for understanding imaging results of language acquisition. PMID:24961428

  11. Multilingualism and fMRI: Longitudinal Study of Second Language Acquisition.

    PubMed

    Andrews, Edna; Frigau, Luca; Voyvodic-Casabo, Clara; Voyvodic, James; Wright, John

    2013-05-28

    BOLD fMRI is often used for the study of human language. However, there are still very few attempts to conduct longitudinal fMRI studies in the study of language acquisition by measuring auditory comprehension and reading. The following paper is the first in a series concerning a unique longitudinal study devoted to the analysis of bi- and multilingual subjects who are: (1) already proficient in at least two languages; or (2) are acquiring Russian as a second/third language. The focus of the current analysis is to present data from the auditory sections of a set of three scans acquired from April, 2011 through April, 2012 on a five-person subject pool who are learning Russian during the study. All subjects were scanned using the same protocol for auditory comprehension on the same General Electric LX 3T Signa scanner in Duke University Hospital. Using a multivariate analysis of covariance (MANCOVA) for statistical analysis, proficiency measurements are shown to correlate significantly with scan results in the Russian conditions over time. The importance of both the left and right hemispheres in language processing is discussed. Special attention is devoted to the importance of contextualizing imaging data with corresponding behavioral and empirical testing data using a multivariate analysis of variance. This is the only study to date that includes: (1) longitudinal fMRI data with subject-based proficiency and behavioral data acquired in the same time frame; and (2) statistical modeling that demonstrates the importance of covariate language proficiency data for understanding imaging results of language acquisition.

  12. When experience meets language statistics: Individual variability in processing English compound words.

    PubMed

    Falkauskas, Kaitlin; Kuperman, Victor

    2015-11-01

    Statistical patterns of language use demonstrably affect language comprehension and language production. This study set out to determine whether the variable amount of exposure to such patterns leads to individual differences in reading behavior as measured via eye-movements. Previous studies have demonstrated that more proficient readers are less influenced by distributional biases in language (e.g., frequency, predictability, transitional probability) than poor readers. We hypothesized that a probabilistic bias that is characteristic of written but not spoken language would preferentially affect readers with greater exposure to printed materials in general and to the specific pattern engendering the bias. Readers of varying reading experience were presented with sentences including English compound words that can occur in 2 spelling formats with differing probabilities: concatenated (windowsill, used 40% of the time) or spaced (window sill, 60%). Linear mixed effects multiple regression models fitted to the eye-movement measures showed that the probabilistic bias toward the presented spelling had a stronger facilitatory effect on compounds that occurred more frequently (in any spelling) or belonged to larger morphological families, and on readers with higher scores on a test of exposure-to-print. Thus, the amount of support toward the compound's spelling is effectively exploited when reading, but only when the spelling patterns are entrenched in an individual's mental lexicon via overall exposure to print and to compounds with alternating spelling. We argue that research on the interplay of language use and structure is incomplete without proper characterization of how particular individuals, with varying levels of experience and skill, learn these language structures. (c) 2015 APA, all rights reserved).

  13. When experience meets language statistics: Individual variability in processing English compound words

    PubMed Central

    Falkauskas, Kaitlin; Kuperman, Victor

    2015-01-01

    Statistical patterns of language use demonstrably affect language comprehension and language production. This study set out to determine whether the variable amount of exposure to such patterns leads to individual differences in reading behaviour as measured via eye-movements. Previous studies have demonstrated that more proficient readers are less influenced by distributional biases in language (e.g. frequency, predictability, transitional probability) than poor readers. We hypothesized that a probabilistic bias that is characteristic of written but not spoken language would preferentially affect readers with greater exposure to printed materials in general and to the specific pattern engendering the bias. Readers of varying reading experience were presented with sentences including English compound words that can occur in two spelling formats with differing probabilities: concatenated (windowsill, used 40% of the time) or spaced (window sill, 60%). Linear mixed effects multiple regression models fitted to the eye-movement measures showed that the probabilistic bias towards the presented spelling had a stronger facilitatory effect on compounds that occurred more frequently (in any spelling) or belonged to larger morphological families, and on readers with higher scores on a test of exposure-to-print. Thus, the amount of support towards the compound’s spelling is effectively exploited when reading, but only when the spelling patterns are entrenched in an individual’s mental lexicon via overall exposure to print and to compounds with alternating spelling. We argue that research on the interplay of language use and structure is incomplete without proper characterization of how particular individuals, with varying levels of experience and skill, learn these language structures. PMID:26076328

  14. Rapid Expectation Adaptation during Syntactic Comprehension

    PubMed Central

    Fine, Alex B.; Jaeger, T. Florian; Farmer, Thomas A.; Qian, Ting

    2013-01-01

    When we read or listen to language, we are faced with the challenge of inferring intended messages from noisy input. This challenge is exacerbated by considerable variability between and within speakers. Focusing on syntactic processing (parsing), we test the hypothesis that language comprehenders rapidly adapt to the syntactic statistics of novel linguistic environments (e.g., speakers or genres). Two self-paced reading experiments investigate changes in readers’ syntactic expectations based on repeated exposure to sentences with temporary syntactic ambiguities (so-called “garden path sentences”). These sentences typically lead to a clear expectation violation signature when the temporary ambiguity is resolved to an a priori less expected structure (e.g., based on the statistics of the lexical context). We find that comprehenders rapidly adapt their syntactic expectations to converge towards the local statistics of novel environments. Specifically, repeated exposure to a priori unexpected structures can reduce, and even completely undo, their processing disadvantage (Experiment 1). The opposite is also observed: a priori expected structures become less expected (even eliciting garden paths) in environments where they are hardly ever observed (Experiment 2). Our findings suggest that, when changes in syntactic statistics are to be expected (e.g., when entering a novel environment), comprehenders can rapidly adapt their expectations, thereby overcoming the processing disadvantage that mistaken expectations would otherwise cause. Our findings take a step towards unifying insights from research in expectation-based models of language processing, syntactic priming, and statistical learning. PMID:24204909

  15. A Management Information System Model for Program Management. Ph.D. Thesis - Oklahoma State Univ.; [Computerized Systems Analysis

    NASA Technical Reports Server (NTRS)

    Shipman, D. L.

    1972-01-01

    The development of a model to simulate the information system of a program management type of organization is reported. The model statistically determines the following parameters: type of messages, destinations, delivery durations, type processing, processing durations, communication channels, outgoing messages, and priorites. The total management information system of the program management organization is considered, including formal and informal information flows and both facilities and equipment. The model is written in General Purpose System Simulation 2 computer programming language for use on the Univac 1108, Executive 8 computer. The model is simulated on a daily basis and collects queue and resource utilization statistics for each decision point. The statistics are then used by management to evaluate proposed resource allocations, to evaluate proposed changes to the system, and to identify potential problem areas. The model employs both empirical and theoretical distributions which are adjusted to simulate the information flow being studied.

  16. Functional language shift to the right hemisphere in patients with language-eloquent brain tumors.

    PubMed

    Krieg, Sandro M; Sollmann, Nico; Hauck, Theresa; Ille, Sebastian; Foerschler, Annette; Meyer, Bernhard; Ringel, Florian

    2013-01-01

    Language function is mainly located within the left hemisphere of the brain, especially in right-handed subjects. However, functional MRI (fMRI) has demonstrated changes of language organization in patients with left-sided perisylvian lesions to the right hemisphere. Because intracerebral lesions can impair fMRI, this study was designed to investigate human language plasticity with a virtual lesion model using repetitive navigated transcranial magnetic stimulation (rTMS). Fifteen patients with lesions of left-sided language-eloquent brain areas and 50 healthy and purely right-handed participants underwent bilateral rTMS language mapping via an object-naming task. All patients were proven to have left-sided language function during awake surgery. The rTMS-induced language errors were categorized into 6 different error types. The error ratio (induced errors/number of stimulations) was determined for each brain region on both hemispheres. A hemispheric dominance ratio was then defined for each region as the quotient of the error ratio (left/right) of the corresponding area of both hemispheres (ratio >1 = left dominant; ratio <1 = right dominant). Patients with language-eloquent lesions showed a statistically significantly lower ratio than healthy participants concerning "all errors" and "all errors without hesitations", which indicates a higher participation of the right hemisphere in language function. Yet, there was no cortical region with pronounced difference in language dominance compared to the whole hemisphere. This is the first study that shows by means of an anatomically accurate virtual lesion model that a shift of language function to the non-dominant hemisphere can occur.

  17. Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval

    NASA Astrophysics Data System (ADS)

    Zhang, Qingqing; Pan, Jielin; Lin, Yang; Shao, Jian; Yan, Yonghong

    In recent decades, there has been a great deal of research into the problem of bilingual speech recognition-to develop a recognizer that can handle inter- and intra-sentential language switching between two languages. This paper presents our recent work on the development of a grammar-constrained, Mandarin-English bilingual Speech Recognition System (MESRS) for real world music retrieval. Two of the main difficult issues in handling the bilingual speech recognition systems for real world applications are tackled in this paper. One is to balance the performance and the complexity of the bilingual speech recognition system; the other is to effectively deal with the matrix language accents in embedded language**. In order to process the intra-sentential language switching and reduce the amount of data required to robustly estimate statistical models, a compact single set of bilingual acoustic models derived by phone set merging and clustering is developed instead of using two separate monolingual models for each language. In our study, a novel Two-pass phone clustering method based on Confusion Matrix (TCM) is presented and compared with the log-likelihood measure method. Experiments testify that TCM can achieve better performance. Since potential system users' native language is Mandarin which is regarded as a matrix language in our application, their pronunciations of English as the embedded language usually contain Mandarin accents. In order to deal with the matrix language accents in embedded language, different non-native adaptation approaches are investigated. Experiments show that model retraining method outperforms the other common adaptation methods such as Maximum A Posteriori (MAP). With the effective incorporation of approaches on phone clustering and non-native adaptation, the Phrase Error Rate (PER) of MESRS for English utterances was reduced by 24.47% relatively compared to the baseline monolingual English system while the PER on Mandarin utterances was comparable to that of the baseline monolingual Mandarin system. The performance for bilingual utterances achieved 22.37% relative PER reduction.

  18. Reducing the language content in ToM tests: A developmental scale.

    PubMed

    Burnel, Morgane; Perrone-Bertolotti, Marcela; Reboul, Anne; Baciu, Monica; Durrleman, Stephanie

    2018-02-01

    The goal of the current study was to statistically evaluate the reliable scalability of a set of tasks designed to assess Theory of Mind (ToM) without language as a confounding variable. This tool might be useful to study ToM in populations where language is impaired or to study links between language and ToM. Low verbal versions of the ToM tasks proposed by Wellman and Liu (2004) for their scale were tested in 234 children (2.5 years to 11.9 years). Results showed that 5 of the tasks formed a scale according to both Guttman and Rasch models whereas all 6 tasks could form a scale according to the Rasch model only. The main difference from the original scale was that the Explicit False Belief task could be included whereas the Knowledge Access (KA) task could not. The authors argue that the more verbal version of the KA task administered in previous studies could have measured language understanding rather than ToM. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  19. Metrological traceability in education: A practical online system for measuring and managing middle school mathematics instruction

    NASA Astrophysics Data System (ADS)

    Torres Irribarra, D.; Freund, R.; Fisher, W.; Wilson, M.

    2015-02-01

    Computer-based, online assessments modelled, designed, and evaluated for adaptively administered invariant measurement are uniquely suited to defining and maintaining traceability to standardized units in education. An assessment of this kind is embedded in the Assessing Data Modeling and Statistical Reasoning (ADM) middle school mathematics curriculum. Diagnostic information about middle school students' learning of statistics and modeling is provided via computer-based formative assessments for seven constructs that comprise a learning progression for statistics and modeling from late elementary through the middle school grades. The seven constructs are: Data Display, Meta-Representational Competence, Conceptions of Statistics, Chance, Modeling Variability, Theory of Measurement, and Informal Inference. The end product is a web-delivered system built with Ruby on Rails for use by curriculum development teams working with classroom teachers in designing, developing, and delivering formative assessments. The online accessible system allows teachers to accurately diagnose students' unique comprehension and learning needs in a common language of real-time assessment, logging, analysis, feedback, and reporting.

  20. Adaptive Statistical Language Modeling; A Maximum Entropy Approach

    DTIC Science & Technology

    1994-04-19

    models exploit the immediate past only. To extract information from further back in the document’s history , I use trigger pairs as the basic information...9 2.2 Context-Free Estimation (Unigram) ...... .................... 12 2.3 Short-Term History (Conventional N-gram...12 2.4 Short-term Class History (Class-Based N-gram) ................... 14 2.5 Intermediate Distance ........ ........................... 16

  1. Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization

    ERIC Educational Resources Information Center

    Gelman, Andrew; Lee, Daniel; Guo, Jiqiang

    2015-01-01

    Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…

  2. The microcomputer scientific software series 3: general linear model--analysis of variance.

    Treesearch

    Harold M. Rauscher

    1985-01-01

    A BASIC language set of programs, designed for use on microcomputers, is presented. This set of programs will perform the analysis of variance for any statistical model describing either balanced or unbalanced designs. The program computes and displays the degrees of freedom, Type I sum of squares, and the mean square for the overall model, the error, and each factor...

  3. Spacecraft software training needs assessment research, appendices

    NASA Technical Reports Server (NTRS)

    Ratcliff, Shirley; Golas, Katharine

    1990-01-01

    The appendices to the previously reported study are presented: statistical data from task rating worksheets; SSD references; survey forms; fourth generation language, a powerful, long-term solution to maintenance cost; task list; methodology; SwRI's instructional systems development model; relevant research; and references.

  4. A Cultural Diffusion Model for the Rise and Fall of Programming Languages.

    PubMed

    Valverde, Sergi; Solé, Ricard V

    2015-07-01

    Our interaction with complex computing machines is mediated by programming languages (PLs), which constitute one of the major innovations in the evolution of technology. PLs allow flexible, scalable, and fast use of hardware and are largely responsible for shaping the history of information technology since the rise of computers in the 1950s. The rapid growth and impact of computers were followed closely by the development of PLs. As occurs with natural, human languages, PLs have emerged and gone extinct. There has been always a diversity of coexisting PLs that compete somewhat while occupying special niches. Here we show that the statistical patterns of language adoption, rise, and fall can be accounted for by a simple model in which a set of programmers can use several PLs, decide to use existing PLs used by other programmers, or decide not to use them. Our results highlight the influence of strong communities of practice in the diffusion of PL innovations.

  5. Learning across Languages: Bilingual Experience Supports Dual Language Statistical Word Segmentation

    ERIC Educational Resources Information Center

    Antovich, Dylan M.; Graf Estes, Katharine

    2018-01-01

    Bilingual acquisition presents learning challenges beyond those found in monolingual environments, including the need to segment speech in two languages. Infants may use statistical cues, such as syllable-level transitional probabilities, to segment words from fluent speech. In the present study we assessed monolingual and bilingual 14-month-olds'…

  6. The conceptual basis of mathematics in cardiology IV: statistics and model fitting.

    PubMed

    Bates, Jason H T; Sobel, Burton E

    2003-06-01

    This is the fourth in a series of four articles developed for the readers of Coronary Artery Disease. Without language ideas cannot be articulated. What may not be so immediately obvious is that they cannot be formulated either. One of the essential languages of cardiology is mathematics. Unfortunately, medical education does not emphasize, and in fact, often neglects empowering physicians to think mathematically. Reference to statistics, conditional probability, multicompartmental modeling, algebra, calculus and transforms is common but often without provision of genuine conceptual understanding. At the University of Vermont College of Medicine, Professor Bates developed a course designed to address these deficiencies. The course covered mathematical principles pertinent to clinical cardiovascular and pulmonary medicine and research. It focused on fundamental concepts to facilitate formulation and grasp of ideas. This series of four articles was developed to make the material available for a wider audience. The articles will be published sequentially in Coronary Artery Disease. Beginning with fundamental axioms and basic algebraic manipulations they address algebra, function and graph theory, real and complex numbers, calculus and differential equations, mathematical modeling, linear system theory and integral transforms and statistical theory. The principles and concepts they address provide the foundation needed for in-depth study of any of these topics. Perhaps of even more importance, they should empower cardiologists and cardiovascular researchers to utilize the language of mathematics in assessing the phenomena of immediate pertinence to diagnosis, pathophysiology and therapeutics. The presentations are interposed with queries (by Coronary Artery Disease abbreviated as CAD) simulating the nature of interactions that occurred during the course itself. Each article concludes with one or more examples illustrating application of the concepts covered to cardiovascular medicine and biology.

  7. An investigation of mathematics and science instruction in English and Spanish for English language learners

    NASA Astrophysics Data System (ADS)

    Rodriguez-Esquivel, Marina

    The contextual demands of language in content area are difficult for ELLS. Content in the native language furthers students' academic development and native language skills, while they are learning English. Content in English integrates pedagogical strategies for English acquisition with subject area instruction. The following models of curriculum content are provided in most Miami Dade County Public Schools: (a) mathematics instruction in the native language with science instruction in English or (b) science instruction in the native language with mathematics instruction in English. The purpose of this study was to investigate which model of instruction is more contextually supportive for mathematics and science achievement. A pretest and posttest, nonequivalent group design was used with 94 fifth grade ELLs who received instruction in curriculum model (a) or (b). This allowed for statistical analysis that detected a difference in the means of .5 standard deviations with a power of .80 at the .05 level of significance. Pretreatment and post-treatment assessments of mathematics, reading, and science achievement were obtained through the administration of Aprenda-Segunda Edicion and the Florida Comprehensive Achievement Test. The results indicated that students receiving mathematics in English and Science in Spanish scored higher on achievement tests in both Mathematics and Science than the students who received Mathematics in Spanish and Science in English. In addition, the mean score of students on the FCAT mathematics examination was higher than their mean score on the FCAT science examination regardless of the language of instruction.

  8. Statistical word learning in children with autism spectrum disorder and specific language impairment.

    PubMed

    Haebig, Eileen; Saffran, Jenny R; Ellis Weismer, Susan

    2017-11-01

    Word learning is an important component of language development that influences child outcomes across multiple domains. Despite the importance of word knowledge, word-learning mechanisms are poorly understood in children with specific language impairment (SLI) and children with autism spectrum disorder (ASD). This study examined underlying mechanisms of word learning, specifically, statistical learning and fast-mapping, in school-aged children with typical and atypical development. Statistical learning was assessed through a word segmentation task and fast-mapping was examined in an object-label association task. We also examined children's ability to map meaning onto newly segmented words in a third task that combined exposure to an artificial language and a fast-mapping task. Children with SLI had poorer performance on the word segmentation and fast-mapping tasks relative to the typically developing and ASD groups, who did not differ from one another. However, when children with SLI were exposed to an artificial language with phonemes used in the subsequent fast-mapping task, they successfully learned more words than in the isolated fast-mapping task. There was some evidence that word segmentation abilities are associated with word learning in school-aged children with typical development and ASD, but not SLI. Follow-up analyses also examined performance in children with ASD who did and did not have a language impairment. Children with ASD with language impairment evidenced intact statistical learning abilities, but subtle weaknesses in fast-mapping abilities. As the Procedural Deficit Hypothesis (PDH) predicts, children with SLI have impairments in statistical learning. However, children with SLI also have impairments in fast-mapping. Nonetheless, they are able to take advantage of additional phonological exposure to boost subsequent word-learning performance. In contrast to the PDH, children with ASD appear to have intact statistical learning, regardless of language status; however, fast-mapping abilities differ according to broader language skills. © 2017 Association for Child and Adolescent Mental Health.

  9. Statistical Analysis of the Indus Script Using n-Grams

    PubMed Central

    Yadav, Nisha; Joglekar, Hrishikesh; Rao, Rajesh P. N.; Vahia, Mayank N.; Adhikari, Ronojoy; Mahadevan, Iravatham

    2010-01-01

    The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilization. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyze the syntax of the Indus script. We find that unigrams follow a Zipf-Mandelbrot distribution. Text beginner and ender distributions are unequal, providing internal evidence for syntax. We see clear evidence of strong bigram correlations and extract significant pairs and triplets using a log-likelihood measure of association. Highly frequent pairs and triplets are not always highly significant. The model performance is evaluated using information-theoretic measures and cross-validation. The model can restore doubtfully read texts with an accuracy of about 75%. We find that a quadrigram Markov chain saturates information theoretic measures against a held-out corpus. Our work forms the basis for the development of a stochastic grammar which may be used to explore the syntax of the Indus script in greater detail. PMID:20333254

  10. Assessing the relationship between computational speed and precision: a case study comparing an interpreted versus compiled programming language using a stochastic simulation model in diabetes care.

    PubMed

    McEwan, Phil; Bergenheim, Klas; Yuan, Yong; Tetlow, Anthony P; Gordon, Jason P

    2010-01-01

    Simulation techniques are well suited to modelling diseases yet can be computationally intensive. This study explores the relationship between modelled effect size, statistical precision, and efficiency gains achieved using variance reduction and an executable programming language. A published simulation model designed to model a population with type 2 diabetes mellitus based on the UKPDS 68 outcomes equations was coded in both Visual Basic for Applications (VBA) and C++. Efficiency gains due to the programming language were evaluated, as was the impact of antithetic variates to reduce variance, using predicted QALYs over a 40-year time horizon. The use of C++ provided a 75- and 90-fold reduction in simulation run time when using mean and sampled input values, respectively. For a series of 50 one-way sensitivity analyses, this would yield a total run time of 2 minutes when using C++, compared with 155 minutes for VBA when using mean input values. The use of antithetic variates typically resulted in a 53% reduction in the number of simulation replications and run time required. When drawing all input values to the model from distributions, the use of C++ and variance reduction resulted in a 246-fold improvement in computation time compared with VBA - for which the evaluation of 50 scenarios would correspondingly require 3.8 hours (C++) and approximately 14.5 days (VBA). The choice of programming language used in an economic model, as well as the methods for improving precision of model output can have profound effects on computation time. When constructing complex models, more computationally efficient approaches such as C++ and variance reduction should be considered; concerns regarding model transparency using compiled languages are best addressed via thorough documentation and model validation.

  11. Statistical Significance Testing in Second Language Research: Basic Problems and Suggestions for Reform

    ERIC Educational Resources Information Center

    Norris, John M.

    2015-01-01

    Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…

  12. Statistical Literacy among Applied Linguists and Second Language Acquisition Researchers

    ERIC Educational Resources Information Center

    Loewen, Shawn; Lavolette, Elizabeth; Spino, Le Anne; Papi, Mostafa; Schmidtke, Jens; Sterling, Scott; Wolff, Dominik

    2014-01-01

    The importance of statistical knowledge in applied linguistics and second language acquisition (SLA) research has been emphasized in recent publications. However, the last investigation of the statistical literacy of applied linguists occurred more than 25 years ago (Lazaraton, Riggenbach, & Ediger, 1987). The current study undertook a partial…

  13. A neural network model of metaphor understanding with dynamic interaction based on a statistical language analysis: targeting a human-like model.

    PubMed

    Terai, Asuka; Nakagawa, Masanori

    2007-08-01

    The purpose of this paper is to construct a model that represents the human process of understanding metaphors, focusing specifically on similes of the form an "A like B". Generally speaking, human beings are able to generate and understand many sorts of metaphors. This study constructs the model based on a probabilistic knowledge structure for concepts which is computed from a statistical analysis of a large-scale corpus. Consequently, this model is able to cover the many kinds of metaphors that human beings can generate. Moreover, the model implements the dynamic process of metaphor understanding by using a neural network with dynamic interactions. Finally, the validity of the model is confirmed by comparing model simulations with the results from a psychological experiment.

  14. Informatics technology mimics ecology: dense, mutualistic collaboration networks are associated with higher publication rates.

    PubMed

    Sorani, Marco D

    2012-01-01

    Information technology (IT) adoption enables biomedical research. Publications are an accepted measure of research output, and network models can describe the collaborative nature of publication. In particular, ecological networks can serve as analogies for publication and technology adoption. We constructed network models of adoption of bioinformatics programming languages and health IT (HIT) from the literature.We selected seven programming languages and four types of HIT. We performed PubMed searches to identify publications since 2001. We calculated summary statistics and analyzed spatiotemporal relationships. Then, we assessed ecological models of specialization, cooperativity, competition, evolution, biodiversity, and stability associated with publications.Adoption of HIT has been variable, while scripting languages have experienced rapid adoption. Hospital systems had the largest HIT research corpus, while Perl had the largest language corpus. Scripting languages represented the largest connected network components. The relationship between edges and nodes was linear, though Bioconductor had more edges than expected and Perl had fewer. Spatiotemporal relationships were weak. Most languages shared a bioinformatics specialization and appeared mutualistic or competitive. HIT specializations varied. Specialization was highest for Bioconductor and radiology systems. Specialization and cooperativity were positively correlated among languages but negatively correlated among HIT. Rates of language evolution were similar. Biodiversity among languages grew in the first half of the decade and stabilized, while diversity among HIT was variable but flat. Compared with publications in 2001, correlation with publications one year later was positive while correlation after ten years was weak and negative.Adoption of new technologies can be unpredictable. Spatiotemporal relationships facilitate adoption but are not sufficient. As with ecosystems, dense, mutualistic, specialized co-habitation is associated with faster growth. There are rapidly changing trends in external technological and macroeconomic influences. We propose that a better understanding of how technologies are adopted can facilitate their development.

  15. Functional Language Shift to the Right Hemisphere in Patients with Language-Eloquent Brain Tumors

    PubMed Central

    Krieg, Sandro M.; Sollmann, Nico; Hauck, Theresa; Ille, Sebastian; Foerschler, Annette; Meyer, Bernhard; Ringel, Florian

    2013-01-01

    Objectives Language function is mainly located within the left hemisphere of the brain, especially in right-handed subjects. However, functional MRI (fMRI) has demonstrated changes of language organization in patients with left-sided perisylvian lesions to the right hemisphere. Because intracerebral lesions can impair fMRI, this study was designed to investigate human language plasticity with a virtual lesion model using repetitive navigated transcranial magnetic stimulation (rTMS). Experimental design Fifteen patients with lesions of left-sided language-eloquent brain areas and 50 healthy and purely right-handed participants underwent bilateral rTMS language mapping via an object-naming task. All patients were proven to have left-sided language function during awake surgery. The rTMS-induced language errors were categorized into 6 different error types. The error ratio (induced errors/number of stimulations) was determined for each brain region on both hemispheres. A hemispheric dominance ratio was then defined for each region as the quotient of the error ratio (left/right) of the corresponding area of both hemispheres (ratio >1  =  left dominant; ratio <1  =  right dominant). Results Patients with language-eloquent lesions showed a statistically significantly lower ratio than healthy participants concerning “all errors” and “all errors without hesitations”, which indicates a higher participation of the right hemisphere in language function. Yet, there was no cortical region with pronounced difference in language dominance compared to the whole hemisphere. Conclusions This is the first study that shows by means of an anatomically accurate virtual lesion model that a shift of language function to the non-dominant hemisphere can occur. PMID:24069410

  16. Accuracy of Presurgical Functional MR Imaging for Language Mapping of Brain Tumors: A Systematic Review and Meta-Analysis.

    PubMed

    Weng, Hsu-Huei; Noll, Kyle R; Johnson, Jason M; Prabhu, Sujit S; Tsai, Yuan-Hsiung; Chang, Sheng-Wei; Huang, Yen-Chu; Lee, Jiann-Der; Yang, Jen-Tsung; Yang, Cheng-Ta; Tsai, Ying-Huang; Yang, Chun-Yuh; Hazle, John D; Schomer, Donald F; Liu, Ho-Ling

    2018-02-01

    Purpose To compare functional magnetic resonance (MR) imaging for language mapping (hereafter, language functional MR imaging) with direct cortical stimulation (DCS) in patients with brain tumors and to assess factors associated with its accuracy. Materials and Methods PubMed/MEDLINE and related databases were searched for research articles published between January 2000 and September 2016. Findings were pooled by using bivariate random-effects and hierarchic summary receiver operating characteristic curve models. Meta-regression and subgroup analyses were performed to evaluate whether publication year, functional MR imaging paradigm, magnetic field strength, statistical threshold, and analysis software affected classification accuracy. Results Ten articles with a total of 214 patients were included in the analysis. On a per-patient basis, the pooled sensitivity and specificity of functional MR imaging was 44% (95% confidence interval [CI]: 14%, 78%) and 80% (95% CI: 54%, 93%), respectively. On a per-tag basis (ie, each DCS stimulation site or "tag" was considered a separate data point across all patients), the pooled sensitivity and specificity were 67% (95% CI: 51%, 80%) and 55% (95% CI: 25%, 82%), respectively. The per-tag analysis showed significantly higher sensitivity for studies with shorter functional MR imaging session times (P = .03) and relaxed statistical threshold (P = .05). Significantly higher specificity was found when expressive language task (P = .02), longer functional MR imaging session times (P < .01), visual presentation of stimuli (P = .04), and stringent statistical threshold (P = .01) were used. Conclusion Results of this study showed moderate accuracy of language functional MR imaging when compared with intraoperative DCS, and the included studies displayed significant methodologic heterogeneity. © RSNA, 2017 Online supplemental material is available for this article.

  17. Conversational Language Use as a Predictor of Early Reading Development: Language History as a Moderating Variable

    PubMed Central

    DeThorne, Laura Segebart; Petrill, Stephen A.; Schatschneider, Chris; Cutting, Laurie

    2010-01-01

    Purpose The present study examined the nature of concurrent and predictive associations between conversational language use and reading development during early school-age years. Method Language and reading data from 380 twins in the Western Reserve Reading Project were examined via phenotypic correlations and multilevel modeling on exploratory latent factors. Results In the concurrent prediction of children’s early reading abilities, a significant interaction emerged between children’s conversational language abilities and their history of reported language difficulties. Specifically, conversational language concurrently predicted reading development above and beyond variance accounted for by formal vocabulary scores, but only in children with a history of reported language difficulties. A similar trend was noted in predicting reading skills 1 year later, but the interaction was not statistically significant. Conclusions Findings suggest a more nuanced view of the association between spoken language and early reading than is commonly proposed. One possibility is that children with and without a history of reported language difficulties rely on different skills, or the same skills to differing degrees, when completing early reading-related tasks. Future studies should examine the causal link between conversational language and early reading specifically in children with a history of reported language difficulties. PMID:20150410

  18. Language/culture/mind/brain. Progress at the margins between disciplines.

    PubMed

    Kuhl, P K; Tsao, F M; Liu, H M; Zhang, Y; De Boer, B

    2001-05-01

    At the forefront of research on language are new data demonstrating infants' strategies in the early acquisition of language. The data show that infants perceptually "map" critical aspects of ambient language in the first year of life before they can speak. Statistical and abstract properties of speech are picked up through exposure to ambient language. Moreover, linguistic experience alters infants' perception of speech, warping perception in a way that enhances native-language speech processing. Infants' strategies are unexpected and unpredicted by historical views. At the same time, research in three additional disciplines is contributing to our understanding of language and its acquisition by children. Cultural anthropologists are demonstrating the universality of adult speech behavior when addressing infants and children across cultures, and this is creating a new view of the role adult speakers play in bringing about language in the child. Neuroscientists, using the techniques of modern brain imaging, are revealing the temporal and structural aspects of language processing by the brain and suggesting new views of the critical period for language. Computer scientists, modeling the computational aspects of childrens' language acquisition, are meeting success using biologically inspired neural networks. Although a consilient view cannot yet be offered, the cross-disciplinary interaction now seen among scientists pursuing one of humans' greatest achievements, language, is quite promising.

  19. Experimental study on GMM-based speaker recognition

    NASA Astrophysics Data System (ADS)

    Ye, Wenxing; Wu, Dapeng; Nucci, Antonio

    2010-04-01

    Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.

  20. Constraints on Statistical Computations at 10 Months of Age: The Use of Phonological Features

    ERIC Educational Resources Information Center

    Gonzalez-Gomez, Nayeli; Nazzi, Thierry

    2015-01-01

    Recently, several studies have argued that infants capitalize on the statistical properties of natural languages to acquire the linguistic structure of their native language, but the kinds of constraints which apply to statistical computations remain largely unknown. Here we explored French-learning infants' perceptual preference for…

  1. A model for indexing medical documents combining statistical and symbolic knowledge.

    PubMed

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-10-11

    To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.

  2. How language production shapes language form and comprehension

    PubMed Central

    MacDonald, Maryellen C.

    2012-01-01

    Language production processes can provide insight into how language comprehension works and language typology—why languages tend to have certain characteristics more often than others. Drawing on work in memory retrieval, motor planning, and serial order in action planning, the Production-Distribution-Comprehension (PDC) account links work in the fields of language production, typology, and comprehension: (1) faced with substantial computational burdens of planning and producing utterances, language producers implicitly follow three biases in utterance planning that promote word order choices that reduce these burdens, thereby improving production fluency. (2) These choices, repeated over many utterances and individuals, shape the distributions of utterance forms in language. The claim that language form stems in large degree from producers' attempts to mitigate utterance planning difficulty is contrasted with alternative accounts in which form is driven by language use more broadly, language acquisition processes, or producers' attempts to create language forms that are easily understood by comprehenders. (3) Language perceivers implicitly learn the statistical regularities in their linguistic input, and they use this prior experience to guide comprehension of subsequent language. In particular, they learn to predict the sequential structure of linguistic signals, based on the statistics of previously-encountered input. Thus, key aspects of comprehension behavior are tied to lexico-syntactic statistics in the language, which in turn derive from utterance planning biases promoting production of comparatively easy utterance forms over more difficult ones. This approach contrasts with classic theories in which comprehension behaviors are attributed to innate design features of the language comprehension system and associated working memory. The PDC instead links basic features of comprehension to a different source: production processes that shape language form. PMID:23637689

  3. Individual Biases, Cultural Evolution, and the Statistical Nature of Language Universals: The Case of Colour Naming Systems

    PubMed Central

    Baronchelli, Andrea; Loreto, Vittorio; Puglisi, Andrea

    2015-01-01

    Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces. PMID:26018391

  4. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  5. Text generation from Taiwanese Sign Language using a PST-based language model for augmentative communication.

    PubMed

    Wu, Chung-Hsien; Chiu, Yu-Hsien; Guo, Chi-Shiang

    2004-12-01

    This paper proposes a novel approach to the generation of Chinese sentences from ill-formed Taiwanese Sign Language (TSL) for people with hearing impairments. First, a sign icon-based virtual keyboard is constructed to provide a visualized interface to retrieve sign icons from a sign database. A proposed language model (LM), based on a predictive sentence template (PST) tree, integrates a statistical variable n-gram LM and linguistic constraints to deal with the translation problem from ill-formed sign sequences to grammatical written sentences. The PST tree trained by a corpus collected from the deaf schools was used to model the correspondence between signed and written Chinese. In addition, a set of phrase formation rules, based on trigger pair category, was derived for sentence pattern expansion. These approaches improved the efficiency of text generation and the accuracy of word prediction and, therefore, improved the input rate. For the assessment of practical communication aids, a reading-comprehension training program with ten profoundly deaf students was undertaken in a deaf school in Tainan, Taiwan. Evaluation results show that the literacy aptitude test and subjective satisfactory level are significantly improved.

  6. Propositional idea density in women's written language over the lifespan: computerized analysis.

    PubMed

    Ferguson, Alison; Spencer, Elizabeth; Craig, Hugh; Colyvas, Kim

    2014-06-01

    The informativeness of written language, as measured by Propositional Idea Density (PD), has been shown to be a sensitive predictive index of language decline with age and dementia in previous research. The present study investigated the influence of age and education on the written language of three large cohorts of women from the general community, born between 1973 and 1978, 1946-51 and 1921-26. Written texts were obtained from the Australian Longitudinal Study on Women's Health in which participants were invited to respond to an open-ended question about their health. The informativeness of written comments of 10 words or more (90% of the total number of comments) was analyzed using the Computerized Propositional Idea Density Rater 3 (CPIDR-3). Over 2.5 million words used in 37,705 written responses from 19,512 respondents were analyzed. Based on a linear mixed model approach to statistical analysis with adjustment for several factors including number of comments per respondent and number of words per comment, a small but statistically significant effect of age was identified for the older cohort with mean age 78 years. The mean PD per word for this cohort was lower than the younger and mid-aged cohorts with mean age 27 and 53 years respectively, with mean reduction in PD 95% confidence interval (CI) of .006 (.003, .008) and .009 (.008, .011) respectively. This suggests that PD for this population of women was relatively more stable over the adult lifespan than has been reported previously even in late old age. There was no statistically significant effect of education level. Computerized analyses were found to greatly facilitate the study of informativeness of this large corpus of written language. Directions for further research are discussed in relation to the need for extended investigation of the variability of the measure for potential application to the identification of acquired language pathologies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Statistical physics of language dynamics

    NASA Astrophysics Data System (ADS)

    Loreto, Vittorio; Baronchelli, Andrea; Mukherjee, Animesh; Puglisi, Andrea; Tria, Francesca

    2011-04-01

    Language dynamics is a rapidly growing field that focuses on all processes related to the emergence, evolution, change and extinction of languages. Recently, the study of self-organization and evolution of language and meaning has led to the idea that a community of language users can be seen as a complex dynamical system, which collectively solves the problem of developing a shared communication framework through the back-and-forth signaling between individuals. We shall review some of the progress made in the past few years and highlight potential future directions of research in this area. In particular, the emergence of a common lexicon and of a shared set of linguistic categories will be discussed, as examples corresponding to the early stages of a language. The extent to which synthetic modeling is nowadays contributing to the ongoing debate in cognitive science will be pointed out. In addition, the burst of growth of the web is providing new experimental frameworks. It makes available a huge amount of resources, both as novel tools and data to be analyzed, allowing quantitative and large-scale analysis of the processes underlying the emergence of a collective information and language dynamics.

  8. Timing in turn-taking and its implications for processing models of language

    PubMed Central

    Levinson, Stephen C.; Torreira, Francisco

    2015-01-01

    The core niche for language use is in verbal interaction, involving the rapid exchange of turns at talking. This paper reviews the extensive literature about this system, adding new statistical analyses of behavioral data where they have been missing, demonstrating that turn-taking has the systematic properties originally noted by Sacks et al. (1974; hereafter SSJ). This system poses some significant puzzles for current theories of language processing: the gaps between turns are short (of the order of 200 ms), but the latencies involved in language production are much longer (over 600 ms). This seems to imply that participants in conversation must predict (or ‘project’ as SSJ have it) the end of the current speaker’s turn in order to prepare their response in advance. This in turn implies some overlap between production and comprehension despite their use of common processing resources. Collecting together what is known behaviorally and experimentally about the system, the space for systematic explanations of language processing for conversation can be significantly narrowed, and we sketch some first model of the mental processes involved for the participant preparing to speak next. PMID:26124727

  9. Bootstrapping language acquisition.

    PubMed

    Abend, Omri; Kwiatkowski, Tom; Smith, Nathaniel J; Goldwater, Sharon; Steedman, Mark

    2017-07-01

    The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured representations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child's task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms). Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on techniques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the "vocabulary spurt"), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. The learner thus demonstrates how statistical learning over structured representations can provide a unified account for these seemingly disparate phenomena. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. The Interplay between Spoken Language and Informal Definitions of Statistical Concepts

    ERIC Educational Resources Information Center

    Lavy, Ilana; Mashiach-Eizenberg, Michal

    2009-01-01

    Various terms are used to describe mathematical concepts, in general, and statistical concepts, in particular. Regarding statistical concepts in the Hebrew language, some of these terms have the same meaning both in their everyday use and in mathematics, such as Mode; some of them have a different meaning, such as Expected value and Life…

  11. Inferring Action Structure and Causal Relationships in Continuous Sequences of Human Action

    DTIC Science & Technology

    2014-01-01

    language processing literature (e.g., Brent, 1999; Venkataraman , 2001), and which were also used by Goldwater et al. (2009). Precision (P) is the...trees in oriented linear graphs. Simon Stevin: Wis-en Natuurkundig Tijdschrift, 28 , 203. Venkataraman , A. (2001). A statistical model for word discovery

  12. Monte Carlo Approach for Reliability Estimations in Generalizability Studies.

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    A Monte Carlo approach is proposed, using the Statistical Analysis System (SAS) programming language, for estimating reliability coefficients in generalizability theory studies. Test scores are generated by a probabilistic model that considers the probability for a person with a given ability score to answer an item with a given difficulty…

  13. Automated speech understanding: the next generation

    NASA Astrophysics Data System (ADS)

    Picone, J.; Ebel, W. J.; Deshmukh, N.

    1995-04-01

    Modern speech understanding systems merge interdisciplinary technologies from Signal Processing, Pattern Recognition, Natural Language, and Linguistics into a unified statistical framework. These systems, which have applications in a wide range of signal processing problems, represent a revolution in Digital Signal Processing (DSP). Once a field dominated by vector-oriented processors and linear algebra-based mathematics, the current generation of DSP-based systems rely on sophisticated statistical models implemented using a complex software paradigm. Such systems are now capable of understanding continuous speech input for vocabularies of several thousand words in operational environments. The current generation of deployed systems, based on small vocabularies of isolated words, will soon be replaced by a new technology offering natural language access to vast information resources such as the Internet, and provide completely automated voice interfaces for mundane tasks such as travel planning and directory assistance.

  14. Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children.

    PubMed

    Zubrick, Stephen R; Taylor, Catherine L; Christensen, Daniel

    2015-01-01

    Oral language is the foundation of literacy. Naturally, policies and practices to promote children's literacy begin in early childhood and have a strong focus on developing children's oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children's progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children's oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children's progress along the oral to literate continuum is stable and predictable. Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years.

  15. Sampling Assumptions Affect Use of Indirect Negative Evidence in Language Learning.

    PubMed

    Hsu, Anne; Griffiths, Thomas L

    2016-01-01

    A classic debate in cognitive science revolves around understanding how children learn complex linguistic patterns, such as restrictions on verb alternations and contractions, without negative evidence. Recently, probabilistic models of language learning have been applied to this problem, framing it as a statistical inference from a random sample of sentences. These probabilistic models predict that learners should be sensitive to the way in which sentences are sampled. There are two main types of sampling assumptions that can operate in language learning: strong and weak sampling. Strong sampling, as assumed by probabilistic models, assumes the learning input is drawn from a distribution of grammatical samples from the underlying language and aims to learn this distribution. Thus, under strong sampling, the absence of a sentence construction from the input provides evidence that it has low or zero probability of grammaticality. Weak sampling does not make assumptions about the distribution from which the input is drawn, and thus the absence of a construction from the input as not used as evidence of its ungrammaticality. We demonstrate in a series of artificial language learning experiments that adults can produce behavior consistent with both sets of sampling assumptions, depending on how the learning problem is presented. These results suggest that people use information about the way in which linguistic input is sampled to guide their learning.

  16. Sampling Assumptions Affect Use of Indirect Negative Evidence in Language Learning

    PubMed Central

    2016-01-01

    A classic debate in cognitive science revolves around understanding how children learn complex linguistic patterns, such as restrictions on verb alternations and contractions, without negative evidence. Recently, probabilistic models of language learning have been applied to this problem, framing it as a statistical inference from a random sample of sentences. These probabilistic models predict that learners should be sensitive to the way in which sentences are sampled. There are two main types of sampling assumptions that can operate in language learning: strong and weak sampling. Strong sampling, as assumed by probabilistic models, assumes the learning input is drawn from a distribution of grammatical samples from the underlying language and aims to learn this distribution. Thus, under strong sampling, the absence of a sentence construction from the input provides evidence that it has low or zero probability of grammaticality. Weak sampling does not make assumptions about the distribution from which the input is drawn, and thus the absence of a construction from the input as not used as evidence of its ungrammaticality. We demonstrate in a series of artificial language learning experiments that adults can produce behavior consistent with both sets of sampling assumptions, depending on how the learning problem is presented. These results suggest that people use information about the way in which linguistic input is sampled to guide their learning. PMID:27310576

  17. A novel probabilistic framework for event-based speech recognition

    NASA Astrophysics Data System (ADS)

    Juneja, Amit; Espy-Wilson, Carol

    2003-10-01

    One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.

  18. On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve.

    PubMed

    Eng, Kevin H; Schiller, Emily; Morrell, Kayla

    2015-11-03

    Researchers developing biomarkers for cancer prognosis from quantitative gene expression data are often faced with an odd methodological discrepancy: while Cox's proportional hazards model, the appropriate and popular technique, produces a continuous and relative risk score, it is hard to cast the estimate in clear clinical terms like median months of survival and percent of patients affected. To produce a familiar Kaplan-Meier plot, researchers commonly make the decision to dichotomize a continuous (often unimodal and symmetric) score. It is well known in the statistical literature that this procedure induces significant bias. We illustrate the liabilities of common techniques for categorizing a risk score and discuss alternative approaches. We promote the use of the restricted mean survival (RMS) and the corresponding RMS curve that may be thought of as an analog to the best fit line from simple linear regression. Continuous biomarker workflows should be modified to include the more rigorous statistical techniques and descriptive plots described in this article. All statistics discussed can be computed via standard functions in the Survival package of the R statistical programming language. Example R language code for the RMS curve is presented in the appendix.

  19. Non-invasive brain stimulation to investigate language production in healthy speakers: A meta-analysis.

    PubMed

    Klaus, Jana; Schutter, Dennis J L G

    2018-06-01

    Non-invasive brain stimulation (NIBS) has become a common method to study the interrelations between the brain and language functioning. This meta-analysis examined the efficacy of transcranial magnetic stimulation (TMS) and direct current stimulation (tDCS) in the study of language production in healthy volunteers. Forty-five effect sizes from 30 studies which investigated the effects of NIBS on picture naming or verbal fluency in healthy participants were meta-analysed. Further sub-analyses investigated potential influences of stimulation type, control, target site, task, online vs. offline application, and current density of the target electrode. Random effects modelling showed a small, but reliable effect of NIBS on language production. Subsequent analyses indicated larger weighted mean effect sizes for TMS as compared to tDCS studies. No statistical differences for the other sub-analyses were observed. We conclude that NIBS is a useful method for neuroscientific studies on language production in healthy volunteers. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Cross-language Transfer of Metalinguistic Skills: Evidence from Spelling English Words by Korean Students in Grades 4, 5 and 6.

    PubMed

    Yeon, Sookkyung; Bae, Han Suk; Joshi, R Malatesha

    2017-11-01

    The present study examined unique and shared contributions of Korean (first language) phonological, orthographic and morphological awareness (PA, OA and MA, respectively) to English (second/foreign language) spelling among 287 fourth-grade, fifth-grade and sixth-grade Korean children. Korean measures of PA, OA and MA were administered, in addition to English vocabulary and spelling measures. Results from structural equation modelling showed that PA, OA and MA were caused by one common construct, metalinguistic awareness (META), and the contribution of Korean META to English spelling was statistically significant, controlling for English vocabulary. In particular, Korean MA and PA played unique roles in explaining English spelling; whereas Korean OA did not significantly contribute to English spelling. Findings from the present study provided empirical evidence of first language META transfer effect on second/foreign language spelling development. Educational implications and future research ideas are discussed. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. A Model for Indexing Medical Documents Combining Statistical and Symbolic Knowledge.

    PubMed Central

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-01-01

    OBJECTIVES: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. METHODS: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). RESULTS: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. CONCLUSIONS: The use of several terminologies leads to more precise indexing. The improvement achieved in the model’s implementation performances as a result of using semantic relationships is encouraging. PMID:18693792

  2. Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death

    PubMed Central

    Petersen, Alexander M.; Tenenbaum, Joel; Havlin, Shlomo; Stanley, H. Eugene

    2012-01-01

    We analyze the dynamic properties of 107 words recorded in English, Spanish and Hebrew over the period 1800–2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis. PMID:22423321

  3. Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death

    NASA Astrophysics Data System (ADS)

    Petersen, Alexander M.; Tenenbaum, Joel; Havlin, Shlomo; Stanley, H. Eugene

    2012-03-01

    We analyze the dynamic properties of 107 words recorded in English, Spanish and Hebrew over the period 1800-2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis.

  4. Rank Diversity of Languages: Generic Behavior in Computational Linguistics

    PubMed Central

    Cocho, Germinal; Flores, Jorge; Gershenson, Carlos; Pineda, Carlos; Sánchez, Sergio

    2015-01-01

    Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution rank diversity. We calculate this diversity for books published in six European languages since 1800, and find that it follows a universal lognormal distribution. Based on the mean and standard deviation associated with the lognormal distribution, we define three different word regimes of languages: “heads” consist of words which almost do not change their rank in time, “bodies” are words of general use, while “tails” are comprised by context-specific words and vary their rank considerably in time. The heads and bodies reflect the size of language cores identified by linguists for basic communication. We propose a Gaussian random walk model which reproduces the rank variation of words in time and thus the diversity. Rank diversity of words can be understood as the result of random variations in rank, where the size of the variation depends on the rank itself. We find that the core size is similar for all languages studied. PMID:25849150

  5. Rank diversity of languages: generic behavior in computational linguistics.

    PubMed

    Cocho, Germinal; Flores, Jorge; Gershenson, Carlos; Pineda, Carlos; Sánchez, Sergio

    2015-01-01

    Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution rank diversity. We calculate this diversity for books published in six European languages since 1800, and find that it follows a universal lognormal distribution. Based on the mean and standard deviation associated with the lognormal distribution, we define three different word regimes of languages: "heads" consist of words which almost do not change their rank in time, "bodies" are words of general use, while "tails" are comprised by context-specific words and vary their rank considerably in time. The heads and bodies reflect the size of language cores identified by linguists for basic communication. We propose a Gaussian random walk model which reproduces the rank variation of words in time and thus the diversity. Rank diversity of words can be understood as the result of random variations in rank, where the size of the variation depends on the rank itself. We find that the core size is similar for all languages studied.

  6. Word-level language modeling for P300 spellers based on discriminative graphical models

    NASA Astrophysics Data System (ADS)

    Delgado Saa, Jaime F.; de Pesters, Adriana; McFarland, Dennis; Çetin, Müjdat

    2015-04-01

    Objective. In this work we propose a probabilistic graphical model framework that uses language priors at the level of words as a mechanism to increase the performance of P300-based spellers. Approach. This paper is concerned with brain-computer interfaces based on P300 spellers. Motivated by P300 spelling scenarios involving communication based on a limited vocabulary, we propose a probabilistic graphical model framework and an associated classification algorithm that uses learned statistical models of language at the level of words. Exploiting such high-level contextual information helps reduce the error rate of the speller. Main results. Our experimental results demonstrate that the proposed approach offers several advantages over existing methods. Most importantly, it increases the classification accuracy while reducing the number of times the letters need to be flashed, increasing the communication rate of the system. Significance. The proposed approach models all the variables in the P300 speller in a unified framework and has the capability to correct errors in previous letters in a word, given the data for the current one. The structure of the model we propose allows the use of efficient inference algorithms, which in turn makes it possible to use this approach in real-time applications.

  7. Automatic Coding of Short Text Responses via Clustering in Educational Assessment

    ERIC Educational Resources Information Center

    Zehner, Fabian; Sälzer, Christine; Goldhammer, Frank

    2016-01-01

    Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the "Programme…

  8. Bootstrapping in a Language of Thought: A Formal Model of Numerical Concept Learning

    ERIC Educational Resources Information Center

    Piantadosi, Steven T.; Tenenbaum, Joshua B.; Goodman, Noah D.

    2012-01-01

    In acquiring number words, children exhibit a qualitative leap in which they transition from understanding a few number words, to possessing a rich system of interrelated numerical concepts. We present a computational framework for understanding this inductive leap as the consequence of statistical inference over a sufficiently powerful…

  9. Age and experience shape developmental changes in the neural basis of language-related learning.

    PubMed

    McNealy, Kristin; Mazziotta, John C; Dapretto, Mirella

    2011-11-01

    Very little is known about the neural underpinnings of language learning across the lifespan and how these might be modified by maturational and experiential factors. Building on behavioral research highlighting the importance of early word segmentation (i.e. the detection of word boundaries in continuous speech) for subsequent language learning, here we characterize developmental changes in brain activity as this process occurs online, using data collected in a mixed cross-sectional and longitudinal design. One hundred and fifty-six participants, ranging from age 5 to adulthood, underwent functional magnetic resonance imaging (fMRI) while listening to three novel streams of continuous speech, which contained either strong statistical regularities, strong statistical regularities and speech cues, or weak statistical regularities providing minimal cues to word boundaries. All age groups displayed significant signal increases over time in temporal cortices for the streams with high statistical regularities; however, we observed a significant right-to-left shift in the laterality of these learning-related increases with age. Interestingly, only the 5- to 10-year-old children displayed significant signal increases for the stream with low statistical regularities, suggesting an age-related decrease in sensitivity to more subtle statistical cues. Further, in a sample of 78 10-year-olds, we examined the impact of proficiency in a second language and level of pubertal development on learning-related signal increases, showing that the brain regions involved in language learning are influenced by both experiential and maturational factors. 2011 Blackwell Publishing Ltd.

  10. An order statistics approach to the halo model for galaxies

    NASA Astrophysics Data System (ADS)

    Paul, Niladri; Paranjape, Aseem; Sheth, Ravi K.

    2017-04-01

    We use the halo model to explore the implications of assuming that galaxy luminosities in groups are randomly drawn from an underlying luminosity function. We show that even the simplest of such order statistics models - one in which this luminosity function p(L) is universal - naturally produces a number of features associated with previous analyses based on the 'central plus Poisson satellites' hypothesis. These include the monotonic relation of mean central luminosity with halo mass, the lognormal distribution around this mean and the tight relation between the central and satellite mass scales. In stark contrast to observations of galaxy clustering; however, this model predicts no luminosity dependence of large-scale clustering. We then show that an extended version of this model, based on the order statistics of a halo mass dependent luminosity function p(L|m), is in much better agreement with the clustering data as well as satellite luminosities, but systematically underpredicts central luminosities. This brings into focus the idea that central galaxies constitute a distinct population that is affected by different physical processes than are the satellites. We model this physical difference as a statistical brightening of the central luminosities, over and above the order statistics prediction. The magnitude gap between the brightest and second brightest group galaxy is predicted as a by-product, and is also in good agreement with observations. We propose that this order statistics framework provides a useful language in which to compare the halo model for galaxies with more physically motivated galaxy formation models.

  11. Domain General Constraints on Statistical Learning

    ERIC Educational Resources Information Center

    Thiessen, Erik D.

    2011-01-01

    All theories of language development suggest that learning is constrained. However, theories differ on whether these constraints arise from language-specific processes or have domain-general origins such as the characteristics of human perception and information processing. The current experiments explored constraints on statistical learning of…

  12. Effects of statistical learning on the acquisition of grammatical categories through Qur'anic memorization: A natural experiment.

    PubMed

    Zuhurudeen, Fathima Manaar; Huang, Yi Ting

    2016-03-01

    Empirical evidence for statistical learning comes from artificial language tasks, but it is unclear how these effects scale up outside of the lab. The current study turns to a real-world test case of statistical learning where native English speakers encounter the syntactic regularities of Arabic through memorization of the Qur'an. This unique input provides extended exposure to the complexity of a natural language, with minimal semantic cues. Memorizers were asked to distinguish unfamiliar nouns and verbs based on their co-occurrence with familiar pronouns in an Arabic language sample. Their performance was compared to that of classroom learners who had explicit knowledge of pronoun meanings and grammatical functions. Grammatical judgments were more accurate in memorizers compared to non-memorizers. No effects of classroom experience were found. These results demonstrate that real-world exposure to the statistical properties of a natural language facilitates the acquisition of grammatical categories. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Building flexible real-time systems using the Flex language

    NASA Technical Reports Server (NTRS)

    Kenny, Kevin B.; Lin, Kwei-Jay

    1991-01-01

    The design and implementation of a real-time programming language called Flex, which is a derivative of C++, are presented. It is shown how different types of timing requirements might be expressed and enforced in Flex, how they might be fulfilled in a flexible way using different program models, and how the programming environment can help in making binding and scheduling decisions. The timing constraint primitives in Flex are easy to use yet powerful enough to define both independent and relative timing constraints. Program models like imprecise computation and performance polymorphism can carry out flexible real-time programs. In addition, programmers can use a performance measurement tool that produces statistically correct timing models to predict the expected execution time of a program and to help make binding decisions. A real-time programming environment is also presented.

  14. Microsimulation Modeling for Health Decision Sciences Using R: A Tutorial.

    PubMed

    Krijkamp, Eline M; Alarid-Escudero, Fernando; Enns, Eva A; Jalal, Hawre J; Hunink, M G Myriam; Pechlivanoglou, Petros

    2018-04-01

    Microsimulation models are becoming increasingly common in the field of decision modeling for health. Because microsimulation models are computationally more demanding than traditional Markov cohort models, the use of computer programming languages in their development has become more common. R is a programming language that has gained recognition within the field of decision modeling. It has the capacity to perform microsimulation models more efficiently than software commonly used for decision modeling, incorporate statistical analyses within decision models, and produce more transparent models and reproducible results. However, no clear guidance for the implementation of microsimulation models in R exists. In this tutorial, we provide a step-by-step guide to build microsimulation models in R and illustrate the use of this guide on a simple, but transferable, hypothetical decision problem. We guide the reader through the necessary steps and provide generic R code that is flexible and can be adapted for other models. We also show how this code can be extended to address more complex model structures and provide an efficient microsimulation approach that relies on vectorization solutions.

  15. Informatics Technology Mimics Ecology: Dense, Mutualistic Collaboration Networks Are Associated with Higher Publication Rates

    PubMed Central

    Sorani, Marco D.

    2012-01-01

    Information technology (IT) adoption enables biomedical research. Publications are an accepted measure of research output, and network models can describe the collaborative nature of publication. In particular, ecological networks can serve as analogies for publication and technology adoption. We constructed network models of adoption of bioinformatics programming languages and health IT (HIT) from the literature. We selected seven programming languages and four types of HIT. We performed PubMed searches to identify publications since 2001. We calculated summary statistics and analyzed spatiotemporal relationships. Then, we assessed ecological models of specialization, cooperativity, competition, evolution, biodiversity, and stability associated with publications. Adoption of HIT has been variable, while scripting languages have experienced rapid adoption. Hospital systems had the largest HIT research corpus, while Perl had the largest language corpus. Scripting languages represented the largest connected network components. The relationship between edges and nodes was linear, though Bioconductor had more edges than expected and Perl had fewer. Spatiotemporal relationships were weak. Most languages shared a bioinformatics specialization and appeared mutualistic or competitive. HIT specializations varied. Specialization was highest for Bioconductor and radiology systems. Specialization and cooperativity were positively correlated among languages but negatively correlated among HIT. Rates of language evolution were similar. Biodiversity among languages grew in the first half of the decade and stabilized, while diversity among HIT was variable but flat. Compared with publications in 2001, correlation with publications one year later was positive while correlation after ten years was weak and negative. Adoption of new technologies can be unpredictable. Spatiotemporal relationships facilitate adoption but are not sufficient. As with ecosystems, dense, mutualistic, specialized co-habitation is associated with faster growth. There are rapidly changing trends in external technological and macroeconomic influences. We propose that a better understanding of how technologies are adopted can facilitate their development. PMID:22279593

  16. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  17. Cracking the Language Code: Neural Mechanisms Underlying Speech Parsing

    PubMed Central

    McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

    2013-01-01

    Word segmentation, detecting word boundaries in continuous speech, is a critical aspect of language learning. Previous research in infants and adults demonstrated that a stream of speech can be readily segmented based solely on the statistical and speech cues afforded by the input. Using functional magnetic resonance imaging (fMRI), the neural substrate of word segmentation was examined on-line as participants listened to three streams of concatenated syllables, containing either statistical regularities alone, statistical regularities and speech cues, or no cues. Despite the participants’ inability to explicitly detect differences between the speech streams, neural activity differed significantly across conditions, with left-lateralized signal increases in temporal cortices observed only when participants listened to streams containing statistical regularities, particularly the stream containing speech cues. In a second fMRI study, designed to verify that word segmentation had implicitly taken place, participants listened to trisyllabic combinations that occurred with different frequencies in the streams of speech they just heard (“words,” 45 times; “partwords,” 15 times; “nonwords,” once). Reliably greater activity in left inferior and middle frontal gyri was observed when comparing words with partwords and, to a lesser extent, when comparing partwords with nonwords. Activity in these regions, taken to index the implicit detection of word boundaries, was positively correlated with participants’ rapid auditory processing skills. These findings provide a neural signature of on-line word segmentation in the mature brain and an initial model with which to study developmental changes in the neural architecture involved in processing speech cues during language learning. PMID:16855090

  18. The impact of science notebook writing on ELL and low-SES students' science language development and conceptual understanding

    NASA Astrophysics Data System (ADS)

    Huerta, Margarita

    This quantitative study explored the impact of literacy integration in a science inquiry classroom involving the use of science notebooks on the academic language development and conceptual understanding of students from diverse (i.e., English Language Learners, or ELLs) and low socio-economic status (low-SES) backgrounds. The study derived from a randomized, longitudinal, field-based NSF funded research project (NSF Award No. DRL - 0822343) targeting ELL and non-ELL students from low-SES backgrounds in a large urban school district in Southeast Texas. The study used a scoring rubric (modified and tested for validity and reliability) to analyze fifth-grade school students' science notebook entries. Scores for academic language quality (or, for brevity, language ) were used to compare language growth over time across three time points (i.e., beginning, middle, and end of the school year) and to compare students across categories (ELL, former ELL, non-ELL, and gender) using descriptive statistics and mixed between-within subjects analysis of variance (ANOVA). Scores for conceptual understanding (or, for brevity, concept) were used to compare students across categories (ELL, former ELL, non-ELL, and gender) in three domains using descriptive statistics and ANOVA. A correlational analysis was conducted to explore the relationship, if any, between language scores and concept scores for each group. Students demonstrated statistically significant growth over time in their academic language as reflected by science notebook scores. While ELL students scored lower than former ELL and non-ELL students at the first two time points, they caught up to their peers by the third time point. Similarly, females outperformed males in language scores in the first two time points, but males caught up to females in the third time point. In analyzing conceptual scores, ELLs had statistically significant lower scores than former-ELL and non-ELL students, and females outperformed males in the first two domains. These differences, however, were not statistically significant in the last domain. Last, correlations between language and concept scores were overall, positive, large, and significant across domains and groups. The study presents a rubric useful for quantifying diverse students' science notebook entries, and findings add to the sparse research on the impact of writing in diverse students' language development and conceptual understanding in science.

  19. Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies.

    PubMed

    Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre

    2018-03-15

    Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. We propose a methodology based on Cox mixed models and written under the R language. This semiparametric model is indeed flexible enough to fit duration data. To compare log-linear and Cox mixed models in terms of goodness-of-fit on real data sets, we also provide a procedure based on simulations and quantile-quantile plots. We present two examples from a data set of speech and gesture interactions, which illustrate the limitations of linear and log-linear mixed models, as compared to Cox models. The linear models are not validated on our data, whereas Cox models are. Moreover, in the second example, the Cox model exhibits a significant effect that the linear model does not. We provide methods to select the best-fitting models for repeated duration data and to compare statistical methodologies. In this study, we show that Cox models are best suited to the analysis of our data set.

  20. Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript

    PubMed Central

    Amancio, Diego R.; Altmann, Eduardo G.; Rybski, Diego; Oliveira, Osvaldo N.; Costa, Luciano da F.

    2013-01-01

    While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications. PMID:23844002

  1. Probing the statistical properties of unknown texts: application to the Voynich Manuscript.

    PubMed

    Amancio, Diego R; Altmann, Eduardo G; Rybski, Diego; Oliveira, Osvaldo N; Costa, Luciano da F

    2013-01-01

    While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

  2. Special Operations Forces Language and Culture Needs Assessment Project: Training Emphasis: Language and Culture

    DTIC Science & Technology

    2010-02-25

    gave significantly higher emphasis ratings (i.e., a statistically significant difference between SOF operators and SOF leaders). Responses were made...i.e., a statistically significant difference between SOF operators and SOF leaders). Responses were made on the following scale: 1 = No emphasis, 2...missions?” Means with an asterisk (*) indicate that the group gave significantly higher emphasis ratings (i.e., a statistically significant difference

  3. Familiar units prevail over statistical cues in word segmentation.

    PubMed

    Poulin-Charronnat, Bénédicte; Perruchet, Pierre; Tillmann, Barbara; Peereman, Ronald

    2017-09-01

    In language acquisition research, the prevailing position is that listeners exploit statistical cues, in particular transitional probabilities between syllables, to discover words of a language. However, other cues are also involved in word discovery. Assessing the weight learners give to these different cues leads to a better understanding of the processes underlying speech segmentation. The present study evaluated whether adult learners preferentially used known units or statistical cues for segmenting continuous speech. Before the exposure phase, participants were familiarized with part-words of a three-word artificial language. This design allowed the dissociation of the influence of statistical cues and familiar units, with statistical cues favoring word segmentation and familiar units favoring (nonoptimal) part-word segmentation. In Experiment 1, performance in a two-alternative forced choice (2AFC) task between words and part-words revealed part-word segmentation (even though part-words were less cohesive in terms of transitional probabilities and less frequent than words). By contrast, an unfamiliarized group exhibited word segmentation, as usually observed in standard conditions. Experiment 2 used a syllable-detection task to remove the likely contamination of performance by memory and strategy effects in the 2AFC task. Overall, the results suggest that familiar units overrode statistical cues, ultimately questioning the need for computation mechanisms of transitional probabilities (TPs) in natural language speech segmentation.

  4. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks

    PubMed Central

    2017-01-01

    In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery. PMID:29392184

  5. What can we learn from learning models about sensitivity to letter-order in visual word recognition?

    PubMed Central

    Lerner, Itamar; Armstrong, Blair C.; Frost, Ram

    2014-01-01

    Recent research on the effects of letter transposition in Indo-European Languages has shown that readers are surprisingly tolerant of these manipulations in a range of tasks. This evidence has motivated the development of new computational models of reading that regard flexibility in positional coding to be a core and universal principle of the reading process. Here we argue that such approach does not capture cross-linguistic differences in transposed-letter effects, nor do they explain them. To address this issue, we investigated how a simple domain-general connectionist architecture performs in tasks such as letter-transposition and letter substitution when it had learned to process words in the context of different linguistic environments. The results show that in spite of of the neurobiological noise involved in registering letter-position in all languages, flexibility and inflexibility in coding letter order is also shaped by the statistical orthographic properties of words in a language, such as the relative prevalence of anagrams. Our learning model also generated novel predictions for targeted empirical research, demonstrating a clear advantage of learning models for studying visual word recognition. PMID:25431521

  6. Conclusiveness of natural languages and recognition of images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wojcik, Z.M.

    1983-01-01

    The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It ismore » suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.« less

  7. The Metamorphosis of the Statistical Segmentation Output: Lexicalization during Artificial Language Learning

    ERIC Educational Resources Information Center

    Fernandes, Tania; Kolinsky, Regine; Ventura, Paulo

    2009-01-01

    This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to…

  8. Statistical Learning in Specific Language Impairment: A Meta-Analysis

    ERIC Educational Resources Information Center

    Lammertink, Imme; Boersma, Paul; Wijnen, Frank; Rispens, Judith

    2017-01-01

    Purpose: The current meta-analysis provides a quantitative overview of published and unpublished studies on statistical learning in the auditory verbal domain in people with and without specific language impairment (SLI). The database used for the meta-analysis is accessible online and open to updates (Community-Augmented Meta-Analysis), which…

  9. Genetic Programming as Alternative for Predicting Development Effort of Individual Software Projects

    PubMed Central

    Chavoya, Arturo; Lopez-Martin, Cuauhtemoc; Andalon-Garcia, Irma R.; Meda-Campaña, M. E.

    2012-01-01

    Statistical and genetic programming techniques have been used to predict the software development effort of large software projects. In this paper, a genetic programming model was used for predicting the effort required in individually developed projects. Accuracy obtained from a genetic programming model was compared against one generated from the application of a statistical regression model. A sample of 219 projects developed by 71 practitioners was used for generating the two models, whereas another sample of 130 projects developed by 38 practitioners was used for validating them. The models used two kinds of lines of code as well as programming language experience as independent variables. Accuracy results from the model obtained with genetic programming suggest that it could be used to predict the software development effort of individual projects when these projects have been developed in a disciplined manner within a development-controlled environment. PMID:23226305

  10. Statistical Deviations From the Theoretical Only-SBU Model to Estimate MCU Rates in SRAMs

    NASA Astrophysics Data System (ADS)

    Franco, Francisco J.; Clemente, Juan Antonio; Baylac, Maud; Rey, Solenne; Villa, Francesca; Mecha, Hortensia; Agapito, Juan A.; Puchner, Helmut; Hubert, Guillaume; Velazco, Raoul

    2017-08-01

    This paper addresses a well-known problem that occurs when memories are exposed to radiation: the determination if a bit flip is isolated or if it belongs to a multiple event. As it is unusual to know the physical layout of the memory, this paper proposes to evaluate the statistical properties of the sets of corrupted addresses and to compare the results with a mathematical prediction model where all of the events are single bit upsets. A set of rules easy to implement in common programming languages can be iteratively applied if anomalies are observed, thus yielding a classification of errors quite closer to reality (more than 80% accuracy in our experiments).

  11. InfoSyll: A Syllabary Providing Statistical Information on Phonological and Orthographic Syllables

    ERIC Educational Resources Information Center

    Chetail, Fabienne; Mathey, Stephanie

    2010-01-01

    There is now a growing body of evidence in various languages supporting the claim that syllables are functional units of visual word processing. In the perspective of modeling the processing of polysyllabic words and the activation of syllables, current studies investigate syllabic effects with subtle manipulations. We present here a syllabary of…

  12. Protocol Analysis of Man-Computer Languages: Design and Preliminary Findings

    DTIC Science & Technology

    1975-07-01

    describes a statistical moael o propot of the exercise. It is one particular model used for analysis of variance which allows us to test the significance of... Body : Congressman Blake will be visiting Camp Smith to confer with J6, J612, and Col. Smith with regard to operation of the pilot project on

  13. Are Hispanic, Asian, Native American, or Language-Minority Children Overrepresented in Special Education?

    ERIC Educational Resources Information Center

    Morgan, Paul L.; Farkas, George; Cook, Michael; Strassfeld, Natasha M.; Hillemeier, Marianne M.; Pun, Wik Hung; Wang, Yangyang; Schussler, Deborah L.

    2018-01-01

    We conducted a best-evidence synthesis of 22 studies to examine whether systemic bias explained minority disproportionate overrepresentation in special education. Of the total regression model estimates, only 7/168 (4.2%), 14/208 (6.7%), 2/37 (5.4%), and 6/91 (6.6%) indicated statistically significant overrepresentation for Hispanic, Asian, Native…

  14. Using Statistical Techniques and Web Search to Correct ESL Errors

    ERIC Educational Resources Information Center

    Gamon, Michael; Leacock, Claudia; Brockett, Chris; Dolan, William B.; Gao, Jianfeng; Belenko, Dmitriy; Klementiev, Alexandre

    2009-01-01

    In this paper we present a system for automatic correction of errors made by learners of English. The system has two novel aspects. First, machine-learned classifiers trained on large amounts of native data and a very large language model are combined to optimize the precision of suggested corrections. Second, the user can access real-life web…

  15. THE LEXICOSTATISTICAL CLASSIFICATION OF THE AUSTRONESIAN LANGUAGES.

    ERIC Educational Resources Information Center

    DYEN, ISIDORE

    STATISTICAL DATA DEALING WITH BASIC VOCABULARY COMPARISONS AMONG A SIGNIFICANT GROUP OF AUSTRONESIAN LANGUAGES ARE PRESENTED. SOME OF THE LANGUAGES ARE CLASSIFIED INTO SUBGROUPS UNDER GEOGRAPHICAL DIVISIONS, AND OTHERS ARE REGARDED AS SUBGROUPS IN THEMSELVES. THE LANGUAGES COVERED IN THE STUDY STRETCH GEOGRAPHICALLY FROM MADAGASCAR TO EASTER…

  16. Comparison of Oral Language Usage among English Language Learners Diagnosed with a Learning Disability and Those in General Education

    ERIC Educational Resources Information Center

    Pray, Lisa

    2009-01-01

    The investigator compared the linguistic characteristics of Spanish and English language samples taken from English language learners (ELLs) diagnosed with an academic learning disability (LD) and ELLs in general education to determine if the errors and characteristics of their language use differ. There was a statistically significant difference…

  17. Interactions between statistical and semantic information in infant language development

    PubMed Central

    Lany, Jill; Saffran, Jenny R.

    2013-01-01

    Infants can use statistical regularities to form rudimentary word categories (e.g. noun, verb), and to learn the meanings common to words from those categories. Using an artificial language methodology, we probed the mechanisms by which two types of statistical cues (distributional and phonological regularities) affect word learning. Because linking distributional cues vs. phonological information to semantics make different computational demands on learners, we also tested whether their use is related to language proficiency. We found that 22-month-old infants with smaller vocabularies generalized using phonological cues; however, infants with larger vocabularies showed the opposite pattern of results, generalizing based on distributional cues. These findings suggest that both phonological and distributional cues marking word categories promote early word learning. Moreover, while correlations between these cues are important to forming word categories, we found infants’ weighting of these cues in subsequent word-learning tasks changes over the course of early language development. PMID:21884336

  18. Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.

    PubMed

    Batterink, Laura J

    2017-07-01

    The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.

  19. A unified statistical approach to non-negative matrix factorization and probabilistic latent semantic indexing

    PubMed Central

    Wang, Guoli; Ebrahimi, Nader

    2014-01-01

    Non-negative matrix factorization (NMF) is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into the product of two nonnegative matrices, W and H, such that V ∼ W H. It has been shown to have a parts-based, sparse representation of the data. NMF has been successfully applied in a variety of areas such as natural language processing, neuroscience, information retrieval, image processing, speech recognition and computational biology for the analysis and interpretation of large-scale data. There has also been simultaneous development of a related statistical latent class modeling approach, namely, probabilistic latent semantic indexing (PLSI), for analyzing and interpreting co-occurrence count data arising in natural language processing. In this paper, we present a generalized statistical approach to NMF and PLSI based on Renyi's divergence between two non-negative matrices, stemming from the Poisson likelihood. Our approach unifies various competing models and provides a unique theoretical framework for these methods. We propose a unified algorithm for NMF and provide a rigorous proof of monotonicity of multiplicative updates for W and H. In addition, we generalize the relationship between NMF and PLSI within this framework. We demonstrate the applicability and utility of our approach as well as its superior performance relative to existing methods using real-life and simulated document clustering data. PMID:25821345

  20. A unified statistical approach to non-negative matrix factorization and probabilistic latent semantic indexing.

    PubMed

    Devarajan, Karthik; Wang, Guoli; Ebrahimi, Nader

    2015-04-01

    Non-negative matrix factorization (NMF) is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into the product of two nonnegative matrices, W and H , such that V ∼ W H . It has been shown to have a parts-based, sparse representation of the data. NMF has been successfully applied in a variety of areas such as natural language processing, neuroscience, information retrieval, image processing, speech recognition and computational biology for the analysis and interpretation of large-scale data. There has also been simultaneous development of a related statistical latent class modeling approach, namely, probabilistic latent semantic indexing (PLSI), for analyzing and interpreting co-occurrence count data arising in natural language processing. In this paper, we present a generalized statistical approach to NMF and PLSI based on Renyi's divergence between two non-negative matrices, stemming from the Poisson likelihood. Our approach unifies various competing models and provides a unique theoretical framework for these methods. We propose a unified algorithm for NMF and provide a rigorous proof of monotonicity of multiplicative updates for W and H . In addition, we generalize the relationship between NMF and PLSI within this framework. We demonstrate the applicability and utility of our approach as well as its superior performance relative to existing methods using real-life and simulated document clustering data.

  1. Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children

    PubMed Central

    Zubrick, Stephen R.; Taylor, Catherine L.; Christensen, Daniel

    2015-01-01

    Aims Oral language is the foundation of literacy. Naturally, policies and practices to promote children’s literacy begin in early childhood and have a strong focus on developing children’s oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children’s progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children’s oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children’s progress along the oral to literate continuum is stable and predictable. Findings Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years. PMID:26352436

  2. Individual Differences in Statistical Learning Predict Children's Comprehension of Syntax

    ERIC Educational Resources Information Center

    Kidd, Evan; Arciuli, Joanne

    2016-01-01

    Variability in children's language acquisition is likely due to a number of cognitive and social variables. The current study investigated whether individual differences in statistical learning (SL), which has been implicated in language acquisition, independently predicted 6- to 8-year-old's comprehension of syntax. Sixty-eight (N = 68)…

  3. The Role of Statistical Learning and Working Memory in L2 Speakers' Pattern Learning

    ERIC Educational Resources Information Center

    McDonough, Kim; Trofimovich, Pavel

    2016-01-01

    This study investigated whether second language (L2) speakers' morphosyntactic pattern learning was predicted by their statistical learning and working memory abilities. Across three experiments, Thai English as a Foreign Language (EFL) university students (N = 140) were exposed to either the transitive construction in Esperanto (e.g., "tauro…

  4. Inferential Statistics in "Language Teaching Research": A Review and Ways Forward

    ERIC Educational Resources Information Center

    Lindstromberg, Seth

    2016-01-01

    This article reviews all (quasi)experimental studies appearing in the first 19 volumes (1997-2015) of "Language Teaching Research" (LTR). Specifically, it provides an overview of how statistical analyses were conducted in these studies and of how the analyses were reported. The overall conclusion is that there has been a tight adherence…

  5. Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages.

    PubMed

    Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart

    2016-01-01

    Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure-regularities arising in an ordered series of syllable timings-testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint.

  6. Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages

    PubMed Central

    Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart

    2016-01-01

    Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure—regularities arising in an ordered series of syllable timings—testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint. PMID:27994544

  7. Orthographic influences on division of labor in learning to read Chinese and English: Insights from computational modeling

    PubMed Central

    Yang, Jianfeng; Shu, Hua; McCandliss, Bruce D.; Zevin, Jason D.

    2013-01-01

    Learning to read any language requires learning to map among print, sound and meaning. Writing systems differ in a number of factors that influence both the ease and rate with which reading skill can be acquired, as well as the eventual division of labor between phonological and semantic processes. Further, developmental reading disability manifests differently across writing systems, and may be related to different deficits in constitutive processes. Here we simulate some aspects of reading acquisition in Chinese and English using the same model architecture for both writing systems. The contribution of semantic and phonological processing to literacy acquisition in the two languages is simulated, including specific effects of phonological and semantic deficits. Further, we demonstrate that similar patterns of performance are observed when the same model is trained on both Chinese and English as an "early bilingual." The results are consistent with the view that reading skill is acquired by the application of statistical learning rules to mappings among print, sound and meaning, and that differences in the typical and disordered acquisition of reading skill between writing systems are driven by differences in the statistical patterns of the writing systems themselves, rather than differences in cognitive architecture of the learner. PMID:24587693

  8. The Struggles over African Languages

    ERIC Educational Resources Information Center

    Maseko, Pam; Vale, Peter

    2016-01-01

    In this interview, African Language expert Pam Maseko speaks of her own background and her first encounter with culture outside of her mother tongue, isiXhosa. A statistical breakdown of South African languages is provided as background. She discusses Western (originally missionary) codification of African languages and suggests that this approach…

  9. Fitting direct covariance structures by the MSTRUCT modeling language of the CALIS procedure.

    PubMed

    Yung, Yiu-Fai; Browne, Michael W; Zhang, Wei

    2015-02-01

    This paper demonstrates the usefulness and flexibility of the general structural equation modelling (SEM) approach to fitting direct covariance patterns or structures (as opposed to fitting implied covariance structures from functional relationships among variables). In particular, the MSTRUCT modelling language (or syntax) of the CALIS procedure (SAS/STAT version 9.22 or later: SAS Institute, 2010) is used to illustrate the SEM approach. The MSTRUCT modelling language supports a direct covariance pattern specification of each covariance element. It also supports the input of additional independent and dependent parameters. Model tests, fit statistics, estimates, and their standard errors are then produced under the general SEM framework. By using numerical and computational examples, the following tests of basic covariance patterns are illustrated: sphericity, compound symmetry, and multiple-group covariance patterns. Specification and testing of two complex correlation structures, the circumplex pattern and the composite direct product models with or without composite errors and scales, are also illustrated by the MSTRUCT syntax. It is concluded that the SEM approach offers a general and flexible modelling of direct covariance and correlation patterns. In conjunction with the use of SAS macros, the MSTRUCT syntax provides an easy-to-use interface for specifying and fitting complex covariance and correlation structures, even when the number of variables or parameters becomes large. © 2014 The British Psychological Society.

  10. Exploring the Relation Between Memory, Gestural Communication, and the Emergence of Language in Infancy: A Longitudinal Study

    PubMed Central

    Heimann, Mikael; Strid, Karin; Smith, Lars; Tjus, Tomas; Ulvund, Stein Erik; Meltzoff, Andrew N.

    2006-01-01

    The relationship between recall memory, visual recognition memory, social communication, and the emergence of language skills was measured in a longitudinal study. Thirty typically developing Swedish children were tested at 6, 9 and 14 months. The result showed that, in combination, visual recognition memory at 6 months, deferred imitation at 9 months and turn-taking skills at 14 months could explain 41% of the variance in the infants’ production of communicative gestures as measured by a Swedish variant of the MacArthur Communicative Development Inventories (CDI). In this statistical model, deferred imitation stood out as the strongest predictor. PMID:16886041

  11. Systematic individualized narrative language intervention on the personal narratives of children with autism.

    PubMed

    Petersen, Douglas B; Brown, Catherine L; Ukrainetz, Teresa A; Wise, Christine; Spencer, Trina D; Zebre, Jennifer

    2014-01-01

    The purpose of this study was to investigate the effect of an individualized, systematic language intervention on the personal narratives of children with autism. A single-subject, multiple-baseline design across participants and behaviors was used to examine the effect of the intervention on language features of personal narratives. Three 6- to 8-year-old boys with autism participated in 12 individual intervention sessions that targeted 2-3 story grammar elements (e.g., problem, plan) and 3-4 linguistic complexity elements (e.g., causal subordination, adverbs) selected from each participant's baseline performance. Intervention involved repeated retellings of customized model narratives and the generation of personal narratives with a systematic reduction of visual and verbal scaffolding. Independent personal narratives generated at the end of each baseline, intervention, and maintenance session were analyzed for presence and sophistication of targeted features. Graphical and statistical results showed immediate improvement in targeted language features as a function of intervention. There was mixed evidence of maintenance 2 and 7 weeks after intervention. Children with autism can benefit from an individualized, systematic intervention targeting specific narrative language features. Greater intensity of intervention may be needed to gain enduring effects for some language features.

  12. Visual statistical learning is related to natural language ability in adults: An ERP study.

    PubMed

    Daltrozzo, Jerome; Emerson, Samantha N; Deocampo, Joanne; Singh, Sonia; Freggens, Marjorie; Branum-Martin, Lee; Conway, Christopher M

    2017-03-01

    Statistical learning (SL) is believed to enable language acquisition by allowing individuals to learn regularities within linguistic input. However, neural evidence supporting a direct relationship between SL and language ability is scarce. We investigated whether there are associations between event-related potential (ERP) correlates of SL and language abilities while controlling for the general level of selective attention. Seventeen adults completed tests of visual SL, receptive vocabulary, grammatical ability, and sentence completion. Response times and ERPs showed that SL is related to receptive vocabulary and grammatical ability. ERPs indicated that the relationship between SL and grammatical ability was independent of attention while the association between SL and receptive vocabulary depended on attention. The implications of these dissociative relationships in terms of underlying mechanisms of SL and language are discussed. These results further elucidate the cognitive nature of the links between SL mechanisms and language abilities. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Visual statistical learning is related to natural language ability in adults: An ERP Study

    PubMed Central

    Daltrozzo, Jerome; Emerson, Samantha N.; Deocampo, Joanne; Singh, Sonia; Freggens, Marjorie; Branum-Martin, Lee; Conway, Christopher M.

    2017-01-01

    Statistical learning (SL) is believed to enable language acquisition by allowing individuals to learn regularities within linguistic input. However, neural evidence supporting a direct relationship between SL and language ability is scarce. We investigated whether there are associations between event-related potential (ERP) correlates of SL and language abilities while controlling for the general level of selective attention. Seventeen adults completed tests of visual SL, receptive vocabulary, grammatical ability, and sentence completion. Response times and ERPs showed that SL is related to receptive vocabulary and grammatical ability. ERPs indicated that the relationship between SL and grammatical ability was independent of attention while the association between SL and receptive vocabulary depended on attention. The implications of these dissociative relationships in terms of underlying mechanisms of SL and language are discussed. These results further elucidate the cognitive nature of the links between SL mechanisms and language abilities. PMID:28086142

  14. Statistical physics of vehicular traffic and some related systems

    NASA Astrophysics Data System (ADS)

    Chowdhury, Debashish; Santen, Ludger; Schadschneider, Andreas

    2000-05-01

    In the so-called “microscopic” models of vehicular traffic, attention is paid explicitly to each individual vehicle each of which is represented by a “particle”; the nature of the “interactions” among these particles is determined by the way the vehicles influence each others’ movement. Therefore, vehicular traffic, modeled as a system of interacting “particles” driven far from equilibrium, offers the possibility to study various fundamental aspects of truly nonequilibrium systems which are of current interest in statistical physics. Analytical as well as numerical techniques of statistical physics are being used to study these models to understand rich variety of physical phenomena exhibited by vehicular traffic. Some of these phenomena, observed in vehicular traffic under different circumstances, include transitions from one dynamical phase to another, criticality and self-organized criticality, metastability and hysteresis, phase-segregation, etc. In this critical review, written from the perspective of statistical physics, we explain the guiding principles behind all the main theoretical approaches. But we present detailed discussions on the results obtained mainly from the so-called “particle-hopping” models, particularly emphasizing those which have been formulated in recent years using the language of cellular automata.

  15. Reconciling statistical and systems science approaches to public health.

    PubMed

    Ip, Edward H; Rahmandad, Hazhir; Shoham, David A; Hammond, Ross; Huang, Terry T-K; Wang, Youfa; Mabry, Patricia L

    2013-10-01

    Although systems science has emerged as a set of innovative approaches to study complex phenomena, many topically focused researchers including clinicians and scientists working in public health are somewhat befuddled by this methodology that at times appears to be radically different from analytic methods, such as statistical modeling, to which the researchers are accustomed. There also appears to be conflicts between complex systems approaches and traditional statistical methodologies, both in terms of their underlying strategies and the languages they use. We argue that the conflicts are resolvable, and the sooner the better for the field. In this article, we show how statistical and systems science approaches can be reconciled, and how together they can advance solutions to complex problems. We do this by comparing the methods within a theoretical framework based on the work of population biologist Richard Levins. We present different types of models as representing different tradeoffs among the four desiderata of generality, realism, fit, and precision.

  16. Reconciling Statistical and Systems Science Approaches to Public Health

    PubMed Central

    Ip, Edward H.; Rahmandad, Hazhir; Shoham, David A.; Hammond, Ross; Huang, Terry T.-K.; Wang, Youfa; Mabry, Patricia L.

    2016-01-01

    Although systems science has emerged as a set of innovative approaches to study complex phenomena, many topically focused researchers including clinicians and scientists working in public health are somewhat befuddled by this methodology that at times appears to be radically different from analytic methods, such as statistical modeling, to which the researchers are accustomed. There also appears to be conflicts between complex systems approaches and traditional statistical methodologies, both in terms of their underlying strategies and the languages they use. We argue that the conflicts are resolvable, and the sooner the better for the field. In this article, we show how statistical and systems science approaches can be reconciled, and how together they can advance solutions to complex problems. We do this by comparing the methods within a theoretical framework based on the work of population biologist Richard Levins. We present different types of models as representing different tradeoffs among the four desiderata of generality, realism, fit, and precision. PMID:24084395

  17. The Grammar of Exchange: A Comparative Study of Reciprocal Constructions Across Languages

    PubMed Central

    Majid, Asifa; Evans, Nicholas; Gaby, Alice; Levinson, Stephen C.

    2010-01-01

    Cultures are built on social exchange. Most languages have dedicated grammatical machinery for expressing this. To demonstrate that statistical methods can also be applied to grammatical meaning, we here ask whether the underlying meanings of these grammatical constructions are based on shared common concepts. To explore this, we designed video stimuli of reciprocated actions (e.g., “giving to each other”) and symmetrical states (e.g., “sitting next to each other”), and with the help of a team of linguists collected responses from 20 languages around the world. Statistical analyses revealed that many languages do, in fact, share a common conceptual core for reciprocal meanings but that this is not a universally expressed concept. The recurrent pattern of conceptual packaging found across languages is compatible with the view that there is a shared non-linguistic understanding of reciprocation. But, nevertheless, there are considerable differences between languages in the exact extensional patterns, highlighting that even in the domain of grammar semantics is highly language-specific. PMID:21713188

  18. Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

    NASA Astrophysics Data System (ADS)

    Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

    2018-03-01

    This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.

  19. Words and possible words in early language acquisition.

    PubMed

    Marchetto, Erika; Bonatti, Luca L

    2013-11-01

    In order to acquire language, infants must extract its building blocks-words-and master the rules governing their legal combinations from speech. These two problems are not independent, however: words also have internal structure. Thus, infants must extract two kinds of information from the same speech input. They must find the actual words of their language. Furthermore, they must identify its possible words, that is, the sequences of sounds that, being morphologically well formed, could be words. Here, we show that infants' sensitivity to possible words appears to be more primitive and fundamental than their ability to find actual words. We expose 12- and 18-month-old infants to an artificial language containing a conflict between statistically coherent and structurally coherent items. We show that 18-month-olds can extract possible words when the familiarization stream contains marks of segmentation, but cannot do so when the stream is continuous. Yet, they can find actual words from a continuous stream by computing statistical relationships among syllables. By contrast, 12-month-olds can find possible words when familiarized with a segmented stream, but seem unable to extract statistically coherent items from a continuous stream that contains minimal conflicts between statistical and structural information. These results suggest that sensitivity to word structure is in place earlier than the ability to analyze distributional information. The ability to compute nontrivial statistical relationships becomes fully effective relatively late in development, when infants have already acquired a considerable amount of linguistic knowledge. Thus, mechanisms for structure extraction that do not rely on extensive sampling of the input are likely to have a much larger role in language acquisition than general-purpose statistical abilities. Copyright © 2013. Published by Elsevier Inc.

  20. Emergence of Scale-Free Syntax Networks

    NASA Astrophysics Data System (ADS)

    Corominas-Murtra, Bernat; Valverde, Sergi; Solé, Ricard V.

    The evolution of human language allowed the efficient propagation of nongenetic information, thus creating a new form of evolutionary change. Language development in children offers the opportunity of exploring the emergence of such complex communication system and provides a window to understanding the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years of age from a (pre-syntactic) tree-like structure to a scale-free, small world syntax network. The observed combinatorial patterns provide valuable data to understand the nature of the cognitive processes involved in the acquisition of syntax, introducing a new ingredient to understand the possible biological endowment of human beings which results in the emergence of complex language. We explore this problem by using a minimal, data-driven model that is able to capture several statistical traits, but some key features related to the emergence of syntactic complexity display important divergences.

  1. Propositional idea density in older men's written language: findings from the HIMS study using computerised analysis.

    PubMed

    Spencer, Elizabeth; Ferguson, Alison; Craig, Hugh; Colyvas, Kim; Hankey, Graeme J; Flicker, Leon

    2015-02-01

    Decline in linguistic function has been associated with decline in cognitive function in previous research. This research investigated the informativeness of written language samples of Australian men from the Health in Men's Study (HIMS) aged from 76 to 93 years using the Computerised Propositional Idea Density Rater (CPIDR 5.1). In total, 60,255 words in 1147 comments were analysed using a linear-mixed model for statistical analysis. Results indicated no relationship with education level (p = 0.79). Participants for whom English was not their first learnt language showed Propositional Idea Density (PD) scores slightly lower (0.018 per 1 word). Mean PD per 1 word for those for whom English was their first language for comments below 60 words was 0.494 and above 60 words 0.526. Text length was found to have an effect (p = <0.0001). The mean PD was higher than previously reported for men and lower than previously reported for a similar cohort for Australian women.

  2. A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.

    PubMed

    Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer

    2016-09-10

    When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  3. Are Young Children with Cochlear Implants Sensitive to the Statistics of Words in the Ambient Spoken Language?

    ERIC Educational Resources Information Center

    Guo, Ling-Yu; McGregor, Karla K.; Spencer, Linda J.

    2015-01-01

    Purpose: The purpose of this study was to determine whether children with cochlear implants (CIs) are sensitive to statistical characteristics of words in the ambient spoken language, whether that sensitivity changes in expected ways as their spoken lexicon grows, and whether that sensitivity varies with unilateral or bilateral implantation.…

  4. Improving Data Analysis in Second Language Acquisition by Utilizing Modern Developments in Applied Statistics

    ERIC Educational Resources Information Center

    Larson-Hall, Jenifer; Herrington, Richard

    2010-01-01

    In this article we introduce language acquisition researchers to two broad areas of applied statistics that can improve the way data are analyzed. First we argue that visual summaries of information are as vital as numerical ones, and suggest ways to improve them. Specifically, we recommend choosing boxplots over barplots and adding locally…

  5. Influence of valproate on language functions in children with epilepsy.

    PubMed

    Doo, Jin Woong; Kim, Soon Chul; Kim, Sun Jun

    2018-01-01

    The aim of the current study was to assess the influences of valproate (VPA) on the language functions in newly diagnosed pediatric patients with epilepsy. We reviewed medical records of 53 newly diagnosed patients with epilepsy, who were being treated with VPA monotherapy (n=53; 22 male patients and 31 female patients). The subjects underwent standardized language tests, at least twice, before and after the initiation of VPA. The standardized language tests used were The Test of Language Problem Solving Abilities, a Korean version of The Expressive/Receptive Language Function Test, and the Urimal Test of Articulation and Phonology. Since all the patients analyzed spoke Korean as their first language, we used Korean language tests to reduce the bias within the data. All the language parameters of the Test of Language Problem Solving Abilities slightly improved after the initiation of VPA in the 53 pediatric patients with epilepsy (mean age: 11.6±3.2years), but only "prediction" was statistically significant (determining cause, 14.9±5.1 to 15.5±4.3; making inference, 16.1±5.8 to 16.9±5.6; prediction, 11.1±4.9 to 11.9±4.2; total score of TOPS, 42.0±14.4 to 44.2±12.5). The patients treated with VPA also exhibited a small extension in mean length of utterance in words (MLU-w) when responding, but this was not statistically significant (determining cause, 5.4±2.0 to 5.7±1.6; making inference, 5.8±2.2 to 6.0±1.8; prediction, 5.9±2.5 to 5.9±2.1; total, 5.7±2.1 to 5.9±1.7). The administration of VPA led to a slight, but not statistically significant, improvement in the receptive language function (range: 144.7±41.1 to 148.2±39.7). Finally, there were no statistically significant changes in the percentage of articulation performance after taking VPA. Therefore, our data suggested that VPA did not have negative impact on the language function, but rather slightly improved problem-solving abilities. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Computer programs for computing particle-size statistics of fluvial sediments

    USGS Publications Warehouse

    Stevens, H.H.; Hubbell, D.W.

    1986-01-01

    Two versions of computer programs for inputing data and computing particle-size statistics of fluvial sediments are presented. The FORTRAN 77 language versions are for use on the Prime computer, and the BASIC language versions are for use on microcomputers. The size-statistics program compute Inman, Trask , and Folk statistical parameters from phi values and sizes determined for 10 specified percent-finer values from inputed size and percent-finer data. The program also determines the percentage gravel, sand, silt, and clay, and the Meyer-Peter effective diameter. Documentation and listings for both versions of the programs are included. (Author 's abstract)

  7. Computational methods to extract meaning from text and advance theories of human cognition.

    PubMed

    McNamara, Danielle S

    2011-01-01

    Over the past two decades, researchers have made great advances in the area of computational methods for extracting meaning from text. This research has to a large extent been spurred by the development of latent semantic analysis (LSA), a method for extracting and representing the meaning of words using statistical computations applied to large corpora of text. Since the advent of LSA, researchers have developed and tested alternative statistical methods designed to detect and analyze meaning in text corpora. This research exemplifies how statistical models of semantics play an important role in our understanding of cognition and contribute to the field of cognitive science. Importantly, these models afford large-scale representations of human knowledge and allow researchers to explore various questions regarding knowledge, discourse processing, text comprehension, and language. This topic includes the latest progress by the leading researchers in the endeavor to go beyond LSA. Copyright © 2010 Cognitive Science Society, Inc.

  8. Implicit Language Learning: Adults' Ability to Segment Words in Norwegian

    ERIC Educational Resources Information Center

    Kittleson, Megan M.; Aguilar, Jessica M.; Tokerud, Gry Line; Plante, Elena; Asbjornsen, Arve E.

    2010-01-01

    Previous language learning research reveals that the statistical properties of the input offer sufficient information to allow listeners to segment words from fluent speech in an artificial language. The current pair of studies uses a natural language to test the ecological validity of these findings and to determine whether a listener's language…

  9. Comparability of a Paper-Based Language Test and a Computer-Based Language Test.

    ERIC Educational Resources Information Center

    Choi, Inn-Chull; Kim, Kyoung Sung; Boo, Jaeyool

    2003-01-01

    Utilizing the Test of English Proficiency, developed by Seoul National University (TEPS), examined comparability between the paper-based language test and the computer-based language test based on content and construct validation employing content analyses based on corpus linguistic techniques in addition to such statistical analyses as…

  10. Nuestra lengua en el Mundo (Our Language in the World)

    ERIC Educational Resources Information Center

    Rosenblat, Angel

    1975-01-01

    Reviews the spread of Spanish as a native, second or foreign language and shows that Latin America, rather than Spain, is now the center of gravity for that language. Gives a statistical distribution of the major languages throughout the world. The homogeneity of Spanish is mentioned. (Text is in Spanish.) (TL)

  11. Pedagogical Differences during a Science and Language Intervention for English Language Learners

    ERIC Educational Resources Information Center

    Garza, Tiberio; Huerta, Margarita; Lara-Alecio, Rafael; Irby, Beverly J.; Tong, Fuhui

    2018-01-01

    The purpose of this study was to compare and describe 8 fifth-grade classrooms by their teachers pedagogy during a quasiexperimental, longitudinal, and field-based project focused on increasing English language learners' (ELLs') achievement in science and language. The larger study found statistically significant and positive intervention effects…

  12. STEMming the Tide: STEAMing Ahead by Including World Language Education

    ERIC Educational Resources Information Center

    Murphy-Judy, Kathryn

    2017-01-01

    The author argues for the inclusion of language in science, technology, engineering, mathematics (STEM) curriculum. She begins by examining the American Association of Arts and Sciences (AAAS) statistical report of U.S. language study. Language instruction in public and private schools has been declining throughout the years while pressures from…

  13. Thinking About Multiword Constructions: Usage-Based Approaches to Acquisition and Processing.

    PubMed

    Ellis, Nick C; Ogden, Dave C

    2017-07-01

    Usage-based approaches to language hold that we learn multiword expressions as patterns of language from language usage, and that knowledge of these patterns underlies fluent language processing. This paper explores these claims by focusing upon verb-argument constructions (VACs) such as "V(erb) about n(oun phrase)." These are productive constructions that bind syntax, lexis, and semantics. It presents (a) analyses of usage patterns of English VACs in terms of their grammatical form, semantics, lexical constituency, and distribution patterns in large corpora; (b) patterns of VAC usage in child-directed speech and child language acquisition; and (c) investigations of VAC free-association and psycholinguistic studies of online processing. We conclude that VACs are highly patterned in usage, that this patterning drives language acquisition, and that language processing is sensitive to the forms of the syntagmatic construction and their distributional statistics, the contingency of their association with meaning, and spreading activation and prototypicality effects in semantic reference. Language users have rich implicit knowledge of the statistics of multiword sequences. Copyright © 2017 Cognitive Science Society, Inc.

  14. Language experience changes subsequent learning

    PubMed Central

    Onnis, Luca; Thiessen, Erik

    2013-01-01

    What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. PMID:23200510

  15. Word lengths are optimized for efficient communication.

    PubMed

    Piantadosi, Steven T; Tily, Harry; Gibson, Edward

    2011-03-01

    We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.

  16. The Intersection of Inquiry-Based Science and Language: Preparing Teachers for ELL Classrooms

    NASA Astrophysics Data System (ADS)

    Weinburgh, Molly; Silva, Cecilia; Smith, Kathy Horak; Groulx, Judy; Nettles, Jenesta

    2014-08-01

    As teacher educators, we are tasked with preparing prospective teachers to enter a field that has undergone significant changes in student population and policy since we were K-12 teachers. With the emphasis placed on connections, mathematics integration, and communication by the New Generation Science Standards (NGSS) (Achieve in Next generation science standards, 2012), more research is needed on how teachers can accomplish this integration (Bunch in Rev Res Educ 37:298-341, 2013; Lee et al. in Educ Res 42(4):223-233, 2013). Science teacher educators, in response to the NGSS, recognize that it is necessary for pre-service and in-service teachers to know more about how instructional strategies in language and science can complement one another. Our purpose in this study was to explore a model of integration that can be used in classrooms. To do this, we examined the change in science content knowledge and academic vocabulary for English language learners (ELLs) as they engaged in inquiry-based science experience utilizing the 5R Instructional Model. Two units, erosion and wind turbines, were developed using the 5R Instructional Model and taught during two different years in a summer school program for ELLs. We analyzed data from interviews to assess change in conceptual understanding and science academic vocabulary over the 60 h of instruction. The statistics show a clear trend of growth supporting our claim that ELLs did construct more sophisticated understanding of the topics and use more language to communicate their knowledge. As science teacher educators seek ways to prepare elementary teachers to help preK-12 students to learn science and develop the language of science, the 5R Instructional Model is one pathway.

  17. The MSFC UNIVAC 1108 EXEC 8 simulation model

    NASA Technical Reports Server (NTRS)

    Williams, T. G.; Richards, F. M.; Weatherbee, J. E.; Paul, L. K.

    1972-01-01

    A model is presented which simulates the MSFC Univac 1108 multiprocessor system. The hardware/operating system is described to enable a good statistical measurement of the system behavior. The performance of the 1108 is evaluated by performing twenty-four different experiments designed to locate system bottlenecks and also to test the sensitivity of system throughput with respect to perturbation of the various Exec 8 scheduling algorithms. The model is implemented in the general purpose system simulation language and the techniques described can be used to assist in the design, development, and evaluation of multiprocessor systems.

  18. Language and Demographic Characteristics of the U.S. Population with Potential Need for Bilingual and Other Special Educational Programs, July 1975.

    ERIC Educational Resources Information Center

    Waggoner, Dorothy

    This report summarizes the language background information and certain demographic characteristics of language minorities in the United States. The data were derived from the Survey of Languages, a pilot study of the non-English-language background population aged four and older sponsored by the National Center for Education Statistics as part of…

  19. The language of gene ontology: a Zipf's law analysis.

    PubMed

    Kalankesh, Leila Ranandeh; Stevens, Robert; Brass, Andy

    2012-06-07

    Most major genome projects and sequence databases provide a GO annotation of their data, either automatically or through human annotators, creating a large corpus of data written in the language of GO. Texts written in natural language show a statistical power law behaviour, Zipf's law, the exponent of which can provide useful information on the nature of the language being used. We have therefore explored the hypothesis that collections of GO annotations will show similar statistical behaviours to natural language. Annotations from the Gene Ontology Annotation project were found to follow Zipf's law. Surprisingly, the measured power law exponents were consistently different between annotation captured using the three GO sub-ontologies in the corpora (function, process and component). On filtering the corpora using GO evidence codes we found that the value of the measured power law exponent responded in a predictable way as a function of the evidence codes used to support the annotation. Techniques from computational linguistics can provide new insights into the annotation process. GO annotations show similar statistical behaviours to those seen in natural language with measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent might provide a signal regarding the information content of the annotation.

  20. Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech

    PubMed Central

    Leong, Victoria; Goswami, Usha

    2015-01-01

    When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72–82% (freely-read CDS) and 90–98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across languages. The S-AMPH model reveals a crucial developmental role for stress feet (AMs ~2 Hz). Stress feet underpin different linguistic rhythm typologies, and speech rhythm underpins language acquisition by infants in all languages. PMID:26641472

  1. Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech.

    PubMed

    Leong, Victoria; Goswami, Usha

    2015-01-01

    When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72-82% (freely-read CDS) and 90-98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across languages. The S-AMPH model reveals a crucial developmental role for stress feet (AMs ~2 Hz). Stress feet underpin different linguistic rhythm typologies, and speech rhythm underpins language acquisition by infants in all languages.

  2. Electrocorticographic language mapping in children by high-gamma synchronization during spontaneous conversation: comparison with conventional electrical cortical stimulation.

    PubMed

    Arya, Ravindra; Wilson, J Adam; Vannest, Jennifer; Byars, Anna W; Greiner, Hansel M; Buroker, Jason; Fujiwara, Hisako; Mangano, Francesco T; Holland, Katherine D; Horn, Paul S; Crone, Nathan E; Rose, Douglas F

    2015-02-01

    This study describes development of a novel language mapping approach using high-γ modulation in electrocorticograph (ECoG) during spontaneous conversation, and its comparison with electrical cortical stimulation (ECS) in childhood-onset drug-resistant epilepsy. Patients undergoing invasive pre-surgical monitoring and able to converse with the investigator were eligible. ECoG signals and synchronized audio were acquired during quiet baseline and during natural conversation between investigator and the patient. Using Signal Modeling for Real-time Identification and Event Detection (SIGFRIED) procedure, a statistical model for baseline high-γ (70-116 Hz) power, and a single score for each channel representing the probability that the power features in the experimental signal window belonged to the baseline model, were calculated. Electrodes with significant high-γ responses (HGS) were plotted on the 3D cortical model. Sensitivity, specificity, positive and negative predictive values (PPV, NPV), and classification accuracy were calculated compared to ECS. Seven patients were included (4 males, mean age 10.28 ± 4.07 years). Significant high-γ responses were observed in classic language areas in the left hemisphere plus in some homologous right hemispheric areas. Compared with clinical standard ECS mapping, the sensitivity and specificity of HGS mapping was 88.89% and 63.64%, respectively, and PPV and NPV were 35.29% and 96.25%, with an overall accuracy of 68.24%. HGS mapping was able to correctly determine all ECS+ sites in 6 of 7 patients and all false-sites (ECS+, HGS- for visual naming, n = 3) were attributable to only 1 patient. This study supports the feasibility of language mapping with ECoG HGS during spontaneous conversation, and its accuracy compared to traditional ECS. Given long-standing concerns about ecological validity of ECS mapping of cued language tasks, and difficulties encountered with its use in children, ECoG mapping of spontaneous language may provide a valid alternative for clinical use. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Software For Computing Reliability Of Other Software

    NASA Technical Reports Server (NTRS)

    Nikora, Allen; Antczak, Thomas M.; Lyu, Michael

    1995-01-01

    Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.

  4. The Contribution of Language-Specific Knowledge in the Selection of Statistically-Coherent Word Candidates

    ERIC Educational Resources Information Center

    Toro, Juan M.; Pons, Ferran; Bion, Ricardo A. H.; Sebastian-Galles, Nuria

    2011-01-01

    Much research has explored the extent to which statistical computations account for the extraction of linguistic information. However, it remains to be studied how language-specific constraints are imposed over these computations. In the present study we investigated if the violation of a word-forming rule in Catalan (the presence of more than one…

  5. Neurophysiological Markers of Statistical Learning in Music and Language: Hierarchy, Entropy, and Uncertainty.

    PubMed

    Daikoku, Tatsuya

    2018-06-19

    Statistical learning (SL) is a method of learning based on the transitional probabilities embedded in sequential phenomena such as music and language. It has been considered an implicit and domain-general mechanism that is innate in the human brain and that functions independently of intention to learn and awareness of what has been learned. SL is an interdisciplinary notion that incorporates information technology, artificial intelligence, musicology, and linguistics, as well as psychology and neuroscience. A body of recent study has suggested that SL can be reflected in neurophysiological responses based on the framework of information theory. This paper reviews a range of work on SL in adults and children that suggests overlapping and independent neural correlations in music and language, and that indicates disability of SL. Furthermore, this article discusses the relationships between the order of transitional probabilities (TPs) (i.e., hierarchy of local statistics) and entropy (i.e., global statistics) regarding SL strategies in human's brains; claims importance of information-theoretical approaches to understand domain-general, higher-order, and global SL covering both real-world music and language; and proposes promising approaches for the application of therapy and pedagogy from various perspectives of psychology, neuroscience, computational studies, musicology, and linguistics.

  6. Using complex networks for text classification: Discriminating informative and imaginative documents

    NASA Astrophysics Data System (ADS)

    de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.

    2016-01-01

    Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, such as machine translation and document classification. In the latter, many approaches have emphasised the semantical content of texts, as is the case of bag-of-word language models. These approaches have certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only in a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterising texts.

  7. A random matrix approach to language acquisition

    NASA Astrophysics Data System (ADS)

    Nicolaidis, A.; Kosmidis, Kosmas; Argyrakis, Panos

    2009-12-01

    Since language is tied to cognition, we expect the linguistic structures to reflect patterns that we encounter in nature and are analyzed by physics. Within this realm we investigate the process of lexicon acquisition, using analytical and tractable methods developed within physics. A lexicon is a mapping between sounds and referents of the perceived world. This mapping is represented by a matrix and the linguistic interaction among individuals is described by a random matrix model. There are two essential parameters in our approach. The strength of the linguistic interaction β, which is considered as a genetically determined ability, and the number N of sounds employed (the lexicon size). Our model of linguistic interaction is analytically studied using methods of statistical physics and simulated by Monte Carlo techniques. The analysis reveals an intricate relationship between the innate propensity for language acquisition β and the lexicon size N, N~exp(β). Thus a small increase of the genetically determined β may lead to an incredible lexical explosion. Our approximate scheme offers an explanation for the biological affinity of different species and their simultaneous linguistic disparity.

  8. Native Language Influence in the Segmentation of a Novel Language

    ERIC Educational Resources Information Center

    Ordin, Mikhail; Nespor, Marina

    2016-01-01

    A major problem in second language acquisition (SLA) is the segmentation of fluent speech in the target language, i.e., detecting the boundaries of phonological constituents like words and phrases in the speech stream. To this end, among a variety of cues, people extensively use prosody and statistical regularities. We examined the role of pitch,…

  9. Laying the Foundations for Video-Game Based Language Instruction for the Teaching of EFL

    ERIC Educational Resources Information Center

    Galvis, Héctor Alejandro

    2015-01-01

    This paper introduces video-game based language instruction as a teaching approach catering to the different socio-economic and learning needs of English as a Foreign Language students. First, this paper reviews statistical data revealing the low participation of Colombian students in English as a second language programs abroad (U.S. context…

  10. Contribution of Implicit Sequence Learning to Spoken Language Processing: Some Preliminary Findings with Hearing Adults

    ERIC Educational Resources Information Center

    Conway, Christopher M.; Karpicke, Jennifer; Pisoni, David B.

    2007-01-01

    Spoken language consists of a complex, sequentially arrayed signal that contains patterns that can be described in terms of statistical relations among language units. Previous research has suggested that a domain-general ability to learn structured sequential patterns may underlie language acquisition. To test this prediction, we examined the…

  11. Influence of family environment on language outcomes in children with myelomeningocele.

    PubMed

    Vachha, B; Adams, R

    2005-09-01

    Previously, our studies demonstrated language differences impacting academic performance among children with myelomeningocele and shunted hydrocephalus (MMSH). This follow-up study considers the environmental facilitators within families (achievement orientation, intellectual-cultural orientation, active recreational orientation, independence) among a cohort of children with MMSH and their relationship to language performance. Fifty-eight monolingual, English-speaking children (36 females; mean age: 10.1 years; age range: 7-16 years) with MMSH were evaluated. Exclusionary criteria were prior shunt infection; seizure or shunt malfunction within the previous 3 months; uncorrected visual or auditory impairments; prior diagnoses of mental retardation or attention deficit disorder. The Comprehensive Assessment of Spoken Language (CASL) and the Wechsler Abbreviated Scale of Intelligence (WASI) were administered individually to all participants. The CASL Measures four subsystems: lexical, syntactic, supralinguistic and pragmatic. Parents completed the Family Environment Scale (FES) questionnaire and provided background demographic information. Spearman correlation analyses and partial correlation analyses were performed. Mean intelligence scores for the MMSH group: full scale IQ 92.2 (SD = 11.9). The CASL revealed statistically significant difficulty for supralinguistic and pragmatic (or social) language tasks. FES scores fell within the average range for the group. Spearman correlation and partial correlation analyses revealed statistically significant positive relationships for the FES 'intellectual-cultural orientation' variable and performance within the four language subsystems. Socio-economic status (SES) characteristics were analyzed and did not discriminate language performance when the intellectual-cultural orientation factor was taken into account. The role of family facilitators on language skills in children with MMSH has not previously been described. The relationship between language performance and the families' value on intellectual/cultural activities seems both statistically and intuitively sound. Focused interest in the integration of family values and practices should assist developmental specialists in supporting families and children within their most natural environment.

  12. The statistical trade-off between word order and word structure – Large-scale evidence for the principle of least effort

    PubMed Central

    Koplenig, Alexander; Meyer, Peter; Wolfer, Sascha; Müller-Spitzer, Carolin

    2017-01-01

    Languages employ different strategies to transmit structural and grammatical information. While, for example, grammatical dependency relationships in sentences are mainly conveyed by the ordering of the words for languages like Mandarin Chinese, or Vietnamese, the word ordering is much less restricted for languages such as Inupiatun or Quechua, as these languages (also) use the internal structure of words (e.g. inflectional morphology) to mark grammatical relationships in a sentence. Based on a quantitative analysis of more than 1,500 unique translations of different books of the Bible in almost 1,200 different languages that are spoken as a native language by approximately 6 billion people (more than 80% of the world population), we present large-scale evidence for a statistical trade-off between the amount of information conveyed by the ordering of words and the amount of information conveyed by internal word structure: languages that rely more strongly on word order information tend to rely less on word structure information and vice versa. Or put differently, if less information is carried within the word, more information has to be spread among words in order to communicate successfully. In addition, we find that–despite differences in the way information is expressed–there is also evidence for a trade-off between different books of the biblical canon that recurs with little variation across languages: the more informative the word order of the book, the less informative its word structure and vice versa. We argue that this might suggest that, on the one hand, languages encode information in very different (but efficient) ways. On the other hand, content-related and stylistic features are statistically encoded in very similar ways. PMID:28282435

  13. Depression in non-Korean women residing in South Korea following marriage to Korean men.

    PubMed

    Kim, Hyun-Sil; Kim, Hun-Soo

    2013-06-01

    The purpose of the study was to examine the roles of acculturative stress, life satisfaction, and language literacy in depression in non-Korean women residing in South Korea following marriage to Korean men. A cross-sectional study was performed, using an anonymous, self-reporting questionnaire. A total of 173 women were selected using a proportional stratified random sampling method. The relation between acculturation, depression, language literacy, life satisfaction and socio-demographic variables and the predictors of depression among participants were analyzed. The analysis included descriptive statistics and hierarchical multiple regression. Of the participants, 9.2% had depression, which was almost twice the rate of depression found in the general Korean population. In hierarchical multiple regression analysis, acculturative stress (beta=-.325, P<.001) and life satisfaction (beta=-.282, P=.003) were significantly associated with the level of depression. This final model was statistically significant and life satisfaction, acculturative stress, language literacy accounted for 31.0% (adjusted R(2)) of the variance in the depression score (P<.001). Elevated acculturative stress and less life satisfaction were significantly associated with a higher level of depression in migrant wives in Korea. Implications for practice and research are discussed. Copyright © 2013 Elsevier Inc. All rights reserved.

  14. Ultraconserved words point to deep language ancestry across Eurasia.

    PubMed

    Pagel, Mark; Atkinson, Quentin D; S Calude, Andreea; Meade, Andrew

    2013-05-21

    The search for ever deeper relationships among the World's languages is bedeviled by the fact that most words evolve too rapidly to preserve evidence of their ancestry beyond 5,000 to 9,000 y. On the other hand, quantitative modeling indicates that some "ultraconserved" words exist that might be used to find evidence for deep linguistic relationships beyond that time barrier. Here we use a statistical model, which takes into account the frequency with which words are used in common everyday speech, to predict the existence of a set of such highly conserved words among seven language families of Eurasia postulated to form a linguistic superfamily that evolved from a common ancestor around 15,000 y ago. We derive a dated phylogenetic tree of this proposed superfamily with a time-depth of ~14,450 y, implying that some frequently used words have been retained in related forms since the end of the last ice age. Words used more than once per 1,000 in everyday speech were 7- to 10-times more likely to show deep ancestry on this tree. Our results suggest a remarkable fidelity in the transmission of some words and give theoretical justification to the search for features of language that might be preserved across wide spans of time and geography.

  15. Ultraconserved words point to deep language ancestry across Eurasia

    PubMed Central

    Pagel, Mark; Atkinson, Quentin D.; S. Calude, Andreea; Meade, Andrew

    2013-01-01

    The search for ever deeper relationships among the World’s languages is bedeviled by the fact that most words evolve too rapidly to preserve evidence of their ancestry beyond 5,000 to 9,000 y. On the other hand, quantitative modeling indicates that some “ultraconserved” words exist that might be used to find evidence for deep linguistic relationships beyond that time barrier. Here we use a statistical model, which takes into account the frequency with which words are used in common everyday speech, to predict the existence of a set of such highly conserved words among seven language families of Eurasia postulated to form a linguistic superfamily that evolved from a common ancestor around 15,000 y ago. We derive a dated phylogenetic tree of this proposed superfamily with a time-depth of ∼14,450 y, implying that some frequently used words have been retained in related forms since the end of the last ice age. Words used more than once per 1,000 in everyday speech were 7- to 10-times more likely to show deep ancestry on this tree. Our results suggest a remarkable fidelity in the transmission of some words and give theoretical justification to the search for features of language that might be preserved across wide spans of time and geography. PMID:23650390

  16. Further evidence for a parent-of-origin effect at the NOP9 locus on language-related phenotypes.

    PubMed

    Pettigrew, Kerry A; Frinton, Emily; Nudel, Ron; Chan, May T M; Thompson, Paul; Hayiou-Thomas, Marianna E; Talcott, Joel B; Stein, John; Monaco, Anthony P; Hulme, Charles; Snowling, Margaret J; Newbury, Dianne F; Paracchini, Silvia

    2016-01-01

    Specific language impairment (SLI) is a common neurodevelopmental disorder, observed in 5-10 % of children. Family and twin studies suggest a strong genetic component, but relatively few candidate genes have been reported to date. A recent genome-wide association study (GWAS) described the first statistically significant association specifically for a SLI cohort between a missense variant (rs4280164) in the NOP9 gene and language-related phenotypes under a parent-of-origin model. Replications of these findings are particularly challenging because the availability of parental DNA is required. We used two independent family-based cohorts characterised with reading- and language-related traits: a longitudinal cohort (n = 106 informative families) including children with language and reading difficulties and a nuclear family cohort (n = 264 families) selected for dyslexia. We observed association with language-related measures when modelling for parent-of-origin effects at the NOP9 locus in both cohorts: minimum P = 0.001 for phonological awareness with a paternal effect in the first cohort and minimum P = 0.0004 for irregular word reading with a maternal effect in the second cohort. Allelic and parental trends were not consistent when compared to the original study. A parent-of-origin effect at this locus was detected in both cohorts, albeit with different trends. These findings contribute in interpreting the original GWAS report and support further investigations of the NOP9 locus and its role in language-related traits. A systematic evaluation of parent-of-origin effects in genetic association studies has the potential to reveal novel mechanisms underlying complex traits.

  17. Songs as an aid for language acquisition.

    PubMed

    Schön, Daniele; Boyer, Maud; Moreno, Sylvain; Besson, Mireille; Peretz, Isabelle; Kolinsky, Régine

    2008-02-01

    In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.] have shown that adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. They also showed that a similar learning mechanism operates with musical stimuli [Saffran, J. R., Johnson, R. E. K., Aslin, N., & Newport, E. L. (1999). Abstract Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.]. In this work we combined linguistic and musical information and we compared language learning based on speech sequences to language learning based on sung sequences. We hypothesized that, compared to speech sequences, a consistent mapping of linguistic and musical information would enhance learning. Results confirmed the hypothesis showing a strong learning facilitation of song compared to speech. Most importantly, the present results show that learning a new language, especially in the first learning phase wherein one needs to segment new words, may largely benefit of the motivational and structuring properties of music in song.

  18. Regional Earthquake Likelihood Models: A realm on shaky grounds?

    NASA Astrophysics Data System (ADS)

    Kossobokov, V.

    2005-12-01

    Seismology is juvenile and its appropriate statistical tools to-date may have a "medievil flavor" for those who hurry up to apply a fuzzy language of a highly developed probability theory. To become "quantitatively probabilistic" earthquake forecasts/predictions must be defined with a scientific accuracy. Following the most popular objectivists' viewpoint on probability, we cannot claim "probabilities" adequate without a long series of "yes/no" forecast/prediction outcomes. Without "antiquated binary language" of "yes/no" certainty we cannot judge an outcome ("success/failure"), and, therefore, quantify objectively a forecast/prediction method performance. Likelihood scoring is one of the delicate tools of Statistics, which could be worthless or even misleading when inappropriate probability models are used. This is a basic loophole for a misuse of likelihood as well as other statistical methods on practice. The flaw could be avoided by an accurate verification of generic probability models on the empirical data. It is not an easy task in the frames of the Regional Earthquake Likelihood Models (RELM) methodology, which neither defines the forecast precision nor allows a means to judge the ultimate success or failure in specific cases. Hopefully, the RELM group realizes the problem and its members do their best to close the hole with an adequate, data supported choice. Regretfully, this is not the case with the erroneous choice of Gerstenberger et al., who started the public web site with forecasts of expected ground shaking for `tomorrow' (Nature 435, 19 May 2005). Gerstenberger et al. have inverted the critical evidence of their study, i.e., the 15 years of recent seismic record accumulated just in one figure, which suggests rejecting with confidence above 97% "the generic California clustering model" used in automatic calculations. As a result, since the date of publication in Nature the United States Geological Survey website delivers to the public, emergency planners and the media, a forecast product, which is based on wrong assumptions that violate the best-documented earthquake statistics in California, which accuracy was not investigated, and which forecasts were not tested in a rigorous way.

  19. Factors determining access to oral health services among children aged less than 12 years in Peru.

    PubMed

    Azañedo, Diego; Hernández-Vásquez, Akram; Casas-Bendezú, Mixsi; Gutiérrez, César; Agudelo-Suárez, Andrés A; Cortés, Sandra

    2017-01-01

    Background: Understanding problems of access to oral health services requires knowledge of factors that determine access. This study aimed to evaluate factors that determine access to oral health services among children aged <12 years in Peru between 2014 and 2015. Methods: We performed a secondary data analysis of 71,614 Peruvian children aged <12 years and their caregivers. Data were obtained from the Survey on Demography and Family Health 2014-2015 (Encuesta Demográfica y de Salud Familiar - ENDES). Children's access to oral health services within the previous 6 months was used as the dependent variable (i.e. Yes/No), and the Andersen and col model was used to select independent variables. Predisposing (e.g., language spoken by  tutor or guardian, wealth level, caregivers' educational level, area of residence, natural region of residence, age, and sex) and enabling factors (e.g. type of health insurance) were considered. Descriptive statistics were calculated, and multivariate analysis was performed using generalized linear models (Poisson family). Results: Of all the children, 51% were males, 56% were aged <5 years, and 62.6% lived in urban areas. The most common type of health insurance was Integral Health Insurance (57.8%), and most respondents were in the first quintile of wealth (31.6%). Regarding caregivers, the most common educational level was high school (43.02%) and the most frequently spoken language was Spanish (88.4%). Univariate analysis revealed that all variables, except sex and primary educational level, were statistically significant. After adjustment, sex, area of residence, and language were insignificant, whereas the remaining variables were statistically significant. Conclusions: Wealth index, caregivers' education level, natural region of residence, age, and type of health insurance are factors that determine access to oral health services among children aged <12 years in Peru. These factors should be considered when devising strategies to mitigate against inequities in access to oral health services.

  20. Factors determining access to oral health services among children aged less than 12 years in Peru

    PubMed Central

    Azañedo, Diego; Hernández-Vásquez, Akram; Casas-Bendezú, Mixsi; Gutiérrez, César; Agudelo-Suárez, Andrés A.; Cortés, Sandra

    2017-01-01

    Background: Understanding problems of access to oral health services requires knowledge of factors that determine access. This study aimed to evaluate factors that determine access to oral health services among children aged <12 years in Peru between 2014 and 2015. Methods: We performed a secondary data analysis of 71,614 Peruvian children aged <12 years and their caregivers. Data were obtained from the Survey on Demography and Family Health 2014-2015 (Encuesta Demográfica y de Salud Familiar - ENDES). Children’s access to oral health services within the previous 6 months was used as the dependent variable (i.e. Yes/No), and the Andersen and col model was used to select independent variables. Predisposing (e.g., language spoken by  tutor or guardian, wealth level, caregivers’ educational level, area of residence, natural region of residence, age, and sex) and enabling factors (e.g. type of health insurance) were considered. Descriptive statistics were calculated, and multivariate analysis was performed using generalized linear models (Poisson family). Results: Of all the children, 51% were males, 56% were aged <5 years, and 62.6% lived in urban areas. The most common type of health insurance was Integral Health Insurance (57.8%), and most respondents were in the first quintile of wealth (31.6%). Regarding caregivers, the most common educational level was high school (43.02%) and the most frequently spoken language was Spanish (88.4%). Univariate analysis revealed that all variables, except sex and primary educational level, were statistically significant. After adjustment, sex, area of residence, and language were insignificant, whereas the remaining variables were statistically significant. Conclusions: Wealth index, caregivers’ education level, natural region of residence, age, and type of health insurance are factors that determine access to oral health services among children aged <12 years in Peru. These factors should be considered when devising strategies to mitigate against inequities in access to oral health services. PMID:29527289

  1. The Heuristics of Statistical Argumentation: Scaffolding at the Postsecondary Level

    ERIC Educational Resources Information Center

    Pardue, Teneal Messer

    2017-01-01

    Language plays a key role in statistics and, by extension, in statistics education. Enculturating students into the practice of statistics requires preparing them to communicate results of data analysis. Statistical argumentation is one way of providing structure to facilitate discourse in the statistics classroom. In this study, a teaching…

  2. Colloquium: Hierarchy of scales in language dynamics

    NASA Astrophysics Data System (ADS)

    Blythe, Richard A.

    2015-11-01

    Methods and insights from statistical physics are finding an increasing variety of applications where one seeks to understand the emergent properties of a complex interacting system. One such area concerns the dynamics of language at a variety of levels of description, from the behaviour of individual agents learning simple artificial languages from each other, up to changes in the structure of languages shared by large groups of speakers over historical timescales. In this Colloquium, we survey a hierarchy of scales at which language and linguistic behaviour can be described, along with the main progress in understanding that has been made at each of them - much of which has come from the statistical physics community. We argue that future developments may arise by linking the different levels of the hierarchy together in a more coherent fashion, in particular where this allows more effective use of rich empirical data sets.

  3. Balancing Effort and Information Transmission during Language Acquisition: Evidence from Word Order and Case Marking

    ERIC Educational Resources Information Center

    Fedzechkina, Maryia; Newport, Elissa L.; Jaeger, T. Florian

    2017-01-01

    Across languages of the world, some grammatical patterns have been argued to be more common than expected by chance. These are sometimes referred to as (statistical) "language universals." One such universal is the correlation between constituent order freedom and the presence of a case system in a language. Here, we explore whether this…

  4. Flipping between Languages? An Exploratory Analysis of the Usage by Spanish-Speaking English Language Learner Tertiary Students of a Bilingual Probability Applet

    ERIC Educational Resources Information Center

    Lesser, Lawrence M.; Wagler, Amy E.; Salazar, Berenice

    2016-01-01

    English language learners (ELLs) are a rapidly growing part of the student population in many countries. Studies on resources for language learners--especially Spanish-speaking ELLs--have focused on areas such as reading, writing, and mathematics, but not introductory probability and statistics. Semi-structured qualitative interviews investigated…

  5. Language experience changes subsequent learning.

    PubMed

    Onnis, Luca; Thiessen, Erik

    2013-02-01

    What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. A two-way interface between limited Systems Biology Markup Language and R.

    PubMed

    Radivoyevitch, Tomas

    2004-12-07

    Systems Biology Markup Language (SBML) is gaining broad usage as a standard for representing dynamical systems as data structures. The open source statistical programming environment R is widely used by biostatisticians involved in microarray analyses. An interface between SBML and R does not exist, though one might be useful to R users interested in SBML, and SBML users interested in R. A model structure that parallels SBML to a limited degree is defined in R. An interface between this structure and SBML is provided through two function definitions: write.SBML() which maps this R model structure to SBML level 2, and read.SBML() which maps a limited range of SBML level 2 files back to R. A published model of purine metabolism is provided in this SBML-like format and used to test the interface. The model reproduces published time course responses before and after its mapping through SBML. List infrastructure preexisting in R makes it well-suited for manipulating SBML models. Further developments of this SBML-R interface seem to be warranted.

  7. A two-way interface between limited Systems Biology Markup Language and R

    PubMed Central

    Radivoyevitch, Tomas

    2004-01-01

    Background Systems Biology Markup Language (SBML) is gaining broad usage as a standard for representing dynamical systems as data structures. The open source statistical programming environment R is widely used by biostatisticians involved in microarray analyses. An interface between SBML and R does not exist, though one might be useful to R users interested in SBML, and SBML users interested in R. Results A model structure that parallels SBML to a limited degree is defined in R. An interface between this structure and SBML is provided through two function definitions: write.SBML() which maps this R model structure to SBML level 2, and read.SBML() which maps a limited range of SBML level 2 files back to R. A published model of purine metabolism is provided in this SBML-like format and used to test the interface. The model reproduces published time course responses before and after its mapping through SBML. Conclusions List infrastructure preexisting in R makes it well-suited for manipulating SBML models. Further developments of this SBML-R interface seem to be warranted. PMID:15585059

  8. Curriculum Boosters. Social Studies, Math, Language Arts.

    ERIC Educational Resources Information Center

    Reissman, Rose; And Others

    1994-01-01

    Presents three curriculum boosting activities for elementary classes. A social studies activity builds bridges to other cultures via literature. A math activity teaches students about percentages using baseball card statistics. A language arts activity helps students learn to appreciate the language of Shakespeare. A student page presents a…

  9. Corpus Approaches to Language Ideology

    ERIC Educational Resources Information Center

    Vessey, Rachelle

    2017-01-01

    This paper outlines how corpus linguistics--and more specifically the corpus-assisted discourse studies approach--can add useful dimensions to studies of language ideology. First, it is argued that the identification of words of high, low, and statistically significant frequency can help in the identification and exploration of language ideologies…

  10. Input and Intake in Language Acquisition

    ERIC Educational Resources Information Center

    Gagliardi, Ann C.

    2012-01-01

    This dissertation presents an approach for a productive way forward in the study of language acquisition, sealing the rift between claims of an innate linguistic hypothesis space and powerful domain general statistical inference. This approach breaks language acquisition into its component parts, distinguishing the input in the environment from…

  11. Language of publication restrictions in systematic reviews gave different results depending on whether the intervention was conventional or complementary.

    PubMed

    Pham, Ba'; Klassen, Terry P; Lawson, Margaret L; Moher, David

    2005-08-01

    To assess whether language of publication restrictions impact the estimates of an intervention's effectiveness, whether such impact is similar for conventional medicine and complementary medicine interventions, and whether the results are influenced by publication bias and statistical heterogeneity. We set out to examine the extent to which including reports of randomized controlled trials (RCTs) in languages other than English (LOE) influences the results of systematic reviews, using a broad dataset of 42 language-inclusive systematic reviews, involving 662 RCTs, including both conventional medicine (CM) and complementary and alternative medicine (CAM) interventions. For CM interventions, language-restricted systematic reviews, compared with language-inclusive ones, did not introduce biased results, in terms of estimates of intervention effectiveness (random effects ration of odds rations ROR=1.02; 95% CI=0.83-1.26). For CAM interventions, however, language-restricted systematic reviews resulted in a 63% smaller protective effect estimate than language-inclusive reviews (random effects ROR=1.63; 95% CI=1.03-2.60). Language restrictions do not change the results of CM systematic reviews but do substantially alter the results of CAM systematic reviews. These findings are robust even after sensitivity analyses, and do not appear to be influenced by statistical heterogeneity and publication bias.

  12. Computational Software to Fit Seismic Data Using Epidemic-Type Aftershock Sequence Models and Modeling Performance Comparisons

    NASA Astrophysics Data System (ADS)

    Chu, A.

    2016-12-01

    Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work implements three of the homogeneous ETAS models described in Ogata (1998). With a model's log-likelihood function, my software finds the Maximum-Likelihood Estimates (MLEs) of the model's parameters to estimate the homogeneous background rate and the temporal and spatial parameters that govern triggering effects. EM-algorithm is employed for its advantages of stability and robustness (Veen and Schoenberg, 2008). My work also presents comparisons among the three models in robustness, convergence speed, and implementations from theory to computing practice. Up-to-date regional seismic data of seismic active areas such as Southern California and Japan are used to demonstrate the comparisons. Data analysis has been done using computer languages Java and R. Java has the advantages of being strong-typed and easiness of controlling memory resources, while R has the advantages of having numerous available functions in statistical computing. Comparisons are also made between the two programming languages in convergence and stability, computational speed, and easiness of implementation. Issues that may affect convergence such as spatial shapes are discussed.

  13. Language development in preschool children born after asymmetrical intrauterine growth retardation.

    PubMed

    Simić Klarić, Andrea; Kolundžić, Zdravko; Galić, Slavka; Mejaški Bošnjak, Vlatka

    2012-03-01

    After intrauterine growth retardation, many minor neurodevelopmental disorders may occur, especially in the motor skills domain, language and speech development, and cognitive functions. The assessment of language development and impact of postnatal head growth in preschool children born with asymmetrical intrauterine growth retardation. Examinees were born at term with birth weight below the 10th percentile for gestational age, parity and gender. Mean age at the time of study was six years and four months. The control group was matched according to chronological and gestational age, gender and maternal education with mean age six years and five months. There were 50 children with intrauterine growth retardation and 50 controls, 28 girls and 22 boys in each group. For the assessment of language development Reynell Developmental Language Scale, the Naming test and Mottier test were performed. There were statistically significant differences (p < 0.05) in language comprehension, total expressive language (vocabulary, structure, content), naming skills and non-words repetition. Statistically significant positive correlations were found between relative growth of the head [(Actual head circumference - head circumference at birth)/(Body weight - birth weight)] and language outcome. Children with neonatal complications had lower results (p < 0.05) in language comprehension and total expressive language. Intrauterine growth retardation has a negative impact on language development which is evident in preschool years. Slow postnatal head growth is correlated with poorer language outcome. Neonatal complications were negatively correlated with language comprehension and total expressive language. Copyright © 2011 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  14. Interview with Antony John Kunnan on Language Assessment

    ERIC Educational Resources Information Center

    Nimehchisalem, Vahid

    2015-01-01

    Antony John Kunnan is a language assessment specialist. His research interests are fairness of tests and testing practice, assessment literacy, research methods and statistics, ethics and standards, and language assessment policy. His most recent publications include a four-volume edited collection of 140 chapters titled "The Companion to…

  15. Statistical Word Learning in Children with Autism Spectrum Disorder and Specific Language Impairment

    ERIC Educational Resources Information Center

    Haebig, Eileen; Saffran, Jenny R.; Ellis Weismer, Susan

    2017-01-01

    Background: Word learning is an important component of language development that influences child outcomes across multiple domains. Despite the importance of word knowledge, word-learning mechanisms are poorly understood in children with specific language impairment (SLI) and children with autism spectrum disorder (ASD). This study examined…

  16. Students' Motivation toward Computer-Based Language Learning

    ERIC Educational Resources Information Center

    Genc, Gulten; Aydin, Selami

    2011-01-01

    The present article examined some factors affecting the motivation level of the preparatory school students in using a web-based computer-assisted language-learning course. The sample group of the study consisted of 126 English-as-a-foreign-language learners at a preparatory school of a state university. After performing statistical analyses…

  17. An Investigation of School Counselor Self-Efficacy with English Language Learners

    ERIC Educational Resources Information Center

    Johnson, Leonissa V.; Ziomek-Daigle, Jolie; Haskins, Natoya Hill; Paisley, Pamela O.

    2017-01-01

    This exploratory quantitative study described school counselors' self-efficacy with English language learners. Findings suggest that school counselors with exposure to and experiences with English language learners have higher levels of self-efficacy. Statistically significant and practical differences in self-efficacy were apparent by race, U.S.…

  18. Uterine Cancer Statistics

    MedlinePlus

    ... Doing AMIGAS Stay Informed Cancer Home Uterine Cancer Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... the most commonly diagnosed gynecologic cancer. U.S. Cancer Statistics Data Visualizations Tool The Data Visualizations tool makes ...

  19. HPV-Associated Cancers Statistics

    MedlinePlus

    ... What CDC Is Doing Related Links Stay Informed Statistics for Other Kinds of Cancer Breast Cervical Colorectal ( ... Vaginal and Vulvar Cancer Home HPV-Associated Cancer Statistics Language: English (US) Español (Spanish) Recommend on Facebook ...

  20. Targeted Help for Spoken Dialogue Systems: Intelligent Feedback Improves Naive Users' Performance

    NASA Technical Reports Server (NTRS)

    Hockey, Beth Ann; Lemon, Oliver; Campana, Ellen; Hiatt, Laura; Aist, Gregory; Hieronymous, Jim; Gruenstein, Alexander; Dowding, John

    2003-01-01

    We present experimental evidence that providing naive users of a spoken dialogue system with immediate help messages related to their out-of-coverage utterances improves their success in using the system. A grammar-based recognizer and a Statistical Language Model (SLM) recognizer are run simultaneously. If the grammar-based recognizer suceeds, the less accurate SLM recognizer hypothesis is not used. When the grammar-based recognizer fails and the SLM recognizer produces a recognition hypothesis, this result is used by the Targeted Help agent to give the user feed-back on what was recognized, a diagnosis of what was problematic about the utterance, and a related in-coverage example. The in-coverage example is intended to encourage alignment between user inputs and the language model of the system. We report on controlled experiments on a spoken dialogue system for command and control of a simulated robotic helicopter.

  1. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    PubMed

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  2. An accurate behavioral model for single-photon avalanche diode statistical performance simulation

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Zhao, Tingchen; Li, Ding

    2018-01-01

    An accurate behavioral model is presented to simulate important statistical performance of single-photon avalanche diodes (SPADs), such as dark count and after-pulsing noise. The derived simulation model takes into account all important generation mechanisms of the two kinds of noise. For the first time, thermal agitation, trap-assisted tunneling and band-to-band tunneling mechanisms are simultaneously incorporated in the simulation model to evaluate dark count behavior of SPADs fabricated in deep sub-micron CMOS technology. Meanwhile, a complete carrier trapping and de-trapping process is considered in afterpulsing model and a simple analytical expression is derived to estimate after-pulsing probability. In particular, the key model parameters of avalanche triggering probability and electric field dependence of excess bias voltage are extracted from Geiger-mode TCAD simulation and this behavioral simulation model doesn't include any empirical parameters. The developed SPAD model is implemented in Verilog-A behavioral hardware description language and successfully operated on commercial Cadence Spectre simulator, showing good universality and compatibility. The model simulation results are in a good accordance with the test data, validating high simulation accuracy.

  3. Reducing the Language Content in ToM Tests: A Developmental Scale

    ERIC Educational Resources Information Center

    Burnel, Morgane; Perrone-Bertolotti, Marcela; Reboul, Anne; Baciu, Monica; Durrleman, Stephanie

    2018-01-01

    The goal of the current study was to statistically evaluate the reliable scalability of a set of tasks designed to assess Theory of Mind (ToM) without language as a confounding variable. This tool might be useful to study ToM in populations where language is impaired or to study links between language and ToM. Low verbal versions of the ToM tasks…

  4. A Statistical Model for Multilingual Entity Detection and Tracking

    DTIC Science & Technology

    2004-01-01

    tomatic Content Extraction ( ACE ) evaluation achieved top-tier results in all three evaluation languages. 1 Introduction Detecting entities, whether named...of com- bining the detected mentions into groups of references to the same object. The work presented here is motivated by the ACE eval- uation...Entropy (MaxEnt henceforth) (Berger et al., 1996) and Robust Risk Minimization (RRM henceforth) 1For a description of the ACE program see http

  5. CSMP Mathematics for the Intermediate Grades Part IV, Teacher's Guide. The Languages of Strings and Arrows. Geometry and Measurement. Probability and Statistics.

    ERIC Educational Resources Information Center

    CEMREL, Inc., St. Ann, MO.

    This guide represents the final experimental version of a pilot project which was conducted in the United States between 1973 and 1976. The ideas and the manner of presentation are based on the works of Georges and Frederique Papy. They are recognized for having introduced colored arrow drawings ("papygrams") and models of our numeration…

  6. CSMP Mathematics for the Intermediate Grades Part II, Teacher's Guide. The Languages of Strings and Arrows. Geometry and Measurement. Probability and Statistics. Experimental Version.

    ERIC Educational Resources Information Center

    CEMREL, Inc., St. Ann, MO.

    This guide represents the final experimental version of a pilot project which was conducted in the United States between 1973 and 1976. The ideas and the manner of presentation are based on the works of Georges and Frederique Papy. They are recognized for having introduced colored arrow drawings ("papygrams") and models of our numeration…

  7. CSMP Mathematics for the Intermediate Grades Part III, Teacher's Guide. The Languages of Strings and Arrows. Geometry and Measurement. Probability and Statistics. Experimental Version.

    ERIC Educational Resources Information Center

    CEMREL, Inc., St. Ann, MO.

    This guide represents the final experimental version of a pilot project which was conducted in the United States between 1973 and 1976. The ideas and the manner of presentation are based on the works of Georges and Frederique Papy. They are recognized for having introduced colored arrow drawings ("papygrams") and models of our numeration…

  8. Computer architecture evaluation for structural dynamics computations: Project summary

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1989-01-01

    The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.

  9. A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms

    PubMed Central

    Wołk, Agnieszka; Glinkowski, Wojciech

    2017-01-01

    People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer. PMID:29230254

  10. A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms.

    PubMed

    Wołk, Krzysztof; Wołk, Agnieszka; Glinkowski, Wojciech

    2017-01-01

    People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer.

  11. Text mining by Tsallis entropy

    NASA Astrophysics Data System (ADS)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  12. Aging and the Statistical Learning of Grammatical Form Classes

    PubMed Central

    Schwab, Jessica F.; Schuler, Kathryn D.; Stillman, Chelsea M.; Newport, Elissa L.; Howard, James H.; Howard, Darlene V.

    2016-01-01

    Language learners must place unfamiliar words into categories, often with few explicit indicators about when and how that word can be used grammatically. Reeder, Newport, and Aslin (2013) showed that college students can learn grammatical form classes from an artificial language by relying solely on distributional information (i.e., contextual cues in the input). Here, two experiments revealed that healthy older adults also show such statistical learning, though they are poorer than young at distinguishing grammatical from ungrammatical strings. This finding expands knowledge of which aspects of learning vary with aging, with potential implications for second language learning in late adulthood. PMID:27294711

  13. Network Controllability in the Inferior Frontal Gyrus Relates to Controlled Language Variability and Susceptibility to TMS.

    PubMed

    Medaglia, John D; Harvey, Denise Y; White, Nicole; Kelkar, Apoorva; Zimmerman, Jared; Bassett, Danielle S; Hamilton, Roy H

    2018-06-08

    In language production, humans are confronted with considerable word selection demands. Often, we must select a word from among similar, acceptable, and competing alternative words in order to construct a sentence that conveys an intended meaning. In recent years, the left inferior frontal gyrus (LIFG) has been identified as critical to this ability. Despite a recent emphasis on network approaches to understanding language, how the LIFG interacts with the brain's complex networks to facilitate controlled language performance remains unknown. Here, we take a novel approach to understand word selection as a network control process in the brain. Using an anatomical brain network derived from high-resolution diffusion spectrum imaging (DSI), we computed network controllability underlying the site of transcranial magnetic stimulation in the LIFG between administrations of language tasks that vary in response (cognitive control) demands: open-response (word generation) vs. closed-response (number naming) tasks. We find that a statistic that quantifies the LIFG's theoretically predicted control of communication across modules in the human connectome explains TMS-induced changes in open-response language task performance only. Moreover, we find that a statistic that quantifies the LIFG's theoretically predicted control of difficult-to-reach states explains vulnerability to TMS in the closed-ended (but not open-ended) response task. These findings establish a link between network controllability, cognitive function, and TMS effects. SIGNIFICANCE STATEMENT This work illustrates that network control statistics applied to anatomical connectivity data demonstrate relationships with cognitive variability during controlled language tasks and TMS effects. Copyright © 2018 the authors.

  14. Sampling over Nonuniform Distributions: A Neural Efficiency Account of the Primacy Effect in Statistical Learning.

    PubMed

    Karuza, Elisabeth A; Li, Ping; Weiss, Daniel J; Bulgarelli, Federica; Zinszer, Benjamin D; Aslin, Richard N

    2016-10-01

    Successful knowledge acquisition requires a cognitive system that is both sensitive to statistical information and able to distinguish among multiple structures (i.e., to detect pattern shifts and form distinct representations). Extensive behavioral evidence has highlighted the importance of cues to structural change, demonstrating how, without them, learners fail to detect pattern shifts and are biased in favor of early experience. Here, we seek a neural account of the mechanism underpinning this primacy effect in learning. During fMRI scanning, adult participants were presented with two artificial languages: a familiar language (L1) on which they had been pretrained followed by a novel language (L2). The languages were composed of the same syllable inventory organized according to unique statistical structures. In the absence of cues to the transition between languages, posttest familiarity judgments revealed that learners on average more accurately segmented words from the familiar language compared with the novel one. Univariate activation and functional connectivity analyses showed that participants with the strongest learning of L1 had decreased recruitment of fronto-subcortical and posterior parietal regions, in addition to a dissociation between downstream regions and early auditory cortex. Participants with a strong new language learning capacity (i.e., higher L2 scores) showed the opposite trend. Thus, we suggest that a bias toward neural efficiency, particularly as manifested by decreased sampling from the environment, accounts for the primacy effect in learning. Potential implications of this hypothesis are discussed, including the possibility that "inefficient" learning systems may be more sensitive to structural changes in a dynamic environment.

  15. Mining heart disease risk factors in clinical text with named entity recognition and distributional semantic models.

    PubMed

    Urbain, Jay

    2015-12-01

    We present the design, and analyze the performance of a multi-stage natural language processing system employing named entity recognition, Bayesian statistics, and rule logic to identify and characterize heart disease risk factor events in diabetic patients over time. The system was originally developed for the 2014 i2b2 Challenges in Natural Language in Clinical Data. The system's strengths included a high level of accuracy for identifying named entities associated with heart disease risk factor events. The system's primary weakness was due to inaccuracies when characterizing the attributes of some events. For example, determining the relative time of an event with respect to the record date, whether an event is attributable to the patient's history or the patient's family history, and differentiating between current and prior smoking status. We believe these inaccuracies were due in large part to the lack of an effective approach for integrating context into our event detection model. To address these inaccuracies, we explore the addition of a distributional semantic model for characterizing contextual evidence of heart disease risk factor events. Using this semantic model, we raise our initial 2014 i2b2 Challenges in Natural Language of Clinical data F1 score of 0.838 to 0.890 and increased precision by 10.3% without use of any lexicons that might bias our results. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. The critical period hypothesis in second language acquisition: a statistical critique and a reanalysis.

    PubMed

    Vanhove, Jan

    2013-01-01

    In second language acquisition research, the critical period hypothesis (cph) holds that the function between learners' age and their susceptibility to second language input is non-linear. This paper revisits the indistinctness found in the literature with regard to this hypothesis's scope and predictions. Even when its scope is clearly delineated and its predictions are spelt out, however, empirical studies-with few exceptions-use analytical (statistical) tools that are irrelevant with respect to the predictions made. This paper discusses statistical fallacies common in cph research and illustrates an alternative analytical method (piecewise regression) by means of a reanalysis of two datasets from a 2010 paper purporting to have found cross-linguistic evidence in favour of the cph. This reanalysis reveals that the specific age patterns predicted by the cph are not cross-linguistically robust. Applying the principle of parsimony, it is concluded that age patterns in second language acquisition are not governed by a critical period. To conclude, this paper highlights the role of confirmation bias in the scientific enterprise and appeals to second language acquisition researchers to reanalyse their old datasets using the methods discussed in this paper. The data and R commands that were used for the reanalysis are provided as supplementary materials.

  17. Establishing a learning foundation in a dynamically changing world: Insights from artificial language work

    NASA Astrophysics Data System (ADS)

    Gonzales, Kalim

    It is argued that infants build a foundation for learning about the world through their incidental acquisition of the spatial and temporal regularities surrounding them. A challenge is that learning occurs across multiple contexts whose statistics can greatly differ. Two artificial language studies with 12-month-olds demonstrate that infants come prepared to parse statistics across contexts using the temporal and perceptual features that distinguish one context from another. These results suggest that infants can organize their statistical input with a wider range of features that typically considered. Possible attention, decision making, and memory mechanisms are discussed.

  18. A stylistic classification of Russian-language texts based on the random walk model

    NASA Astrophysics Data System (ADS)

    Kramarenko, A. A.; Nekrasov, K. A.; Filimonov, V. V.; Zhivoderov, A. A.; Amieva, A. A.

    2017-09-01

    A formal approach to text analysis is suggested that is based on the random walk model. The frequencies and reciprocal positions of the vowel letters are matched up by a process of quasi-particle migration. Statistically significant difference in the migration parameters for the texts of different functional styles is found. Thus, a possibility of classification of texts using the suggested method is demonstrated. Five groups of the texts are singled out that can be distinguished from one another by the parameters of the quasi-particle migration process.

  19. Universal Entropy of Word Ordering Across Linguistic Families

    PubMed Central

    Montemurro, Marcelo A.; Zanette, Damián H.

    2011-01-01

    Background The language faculty is probably the most distinctive feature of our species, and endows us with a unique ability to exchange highly structured information. In written language, information is encoded by the concatenation of basic symbols under grammatical and semantic constraints. As is also the case in other natural information carriers, the resulting symbolic sequences show a delicate balance between order and disorder. That balance is determined by the interplay between the diversity of symbols and by their specific ordering in the sequences. Here we used entropy to quantify the contribution of different organizational levels to the overall statistical structure of language. Methodology/Principal Findings We computed a relative entropy measure to quantify the degree of ordering in word sequences from languages belonging to several linguistic families. While a direct estimation of the overall entropy of language yielded values that varied for the different families considered, the relative entropy quantifying word ordering presented an almost constant value for all those families. Conclusions/Significance Our results indicate that despite the differences in the structure and vocabulary of the languages analyzed, the impact of word ordering in the structure of language is a statistical linguistic universal. PMID:21603637

  20. Acquiring and processing verb argument structure: distributional learning in a miniature language.

    PubMed

    Wonnacott, Elizabeth; Newport, Elissa L; Tanenhaus, Michael K

    2008-05-01

    Adult knowledge of a language involves correctly balancing lexically-based and more language-general patterns. For example, verb argument structures may sometimes readily generalize to new verbs, yet with particular verbs may resist generalization. From the perspective of acquisition, this creates significant learnability problems, with some researchers claiming a crucial role for verb semantics in the determination of when generalization may and may not occur. Similarly, there has been debate regarding how verb-specific and more generalized constraints interact in sentence processing and on the role of semantics in this process. The current work explores these issues using artificial language learning. In three experiments using languages without semantic cues to verb distribution, we demonstrate that learners can acquire both verb-specific and verb-general patterns, based on distributional information in the linguistic input regarding each of the verbs as well as across the language as a whole. As with natural languages, these factors are shown to affect production, judgments and real-time processing. We demonstrate that learners apply a rational procedure in determining their usage of these different input statistics and conclude by suggesting that a Bayesian perspective on statistical learning may be an appropriate framework for capturing our findings.

  1. Availability of Pre-Admission Information to Prospective Graduate Students in Speech-Language Pathology

    ERIC Educational Resources Information Center

    Tekieli Koay, Mary Ellen; Lass, Norman J.; Parrill, Madaline; Naeser, Danielle; Babin, Kelly; Bayer, Olivia; Cook, Megan; Elmore, Madeline; Frye, Rachel; Kerwood, Samantha

    2016-01-01

    An extensive Internet search was conducted to obtain pre-admission information and acceptance statistics from 260 graduate programmes in speech-language pathology accredited by the American Speech-Language-Hearing Association (ASHA) in the United States. ASHA is the national professional, scientific and credentialing association for members and…

  2. Language, Ethnicity and Education: Case Studies on Immigrant Minority Groups and Immigrant Minority Languages. Multilingual Matters 111.

    ERIC Educational Resources Information Center

    Broeder, Peter; Extra, Guus

    Immigrant minority groups and immigrant minority languages in Europe are viewed from three perspectives (demographic, sociolinguistic, and educational) through case studies. The first part, using a demographic approach, includes research on immigrant minority groups in population statistics of both European Union and English-dominant countries…

  3. Assessing Language Dominance with Functional MRI: The Role of Control Tasks and Statistical Analysis

    ERIC Educational Resources Information Center

    Dodoo-Schittko, Frank; Rosengarth, Katharina; Doenitz, Christian; Greenlee, Mark W.

    2012-01-01

    There is a discrepancy between the brain regions revealed by functional neuroimaging techniques and those brain regions where a loss of function, either by lesion or by electrocortical stimulation, induces language disorders. To differentiate between essential and non-essential language-related processes, we investigated the effects of linguistic…

  4. A Study of Public Awareness of Speech-Language Pathology in Amman

    ERIC Educational Resources Information Center

    Mahmoud, Hana; Aljazi, Aya; Alkhamra, Rana

    2014-01-01

    Background: Statistical levels of awareness and knowledge of speech-language pathology and of communication disorders are currently unknown among the public in the Middle East, including Jordan. Aims: This study reports the results of an investigation of public awareness and knowledge of speech-language pathology in Amman-Jordan. It also…

  5. Why Segmentation Matters: Experience-Driven Segmentation Errors Impair "Morpheme" Learning

    ERIC Educational Resources Information Center

    Finn, Amy S.; Hudson Kam, Carla L.

    2015-01-01

    We ask whether an adult learner's knowledge of their native language impedes statistical learning in a new language beyond just word segmentation (as previously shown). In particular, we examine the impact of native-language word-form phonotactics on learners' ability to segment words into their component morphemes and learn phonologically…

  6. Health and nutrient content claims in food advertisements on Hispanic and mainstream prime-time television.

    PubMed

    Abbatangelo-Gray, Jodie; Byrd-Bredbenner, Carol; Austin, S Bryn

    2008-01-01

    Characterize frequency and type of health and nutrient content claims in prime-time weeknight Spanish- and English-language television advertisements from programs shown in 2003 with a high viewership by women aged 18 to 35 years. Comparative content analysis design was used to analyze 95 hours of Spanish-language and 72 hours of English-language television programs (netting 269 and 543 food ads, respectively). A content analysis instrument was used to gather information on explicit health and nutrient content claims: nutrition information only; diet-disease; structure-function; processed food health outcome; good for one's health; health care provider endorsement. Chi-square statistics detected statistically significant differences between the groups. Compared to English-language television, Spanish-language television aired significantly more food advertisements containing nutrition information and health, processed food/health, and good for one's health claims. Samples did not differ in the rate of diet/disease, structure/function, or health care provider endorsement claims. Findings indicate that Spanish-language television advertisements provide viewers with significantly more nutrition information than English-language network advertisements. Potential links between the deteriorating health status of Hispanics acculturating into US mainstream culture and their exposure to the less nutrition-based messaging found in English-language television should be explored.

  7. What predicts successful literacy acquisition in a second language?

    PubMed Central

    Frost, Ram; Siegelman, Noam; Narkiss, Alona; Afek, Liron

    2013-01-01

    We examined whether success (or failure) in assimilating the structure of a second language could be predicted by general statistical learning abilities that are non-linguistic in nature. We employed a visual statistical learning (VSL) task, monitoring our participants’ implicit learning of the transitional probabilities of visual shapes. A pretest revealed that performance in the VSL task is not correlated with abilities related to a general G factor or working memory. We found that native speakers of English who picked up the implicit statistical structure embedded in the continuous stream of shapes, on average, better assimilated the Semitic structure of Hebrew words. Our findings thus suggest that languages and their writing systems are characterized by idiosyncratic correlations of form and meaning, and these are picked up in the process of literacy acquisition, as they are picked up in any other type of learning, for the purpose of making sense of the environment. PMID:23698615

  8. Cross-Language Measurement Equivalence of the Center for Epidemiologic Studies Depression (CES-D) Scale in Systemic Sclerosis: A Comparison of Canadian and Dutch Patients

    PubMed Central

    Kwakkenbos, Linda; Arthurs, Erin; van den Hoogen, Frank H. J.; Hudson, Marie; van Lankveld, Wim G. J. M.; Baron, Murray; van den Ende, Cornelia H. M.; Thombs, Brett D.

    2013-01-01

    Objectives Increasingly, medical research involves patients who complete outcomes in different languages. This occurs in countries with more than one common language, such as Canada (French/English) or the United States (Spanish/English), as well as in international multi-centre collaborations, which are utilized frequently in rare diseases such as systemic sclerosis (SSc). In order to pool or compare outcomes, instruments should be measurement equivalent (invariant) across cultural or linguistic groups. This study provides an example of how to assess cross-language measurement equivalence by comparing the Center for Epidemiologic Studies Depression (CES-D) scale between English-speaking Canadian and Dutch SSc patients. Methods The CES-D was completed by 922 English-speaking Canadian and 213 Dutch SSc patients. Confirmatory factor analysis (CFA) was used to assess the factor structure in both samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess the amount of differential item functioning (DIF). Results A two-factor model (positive and negative affect) showed excellent fit in both samples. Statistically significant, but small-magnitude, DIF was found for 3 of 20 items on the CES-D. The English-speaking Canadian sample endorsed more feeling-related symptoms, whereas the Dutch sample endorsed more somatic/retarded activity symptoms. The overall estimate in depression scores between English and Dutch was not influenced substantively by DIF. Conclusions CES-D scores from English-speaking Canadian and Dutch SSc patients can be compared and pooled without concern that measurement differences may substantively influence results. The importance of assessing cross-language measurement equivalence in rheumatology studies prior to pooling outcomes obtained in different languages should be emphasized. PMID:23326538

  9. On Religion and Language Evolutions Seen Through Mathematical and Agent Based Models

    NASA Astrophysics Data System (ADS)

    Ausloos, M.

    Religions and languages are social variables, like age, sex, wealth or political opinions, to be studied like any other organizational parameter. In fact, religiosity is one of the most important sociological aspects of populations. Languages are also obvious characteristics of the human species. Religions, languages appear though also disappear. All religions and languages evolve and survive when they adapt to the society developments. On the other hand, the number of adherents of a given religion, or the number of persons speaking a language is not fixed in time, - nor space. Several questions can be raised. E.g. from a oscopic point of view : How many religions/languages exist at a given time? What is their distribution? What is their life time? How do they evolve? From a "microscopic" view point: can one invent agent based models to describe oscopic aspects? Do simple evolution equations exist? How complicated must be a model? These aspects are considered in the present note. Basic evolution equations are outlined and critically, though briefly, discussed. Similarities and differences between religions and languages are summarized. Cases can be illustrated with historical facts and data. It is stressed that characteristic time scales are different. It is emphasized that "external fields" are historically very relevant in the case of religions, rending the study more " interesting" within a mechanistic approach based on parity and symmetry of clusters concepts. Yet the modern description of human societies through networks in reported simulations is still lacking some mandatory ingredients, i.e. the non scalar nature of the nodes, and the non binary aspects of nodes and links, though for the latter this is already often taken into account, including directions. From an analytical point of view one can consider a population independently of the others. It is intuitively accepted, but also found from the statistical analysis of the frequency distribution that an attachment process is the primary cause of the distribution evolution in the number of adepts: usually the initial religion/language is that of the mother. However later on, changes can occur either due to "heterogeneous agent interaction" processes or due to "external field" constraints, - or both. In so doing one has to consider competition-like processes, in a general environment with different rates of reproduction. More general equations are thus proposed for future work.

  10. Native-likeness in second language lexical categorization reflects individual language history and linguistic community norms.

    PubMed

    Zinszer, Benjamin D; Malt, Barbara C; Ameel, Eef; Li, Ping

    2014-01-01

    SECOND LANGUAGE LEARNERS FACE A DUAL CHALLENGE IN VOCABULARY LEARNING: First, they must learn new names for the 100s of common objects that they encounter every day. Second, after some time, they discover that these names do not generalize according to the same rules used in their first language. Lexical categories frequently differ between languages (Malt et al., 1999), and successful language learning requires that bilinguals learn not just new words but new patterns for labeling objects. In the present study, Chinese learners of English with varying language histories and resident in two different language settings (Beijing, China and State College, PA, USA) named 67 photographs of common serving dishes (e.g., cups, plates, and bowls) in both Chinese and English. Participants' response patterns were quantified in terms of similarity to the responses of functionally monolingual native speakers of Chinese and English and showed the cross-language convergence previously observed in simultaneous bilinguals (Ameel et al., 2005). For English, bilinguals' names for each individual stimulus were also compared to the dominant name generated by the native speakers for the object. Using two statistical models, we disentangle the effects of several highly interactive variables from bilinguals' language histories and the naming norms of the native speaker community to predict inter-personal and inter-item variation in L2 (English) native-likeness. We find only a modest age of earliest exposure effect on L2 category native-likeness, but importantly, we find that classroom instruction in L2 negatively impacts L2 category native-likeness, even after significant immersion experience. We also identify a significant role of both L1 and L2 norms in bilinguals' L2 picture naming responses.

  11. Native-likeness in second language lexical categorization reflects individual language history and linguistic community norms

    PubMed Central

    Zinszer, Benjamin D.; Malt, Barbara C.; Ameel, Eef; Li, Ping

    2014-01-01

    Second language learners face a dual challenge in vocabulary learning: First, they must learn new names for the 100s of common objects that they encounter every day. Second, after some time, they discover that these names do not generalize according to the same rules used in their first language. Lexical categories frequently differ between languages (Malt et al., 1999), and successful language learning requires that bilinguals learn not just new words but new patterns for labeling objects. In the present study, Chinese learners of English with varying language histories and resident in two different language settings (Beijing, China and State College, PA, USA) named 67 photographs of common serving dishes (e.g., cups, plates, and bowls) in both Chinese and English. Participants’ response patterns were quantified in terms of similarity to the responses of functionally monolingual native speakers of Chinese and English and showed the cross-language convergence previously observed in simultaneous bilinguals (Ameel et al., 2005). For English, bilinguals’ names for each individual stimulus were also compared to the dominant name generated by the native speakers for the object. Using two statistical models, we disentangle the effects of several highly interactive variables from bilinguals’ language histories and the naming norms of the native speaker community to predict inter-personal and inter-item variation in L2 (English) native-likeness. We find only a modest age of earliest exposure effect on L2 category native-likeness, but importantly, we find that classroom instruction in L2 negatively impacts L2 category native-likeness, even after significant immersion experience. We also identify a significant role of both L1 and L2 norms in bilinguals’ L2 picture naming responses. PMID:25386149

  12. Perceptual context and individual differences in the language proficiency of preschool children.

    PubMed

    Banai, Karen; Yifat, Rachel

    2016-02-01

    Although the contribution of perceptual processes to language skills during infancy is well recognized, the role of perception in linguistic processing beyond infancy is not well understood. In the experiments reported here, we asked whether manipulating the perceptual context in which stimuli are presented across trials influences how preschool children perform visual (shape-size identification; Experiment 1) and auditory (syllable identification; Experiment 2) tasks. Another goal was to determine whether the sensitivity to perceptual context can explain part of the variance in oral language skills in typically developing preschool children. Perceptual context was manipulated by changing the relative frequency with which target visual (Experiment 1) and auditory (Experiment 2) stimuli were presented in arrays of fixed size, and identification of the target stimuli was tested. Oral language skills were assessed using vocabulary, word definition, and phonological awareness tasks. Changes in perceptual context influenced the performance of the majority of children on both identification tasks. Sensitivity to perceptual context accounted for 7% to 15% of the variance in language scores. We suggest that context effects are an outcome of a statistical learning process. Therefore, the current findings demonstrate that statistical learning can facilitate both visual and auditory identification processes in preschool children. Furthermore, consistent with previous findings in infants and in older children and adults, individual differences in statistical learning were found to be associated with individual differences in language skills of preschool children. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. A tutorial on aphasia test development in any language: Key substantive and psychometric considerations

    PubMed Central

    Ivanova, Maria V.; Hallowell, Brooke

    2013-01-01

    Background There are a limited number of aphasia language tests in the majority of the world's commonly spoken languages. Furthermore, few aphasia tests in languages other than English have been standardized and normed, and few have supportive psychometric data pertaining to reliability and validity. The lack of standardized assessment tools across many of the world's languages poses serious challenges to clinical practice and research in aphasia. Aims The current review addresses this lack of assessment tools by providing conceptual and statistical guidance for the development of aphasia assessment tools and establishment of their psychometric properties. Main Contribution A list of aphasia tests in the 20 most widely spoken languages is included. The pitfalls of translating an existing test into a new language versus creating a new test are outlined. Factors to consider in determining test content are discussed. Further, a description of test items corresponding to different language functions is provided, with special emphasis on implementing important controls in test design. Next, a broad review of principal psychometric properties relevant to aphasia tests is presented, with specific statistical guidance for establishing psychometric properties of standardized assessment tools. Conclusions This article may be used to help guide future work on developing, standardizing and validating aphasia language tests. The considerations discussed are also applicable to the development of standardized tests of other cognitive functions. PMID:23976813

  14. Detecting regular sound changes in linguistics as events of concerted evolution

    DOE PAGES

    Hruschka, Daniel  J.; Branford, Simon; Smith, Eric  D.; ...

    2014-12-18

    Background: Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. Results: Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular soundmore » change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. Conclusions: We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.« less

  15. Detecting regular sound changes in linguistics as events of concerted evolution.

    PubMed

    Hruschka, Daniel J; Branford, Simon; Smith, Eric D; Wilkins, Jon; Meade, Andrew; Pagel, Mark; Bhattacharya, Tanmoy

    2015-01-05

    Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular sound change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  16. Detecting regular sound changes in linguistics as events of concerted evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hruschka, Daniel  J.; Branford, Simon; Smith, Eric  D.

    Background: Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. Results: Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular soundmore » change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. Conclusions: We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.« less

  17. Generalising Ward's Method for Use with Manhattan Distances.

    PubMed

    Strauss, Trudie; von Maltitz, Michael Johan

    2017-01-01

    The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperforms the method using Euclidean distances. As an application, we perform statistical analyses on languages using methods normally applied to biology and genetic classification. We aim to quantify differences in character traits between languages and use a statistical language signature based on relative bi-gram (sequence of two letters) frequencies to calculate a distance matrix between 32 Indo-European languages. We then use Ward's method of hierarchical clustering to classify the languages, using the Euclidean distance and the Manhattan distance. Results obtained from using the different distance metrics are compared to show that the Ward's algorithm characteristic of minimising intra-cluster variation and maximising inter-cluster variation is not violated when using the Manhattan metric.

  18. Does metacognitive strategy instruction improve impaired receptive cognitive-communication skills following acquired brain injury?

    PubMed

    Copley, Anna; Smith, Kathleen; Savill, Katelyn; Finch, Emma

    2015-01-01

    To investigate if metacognitive strategy instruction (MSI) improves the receptive language skills of adults with cognitive-communication disorders secondary to acquired brain injury (ABI). An ABA intervention programme was implemented with eight adults with ABI, aged 25-70 years. The Measure of Cognitive-Linguistic Abilities (MCLA) was administered at baseline and following treatment. The treatment employed in this study involved three components: individual goal-based therapy, group remediation therapy using self-instruction and home practice. No receptive language sub-tests of the MCLA reached statistical significance. However, participants' raw score improvements in receptive language sub-tests indicated that MSI may be effective at remediating CCDs following ABI. Preliminary findings indicate that MSI may be effective in improving receptive language skills in adults with CCDs following ABI. Further research involving a more rigorous study, a larger sample size and a more reliable outcome measure is necessary and may provide statistically significant evidence for the effectiveness of MSI for remediating receptive language disorders.

  19. The vocabulary profile of Slovak children with primary language impairment compared to typically developing Slovak children measured by LITMUS-CLT.

    PubMed

    Kapalková, Svetlana; Slančová, Daniela

    2017-01-01

    This study compared a sample of children with primary language impairment (PLI) and typically developing age-matched children using the crosslinguistic lexical tasks (CLT-SK). We also compared the PLI children with typically developing language-matched younger children who were matched on the basis of receptive vocabulary. Overall, statistical testing showed that the vocabulary of the PLI children was significantly different from the vocabulary of the age-matched children, but not statistically different from the younger children who were matched on the basis of their receptive vocabulary size. Qualitative analysis of the correct answers revealed that the PLI children showed higher rigidity compared to the younger language-matched children who are able to use more synonyms or derivations across word class in naming tasks. Similarly, an examination of the children's naming errors indicated that the language-matched children exhibited more semantic errors, whereas PLI children showed more associative errors.

  20. Diversity, competition, extinction: the ecophysics of language change.

    PubMed

    Solé, Ricard V; Corominas-Murtra, Bernat; Fortuny, Jordi

    2010-12-06

    As indicated early by Charles Darwin, languages behave and change very much like living species. They display high diversity, differentiate in space and time, emerge and disappear. A large body of literature has explored the role of information exchanges and communicative constraints in groups of agents under selective scenarios. These models have been very helpful in providing a rationale on how complex forms of communication emerge under evolutionary pressures. However, other patterns of large-scale organization can be described using mathematical methods ignoring communicative traits. These approaches consider shorter time scales and have been developed by exploiting both theoretical ecology and statistical physics methods. The models are reviewed here and include extinction, invasion, origination, spatial organization, coexistence and diversity as key concepts and are very simple in their defining rules. Such simplicity is used in order to catch the most fundamental laws of organization and those universal ingredients responsible for qualitative traits. The similarities between observed and predicted patterns indicate that an ecological theory of language is emerging, supporting (on a quantitative basis) its ecological nature, although key differences are also present. Here, we critically review some recent advances and outline their implications and limitations as well as highlight problems for future research.

  1. Diversity, competition, extinction: the ecophysics of language change

    PubMed Central

    Solé, Ricard V.; Corominas-Murtra, Bernat; Fortuny, Jordi

    2010-01-01

    As indicated early by Charles Darwin, languages behave and change very much like living species. They display high diversity, differentiate in space and time, emerge and disappear. A large body of literature has explored the role of information exchanges and communicative constraints in groups of agents under selective scenarios. These models have been very helpful in providing a rationale on how complex forms of communication emerge under evolutionary pressures. However, other patterns of large-scale organization can be described using mathematical methods ignoring communicative traits. These approaches consider shorter time scales and have been developed by exploiting both theoretical ecology and statistical physics methods. The models are reviewed here and include extinction, invasion, origination, spatial organization, coexistence and diversity as key concepts and are very simple in their defining rules. Such simplicity is used in order to catch the most fundamental laws of organization and those universal ingredients responsible for qualitative traits. The similarities between observed and predicted patterns indicate that an ecological theory of language is emerging, supporting (on a quantitative basis) its ecological nature, although key differences are also present. Here, we critically review some recent advances and outline their implications and limitations as well as highlight problems for future research. PMID:20591847

  2. Frustration in the pattern formation of polysyllabic words

    NASA Astrophysics Data System (ADS)

    Hayata, Kazuya

    2016-12-01

    A novel frustrated system is given for the analysis of (m + 1)-syllabled vocal sounds for languages with the m-vowel system, where the varieties of vowels are assumed to be m (m > 2). The necessary and sufficient condition for observing the sound frustration is that the configuration of m vowels in an m-syllabled word has a preference for the ‘repulsive’ type, in which there is no duplication of an identical vowel. For languages that meet this requirement, no (m + 1)-syllabled word can in principle select the present type because at most m different vowels are available and consequently the duplicated use of an identical vowel is inevitable. For languages showing a preference for the ‘attractive’ type, where an identical vowel aggregates in a word, there arises no such conflict. In this paper, we first elucidate for Arabic with m = 3 how to deal with the conflicting situation, where a statistical approach based on the chi-square testing is employed. In addition to the conventional three-vowel system, analyses are made also for Russian, where a polysyllabic word contains both a stressed and an indeterminate vowel. Through the statistical analyses the selection scheme for quadrisyllabic configurations is found to be strongly dependent on the parts of speech as well as the gender of nouns. In order to emphasize the relevance to the sound model of binary oppositions, analyzed results of Greek verbs are also given.

  3. Linguistic Strategies for Improving Informed Consent in Clinical Trials Among Low Health Literacy Patients.

    PubMed

    Krieger, Janice L; Neil, Jordan M; Strekalova, Yulia A; Sarge, Melanie A

    2017-03-01

    Improving informed consent to participate in randomized clinical trials (RCTs) is a key challenge in cancer communication. The current study examines strategies for enhancing randomization comprehension among patients with diverse levels of health literacy and identifies cognitive and affective predictors of intentions to participate in cancer RCTs. Using a post-test-only experimental design, cancer patients (n = 500) were randomly assigned to receive one of three message conditions for explaining randomization (ie, plain language condition, gambling metaphor, benign metaphor) or a control message. All statistical tests were two-sided. Health literacy was a statistically significant moderator of randomization comprehension (P = .03). Among participants with the lowest levels of health literacy, the benign metaphor resulted in greater comprehension of randomization as compared with plain language (P = .04) and control (P = .004) messages. Among participants with the highest levels of health literacy, the gambling metaphor resulted in greater randomization comprehension as compared with the benign metaphor (P = .04). A serial mediation model showed a statistically significant negative indirect effect of comprehension on behavioral intention through personal relevance of RCTs and anxiety associated with participation in RCTs (P < .001). The effectiveness of metaphors for explaining randomization depends on health literacy, with a benign metaphor being particularly effective for patients at the lower end of the health literacy spectrum. The theoretical model demonstrates the cognitive and affective predictors of behavioral intention to participate in cancer RCTs and offers guidance on how future research should employ communication strategies to improve the informed consent processes. © The Author 2016. Published by Oxford University Press.

  4. Linguistic Strategies for Improving Informed Consent in Clinical Trials Among Low Health Literacy Patients

    PubMed Central

    Neil, Jordan M.; Strekalova, Yulia A.; Sarge, Melanie A.

    2017-01-01

    Abstract Background: Improving informed consent to participate in randomized clinical trials (RCTs) is a key challenge in cancer communication. The current study examines strategies for enhancing randomization comprehension among patients with diverse levels of health literacy and identifies cognitive and affective predictors of intentions to participate in cancer RCTs. Methods: Using a post-test-only experimental design, cancer patients (n = 500) were randomly assigned to receive one of three message conditions for explaining randomization (ie, plain language condition, gambling metaphor, benign metaphor) or a control message. All statistical tests were two-sided. Results: Health literacy was a statistically significant moderator of randomization comprehension (P = .03). Among participants with the lowest levels of health literacy, the benign metaphor resulted in greater comprehension of randomization as compared with plain language (P = .04) and control (P = .004) messages. Among participants with the highest levels of health literacy, the gambling metaphor resulted in greater randomization comprehension as compared with the benign metaphor (P = .04). A serial mediation model showed a statistically significant negative indirect effect of comprehension on behavioral intention through personal relevance of RCTs and anxiety associated with participation in RCTs (P < .001). Conclusions: The effectiveness of metaphors for explaining randomization depends on health literacy, with a benign metaphor being particularly effective for patients at the lower end of the health literacy spectrum. The theoretical model demonstrates the cognitive and affective predictors of behavioral intention to participate in cancer RCTs and offers guidance on how future research should employ communication strategies to improve the informed consent processes. PMID:27794035

  5. The conceptual basis of mathematics in cardiology: (II). Calculus and differential equations.

    PubMed

    Bates, Jason H T; Sobel, Burton E

    2003-04-01

    This is the second in a series of four articles developed for the readers of Coronary Artery Disease. Without language ideas cannot be articulated. What may not be so immediately obvious is that they cannot be formulated either. One of the essential languages of cardiology is mathematics. Unfortunately, medical education does not emphasize, and in fact, often neglects empowering physicians to think mathematically. Reference to statistics, conditional probability, multicompartmental modeling, algebra, calculus and transforms is common but often without provision of genuine conceptual understanding. At the University of Vermont College of Medicine, Professor Bates developed a course designed to address these deficiencies. The course covered mathematical principles pertinent to clinical cardiovascular and pulmonary medicine and research. It focused on fundamental concepts to facilitate formulation and grasp of ideas. This series of four articles was developed to make the material available for a wider audience. The articles will be published sequentially in Coronary Artery Disease. Beginning with fundamental axioms and basic algebraic manipulations they address algebra, function and graph theory, real and complex numbers, calculus and differential equations, mathematical modeling, linear system theory and integral transforms and statistical theory. The principles and concepts they address provide the foundation needed for in-depth study of any of these topics. Perhaps of even more importance, they should empower cardiologists and cardiovascular researchers to utilize the language of mathematics in assessing the phenomena of immediate pertinence to diagnosis, pathophysiology and therapeutics. The presentations are interposed with queries (by Coronary Artery Disease abbreviated as CAD) simulating the nature of interactions that occurred during the course itself. Each article concludes with one or more examples illustrating application of the concepts covered to cardiovascular medicine and biology.

  6. The conceptual basis of mathematics in cardiology III: linear systems theory and integral transforms.

    PubMed

    Bates, Jason H T; Sobel, Burton E

    2003-05-01

    This is the third in a series of four articles developed for the readers of Coronary Artery Disease. Without language ideas cannot be articulated. What may not be so immediately obvious is that they cannot be formulated either. One of the essential languages of cardiology is mathematics. Unfortunately, medical education does not emphasize, and in fact, often neglects empowering physicians to think mathematically. Reference to statistics, conditional probability, multicompartmental modeling, algebra, calculus and transforms is common but often without provision of genuine conceptual understanding. At the University of Vermont College of Medicine, Professor Bates developed a course designed to address these deficiencies. The course covered mathematical principles pertinent to clinical cardiovascular and pulmonary medicine and research. It focused on fundamental concepts to facilitate formulation and grasp of ideas.This series of four articles was developed to make the material available for a wider audience. The articles will be published sequentially in Coronary Artery Disease. Beginning with fundamental axioms and basic algebraic manipulations they address algebra, function and graph theory, real and complex numbers, calculus and differential equations, mathematical modeling, linear system theory and integral transforms and statistical theory. The principles and concepts they address provide the foundation needed for in-depth study of any of these topics. Perhaps of even more importance, they should empower cardiologists and cardiovascular researchers to utilize the language of mathematics in assessing the phenomena of immediate pertinence to diagnosis, pathophysiology and therapeutics. The presentations are interposed with queries (by Coronary Artery Disease abbreviated as CAD) simulating the nature of interactions that occurred during the course itself. Each article concludes with one or more examples illustrating application of the concepts covered to cardiovascular medicine and biology.

  7. The conceptual basis of mathematics in cardiology: (I) algebra, functions and graphs.

    PubMed

    Bates, Jason H T; Sobel, Burton E

    2003-02-01

    This is the first in a series of four articles developed for the readers of. Without language ideas cannot be articulated. What may not be so immediately obvious is that they cannot be formulated either. One of the essential languages of cardiology is mathematics. Unfortunately, medical education does not emphasize, and in fact, often neglects empowering physicians to think mathematically. Reference to statistics, conditional probability, multicompartmental modeling, algebra, calculus and transforms is common but often without provision of genuine conceptual understanding. At the University of Vermont College of Medicine, Professor Bates developed a course designed to address these deficiencies. The course covered mathematical principles pertinent to clinical cardiovascular and pulmonary medicine and research. It focused on fundamental concepts to facilitate formulation and grasp of ideas. This series of four articles was developed to make the material available for a wider audience. The articles will be published sequentially in Coronary Artery Disease. Beginning with fundamental axioms and basic algebraic manipulations they address algebra, function and graph theory, real and complex numbers, calculus and differential equations, mathematical modeling, linear system theory and integral transforms and statistical theory. The principles and concepts they address provide the foundation needed for in-depth study of any of these topics. Perhaps of even more importance, they should empower cardiologists and cardiovascular researchers to utilize the language of mathematics in assessing the phenomena of immediate pertinence to diagnosis, pathophysiology and therapeutics. The presentations are interposed with queries (by Coronary Artery Disease, abbreviated as CAD) simulating the nature of interactions that occurred during the course itself. Each article concludes with one or more examples illustrating application of the concepts covered to cardiovascular medicine and biology.

  8. False-Belief Understanding and Language Ability Mediate the Relationship between Emotion Comprehension and Prosocial Orientation in Preschoolers.

    PubMed

    Ornaghi, Veronica; Pepe, Alessandro; Grazzani, Ilaria

    2016-01-01

    Emotion comprehension (EC) is known to be a key correlate and predictor of prosociality from early childhood. In the present study, we examined this relationship within the broad theoretical construct of social understanding which includes a number of socio-emotional skills, as well as cognitive and linguistic abilities. Theory of mind, especially false-belief understanding, has been found to be positively correlated with both EC and prosocial orientation. Similarly, language ability is known to play a key role in children's socio-emotional development. The combined contribution of false-belief understanding and language to explaining the relationship between EC and prosociality has yet to be investigated. Thus, in the current study, we conducted an in-depth exploration of how preschoolers' false-belief understanding and language ability each contribute to modeling the relationship between children's comprehension of emotion and their disposition to act prosocially toward others, after controlling for age and gender. Participants were 101 4- to 6-year-old children (54% boys), who were administered measures of language ability, false-belief understanding, EC and prosocial orientation. Multiple mediation analysis of the data suggested that false-belief understanding and language ability jointly and fully mediated the effect of preschoolers' EC on their prosocial orientation. Analysis of covariates revealed that gender exerted no statistically significant effect, while age had a trivial positive effect. Theoretical and practical implications of the findings are discussed.

  9. [Psychometric properties and diagnostic value of 'lexical screening for aphasias'].

    PubMed

    Pena-Chavez, R; Martinez-Jimenez, L; Lopez-Espinoza, M

    2014-09-16

    INTRODUCTION. Language assessment in persons with brain injury makes it possible to know whether they require language rehabilitation or not. Given the importance of a precise evaluation, assessment instruments must be valid and reliable, so as to avoid mistaken and subjective diagnoses. AIM. To validate 'lexical screening for aphasias' in a sample of 58 Chilean individuals. SUBJECTS AND METHODS. A screening-type language test, lasting 20 minutes and based on the lexical processing model devised by Patterson and Shewell (1987), was constructed. The sample was made up of two groups containing 29 aphasic subjects and 29 control subjects from different health centres in the regions of Biobio and Maule, Chile. Their ages ranged between 24 and 79 years and had between 0 and 17 years' schooling. Tests were carried out to determine discriminating validity, concurrent validity with the aphasia disorder assessment battery, reliability, sensitivity and specificity. RESULTS. The statistical analysis showed a high discriminating validity (p < 0.001), an acceptable mean concurrent validity with aphasia disorder assessment battery (rs = 0.65), high mean reliability (alpha = 0.87), moderate mean sensitivity (69%) and high mean specificity (86%). CONCLUSION. 'Lexical screening for aphasias' is valid and reliable for assessing language in persons with aphasias; it is sensitive for detecting aphasic subjects and is specific for precluding language disorders in persons with normal language abilities.

  10. Statistical Learning Is Related to Early Literacy-Related Skills

    ERIC Educational Resources Information Center

    Spencer, Mercedes; Kaschak, Michael P.; Jones, John L.; Lonigan, Christopher J.

    2015-01-01

    It has been demonstrated that statistical learning, or the ability to use statistical information to learn the structure of one's environment, plays a role in young children's acquisition of linguistic knowledge. Although most research on statistical learning has focused on language acquisition processes, such as the segmentation of words from…

  11. Tool Mediation in Focus on Form Activities: Case Studies in a Grammar-Exploring Environment

    ERIC Educational Resources Information Center

    Karlstrom, Petter; Cerratto-Pargman, Teresa; Lindstrom, Henrik; Knutsson, Ola

    2007-01-01

    We present two case studies of two different pedagogical tasks in a Computer Assisted Language Learning environment called Grim. The main design principle in Grim is to support "Focus on Form" in second language pedagogy. Grim contains several language technology-based features for exploring linguistic forms (static, rule-based and statistical),…

  12. Early Language Stimulation of Down's Syndrome Babies: A Study on the Optimum Age To Begin.

    ERIC Educational Resources Information Center

    Aparicio, Maria Teresa Sanz; Balana, Javier Menendez

    2002-01-01

    Examined the marked delay in language acquisition suffered by babies with Down Syndrome and how early treatment affects the subsequent observed development among 36 subjects in Spain. Found statistically significant differences in language acquisitions in favor of newborns, compared with 90-day-old through 18-month-old infants who experienced…

  13. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  14. The Future of Foreign Language Teaching on the North American Continent.

    ERIC Educational Resources Information Center

    Bouton, Charles P.

    Following a brief review of the history of interest in foreign languages in America, facts to be considered when interpreting falling enrollment statistics, such as a drop in the birth rate, are discussed. It is stressed that foreign language teaching cannot be neglected in a world having improved and extensive communication between people…

  15. Public Views of Minority Languages as Communication or Symbol: The Case of Gaelic in Scotland

    ERIC Educational Resources Information Center

    Paterson, Lindsay; O'Hanlon, Fiona

    2015-01-01

    Two social roles for language have been distinguished by Edwards--the communicative and the symbolic. Using data from a survey of public attitudes to Gaelic in Scotland, the article investigates the extent to which people's view of language may be characterised as relating to these roles. Respondents were grouped, using statistical cluster…

  16. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics

    ERIC Educational Resources Information Center

    Burk, Robin K.

    2010-01-01

    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…

  17. Language Policy in British Colonial Education: Evidence from Nineteenth-Century Hong Kong

    ERIC Educational Resources Information Center

    Evans, Stephen

    2006-01-01

    This article examines the evolution of language-in-education policy in Hong Kong during the first six decades of British rule (1842-1902). In particular, it analyses the changing roles and status of the English and Chinese languages during this formative period in the development of the colony's education system. The textual and statistical data…

  18. A Meta-Analytic Review of Gender Variations in Children's Language Use: Talkativeness, Affiliative Speech, and Assertive Speech

    ERIC Educational Resources Information Center

    Leaper, Campbell; Smith, Tara E.

    2004-01-01

    Three sets of meta-analyses examined gender effects on children's language use. Each set of analyses considered an aspect of speech that is considered to be gender typed: talkativeness, affiliative speech, and assertive speech. Statistically significant average effect sizes were obtained with all three language constructs. On average, girls were…

  19. Study Quality in SLA: An Assessment of Designs, Analyses, and Reporting Practices in Quantitative L2 Research

    ERIC Educational Resources Information Center

    Plonsky, Luke

    2013-01-01

    This study assesses research and reporting practices in quantitative second language (L2) research. A sample of 606 primary studies, published from 1990 to 2010 in "Language Learning and Studies in Second Language Acquisition," was collected and coded for designs, statistical analyses, reporting practices, and outcomes (i.e., effect…

  20. Investigating Foreign Language Learning Anxiety among Students Learning English in a Public Sector University, Pakistan

    ERIC Educational Resources Information Center

    Gopang, Illahi Bux; Bughio, Faraz Ali; Pathan, Habibullah

    2015-01-01

    The present study investigated foreign language anxiety among students of Lasbela University, Baluchistan, Pakistan. The study adopted the Foreign Language Classroom Anxiety Scale (Horwitz et al., 1986). The respondents were (N = 240) including 26 female and 214 male. The data was run through the Statistical Package for the Social Sciences (SPSS)…

  1. Aging and the statistical learning of grammatical form classes.

    PubMed

    Schwab, Jessica F; Schuler, Kathryn D; Stillman, Chelsea M; Newport, Elissa L; Howard, James H; Howard, Darlene V

    2016-08-01

    Language learners must place unfamiliar words into categories, often with few explicit indicators about when and how that word can be used grammatically. Reeder, Newport, and Aslin (2013) showed that college students can learn grammatical form classes from an artificial language by relying solely on distributional information (i.e., contextual cues in the input). Here, 2 experiments revealed that healthy older adults also show such statistical learning, though they are poorer than young at distinguishing grammatical from ungrammatical strings. This finding expands knowledge of which aspects of learning vary with aging, with potential implications for second language learning in late adulthood. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  2. Usage of fMRI for pre-surgical planning in brain tumor and vascular lesion patients: task and statistical threshold effects on language lateralization.

    PubMed

    Nadkarni, Tanvi N; Andreoli, Matthew J; Nair, Veena A; Yin, Peng; Young, Brittany M; Kundu, Bornali; Pankratz, Joshua; Radtke, Andrew; Holdsworth, Ryan; Kuo, John S; Field, Aaron S; Baskaya, Mustafa K; Moritz, Chad H; Meyerand, M Elizabeth; Prabhakaran, Vivek

    2015-01-01

    Functional magnetic resonance imaging (fMRI) is a non-invasive pre-surgical tool used to assess localization and lateralization of language function in brain tumor and vascular lesion patients in order to guide neurosurgeons as they devise a surgical approach to treat these lesions. We investigated the effect of varying the statistical thresholds as well as the type of language tasks on functional activation patterns and language lateralization. We hypothesized that language lateralization indices (LIs) would be threshold- and task-dependent. Imaging data were collected from brain tumor patients (n = 67, average age 48 years) and vascular lesion patients (n = 25, average age 43 years) who received pre-operative fMRI scanning. Both patient groups performed expressive (antonym and/or letter-word generation) and receptive (tumor patients performed text-reading; vascular lesion patients performed text-listening) language tasks. A control group (n = 25, average age 45 years) performed the letter-word generation task. Brain tumor patients showed left-lateralization during the antonym-word generation and text-reading tasks at high threshold values and bilateral activation during the letter-word generation task, irrespective of the threshold values. Vascular lesion patients showed left-lateralization during the antonym and letter-word generation, and text-listening tasks at high threshold values. Our results suggest that the type of task and the applied statistical threshold influence LI and that the threshold effects on LI may be task-specific. Thus identifying critical functional regions and computing LIs should be conducted on an individual subject basis, using a continuum of threshold values with different tasks to provide the most accurate information for surgical planning to minimize post-operative language deficits.

  3. CDC Kerala 7: Effect of early language intervention among children 0-3 y with speech and language delay.

    PubMed

    Nair, M K C; Mini, A O; Leena, M L; George, Babu; Harikumaran Nair, G S; Bhaskaran, Deepa; Russell, Paul Swamidhas Sudhakar

    2014-12-01

    To assess the effect of systematic clinic and home based early language intervention program in children reporting to the early language intervention clinic with full partnership of specially trained developmental therapist and the parents. All babies between 0 and 3 y referred to Child Development Centre (CDC) Kerala for suspected speech/language delay were assessed and those without hearing impairment were screened first using Language Evaluation Scale Trivandrum (LEST) and assessed in detail using Receptive Expressive Emergent Language Scale (REELS). Those having language delay are enrolled into the early language intervention program for a period of 6 mo, 1 h at the CDC clinic once every month followed by home stimulation for rest of the month by the mother trained at CDC. Out of the total 455 children between 0 and 3 y, who successfully completed 6 mo intervention, the mean pre and post intervention language quotient (LQ) were 60.79 and 70.62 respectively and the observed 9.83 increase was statistically significant. The developmental diagnosis included developmental delay (62.4%), global developmental delay (18.5%), Trisomy and other chromosomal abnormalities (10.5%), microcephaly and other brain problems (9.9%), misarticulation (8.4%), autistic features (5.3%) and cleft palate and lip (3.3%) in the descending order. In the present study among 455 children between 0 and 3 y without hearing impairment, who successfully completed 6 mo early language intervention, the mean pre and post intervention LQ were 60.79 and 70.62 respectively and the observed 9.83 increase was statistically significant.

  4. Perceived discrimination among three groups of refugees resettled in the USA: associations with language, time in the USA, and continent of origin.

    PubMed

    Hadley, Craig; Patil, Crystal

    2009-12-01

    The objectives of this study were to assess the prevalence and predictors of discrimination among a community-based sample of refugees resettled in the USA. We sought to test whether language, gender, time in the USA and country of origin were associated with the experience of discrimination among individuals resettled in the USA as part of the refugee resettlement program. Perceived discrimination was assessed among individuals from East Africa (n = 92), West Africa (n = 74), and from Eastern Europe (n = 112) using a multi-item measure of discrimination. Bivariate associations revealed statistically significant associations between experiences of discrimination and time in the USA, language ability, and sending country. A logistic regression model revealed that refugees from African sending countries were more likely than Eastern European individuals to experience discrimination, even after controlling for potentially confounding factors. We interpret this finding as evidence of racism and discuss the implications for population health and resettlement practice.

  5. Utilitarian and Recreational Walking Among Spanish- and English-Speaking Latino Adults in Micropolitan US Towns.

    PubMed

    Doescher, Mark P; Lee, Chanam; Saelens, Brian E; Lee, Chunkuen; Berke, Ethan M; Adachi-Mejia, Anna M; Patterson, Davis G; Moudon, Anne Vernez

    2017-04-01

    Walking among Latinos in US Micropolitan towns may vary by language spoken. In 2011-2012, we collected telephone survey and built environment (BE) data from adults in six towns located within micropolitan counties from two states with sizable Latino populations. We performed mixed-effects logistic regression modeling to examine relationships between ethnicity-language group [Spanish-speaking Latinos (SSLs); English-speaking Latinos (ESLs); and English-speaking non-Latinos (ENLs)] and utilitarian walking and recreational walking, accounting for socio-demographic, lifestyle and BE characteristics. Low-income SSLs reported higher amounts of utilitarian walking than ENLs (p = 0.007), but utilitarian walking in this group decreased as income increased. SSLs reported lower amounts of recreational walking than ENLs (p = 0.004). ESL-ENL differences were not significant. We identified no statistically significant interactions between ethnicity-language group and BE characteristics. Approaches to increase walking in micropolitan towns with sizable SSL populations may need to account for this group's differences in walking behaviors.

  6. Statistical Signal Process in R Language in the Pharmacovigilance Programme of India.

    PubMed

    Kumar, Aman; Ahuja, Jitin; Shrivastava, Tarani Prakash; Kumar, Vipin; Kalaiselvan, Vivekanandan

    2018-05-01

    The Ministry of Health & Family Welfare, Government of India, initiated the Pharmacovigilance Programme of India (PvPI) in July 2010. The purpose of the PvPI is to collect data on adverse reactions due to medications, analyze it, and use the reference to recommend informed regulatory intervention, besides communicating the risk to health care professionals and the public. The goal of the present study was to apply statistical tools to find the relationship between drugs and ADRs for signal detection by R programming. Four statistical parameters were proposed for quantitative signal detection. These 4 parameters are IC 025 , PRR and PRR lb , chi-square, and N 11 ; we calculated these 4 values using R programming. We analyzed 78,983 drug-ADR combinations, and the total count of drug-ADR combination was 4,20,060. During the calculation of the statistical parameter, we use 3 variables: (1) N 11 (number of counts), (2) N 1. (Drug margin), and (3) N .1 (ADR margin). The structure and calculation of these 4 statistical parameters in R language are easily understandable. On the basis of the IC value (IC value >0), out of the 78,983 drug-ADR combination (drug-ADR combination), we found the 8,667 combinations to be significantly associated. The calculation of statistical parameters in R language is time saving and allows to easily identify new signals in the Indian ICSR (Individual Case Safety Reports) database.

  7. Of Substance: The Nature of Language Effects on Entity Construal

    PubMed Central

    Li, Peggy; Dunham, Yarrow; Carey, Susan

    2009-01-01

    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind (whisk; Imai and Gentner, 1997). Five experiments replicated this language type effect on entity construal, extended it to quite different stimuli from those studied before, and extended it to a comparison between Mandarin-speakers and English-speakers. A sixth experiment, which did not involve interpreting the meaning of a noun or a pronoun that stands for a noun, failed to find any effect of language type on entity construal. Thus, the overall pattern of findings supports a non-Whorfian, language on language account, according to which sensitivity to lexical statistics in a count/mass language leads adults to assign a novel noun in neutral syntax the status of a count noun, influencing construal of ambiguous entities. The experiments also document and explore cross-linguistically universal factors that influence entity construal, and favor Prasada's (1999) hypothesis that features indicating non-accidentalness of an entity's form lead participants to a construal of object-kind rather than substance-kind. Finally, the experiments document the age at which the language type effect emerges in lexical projection. The details of the developmental pattern are consistent with the lexical statistics hypothesis, along with a universal increase in sensitivity to material kind. PMID:19230873

  8. Using Statistical Natural Language Processing for Understanding Complex Responses to Free-Response Tasks

    ERIC Educational Resources Information Center

    DeMark, Sarah F.; Behrens, John T.

    2004-01-01

    Whereas great advances have been made in the statistical sophistication of assessments in terms of evidence accumulation and task selection, relatively little statistical work has explored the possibility of applying statistical techniques to data for the purposes of determining appropriate domain understanding and to generate task-level scoring…

  9. A Critical Analysis of the Language Background Other than English (LBOTE) Category in the Australian National Testing System: A Foucauldian Perspective

    ERIC Educational Resources Information Center

    Creagh, Sue

    2016-01-01

    This article presents a Foucauldian analysis of the political rationalities of national testing and accountability practices in Australia, and their inconsistencies for students for whom English is a second or additional language. It focuses on a problem associated with the statistical data category "Language Background Other Than…

  10. Relationship between English Language Learners' Proficiency in Reading, Writing, Listening, and Speaking and Proficiency on Maryland School Assessments in Mathematics

    ERIC Educational Resources Information Center

    Johnson, C. Michael

    2013-01-01

    Mathematics proficiency of English language learners (ELLs) on the Maryland School Assessments (MSA) for mathematics continues to lag behind the proficiency level of students who are proficient English speakers. The purpose of this study was to determine if there is a statistically significant relationship between English language learner's…

  11. English and Chinese languages as weighted complex networks

    NASA Astrophysics Data System (ADS)

    Sheng, Long; Li, Chunguang

    2009-06-01

    In this paper, we analyze statistical properties of English and Chinese written human language within the framework of weighted complex networks. The two language networks are based on an English novel and a Chinese biography, respectively, and both of the networks are constructed in the same way. By comparing the intensity and density of connections between the two networks, we find that high weight connections in Chinese language networks prevail more than those in English language networks. Furthermore, some of the topological and weighted quantities are compared. The results display some differences in the structural organizations between the two language networks. These observations indicate that the two languages may have different linguistic mechanisms and different combinatorial natures.

  12. Optical character recognition of handwritten Arabic using hidden Markov models

    NASA Astrophysics Data System (ADS)

    Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.

    2011-04-01

    The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.

  13. Optical character recognition of handwritten Arabic using hidden Markov models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.

    2011-01-01

    The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language ismore » initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.« less

  14. Dual language versus English-only support for bilingual children with hearing loss who use cochlear implants and hearing aids.

    PubMed

    Bunta, Ferenc; Douglas, Michael; Dickson, Hanna; Cantu, Amy; Wickesberg, Jennifer; Gifford, René H

    2016-07-01

    There is a critical need to understand better speech and language development in bilingual children learning two spoken languages who use cochlear implants (CIs) and hearing aids (HAs). The paucity of knowledge in this area poses a significant barrier to providing maximal communicative outcomes to a growing number of children who have a hearing loss (HL) and are learning multiple spoken languages. In fact, the number of bilingual individuals receiving CIs and HAs is rapidly increasing, and Hispanic children display a higher prevalence of HL than the general population of the United States. In order to serve better bilingual children with CIs and HAs, appropriate and effective therapy approaches need to be designed and tested, based on research findings. This study investigated the effects of supporting both the home language (Spanish) and the language of the majority culture (English) on language outcomes in bilingual children with HL who use CIs and HAs as compared to their bilingual peers who receive English-only support. Retrospective analyses of language measures were completed for two groups of Spanish- and English-speaking bilingual children with HL who use CIs and HAs matched on a range of demographic and socio-economic variables: those with dual-language support versus their peers with English-only support. Dependent variables included scores from the English version of the Preschool Language Scales, 4th Edition. Bilingual children who received dual-language support outperformed their peers who received English-only support at statistically significant levels as measured by Total Language and Expressive Communication as raw and language age scores. No statistically significant group differences were found on Auditory Comprehension scores. In addition to providing support in English, encouraging home language use and providing treatment support in the first language may help rather than hinder development of both English and the home language in bilingual children with HL who use CIs and HAs. In fact, dual-language support may yield better overall and expressive English language outcomes than English-only support for this population. © 2016 Royal College of Speech and Language Therapists.

  15. Harnessing QbD, Programming Languages, and Automation for Reproducible Biology.

    PubMed

    Sadowski, Michael I; Grant, Chris; Fell, Tim S

    2016-03-01

    Building robust manufacturing processes from biological components is a task that is highly complex and requires sophisticated tools to describe processes, inputs, and measurements and administrate management of knowledge, data, and materials. We argue that for bioengineering to fully access biological potential, it will require application of statistically designed experiments to derive detailed empirical models of underlying systems. This requires execution of large-scale structured experimentation for which laboratory automation is necessary. This requires development of expressive, high-level languages that allow reusability of protocols, characterization of their reliability, and a change in focus from implementation details to functional properties. We review recent developments in these areas and identify what we believe is an exciting trend that promises to revolutionize biotechnology. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Planning representation for automated exploratory data analysis

    NASA Astrophysics Data System (ADS)

    St. Amant, Robert; Cohen, Paul R.

    1994-03-01

    Igor is a knowledge-based system for exploratory statistical analysis of complex systems and environments. Igor has two related goals: to help automate the search for interesting patterns in data sets, and to help develop models that capture significant relationships in the data. We outline a language for Igor, based on techniques of opportunistic planning, which balances control and opportunism. We describe the application of Igor to the analysis of the behavior of Phoenix, an artificial intelligence planning system.

  17. Probability and Statistics: A Prelude.

    ERIC Educational Resources Information Center

    Goodman, A. F.; Blischke, W. R.

    Probability and statistics have become indispensable to scientific, technical, and management progress. They serve as essential dialects of mathematics, the classical language of science, and as instruments necessary for intelligent generation and analysis of information. A prelude to probability and statistics is presented by examination of the…

  18. A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation.

    PubMed

    Tran, Phuoc; Dinh, Dien; Nguyen, Hien T

    2016-01-01

    Chinese and Vietnamese have the same isolated language; that is, the words are not delimited by spaces. In machine translation, word segmentation is often done first when translating from Chinese or Vietnamese into different languages (typically English) and vice versa. However, it is a matter for consideration that words may or may not be segmented when translating between two languages in which spaces are not used between words, such as Chinese and Vietnamese. Since Chinese-Vietnamese is a low-resource language pair, the sparse data problem is evident in the translation system of this language pair. Therefore, while translating, whether it should be segmented or not becomes more important. In this paper, we propose a new method for translating Chinese to Vietnamese based on a combination of the advantages of character level and word level translation. In addition, a hybrid approach that combines statistics and rules is used to translate on the word level. And at the character level, a statistical translation is used. The experimental results showed that our method improved the performance of machine translation over that of character or word level translation.

  19. Dual language versus English only support for bilingual children with hearing loss who use cochlear implants and hearing aids

    PubMed Central

    Bunta, Ferenc; Douglas, Michael; Dickson, Hanna; Cantu, Amy; Wickesberg, Jennifer; Gifford, René H.

    2015-01-01

    Background There is a critical need to better understand speech and language development in bilingual children learning two spoken languages who use cochlear implants (CIs) and hearing aids (HAs). The paucity of knowledge in this area poses a significant barrier to providing maximal communicative outcomes to a growing number of children who have a hearing loss and are learning multiple spoken languages. In fact, the number of bilingual individuals receiving CIs and HAs is rapidly increasing, and Hispanic children display a higher prevalence of hearing loss than the general population of the United States (e.g., Mehra, Eavey, & Keamy, 2009). In order to better serve bilingual children with CIs and HAs, appropriate and effective therapy approaches need to be designed and tested, based on research findings. Aims This study investigated the effects of supporting both the home language (Spanish) and the language of the majority culture (English) on language outcomes in bilingual children with hearing loss (HL) who use CIs and HAs as compared to their bilingual peers who receive English only support. Methods and Procedures Retrospective analyses of language measures were completed for two groups of Spanish-and English-speaking bilingual children with HL who use CIs and HAs matched on a range of demographic and socio-economic variables: those with dual language support versus their peers with English only support. Dependent variables included scores from the English version of the Preschool Language Scales, 4th edition. Results Bilingual children who received dual language support outperformed their peers who received English only support at statistically significant levels as measured by Total Language and Expressive Communication as raw and language age scores. No statistically significant group differences were found on Auditory Comprehension scores. Conclusions In addition to providing support in English, encouraging home language use and providing treatment support in the first language may help rather than hinder development of both English and the home language in bilingual children with hearing loss who use CIs and HAs. In fact, dual language support may yield better overall and expressive English language outcomes than English only support for this population. PMID:27017913

  20. Diffusion of Lexical Change in Social Media

    PubMed Central

    Eisenstein, Jacob; O'Connor, Brendan; Smith, Noah A.; Xing, Eric P.

    2014-01-01

    Computer-mediated communication is driving fundamental changes in the nature of written language. We investigate these changes by statistical analysis of a dataset comprising 107 million Twitter messages (authored by 2.7 million unique user accounts). Using a latent vector autoregressive model to aggregate across thousands of words, we identify high-level patterns in diffusion of linguistic change over the United States. Our model is robust to unpredictable changes in Twitter's sampling rate, and provides a probabilistic characterization of the relationship of macro-scale linguistic influence to a set of demographic and geographic predictors. The results of this analysis offer support for prior arguments that focus on geographical proximity and population size. However, demographic similarity – especially with regard to race – plays an even more central role, as cities with similar racial demographics are far more likely to share linguistic influence. Rather than moving towards a single unified “netspeak” dialect, language evolution in computer-mediated communication reproduces existing fault lines in spoken American English. PMID:25409166

  1. An automated method to analyze language use in patients with schizophrenia and their first-degree relatives

    PubMed Central

    Elvevåg, Brita; Foltz, Peter W.; Rosenstein, Mark; DeLisi, Lynn E.

    2009-01-01

    Communication disturbances are prevalent in schizophrenia, and since it is a heritable illness these are likely present - albeit in a muted form - in the relatives of patients. Given the time-consuming, and often subjective nature of discourse analysis, these deviances are frequently not assayed in large scale studies. Recent work in computational linguistics and statistical-based semantic analysis has shown the potential and power of automated analysis of communication. We present an automated and objective approach to modeling discourse that detects very subtle deviations between probands, their first-degree relatives and unrelated healthy controls. Although these findings should be regarded as preliminary due to the limitations of the data at our disposal, we present a brief analysis of the models that best differentiate these groups in order to illustrate the utility of the method for future explorations of how language components are differentially affected by familial and illness related issues. PMID:20383310

  2. Performance Assessment and Translation of Physiologically Based Pharmacokinetic Models From acslX to Berkeley Madonna, MATLAB, and R Language: Oxytetracycline and Gold Nanoparticles As Case Examples.

    PubMed

    Lin, Zhoumeng; Jaberi-Douraki, Majid; He, Chunla; Jin, Shiqiang; Yang, Raymond S H; Fisher, Jeffrey W; Riviere, Jim E

    2017-07-01

    Many physiologically based pharmacokinetic (PBPK) models for environmental chemicals, drugs, and nanomaterials have been developed to aid risk and safety assessments using acslX. However, acslX has been rendered sunset since November 2015. Alternative modeling tools and tutorials are needed for future PBPK applications. This forum article aimed to: (1) demonstrate the performance of 4 PBPK modeling software packages (acslX, Berkeley Madonna, MATLAB, and R language) tested using 2 existing models (oxytetracycline and gold nanoparticles); (2) provide a tutorial of PBPK model code conversion from acslX to Berkeley Madonna, MATLAB, and R language; (3) discuss the advantages and disadvantages of each software package in the implementation of PBPK models in toxicology, and (4) share our perspective about future direction in this field. Simulation results of plasma/tissue concentrations/amounts of oxytetracycline and gold from different models were compared visually and statistically with linear regression analyses. Simulation results from the original models were correlated well with results from the recoded models, with time-concentration/amount curves nearly superimposable and determination coefficients of 0.86-1.00. Step-by-step explanations of the recoding of the models in different software programs are provided in the Supplementary Data. In summary, this article presents a tutorial of PBPK model code conversion for a small molecule and a nanoparticle among 4 software packages, and a performance comparison of these software packages in PBPK model implementation. This tutorial helps beginners learn PBPK modeling, provides suggestions for selecting a suitable tool for future projects, and may lead to the transition from acslX to alternative modeling tools. © The Author 2017. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. The Usual and the Unusual: Solving Remote Associates Test Tasks Using Simple Statistical Natural Language Processing Based on Language Use

    ERIC Educational Resources Information Center

    Klein, Ariel; Badia, Toni

    2015-01-01

    In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…

  4. A Sociolinguistic Survey of Araki: A Dying Language of Vanuatu

    ERIC Educational Resources Information Center

    Vari-Bogiri, Hannah

    2005-01-01

    Araki is one of around a hundred languages of the Republic of Vanuatu. It is a language spoken by the people of Araki, an islet situated near the south western part of Santo, in the north of Vanuatu. Linguistic statistics have shown a gradual decline in the number of speakers. This study presents evidence to show that Araki is a declining language…

  5. Relationship between affect and achievement in science and mathematics in Malaysia and Singapore

    NASA Astrophysics Data System (ADS)

    Thoe Ng, Khar; Fah Lay, Yoon; Areepattamannil, Shaljan; Treagust, David F.; Chandrasegaran, A. L.

    2012-11-01

    Background : The Trends in International Mathematics and Science Study (TIMSS) assesses the quality of the teaching and learning of science and mathematics among Grades 4 and 8 students across participating countries. Purpose : This study explored the relationship between positive affect towards science and mathematics and achievement in science and mathematics among Malaysian and Singaporean Grade 8 students. Sample : In total, 4466 Malaysia students and 4599 Singaporean students from Grade 8 who participated in TIMSS 2007 were involved in this study. Design and method : Students' achievement scores on eight items in the survey instrument that were reported in TIMSS 2007 were used as the dependent variable in the analysis. Students' scores on four items in the TIMSS 2007 survey instrument pertaining to students' affect towards science and mathematics together with students' gender, language spoken at home and parental education were used as the independent variables. Results : Positive affect towards science and mathematics indicated statistically significant predictive effects on achievement in the two subjects for both Malaysian and Singaporean Grade 8 students. There were statistically significant predictive effects on mathematics achievement for the students' gender, language spoken at home and parental education for both Malaysian and Singaporean students, with R 2 = 0.18 and 0.21, respectively. However, only parental education showed statistically significant predictive effects on science achievement for both countries. For Singapore, language spoken at home also demonstrated statistically significant predictive effects on science achievement, whereas gender did not. For Malaysia, neither gender nor language spoken at home had statistically significant predictive effects on science achievement. Conclusions : It is important for educators to consider implementing self-concept enhancement intervention programmes by incorporating 'affect' components of academic self-concept in order to develop students' talents and promote academic excellence in science and mathematics.

  6. False-Belief Understanding and Language Ability Mediate the Relationship between Emotion Comprehension and Prosocial Orientation in Preschoolers

    PubMed Central

    Ornaghi, Veronica; Pepe, Alessandro; Grazzani, Ilaria

    2016-01-01

    Emotion comprehension (EC) is known to be a key correlate and predictor of prosociality from early childhood. In the present study, we examined this relationship within the broad theoretical construct of social understanding which includes a number of socio-emotional skills, as well as cognitive and linguistic abilities. Theory of mind, especially false-belief understanding, has been found to be positively correlated with both EC and prosocial orientation. Similarly, language ability is known to play a key role in children’s socio-emotional development. The combined contribution of false-belief understanding and language to explaining the relationship between EC and prosociality has yet to be investigated. Thus, in the current study, we conducted an in-depth exploration of how preschoolers’ false-belief understanding and language ability each contribute to modeling the relationship between children’s comprehension of emotion and their disposition to act prosocially toward others, after controlling for age and gender. Participants were 101 4- to 6-year-old children (54% boys), who were administered measures of language ability, false-belief understanding, EC and prosocial orientation. Multiple mediation analysis of the data suggested that false-belief understanding and language ability jointly and fully mediated the effect of preschoolers’ EC on their prosocial orientation. Analysis of covariates revealed that gender exerted no statistically significant effect, while age had a trivial positive effect. Theoretical and practical implications of the findings are discussed. PMID:27774075

  7. Association between maternal acculturation and health beliefs related to oral health of Latino children.

    PubMed

    Tiwari, Tamanna; Mulvahill, Matthew; Wilson, Anne; Rai, Nayanjot; Albino, Judith

    2018-04-24

    This report is presenting the association of maternal acculturation, measured by preferred language, and oral health-related psychosocial measures in an urban Latino population. A cross-sectional survey was conducted with 100 mother-child dyads from the Dental Center at the Children's Hospital Colorado, the University of Colorado. A portion of Basic Research Factors Questionnaire capturing information about parental dental knowledge, attitudes, behavior and psychosocial measures was used to collect data from the participating mothers. Descriptive statistics were calculated for demographics and psychosocial measures by acculturation. A univariate linear regression model was performed for each measure by preferred language for primary analysis followed by adjusted model adjusting for parent's education. The mean age of the children was 3.99 years (SD = 1.11), and that of the mother was 29.54 years (SD = 9.62). Dental caries, measured as dmfs, was significantly higher in children of Spanish-speaking mothers compared to children of English-speaking mothers. English-speaking mothers had higher mean scores of oral health knowledge, oral health behaviors, knowledge on dental utilization, self-efficacy, and Oral Health Locus of Control as compared to Spanish-speaking mothers. Univariate analysis demonstrated significant association for preference for Spanish language with knowledge on dental utilization, maternal self-efficacy, perceived susceptibility and perceived barriers. The effect of language was attenuated, but significant, for each of these variables after adjusting for parent's education. This study reported that higher acculturation measured by a preference for the English language had a positive association with oral health outcomes in children. Spanish-speaking mothers perceived that their children were less susceptible to caries. Additionally, they perceived barriers in visiting the dentist for preventive visits.

  8. Reconciling phonological neighborhood effects in speech production through single trial analysis.

    PubMed

    Sadat, Jasmin; Martin, Clara D; Costa, Albert; Alario, F-Xavier

    2014-02-01

    A crucial step for understanding how lexical knowledge is represented is to describe the relative similarity of lexical items, and how it influences language processing. Previous studies of the effects of form similarity on word production have reported conflicting results, notably within and across languages. The aim of the present study was to clarify this empirical issue to provide specific constraints for theoretical models of language production. We investigated the role of phonological neighborhood density in a large-scale picture naming experiment using fine-grained statistical models. The results showed that increasing phonological neighborhood density has a detrimental effect on naming latencies, and re-analyses of independently obtained data sets provide supplementary evidence for this effect. Finally, we reviewed a large body of evidence concerning phonological neighborhood density effects in word production, and discussed the occurrence of facilitatory and inhibitory effects in accuracy measures. The overall pattern shows that phonological neighborhood generates two opposite forces, one facilitatory and one inhibitory. In cases where speech production is disrupted (e.g. certain aphasic symptoms), the facilitatory component may emerge, but inhibitory processes dominate in efficient naming by healthy speakers. These findings are difficult to accommodate in terms of monitoring processes, but can be explained within interactive activation accounts combining phonological facilitation and lexical competition. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Primary care physician supply and other key determinants of health care utilisation: the case of Switzerland

    PubMed Central

    Busato, André; Künzi, Beat

    2008-01-01

    Background The Swiss government decided to freeze new accreditations for physicians in private practice in Switzerland based on the assumption that demand-induced health care spending may be cut by limiting care offers. This legislation initiated an ongoing controversial public debate in Switzerland. The aim of this study is therefore the determination of socio-demographic and health system-related factors of per capita consultation rates with primary care physicians in the multicultural population of Switzerland. Methods The data were derived from the complete claims data of Swiss health insurers for 2004 and included 21.4 million consultations provided by 6564 Swiss primary care physicians on a fee-for-service basis. Socio-demographic data were obtained from the Swiss Federal Statistical Office. Utilisation-based health service areas were created and were used as observational units for statistical procedures. Multivariate and hierarchical models were applied to analyze the data. Results Models within the study allowed the definition of 1018 primary care service areas with a median population of 3754 and an average per capita consultation rate of 2.95 per year. Statistical models yielded significant effects for various geographical, socio-demographic and cultural factors. The regional density of physicians in independent practice was also significantly associated with annual consultation rates and indicated an associated increase 0.10 for each additional primary care physician in a population of 10,000 inhabitants. Considerable differences across Swiss language regions were observed with reference to the supply of ambulatory health resources provided either by primary care physicians, specialists, or hospital-based ambulatory care. Conclusion The study documents a large small-area variation in utilisation and provision of health care resources in Switzerland. Effects of physician density appeared to be strongly related to Swiss language regions and may be rooted in the different cultural backgrounds of the served populations. PMID:18190705

  10. Primary care physician supply and other key determinants of health care utilisation: the case of Switzerland.

    PubMed

    Busato, André; Künzi, Beat

    2008-01-11

    The Swiss government decided to freeze new accreditations for physicians in private practice in Switzerland based on the assumption that demand-induced health care spending may be cut by limiting care offers. This legislation initiated an ongoing controversial public debate in Switzerland. The aim of this study is therefore the determination of socio-demographic and health system-related factors of per capita consultation rates with primary care physicians in the multicultural population of Switzerland. The data were derived from the complete claims data of Swiss health insurers for 2004 and included 21.4 million consultations provided by 6564 Swiss primary care physicians on a fee-for-service basis. Socio-demographic data were obtained from the Swiss Federal Statistical Office. Utilisation-based health service areas were created and were used as observational units for statistical procedures. Multivariate and hierarchical models were applied to analyze the data. Models within the study allowed the definition of 1018 primary care service areas with a median population of 3754 and an average per capita consultation rate of 2.95 per year. Statistical models yielded significant effects for various geographical, socio-demographic and cultural factors. The regional density of physicians in independent practice was also significantly associated with annual consultation rates and indicated an associated increase 0.10 for each additional primary care physician in a population of 10,000 inhabitants. Considerable differences across Swiss language regions were observed with reference to the supply of ambulatory health resources provided either by primary care physicians, specialists, or hospital-based ambulatory care. The study documents a large small-area variation in utilisation and provision of health care resources in Switzerland. Effects of physician density appeared to be strongly related to Swiss language regions and may be rooted in the different cultural backgrounds of the served populations.

  11. Language extinction and linguistic fronts

    PubMed Central

    Isern, Neus; Fort, Joaquim

    2014-01-01

    Language diversity has become greatly endangered in the past centuries owing to processes of language shift from indigenous languages to other languages that are seen as socially and economically more advantageous, resulting in the death or doom of minority languages. In this paper, we define a new language competition model that can describe the historical decline of minority languages in competition with more advantageous languages. We then implement this non-spatial model as an interaction term in a reaction–diffusion system to model the evolution of the two competing languages. We use the results to estimate the speed at which the more advantageous language spreads geographically, resulting in the shrinkage of the area of dominance of the minority language. We compare the results from our model with the observed retreat in the area of influence of the Welsh language in the UK, obtaining a good agreement between the model and the observed data. PMID:24598207

  12. Statistical Learning of Phonetic Categories: Insights from a Computational Approach

    ERIC Educational Resources Information Center

    McMurray, Bob; Aslin, Richard N.; Toscano, Joseph C.

    2009-01-01

    Recent evidence (Maye, Werker & Gerken, 2002) suggests that statistical learning may be an important mechanism for the acquisition of phonetic categories in the infant's native language. We examined the sufficiency of this hypothesis and its implications for development by implementing a statistical learning mechanism in a computational model…

  13. The possibility of coexistence and co-development in language competition: ecology-society computational model and simulation.

    PubMed

    Yun, Jian; Shang, Song-Chao; Wei, Xiao-Dan; Liu, Shuang; Li, Zhi-Jie

    2016-01-01

    Language is characterized by both ecological properties and social properties, and competition is the basic form of language evolution. The rise and decline of one language is a result of competition between languages. Moreover, this rise and decline directly influences the diversity of human culture. Mathematics and computer modeling for language competition has been a popular topic in the fields of linguistics, mathematics, computer science, ecology, and other disciplines. Currently, there are several problems in the research on language competition modeling. First, comprehensive mathematical analysis is absent in most studies of language competition models. Next, most language competition models are based on the assumption that one language in the model is stronger than the other. These studies tend to ignore cases where there is a balance of power in the competition. The competition between two well-matched languages is more practical, because it can facilitate the co-development of two languages. A third issue with current studies is that many studies have an evolution result where the weaker language inevitably goes extinct. From the integrated point of view of ecology and sociology, this paper improves the Lotka-Volterra model and basic reaction-diffusion model to propose an "ecology-society" computational model for describing language competition. Furthermore, a strict and comprehensive mathematical analysis was made for the stability of the equilibria. Two languages in competition may be either well-matched or greatly different in strength, which was reflected in the experimental design. The results revealed that language coexistence, and even co-development, are likely to occur during language competition.

  14. Quantifying the driving factors for language shift in a bilingual region.

    PubMed

    Prochazka, Katharina; Vogl, Gero

    2017-04-25

    Many of the world's around 6,000 languages are in danger of disappearing as people give up use of a minority language in favor of the majority language in a process called language shift. Language shift can be monitored on a large scale through the use of mathematical models by way of differential equations, for example, reaction-diffusion equations. Here, we use a different approach: we propose a model for language dynamics based on the principles of cellular automata/agent-based modeling and combine it with very detailed empirical data. Our model makes it possible to follow language dynamics over space and time, whereas existing models based on differential equations average over space and consequently provide no information on local changes in language use. Additionally, cellular automata models can be used even in cases where models based on differential equations are not applicable, for example, in situations where one language has become dispersed and retreated to language islands. Using data from a bilingual region in Austria, we show that the most important factor in determining the spread and retreat of a language is the interaction with speakers of the same language. External factors like bilingual schools or parish language have only a minor influence.

  15. Implementation and Testing of Turbulence Models for the F18-HARV Simulation

    NASA Technical Reports Server (NTRS)

    Yeager, Jessie C.

    1998-01-01

    This report presents three methods of implementing the Dryden power spectral density model for atmospheric turbulence. Included are the equations which define the three methods and computer source code written in Advanced Continuous Simulation Language to implement the equations. Time-history plots and sample statistics of simulated turbulence results from executing the code in a test program are also presented. Power spectral densities were computed for sample sequences of turbulence and are plotted for comparison with the Dryden spectra. The three model implementations were installed in a nonlinear six-degree-of-freedom simulation of the High Alpha Research Vehicle airplane. Aircraft simulation responses to turbulence generated with the three implementations are presented as plots.

  16. Modeling the Development of Audiovisual Cue Integration in Speech Perception

    PubMed Central

    Getz, Laura M.; Nordeen, Elke R.; Vrabic, Sarah C.; Toscano, Joseph C.

    2017-01-01

    Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues. PMID:28335558

  17. Modeling the Development of Audiovisual Cue Integration in Speech Perception.

    PubMed

    Getz, Laura M; Nordeen, Elke R; Vrabic, Sarah C; Toscano, Joseph C

    2017-03-21

    Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues.

  18. From Sensory Signals to Modality-Independent Conceptual Representations: A Probabilistic Language of Thought Approach

    PubMed Central

    Erdogan, Goker; Yildirim, Ilker; Jacobs, Robert A.

    2015-01-01

    People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception. PMID:26554704

  19. Do infants retain the statistics of a statistical learning experience? Insights from a developmental cognitive neuroscience perspective

    PubMed Central

    2017-01-01

    Statistical structure abounds in language. Human infants show a striking capacity for using statistical learning (SL) to extract regularities in their linguistic environments, a process thought to bootstrap their knowledge of language. Critically, studies of SL test infants in the minutes immediately following familiarization, but long-term retention unfolds over hours and days, with almost no work investigating retention of SL. This creates a critical gap in the literature given that we know little about how single or multiple SL experiences translate into permanent knowledge. Furthermore, different memory systems with vastly different encoding and retention profiles emerge at different points in development, with the underlying memory system dictating the fidelity of the memory trace hours later. I describe the scant literature on retention of SL, the learning and retention properties of memory systems as they apply to SL, and the development of these memory systems. I propose that different memory systems support retention of SL in infant and adult learners, suggesting an explanation for the slow pace of natural language acquisition in infancy. I discuss the implications of developing memory systems for SL and suggest that we exercise caution in extrapolating from adult to infant properties of SL. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’. PMID:27872372

  20. Do infants retain the statistics of a statistical learning experience? Insights from a developmental cognitive neuroscience perspective.

    PubMed

    Gómez, Rebecca L

    2017-01-05

    Statistical structure abounds in language. Human infants show a striking capacity for using statistical learning (SL) to extract regularities in their linguistic environments, a process thought to bootstrap their knowledge of language. Critically, studies of SL test infants in the minutes immediately following familiarization, but long-term retention unfolds over hours and days, with almost no work investigating retention of SL. This creates a critical gap in the literature given that we know little about how single or multiple SL experiences translate into permanent knowledge. Furthermore, different memory systems with vastly different encoding and retention profiles emerge at different points in development, with the underlying memory system dictating the fidelity of the memory trace hours later. I describe the scant literature on retention of SL, the learning and retention properties of memory systems as they apply to SL, and the development of these memory systems. I propose that different memory systems support retention of SL in infant and adult learners, suggesting an explanation for the slow pace of natural language acquisition in infancy. I discuss the implications of developing memory systems for SL and suggest that we exercise caution in extrapolating from adult to infant properties of SL.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  1. A distinct language and a historic pendulum: the evolution of the Diagnostic and Statistical Manual of Mental Disorders.

    PubMed

    Sanders, James L

    2011-12-01

    Historically, the Diagnostic and Statistical Manual of Mental Disorders (DSM) has met an important need in defining a common language of psychiatric diagnosis in North America. Understanding the development of the DSM can help researchers and practitioners better understand this diagnostic language. The history of the DSM, from its precursors to recent proposed revisions for its fifth edition, is reviewed and compared while avoiding the presentist bias. The development of DSM resembles a historic pendulum, from DSM-I emphasizing psychodynamics and causality to DSM-III and DSM-IV emphasizing empiricism and logical positivism. The proposed changes in etiological- and dimensional-based classification for DSM-V represent a slight backswing toward the center. 2011 Elsevier Inc. All rights reserved.

  2. University Administrators as Forced Language Policy Agents. An Institutional Ethnography of Parallel Language Strategy and Practices at the University of Copenhagen

    ERIC Educational Resources Information Center

    Siiner, Maarja

    2016-01-01

    Nation states increasingly assign the responsibility for meeting the global competitiveness agenda to the universities themselves [Cirius, 2009, "Mobilitetsstatistik for de videregaaende uddannelser 2007/08" [Mobility statistics for higher education 2007/08

  3. Assessing Creative Problem-Solving with Automated Text Grading

    ERIC Educational Resources Information Center

    Wang, Hao-Chuan; Chang, Chun-Yen; Li, Tsai-Yen

    2008-01-01

    The work aims to improve the assessment of creative problem-solving in science education by employing language technologies and computational-statistical machine learning methods to grade students' natural language responses automatically. To evaluate constructs like creative problem-solving with validity, open-ended questions that elicit…

  4. 78 FR 68987 - Guides for Private Vocational and Distance Education Schools

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-18

    ... placement assistance and assistance overcoming language barriers or learning disabilities, it will provide... learning, the source of funding for student loans, and security policies and crime statistics. In response... misrepresentations relating to student financial assistance, assistance overcoming language barriers or learning...

  5. The words children hear: Picture books and the statistics for language learning

    PubMed Central

    Montag, Jessica L.; Jones, Michael N.; Smith, Linda B.

    2015-01-01

    Young children learn language from the speech they hear. Previous work suggests that the statistical diversity of words and of linguistic contexts is associated with better language outcomes. One potential source of lexical diversity is the text of picture books that caregivers read aloud to children. Many parents begin reading to their children shortly after birth, so this is potentially an important source of linguistic input for many children. We constructed a corpus of 100 children’s picture books and compared word type and token counts to a matched sample of child-directed speech. Overall, the picture books contained more unique word types than the child-directed speech. Further, individual picture books generally contained more unique word types than length-matched, child-directed conversations. The text of picture books may be an important source of vocabulary for young children, and these findings suggest a mechanism that underlies the language benefits associated with reading to children. PMID:26243292

  6. The Words Children Hear: Picture Books and the Statistics for Language Learning.

    PubMed

    Montag, Jessica L; Jones, Michael N; Smith, Linda B

    2015-09-01

    Young children learn language from the speech they hear. Previous work suggests that greater statistical diversity of words and of linguistic contexts is associated with better language outcomes. One potential source of lexical diversity is the text of picture books that caregivers read aloud to children. Many parents begin reading to their children shortly after birth, so this is potentially an important source of linguistic input for many children. We constructed a corpus of 100 children's picture books and compared word type and token counts in that sample and a matched sample of child-directed speech. Overall, the picture books contained more unique word types than the child-directed speech. Further, individual picture books generally contained more unique word types than length-matched, child-directed conversations. The text of picture books may be an important source of vocabulary for young children, and these findings suggest a mechanism that underlies the language benefits associated with reading to children. © The Author(s) 2015.

  7. MPTinR: analysis of multinomial processing tree models in R.

    PubMed

    Singmann, Henrik; Kellen, David

    2013-06-01

    We introduce MPTinR, a software package developed for the analysis of multinomial processing tree (MPT) models. MPT models represent a prominent class of cognitive measurement models for categorical data with applications in a wide variety of fields. MPTinR is the first software for the analysis of MPT models in the statistical programming language R, providing a modeling framework that is more flexible than standalone software packages. MPTinR also introduces important features such as (1) the ability to calculate the Fisher information approximation measure of model complexity for MPT models, (2) the ability to fit models for categorical data outside the MPT model class, such as signal detection models, (3) a function for model selection across a set of nested and nonnested candidate models (using several model selection indices), and (4) multicore fitting. MPTinR is available from the Comprehensive R Archive Network at http://cran.r-project.org/web/packages/MPTinR/ .

  8. Usage of fMRI for pre-surgical planning in brain tumor and vascular lesion patients: Task and statistical threshold effects on language lateralization☆☆☆

    PubMed Central

    Nadkarni, Tanvi N.; Andreoli, Matthew J.; Nair, Veena A.; Yin, Peng; Young, Brittany M.; Kundu, Bornali; Pankratz, Joshua; Radtke, Andrew; Holdsworth, Ryan; Kuo, John S.; Field, Aaron S.; Baskaya, Mustafa K.; Moritz, Chad H.; Meyerand, M. Elizabeth; Prabhakaran, Vivek

    2014-01-01

    Background and purpose Functional magnetic resonance imaging (fMRI) is a non-invasive pre-surgical tool used to assess localization and lateralization of language function in brain tumor and vascular lesion patients in order to guide neurosurgeons as they devise a surgical approach to treat these lesions. We investigated the effect of varying the statistical thresholds as well as the type of language tasks on functional activation patterns and language lateralization. We hypothesized that language lateralization indices (LIs) would be threshold- and task-dependent. Materials and methods Imaging data were collected from brain tumor patients (n = 67, average age 48 years) and vascular lesion patients (n = 25, average age 43 years) who received pre-operative fMRI scanning. Both patient groups performed expressive (antonym and/or letter-word generation) and receptive (tumor patients performed text-reading; vascular lesion patients performed text-listening) language tasks. A control group (n = 25, average age 45 years) performed the letter-word generation task. Results Brain tumor patients showed left-lateralization during the antonym-word generation and text-reading tasks at high threshold values and bilateral activation during the letter-word generation task, irrespective of the threshold values. Vascular lesion patients showed left-lateralization during the antonym and letter-word generation, and text-listening tasks at high threshold values. Conclusion Our results suggest that the type of task and the applied statistical threshold influence LI and that the threshold effects on LI may be task-specific. Thus identifying critical functional regions and computing LIs should be conducted on an individual subject basis, using a continuum of threshold values with different tasks to provide the most accurate information for surgical planning to minimize post-operative language deficits. PMID:25685705

  9. Effects of speech and language treatment on recovery from aphasia.

    PubMed

    Shewan, C M; Kertesz, A

    1984-11-01

    Language recovery in aphasic patients who received one of three types of speech and language treatment was compared with that in aphasic patients who received no treatment. One hundred aphasic patients were followed from 2 to 4 weeks postonset for 1 year or until recovery, using a standardized test battery administered at systematic intervals. Both treatment methods provided by trained speech-language pathologists were efficacious, while the method provided by trained nonprofessionals approached statistical significance. Small group size prevented resolution of the question of whether one type of treatment was superior to another.

  10. Dialect Density in Bilingual Puerto Rican Spanish-English Speaking Children

    PubMed Central

    Fabiano-Smith, Leah; Shuriff, Rebecca; Barlow, Jessica A.; Goldstein, Brian A.

    2014-01-01

    It is still largely unknown how the two phonological systems of bilingual children interact. In this exploratory study, we examine children's use of dialect features to determine how their speech sound systems interact. Six monolingual Puerto Rican Spanish-speaking children and 6 bilingual Puerto Rican Spanish-English speaking children, ages 5-7 years, were included in the current study. Children's single word productions were analyzed for (1) dialect density and (2) frequency of occurrence of dialect features (after Oetting & McDonald, 2002). Nonparametric statistical analyses were used to examine differences within and across language groups. Results indicated that monolinguals and bilinguals exhibited similar dialect density, but differed on the types of dialect features used. Findings are discussed within the theoretical framework of the Dual Systems Model (Paradis, 2001) of language acquisition in bilingual children. PMID:25009677

  11. Why segmentation matters: experience-driven segmentation errors impair “morpheme” learning

    PubMed Central

    Finn, Amy S.; Hudson Kam, Carla L.

    2015-01-01

    We ask whether an adult learner’s knowledge of their native language impedes statistical learning in a new language beyond just word segmentation (as previously shown). In particular, we examine the impact of native-language word-form phonotactics on learners’ ability to segment words into their component morphemes and learn phonologically triggered variation of morphemes. We find that learning is impaired when words and component morphemes are structured to conflict with a learner’s native-language phonotactic system, but not when native-language phonotactics do not conflict with morpheme boundaries in the artificial language. A learner’s native-language knowledge can therefore have a cascading impact affecting word segmentation and the morphological variation that relies upon proper segmentation. These results show that getting word segmentation right early in learning is deeply important for learning other aspects of language, even those (morphology) that are known to pose a great difficulty for adult language learners. PMID:25730305

  12. The emergence of Zipf's law - Spontaneous encoding optimization by users of a command language

    NASA Technical Reports Server (NTRS)

    Ellis, S. R.; Hitchcock, R. J.

    1986-01-01

    The distribution of commands issued by experienced users of a computer operating system allowing command customization tends to conform to Zipf's law. This result documents the emergence of a statistical property of natural language as users master an artificial language. Analysis of Zipf's law by Mandelbrot and Cherry shows that its emergence in the computer interaction of experienced users may be interpreted as evidence that these users optimize their encoding of commands. Accordingly, the extent to which users of a command language exhibit Zipf's law can provide a metric of the naturalness and efficiency with which that language is used.

  13. Quantifying the driving factors for language shift in a bilingual region

    PubMed Central

    Prochazka, Katharina; Vogl, Gero

    2017-01-01

    Many of the world’s around 6,000 languages are in danger of disappearing as people give up use of a minority language in favor of the majority language in a process called language shift. Language shift can be monitored on a large scale through the use of mathematical models by way of differential equations, for example, reaction–diffusion equations. Here, we use a different approach: we propose a model for language dynamics based on the principles of cellular automata/agent-based modeling and combine it with very detailed empirical data. Our model makes it possible to follow language dynamics over space and time, whereas existing models based on differential equations average over space and consequently provide no information on local changes in language use. Additionally, cellular automata models can be used even in cases where models based on differential equations are not applicable, for example, in situations where one language has become dispersed and retreated to language islands. Using data from a bilingual region in Austria, we show that the most important factor in determining the spread and retreat of a language is the interaction with speakers of the same language. External factors like bilingual schools or parish language have only a minor influence. PMID:28298530

  14. Distributional structure in language: Contributions to noun–verb difficulty differences in infant word recognition

    PubMed Central

    Willits, Jon A.; Seidenberg, Mark S.; Saffran, Jenny R.

    2014-01-01

    What makes some words easy for infants to recognize, and other words difficult? We addressed this issue in the context of prior results suggesting that infants have difficulty recognizing verbs relative to nouns. In this work, we highlight the role played by the distributional contexts in which nouns and verbs occur. Distributional statistics predict that English nouns should generally be easier to recognize than verbs in fluent speech. However, there are situations in which distributional statistics provide similar support for verbs. The statistics for verbs that occur with the English morpheme –ing, for example, should facilitate verb recognition. In two experiments with 7.5- and 9.5-month-old infants, we tested the importance of distributional statistics for word recognition by varying the frequency of the contextual frames in which verbs occur. The results support the conclusion that distributional statistics are utilized by infant language learners and contribute to noun–verb differences in word recognition. PMID:24908342

  15. A model of the mechanisms of language extinction and revitalization strategies to save endangered languages.

    PubMed

    Fernando, Chrisantha; Valijärvi, Riitta-Liisa; Goldstein, Richard A

    2010-02-01

    Why and how have languages died out? We have devised a mathematical model to help us understand how languages go extinct. We use the model to ask whether language extinction can be prevented in the future and why it may have occurred in the past. A growing number of mathematical models of language dynamics have been developed to study the conditions for language coexistence and death, yet their phenomenological approach compromises their ability to influence language revitalization policy. In contrast, here we model the mechanisms underlying language competition and look at how these mechanisms are influenced by specific language revitalization interventions, namely, private interventions to raise the status of the language and thus promote language learning at home, public interventions to increase the use of the minority language, and explicit teaching of the minority language in schools. Our model reveals that it is possible to preserve a minority language but that continued long-term interventions will likely be necessary. We identify the parameters that determine which interventions work best under certain linguistic and societal circumstances. In this way the efficacy of interventions of various types can be identified and predicted. Although there are qualitative arguments for these parameter values (e.g., the responsiveness of children to learning a language as a function of the proportion of conversations heard in that language, the relative importance of conversations heard in the family and elsewhere, and the amplification of spoken to heard conversations of the high-status language because of the media), extensive quantitative data are lacking in this field. We propose a way to measure these parameters, allowing our model, as well as others models in the field, to be validated.

  16. Linking Language Assessments: An Example in a Low Stakes Context.

    ERIC Educational Resources Information Center

    North, Brian

    2000-01-01

    Linking language assessments is a matter of greater concern with the advent of educational frameworks used to orient curricula and profile attainment. Outlines practical ways the principles of techniques recognized for linking separate assessments--equating, calibrating, statistical moderation, predicting (or benchmarking), and social moderation…

  17. Konnen Computer das Sprachproblem losen (Can Computers Solve the Language Problem)?

    ERIC Educational Resources Information Center

    Zeilinger, Michael

    1972-01-01

    Various computer applications in linguistics, primarily speech synthesis and machine translation, are reviewed. Although the computer proves useful for statistics, dictionary building and programmed instruction, the promulgation of a world auxiliary language is considered a more human and practical solution to the international communication…

  18. Language Learning Strategy Use and Reading Achievement

    ERIC Educational Resources Information Center

    Ghafournia, Narjes

    2014-01-01

    The current study investigated the differences across the varying levels of EFL learners in the frequency and choice of learning strategies. Using a reading test, questionnaire, and parametric statistical analysis, the findings yielded up discrepancies among the participants in the implementation of language-learning strategies concerning their…

  19. The Evolution of the Language Laboratory: Changes During Fifteen Years of Operation

    ERIC Educational Resources Information Center

    Stack, Edward M.

    1977-01-01

    This article summarizes conditions and changes in language laboratories. Types of laboratories and equipment are listed; laboratory personnel include technicians, librarians and student assistants. Most maintenance was done by institution personnel; student use is outlined. Professional attitudes and equipment statistics are surveyed. (CHK)

  20. Real English Project Report.

    ERIC Educational Resources Information Center

    Cautin, Harvey; Regan, Edward

    Requirements are discussed for an information retrieval language that enables users to employ natural language sentences in interaction with computer-stored files. Anticipated modes of operation of the system are outlined. These are: the search mode, the dictionary mode, the tables mode, and the statistical mode. Analysis of sample sentences…

  1. Hemophilia Data and Statistics

    MedlinePlus

    ... View public health webinars on blood disorders Data & Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... genetic testing is done to diagnose hemophilia before birth. For the one-third ... rates and hospitalization rates for bleeding complications from hemophilia ...

  2. From Statistics to Meaning: Infants’ Acquisition of Lexical Categories

    PubMed Central

    Lany, Jill; Saffran, Jenny R.

    2013-01-01

    Infants are highly sensitive to statistical patterns in their auditory language input that mark word categories (e.g., noun and verb). However, it is unknown whether experience with these cues facilitates the acquisition of semantic properties of word categories. In a study testing this hypothesis, infants first listened to an artificial language in which word categories were reliably distinguished by statistical cues (experimental group) or in which these properties did not cue category membership (control group). Both groups were then trained on identical pairings between the words and pictures from two categories (animals and vehicles). Only infants in the experimental group learned the trained associations between specific words and pictures. Moreover, these infants generalized the pattern to include novel pairings. These results suggest that experience with statistical cues marking lexical categories sets the stage for learning the meanings of individual words and for generalizing meanings to new category members. PMID:20424058

  3. A Comparison and Evaluation of Real-Time Software Systems Modeling Languages

    NASA Technical Reports Server (NTRS)

    Evensen, Kenneth D.; Weiss, Kathryn Anne

    2010-01-01

    A model-driven approach to real-time software systems development enables the conceptualization of software, fostering a more thorough understanding of its often complex architecture and behavior while promoting the documentation and analysis of concerns common to real-time embedded systems such as scheduling, resource allocation, and performance. Several modeling languages have been developed to assist in the model-driven software engineering effort for real-time systems, and these languages are beginning to gain traction with practitioners throughout the aerospace industry. This paper presents a survey of several real-time software system modeling languages, namely the Architectural Analysis and Design Language (AADL), the Unified Modeling Language (UML), Systems Modeling Language (SysML), the Modeling and Analysis of Real-Time Embedded Systems (MARTE) UML profile, and the AADL for UML profile. Each language has its advantages and disadvantages, and in order to adequately describe a real-time software system's architecture, a complementary use of multiple languages is almost certainly necessary. This paper aims to explore these languages in the context of understanding the value each brings to the model-driven software engineering effort and to determine if it is feasible and practical to combine aspects of the various modeling languages to achieve more complete coverage in architectural descriptions. To this end, each language is evaluated with respect to a set of criteria such as scope, formalisms, and architectural coverage. An example is used to help illustrate the capabilities of the various languages.

  4. Electrophysiological Evidence of Heterogeneity in Visual Statistical Learning in Young Children with ASD

    ERIC Educational Resources Information Center

    Jeste, Shafali S.; Kirkham, Natasha; Senturk, Damla; Hasenstab, Kyle; Sugar, Catherine; Kupelian, Chloe; Baker, Elizabeth; Sanders, Andrew J.; Shimizu, Christina; Norona, Amanda; Paparella, Tanya; Freeman, Stephanny F. N.; Johnson, Scott P.

    2015-01-01

    Statistical learning is characterized by detection of regularities in one's environment without an awareness or intention to learn, and it may play a critical role in language and social behavior. Accordingly, in this study we investigated the electrophysiological correlates of visual statistical learning in young children with autism…

  5. Value Production in a Collaborative Environment. Sociophysical Studies of Wikipedia

    NASA Astrophysics Data System (ADS)

    Yasseri, Taha; Kertész, János

    2013-05-01

    We review some recent endeavors and add some new results to characterize and understand underlying mechanisms in Wikipedia (WP), the paradigmatic example of collaborative value production. We analyzed the statistics of editorial activity in different languages and observed typical circadian and weekly patterns, which enabled us to estimate the geographical origins of contributions to WPs in languages spoken in several time zones. Using a recently introduced measure we showed that the editorial activities have intrinsic dependencies in the burstiness of events. A comparison of the English and Simple English WPs revealed important aspects of language complexity and showed how peer cooperation solved the task of enhancing readability. One of our focus issues was characterizing the conflicts or edit wars in WPs, which helped us to automatically filter out controversial pages. When studying the temporal evolution of the controversiality of such pages we identified typical patterns and classified conflicts accordingly. Our quantitative analysis provides the basis of modeling conflicts and their resolution in collaborative environments and contribute to the understanding of this issue, which becomes increasingly important with the development of information communication technology.

  6. Expanding access to high-quality plain-language patient education information through context-specific hyperlinks

    PubMed Central

    Ancker, Jessica S.; Mauer, Elizabeth; Hauser, Diane; Calman, Neil

    2016-01-01

    Medical records, which are increasingly directly accessible to patients, contain highly technical terms unfamiliar to many patients. A federally qualified health center (FQHC) sought to help patients interpret their records by embedding context-specific hyperlinks to plain-language patient education materials in its portal. We assessed the impact of this innovation through a 3-year retrospective cohort study. A total of 12,877 (10% of all patients) in this safety net population had used the MPC links. Black patients, Latino patients comfortable using English, and patients covered by Medicaid were more likely to use the informational hyperlinks than other patients. The positive association with black race and Latino ethnicity remained statistically significant in multivariable models that controlled for insurance type. We conclude that many of the sociodemographic factors associated with the digital divide do not present barriers to accessing context-specific patient education information once in the portal. In fact, this type of highly convenient plain-language patient education may provide particular value to patients in traditionally disadvantaged groups. PMID:28269821

  7. A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

    PubMed Central

    2016-01-01

    Chinese and Vietnamese have the same isolated language; that is, the words are not delimited by spaces. In machine translation, word segmentation is often done first when translating from Chinese or Vietnamese into different languages (typically English) and vice versa. However, it is a matter for consideration that words may or may not be segmented when translating between two languages in which spaces are not used between words, such as Chinese and Vietnamese. Since Chinese-Vietnamese is a low-resource language pair, the sparse data problem is evident in the translation system of this language pair. Therefore, while translating, whether it should be segmented or not becomes more important. In this paper, we propose a new method for translating Chinese to Vietnamese based on a combination of the advantages of character level and word level translation. In addition, a hybrid approach that combines statistics and rules is used to translate on the word level. And at the character level, a statistical translation is used. The experimental results showed that our method improved the performance of machine translation over that of character or word level translation. PMID:27446207

  8. Language learners privilege structured meaning over surface frequency

    PubMed Central

    Culbertson, Jennifer; Adger, David

    2014-01-01

    Although it is widely agreed that learning the syntax of natural languages involves acquiring structure-dependent rules, recent work on acquisition has nevertheless attempted to characterize the outcome of learning primarily in terms of statistical generalizations about surface distributional information. In this paper we investigate whether surface statistical knowledge or structural knowledge of English is used to infer properties of a novel language under conditions of impoverished input. We expose learners to artificial-language patterns that are equally consistent with two possible underlying grammars—one more similar to English in terms of the linear ordering of words, the other more similar on abstract structural grounds. We show that learners’ grammatical inferences overwhelmingly favor structural similarity over preservation of superficial order. Importantly, the relevant shared structure can be characterized in terms of a universal preference for isomorphism in the mapping from meanings to utterances. Whereas previous empirical support for this universal has been based entirely on data from cross-linguistic language samples, our results suggest it may reflect a deep property of the human cognitive system—a property that, together with other structure-sensitive principles, constrains the acquisition of linguistic knowledge. PMID:24706789

  9. Multicriteria framework for selecting a process modelling language

    NASA Astrophysics Data System (ADS)

    Scanavachi Moreira Campos, Ana Carolina; Teixeira de Almeida, Adiel

    2016-01-01

    The choice of process modelling language can affect business process management (BPM) since each modelling language shows different features of a given process and may limit the ways in which a process can be described and analysed. However, choosing the appropriate modelling language for process modelling has become a difficult task because of the availability of a large number modelling languages and also due to the lack of guidelines on evaluating, and comparing languages so as to assist in selecting the most appropriate one. This paper proposes a framework for selecting a modelling language in accordance with the purposes of modelling. This framework is based on the semiotic quality framework (SEQUAL) for evaluating process modelling languages and a multicriteria decision aid (MCDA) approach in order to select the most appropriate language for BPM. This study does not attempt to set out new forms of assessment and evaluation criteria, but does attempt to demonstrate how two existing approaches can be combined so as to solve the problem of selection of modelling language. The framework is described in this paper and then demonstrated by means of an example. Finally, the advantages and disadvantages of using SEQUAL and MCDA in an integrated manner are discussed.

  10. The Effectiveness of Using Linguistic Classroom Activities in Teaching English Language in Developing the Skills of Oral Linguistic Performance and Decision Making Skill among Third Grade Intermediate Students in Makah

    ERIC Educational Resources Information Center

    Alshareef, Fahd Majed

    2016-01-01

    The study aimed to reveal the effectiveness of the use of certain classroom language activities in teaching English language in the development of oral linguistic performance and decision-making among intermediate third-grade students in Makah, and it revealed a statistically significant correlation relationship between the averages of the study…

  11. Representing spatial structure through maps and language: Lord of the Rings encodes the spatial structure of middle Earth.

    PubMed

    Louwerse, Max M; Benesh, Nick

    2012-01-01

    Spatial mental representations can be derived from linguistic and non-linguistic sources of information. This study tested whether these representations could be formed from statistical linguistic frequencies of city names, and to what extent participants differed in their performance when they estimated spatial locations from language or maps. In a computational linguistic study, we demonstrated that co-occurrences of cities in Tolkien's Lord of the Rings trilogy and The Hobbit predicted the authentic longitude and latitude of those cities in Middle Earth. In a human study, we showed that human spatial estimates of the location of cities were very similar regardless of whether participants read Tolkien's texts or memorized a map of Middle Earth. However, text-based location estimates obtained from statistical linguistic frequencies better predicted the human text-based estimates than the human map-based estimates. These findings suggest that language encodes spatial structure of cities, and that human cognitive map representations can come from implicit statistical linguistic patterns, from explicit non-linguistic perceptual information, or from both. Copyright © 2012 Cognitive Science Society, Inc.

  12. Bilingual Language Switching: Production vs. Recognition

    PubMed Central

    Mosca, Michela; de Bot, Kees

    2017-01-01

    This study aims at assessing how bilinguals select words in the appropriate language in production and recognition while minimizing interference from the non-appropriate language. Two prominent models are considered which assume that when one language is in use, the other is suppressed. The Inhibitory Control (IC) model suggests that, in both production and recognition, the amount of inhibition on the non-target language is greater for the stronger compared to the weaker language. In contrast, the Bilingual Interactive Activation (BIA) model proposes that, in language recognition, the amount of inhibition on the weaker language is stronger than otherwise. To investigate whether bilingual language production and recognition can be accounted for by a single model of bilingual processing, we tested a group of native speakers of Dutch (L1), advanced speakers of English (L2) in a bilingual recognition and production task. Specifically, language switching costs were measured while participants performed a lexical decision (recognition) and a picture naming (production) task involving language switching. Results suggest that while in language recognition the amount of inhibition applied to the non-appropriate language increases along with its dominance as predicted by the IC model, in production the amount of inhibition applied to the non-relevant language is not related to language dominance, but rather it may be modulated by speakers' unconscious strategies to foster the weaker language. This difference indicates that bilingual language recognition and production might rely on different processing mechanisms and cannot be accounted within one of the existing models of bilingual language processing. PMID:28638361

  13. Bilingual Language Switching: Production vs. Recognition.

    PubMed

    Mosca, Michela; de Bot, Kees

    2017-01-01

    This study aims at assessing how bilinguals select words in the appropriate language in production and recognition while minimizing interference from the non-appropriate language. Two prominent models are considered which assume that when one language is in use, the other is suppressed. The Inhibitory Control (IC) model suggests that, in both production and recognition, the amount of inhibition on the non-target language is greater for the stronger compared to the weaker language. In contrast, the Bilingual Interactive Activation (BIA) model proposes that, in language recognition, the amount of inhibition on the weaker language is stronger than otherwise. To investigate whether bilingual language production and recognition can be accounted for by a single model of bilingual processing, we tested a group of native speakers of Dutch (L1), advanced speakers of English (L2) in a bilingual recognition and production task. Specifically, language switching costs were measured while participants performed a lexical decision (recognition) and a picture naming (production) task involving language switching. Results suggest that while in language recognition the amount of inhibition applied to the non-appropriate language increases along with its dominance as predicted by the IC model, in production the amount of inhibition applied to the non-relevant language is not related to language dominance, but rather it may be modulated by speakers' unconscious strategies to foster the weaker language. This difference indicates that bilingual language recognition and production might rely on different processing mechanisms and cannot be accounted within one of the existing models of bilingual language processing.

  14. The efficacy of early language intervention in mainstream school settings: a randomized controlled trial.

    PubMed

    Fricke, Silke; Burgoyne, Kelly; Bowyer-Crane, Claudine; Kyriacou, Maria; Zosimidou, Alexandra; Maxwell, Liam; Lervåg, Arne; Snowling, Margaret J; Hulme, Charles

    2017-10-01

    Oral language skills are a critical foundation for literacy and more generally for educational success. The current study shows that oral language skills can be improved by providing suitable additional help to children with language difficulties in the early stages of formal education. We conducted a randomized controlled trial with 394 children in England, comparing a 30-week oral language intervention programme starting in nursery (N = 132) with a 20-week version of the same programme starting in Reception (N = 133). The intervention groups were compared to an untreated waiting control group (N = 129). The programmes were delivered by trained teaching assistants (TAs) working in the children's schools/nurseries. All testers were blind to group allocation. Both the 20- and 30-week programmes produced improvements on primary outcome measures of oral language skill compared to the untreated control group. Effect sizes were small to moderate (20-week programme: d = .21; 30-week programme: d = .30) immediately following the intervention and were maintained at follow-up 6 months later. The difference in improvement between the 20-week and 30-week programmes was not statistically significant. Neither programme produced statistically significant improvements in children's early word reading or reading comprehension skills (secondary outcome measures). This study provides further evidence that oral language interventions can be delivered successfully by trained TAs to children with oral language difficulties in nursery and Reception classes. The methods evaluated have potentially important policy implications for early education. © 2017 Association for Child and Adolescent Mental Health.

  15. Special Report on the English Language Arts.

    ERIC Educational Resources Information Center

    Freeman, Lawrence D.

    This report, based on statistics gathered from the first statewide census of Illinois public secondary school course offerings, enrollments, and cocurricular activities, focuses on English language arts courses. Among the highlights from the report are the following: (1) Illinois junior and senior high schools typically rely on general, grade…

  16. Judgmental and Statistical DIF Analyses of the PISA-2003 Mathematics Literacy Items

    ERIC Educational Resources Information Center

    Yildirim, Huseyin Husnu; Berberoglu, Giray

    2009-01-01

    Comparisons of human characteristics across different language groups and cultures become more important in today's educational assessment practices as evidenced by the increasing interest in international comparative studies. Within this context, the fairness of the results across different language and cultural groups draws the attention of…

  17. The Evolution of Organization Analysis in ASQ, 1959-1979.

    ERIC Educational Resources Information Center

    Daft, Richard L.

    1980-01-01

    During the period 1959-1979, a sharp trend toward low-variety statistical languages has taken place, which may represent an organizational mapping phase in which simple, quantifiable relationships have been formally defined and measured. A broader scope of research languages will be needed in the future. (Author/IRT)

  18. Differences in Students' Reading Comprehension of International Financial Reporting Standards: A South African Case

    ERIC Educational Resources Information Center

    Coetzee, Stephen A.; Janse van Rensburg, Cecile; Schmulian, Astrid

    2016-01-01

    This study explores differences in students' reading comprehension of International Financial Reporting Standards in a South African financial reporting class with a heterogeneous student cohort. Statistically significant differences were identified for prior academic performance, language of instruction, first language and enrolment in the…

  19. Roots and Rogues in German Child Language

    ERIC Educational Resources Information Center

    Duffield, Nigel

    2008-01-01

    This article is concerned with the proper characterization of subject omission at a particular stage in German child language. It focuses on post-verbal null subjects in finite clauses, here termed Rogues. It is argued that the statistically significant presence of Rogues, in conjunction with their distinct developmental profile, speaks against a…

  20. Quantitative Investigations in Hungarian Phonotactics and Syllable Structure

    ERIC Educational Resources Information Center

    Grimes, Stephen M.

    2010-01-01

    This dissertation investigates statistical properties of segment collocation and syllable geometry of the Hungarian language. A corpus and dictionary based approach to studying language phonologies is outlined. In order to conduct research on Hungarian, a phonological lexicon was created by compiling existing dictionaries and corpora and using a…

  1. Leveraging Code Comments to Improve Software Reliability

    ERIC Educational Resources Information Center

    Tan, Lin

    2009-01-01

    Commenting source code has long been a common practice in software development. This thesis, consisting of three pieces of work, made novel use of the code comments written in natural language to improve software reliability. Our solution combines Natural Language Processing (NLP), Machine Learning, Statistics, and Program Analysis techniques to…

  2. Red Dirt Thinking on Educational Disadvantage

    ERIC Educational Resources Information Center

    Guenther, John; Bat, Melodie; Osborne, Sam

    2013-01-01

    When people talk about education of remote Aboriginal and Torres Strait Islander students, the language used is often replete with messages of failure and deficit, of disparity and problems. This language is reflected in statistics that on the surface seem unambiguous in their demonstration of poor outcomes for remote Aboriginal and Torres Strait…

  3. Capturing the Diversity in Lexical Diversity

    ERIC Educational Resources Information Center

    Jarvis, Scott

    2013-01-01

    The range, variety, or diversity of words found in learners' language use is believed to reflect the complexity of their vocabulary knowledge as well as the level of their language proficiency. Many indices of lexical diversity have been proposed, most of which involve statistical relationships between types and tokens, and which ultimately…

  4. Swedish: The Swedish Language in Education in Finland. Regional Dossiers Series.

    ERIC Educational Resources Information Center

    Ostern, Anna Lena

    This regional dossier aims to provide concise, descriptive information and basic educational statistics about minority language education in a specific country of the European Union--Finland. Details are provided about the features of the educational system, recent educational policies, divisions of responsibilities, main actors, legal…

  5. Occitan: The Occitan Language in Education in France. Regional Dossiers Series.

    ERIC Educational Resources Information Center

    Berthoumieux, Michel; Willemsma, Adalgard

    This regional dossier aims to provide concise, descriptive information and basic educational statistics about minority language education in a specific region of the European Union--the South of France. Details are provided about the features of the educational system, recent educational policies, divisions of responsibilities, main actors, legal…

  6. Phonetic Diversity, Statistical Learning, and Acquisition of Phonology

    ERIC Educational Resources Information Center

    Pierrehumbert, Janet B.

    2003-01-01

    In learning to perceive and produce speech, children master complex language-specific patterns. Daunting language-specific variation is found both in the segmental domain and in the domain of prosody and intonation. This article reviews the challenges posed by results in phonetic typology and sociolinguistics for the theory of language…

  7. Language-Independent and Language-Specific Aspects of Early Literacy: An Evaluation of the Common Underlying Proficiency Model.

    PubMed

    Goodrich, J Marc; Lonigan, Christopher J

    2017-08-01

    According to the common underlying proficiency model (Cummins, 1981), as children acquire academic knowledge and skills in their first language, they also acquire language-independent information about those skills that can be applied when learning a second language. The purpose of this study was to evaluate the relevance of the common underlying proficiency model for the early literacy skills of Spanish-speaking language-minority children using confirmatory factor analysis. Eight hundred fifty-eight Spanish-speaking language-minority preschoolers (mean age = 60.83 months, 50.2% female) participated in this study. Results indicated that bifactor models that consisted of language-independent as well as language-specific early literacy factors provided the best fits to the data for children's phonological awareness and print knowledge skills. Correlated factors models that only included skills specific to Spanish and English provided the best fits to the data for children's oral language skills. Children's language-independent early literacy skills were significantly related across constructs and to language-specific aspects of early literacy. Language-specific aspects of early literacy skills were significantly related within but not across languages. These findings suggest that language-minority preschoolers have a common underlying proficiency for code-related skills but not language-related skills that may allow them to transfer knowledge across languages.

  8. Do infant vocabulary skills predict school-age language and literacy outcomes?

    PubMed

    Duff, Fiona J; Reen, Gurpreet; Plunkett, Kim; Nation, Kate

    2015-08-01

    Strong associations between infant vocabulary and school-age language and literacy skills would have important practical and theoretical implications: Preschool assessment of vocabulary skills could be used to identify children at risk of reading and language difficulties, and vocabulary could be viewed as a cognitive foundation for reading. However, evidence to date suggests predictive ability from infant vocabulary to later language and literacy is low. This study provides an investigation into, and interpretation of, the magnitude of such infant to school-age relationships. Three hundred British infants whose vocabularies were assessed by parent report in the 2nd year of life (between 16 and 24 months) were followed up on average 5 years later (ages ranged from 4 to 9 years), when their vocabulary, phonological and reading skills were measured. Structural equation modelling of age-regressed scores was used to assess the strength of longitudinal relationships. Infant vocabulary (a latent factor of receptive and expressive vocabulary) was a statistically significant predictor of later vocabulary, phonological awareness, reading accuracy and reading comprehension (accounting for between 4% and 18% of variance). Family risk for language or literacy difficulties explained additional variance in reading (approximately 10%) but not language outcomes. Significant longitudinal relationships between preliteracy vocabulary knowledge and subsequent reading support the theory that vocabulary is a cognitive foundation of both reading accuracy and reading comprehension. Importantly however, the stability of vocabulary skills from infancy to later childhood is too low to be sufficiently predictive of language outcomes at an individual level - a finding that fits well with the observation that the majority of 'late talkers' resolve their early language difficulties. For reading outcomes, prediction of future difficulties is likely to be improved when considering family history of language/literacy difficulties alongside infant vocabulary levels. © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd, on behalf of Association for Child and Adolescent Mental Health.

  9. Executive and intellectual functioning in school-aged children with specific language impairment.

    PubMed

    Kuusisto, Marika A; Nieminen, Pirkko E; Helminen, Mika T; Kleemola, Leenamaija

    2017-03-01

    Earlier research and clinical practice show that specific language impairment (SLI) is often associated with nonverbal cognitive deficits and weakened skills in executive functions (EFs). Executive deficits may have a remarkable influence on a child's everyday activities in the home and school environments. However, research information is still limited on EFs in school-aged children with SLI, mostly conducted among English- and Dutch-speaking children. To study whether there are differences in EFs between Finnish-speaking children with SLI and typically developing (TD) peers at school age. EFs are compared between the groups with and without controlling for nonverbal intelligence. Parents and teachers of children with SLI (n = 22) and age- and gender-matched TD peers (n = 22) completed The Behavior Rating Inventory of Executive Functions (BRIEF). The mean age of the children was 8,2 years. BRIEF ratings of parents and teachers were compared between the children with SLI and with TD peers by paired analysis using conditional logistic regression models with and without controlling for nonverbal IQ. Intellectual functioning was assessed with the Wechsler Intelligence Scale for Children. Children with SLI had weaker scores in all parent and teacher BRIEF scales compared with TD peers. Statistically significant differences between the groups were found in BRIEF scales Shift, Emotional Control, Initiate, Working Memory, Plan/Organize and Monitor. Differences between the groups were statistically significant also in intellectual functioning. On BRIEF scales some group differences remained statistically significant after controlling for nonverbal IQ. This study provides additional evidence that also Finnish-speaking school-aged children with SLI are at risk of having deficits in EFs in daily life. EFs have been proposed to have an impact on developmental outcomes later in life. In clinical practice it is important to pay attention to EFs in school-aged children with SLI when making diagnostic evaluations and planning interventions for them. © 2016 Royal College of Speech and Language Therapists.

  10. Modeling the language learning strategies and English language proficiency of pre-university students in UMS: A case study

    NASA Astrophysics Data System (ADS)

    Kiram, J. J.; Sulaiman, J.; Swanto, S.; Din, W. A.

    2015-10-01

    This study aims to construct a mathematical model of the relationship between a student's Language Learning Strategy usage and English Language proficiency. Fifty-six pre-university students of University Malaysia Sabah participated in this study. A self-report questionnaire called the Strategy Inventory for Language Learning was administered to them to measure their language learning strategy preferences before they sat for the Malaysian University English Test (MUET), the results of which were utilised to measure their English language proficiency. We attempted the model assessment specific to Multiple Linear Regression Analysis subject to variable selection using Stepwise regression. We conducted various assessments to the model obtained, including the Global F-test, Root Mean Square Error and R-squared. The model obtained suggests that not all language learning strategies should be included in the model in an attempt to predict Language Proficiency.

  11. Nonadjacent Dependency Learning in Cantonese-Speaking Children With and Without a History of Specific Language Impairment.

    PubMed

    Iao, Lai-Sang; Ng, Lai Yan; Wong, Anita Mei Yin; Lee, Oi Ting

    2017-03-01

    This study investigated nonadjacent dependency learning in Cantonese-speaking children with and without a history of specific language impairment (SLI) in an artificial linguistic context. Sixteen Cantonese-speaking children with a history of SLI and 16 Cantonese-speaking children with typical language development (TLD) were tested with a nonadjacent dependency learning task using artificial languages that mimic Cantonese. Children with TLD performed above chance and were able to discriminate between trained and untrained nonadjacent dependencies. However, children with a history of SLI performed at chance and were not able to differentiate trained versus untrained nonadjacent dependencies. These findings, together with previous findings from English-speaking adults and adolescents with language impairments, suggest that individuals with atypical language development, regardless of age, diagnostic status, language, and culture, show difficulties in learning nonadjacent dependencies. This study provides evidence for early impairments to statistical learning in individuals with atypical language development.

  12. Principles of parametric estimation in modeling language competition

    PubMed Central

    Zhang, Menghan; Gong, Tao

    2013-01-01

    It is generally difficult to define reasonable parameters and interpret their values in mathematical models of social phenomena. Rather than directly fitting abstract parameters against empirical data, we should define some concrete parameters to denote the sociocultural factors relevant for particular phenomena, and compute the values of these parameters based upon the corresponding empirical data. Taking the example of modeling studies of language competition, we propose a language diffusion principle and two language inheritance principles to compute two critical parameters, namely the impacts and inheritance rates of competing languages, in our language competition model derived from the Lotka–Volterra competition model in evolutionary biology. These principles assign explicit sociolinguistic meanings to those parameters and calculate their values from the relevant data of population censuses and language surveys. Using four examples of language competition, we illustrate that our language competition model with thus-estimated parameter values can reliably replicate and predict the dynamics of language competition, and it is especially useful in cases lacking direct competition data. PMID:23716678

  13. Principles of parametric estimation in modeling language competition.

    PubMed

    Zhang, Menghan; Gong, Tao

    2013-06-11

    It is generally difficult to define reasonable parameters and interpret their values in mathematical models of social phenomena. Rather than directly fitting abstract parameters against empirical data, we should define some concrete parameters to denote the sociocultural factors relevant for particular phenomena, and compute the values of these parameters based upon the corresponding empirical data. Taking the example of modeling studies of language competition, we propose a language diffusion principle and two language inheritance principles to compute two critical parameters, namely the impacts and inheritance rates of competing languages, in our language competition model derived from the Lotka-Volterra competition model in evolutionary biology. These principles assign explicit sociolinguistic meanings to those parameters and calculate their values from the relevant data of population censuses and language surveys. Using four examples of language competition, we illustrate that our language competition model with thus-estimated parameter values can reliably replicate and predict the dynamics of language competition, and it is especially useful in cases lacking direct competition data.

  14. English vowel learning by speakers of Mandarin

    NASA Astrophysics Data System (ADS)

    Thomson, Ron I.

    2005-04-01

    One of the most influential models of second language (L2) speech perception and production [Flege, Speech Perception and Linguistic Experience (York, Baltimore, 1995) pp. 233-277] argues that during initial stages of L2 acquisition, perceptual categories sharing the same or nearly the same acoustic space as first language (L1) categories will be processed as members of that L1 category. Previous research has generally been limited to testing these claims on binary L2 contrasts, rather than larger portions of the perceptual space. This study examines the development of 10 English vowel categories by 20 Mandarin L1 learners of English. Imitation of English vowel stimuli by these learners, at 6 data collection points over the course of one year, were recorded. Using a statistical pattern recognition model, these productions were then assessed against native speaker norms. The degree to which the learners' perception/production shifted toward the target English vowels and the degree to which they matched L1 categories in ways predicted by theoretical models are discussed. The results of this experiment suggest that previous claims about perceptual assimilation of L2 categories to L1 categories may be too strong.

  15. Exploiting multiple sources of information in learning an artificial language: human data and modeling.

    PubMed

    Perruchet, Pierre; Tillmann, Barbara

    2010-03-01

    This study investigates the joint influences of three factors on the discovery of new word-like units in a continuous artificial speech stream: the statistical structure of the ongoing input, the initial word-likeness of parts of the speech flow, and the contextual information provided by the earlier emergence of other word-like units. Results of an experiment conducted with adult participants show that these sources of information have strong and interactive influences on word discovery. The authors then examine the ability of different models of word segmentation to account for these results. PARSER (Perruchet & Vinter, 1998) is compared to the view that word segmentation relies on the exploitation of transitional probabilities between successive syllables, and with the models based on the Minimum Description Length principle, such as INCDROP. The authors submit arguments suggesting that PARSER has the advantage of accounting for the whole pattern of data without ad-hoc modifications, while relying exclusively on general-purpose learning principles. This study strengthens the growing notion that nonspecific cognitive processes, mainly based on associative learning and memory principles, are able to account for a larger part of early language acquisition than previously assumed. Copyright © 2009 Cognitive Science Society, Inc.

  16. Wordform Similarity Increases With Semantic Similarity: An Analysis of 100 Languages.

    PubMed

    Dautriche, Isabelle; Mahowald, Kyle; Gibson, Edward; Piantadosi, Steven T

    2017-11-01

    Although the mapping between form and meaning is often regarded as arbitrary, there are in fact well-known constraints on words which are the result of functional pressures associated with language use and its acquisition. In particular, languages have been shown to encode meaning distinctions in their sound properties, which may be important for language learning. Here, we investigate the relationship between semantic distance and phonological distance in the large-scale structure of the lexicon. We show evidence in 100 languages from a diverse array of language families that more semantically similar word pairs are also more phonologically similar. This suggests that there is an important statistical trend for lexicons to have semantically similar words be phonologically similar as well, possibly for functional reasons associated with language learning. Copyright © 2016 Cognitive Science Society, Inc.

  17. Rational integration of noisy evidence and prior semantic expectations in sentence interpretation.

    PubMed

    Gibson, Edward; Bergen, Leon; Piantadosi, Steven T

    2013-05-14

    Sentence processing theories typically assume that the input to our language processing mechanisms is an error-free sequence of words. However, this assumption is an oversimplification because noise is present in typical language use (for instance, due to a noisy environment, producer errors, or perceiver errors). A complete theory of human sentence comprehension therefore needs to explain how humans understand language given imperfect input. Indeed, like many cognitive systems, language processing mechanisms may even be "well designed"--in this case for the task of recovering intended meaning from noisy utterances. In particular, comprehension mechanisms may be sensitive to the types of information that an idealized statistical comprehender would be sensitive to. Here, we evaluate four predictions about such a rational (Bayesian) noisy-channel language comprehender in a sentence comprehension task: (i) semantic cues should pull sentence interpretation towards plausible meanings, especially if the wording of the more plausible meaning is close to the observed utterance in terms of the number of edits; (ii) this process should asymmetrically treat insertions and deletions due to the Bayesian "size principle"; such nonliteral interpretation of sentences should (iii) increase with the perceived noise rate of the communicative situation and (iv) decrease if semantically anomalous meanings are more likely to be communicated. These predictions are borne out, strongly suggesting that human language relies on rational statistical inference over a noisy channel.

  18. Rational integration of noisy evidence and prior semantic expectations in sentence interpretation

    PubMed Central

    Gibson, Edward; Bergen, Leon; Piantadosi, Steven T.

    2013-01-01

    Sentence processing theories typically assume that the input to our language processing mechanisms is an error-free sequence of words. However, this assumption is an oversimplification because noise is present in typical language use (for instance, due to a noisy environment, producer errors, or perceiver errors). A complete theory of human sentence comprehension therefore needs to explain how humans understand language given imperfect input. Indeed, like many cognitive systems, language processing mechanisms may even be “well designed”–in this case for the task of recovering intended meaning from noisy utterances. In particular, comprehension mechanisms may be sensitive to the types of information that an idealized statistical comprehender would be sensitive to. Here, we evaluate four predictions about such a rational (Bayesian) noisy-channel language comprehender in a sentence comprehension task: (i) semantic cues should pull sentence interpretation towards plausible meanings, especially if the wording of the more plausible meaning is close to the observed utterance in terms of the number of edits; (ii) this process should asymmetrically treat insertions and deletions due to the Bayesian “size principle”; such nonliteral interpretation of sentences should (iii) increase with the perceived noise rate of the communicative situation and (iv) decrease if semantically anomalous meanings are more likely to be communicated. These predictions are borne out, strongly suggesting that human language relies on rational statistical inference over a noisy channel. PMID:23637344

  19. Mirror neurons, language, and embodied cognition.

    PubMed

    Perlovsky, Leonid I; Ilin, Roman

    2013-05-01

    Basic mechanisms of the mind, cognition, language, its semantic and emotional mechanisms are modeled using dynamic logic (DL). This cognitively and mathematically motivated model leads to a dual-model hypothesis of language and cognition. The paper emphasizes that abstract cognition cannot evolve without language. The developed model is consistent with a joint emergence of language and cognition from a mirror neuron system. The dual language-cognition model leads to the dual mental hierarchy. The nature of cognition embodiment in the hierarchy is analyzed. Future theoretical and experimental research is discussed. Published by Elsevier Ltd.

  20. R and Spatial Data

    EPA Science Inventory

    R is an open source language and environment for statistical computing and graphics that can also be used for both spatial analysis (i.e. geoprocessing and mapping of different types of spatial data) and spatial data analysis (i.e. the application of statistical descriptions and ...

  1. Incorporating linguistic knowledge for learning distributed word representations.

    PubMed

    Wang, Yan; Liu, Zhiyuan; Sun, Maosong

    2015-01-01

    Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining.

  2. Incorporating Linguistic Knowledge for Learning Distributed Word Representations

    PubMed Central

    Wang, Yan; Liu, Zhiyuan; Sun, Maosong

    2015-01-01

    Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining. PMID:25874581

  3. Approaches in highly parameterized inversion: TSPROC, a general time-series processor to assist in model calibration and result summarization

    USGS Publications Warehouse

    Westenbroek, Stephen M.; Doherty, John; Walker, John F.; Kelson, Victor A.; Hunt, Randall J.; Cera, Timothy B.

    2012-01-01

    The TSPROC (Time Series PROCessor) computer software uses a simple scripting language to process and analyze time series. It was developed primarily to assist in the calibration of environmental models. The software is designed to perform calculations on time-series data commonly associated with surface-water models, including calculation of flow volumes, transformation by means of basic arithmetic operations, and generation of seasonal and annual statistics and hydrologic indices. TSPROC can also be used to generate some of the key input files required to perform parameter optimization by means of the PEST (Parameter ESTimation) computer software. Through the use of TSPROC, the objective function for use in the model-calibration process can be focused on specific components of a hydrograph.

  4. A Multidimensional Curriculum Model for Heritage or International Language Instruction.

    ERIC Educational Resources Information Center

    Lazaruk, Wally

    1993-01-01

    Describes the Multidimension Curriculum Model for developing a language curriculum and suggests a generic approach to selecting and sequencing learning objectives. Alberta Education used this model to design a new French-as-a-Second-Language program. The experience/communication, culture, language, and general language components at the beginning,…

  5. Entraining IDyOT: Timing in the Information Dynamics of Thinking

    PubMed Central

    Forth, Jamie; Agres, Kat; Purver, Matthew; Wiggins, Geraint A.

    2016-01-01

    We present a novel hypothetical account of entrainment in music and language, in context of the Information Dynamics of Thinking model, IDyOT. The extended model affords an alternative view of entrainment, and its companion term, pulse, from earlier accounts. The model is based on hierarchical, statistical prediction, modeling expectations of both what an event will be and when it will happen. As such, it constitutes a kind of predictive coding, with a particular novel hypothetical implementation. Here, we focus on the model's mechanism for predicting when a perceptual event will happen, given an existing sequence of past events, which may be musical or linguistic. We propose a range of tests to validate or falsify the model, at various different levels of abstraction, and argue that computational modeling in general, and this model in particular, can offer a means of providing limited but useful evidence for evolutionary hypotheses. PMID:27803682

  6. Component Models for Semantic Web Languages

    NASA Astrophysics Data System (ADS)

    Henriksson, Jakob; Aßmann, Uwe

    Intelligent applications and agents on the Semantic Web typically need to be specified with, or interact with specifications written in, many different kinds of formal languages. Such languages include ontology languages, data and metadata query languages, as well as transformation languages. As learnt from years of experience in development of complex software systems, languages need to support some form of component-based development. Components enable higher software quality, better understanding and reusability of already developed artifacts. Any component approach contains an underlying component model, a description detailing what valid components are and how components can interact. With the multitude of languages developed for the Semantic Web, what are their underlying component models? Do we need to develop one for each language, or is a more general and reusable approach achievable? We present a language-driven component model specification approach. This means that a component model can be (automatically) generated from a given base language (actually, its specification, e.g. its grammar). As a consequence, we can provide components for different languages and simplify the development of software artifacts used on the Semantic Web.

  7. Modelling language evolution: Examples and predictions

    NASA Astrophysics Data System (ADS)

    Gong, Tao; Shuai, Lan; Zhang, Menghan

    2014-06-01

    We survey recent computer modelling research of language evolution, focusing on a rule-based model simulating the lexicon-syntax coevolution and an equation-based model quantifying the language competition dynamics. We discuss four predictions of these models: (a) correlation between domain-general abilities (e.g. sequential learning) and language-specific mechanisms (e.g. word order processing); (b) coevolution of language and relevant competences (e.g. joint attention); (c) effects of cultural transmission and social structure on linguistic understandability; and (d) commonalities between linguistic, biological, and physical phenomena. All these contribute significantly to our understanding of the evolutions of language structures, individual learning mechanisms, and relevant biological and socio-cultural factors. We conclude the survey by highlighting three future directions of modelling studies of language evolution: (a) adopting experimental approaches for model evaluation; (b) consolidating empirical foundations of models; and (c) multi-disciplinary collaboration among modelling, linguistics, and other relevant disciplines.

  8. Speaker gender identification based on majority vote classifiers

    NASA Astrophysics Data System (ADS)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  9. Deep bottleneck features for spoken language identification.

    PubMed

    Jiang, Bing; Song, Yan; Wei, Si; Liu, Jun-Hua; McLoughlin, Ian Vince; Dai, Li-Rong

    2014-01-01

    A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF) for spoken LID, motivated by the success of Deep Neural Networks (DNN) in speech recognition. We show that DBFs can form a low-dimensional compact representation of the original inputs with a powerful descriptive and discriminative capability. To evaluate the effectiveness of this, we design two acoustic models, termed DBF-TV and parallel DBF-TV (PDBF-TV), using a DBF based i-vector representation for each speech utterance. Results on NIST language recognition evaluation 2009 (LRE09) show significant improvements over state-of-the-art systems. By fusing the output of phonotactic and acoustic approaches, we achieve an EER of 1.08%, 1.89% and 7.01% for 30 s, 10 s and 3 s test utterances respectively. Furthermore, various DBF configurations have been extensively evaluated, and an optimal system proposed.

  10. [An expert system of aiding decision making in breast pathology connected to a clinical data base].

    PubMed

    Brunet, M; Durrleman, S; Ferber, J; Ganascia, J G; Hacene, K; Hirt, F; Jouniaux, F; Meeus, L

    1987-01-01

    The René Huguenin Cancer Center holds a medical file for each patient which is intended to store and process medical data. Since 1970, we introduced computerization: a development plan was elaborated and simultaneously a statistical software (Clotilde--GSI/CFRO) was selected. Thus, we now have access to a large database, structured according to medical rationale, and utilizable with methods of artificial intelligence towards three objectives: improved data acquisition, decision making and exploitation. The first application was to breast pathology, which represents one of the Center's primary activities. The structure of the data concerning patients is by all criteria part of the medical knowledge. This information needs to be presented as well as processed with a suitable language. To this end, we chose a language-oriented object, Mering II, usable with Apple and IBM 4 micro-computers. This project has already allowed to work out an operational model.

  11. Standards-Based Procedural Phenotyping: The Arden Syntax on i2b2.

    PubMed

    Mate, Sebastian; Castellanos, Ixchel; Ganslandt, Thomas; Prokosch, Hans-Ulrich; Kraus, Stefan

    2017-01-01

    Phenotyping, or the identification of patient cohorts, is a recurring challenge in medical informatics. While there are open source tools such as i2b2 that address this problem by providing user-friendly querying interfaces, these platforms lack semantic expressiveness to model complex phenotyping algorithms. The Arden Syntax provides procedural programming language construct, designed specifically for medical decision support and knowledge transfer. In this work, we investigate how language constructs of the Arden Syntax can be used for generic phenotyping. We implemented a prototypical tool to integrate i2b2 with an open source Arden execution environment. To demonstrate the applicability of our approach, we used the tool together with an Arden-based phenotyping algorithm to derive statistics about ICU-acquired hypernatremia. Finally, we discuss how the combination of i2b2's user-friendly cohort pre-selection and Arden's procedural expressiveness could benefit phenotyping.

  12. Songs as an Aid for Language Acquisition

    ERIC Educational Resources Information Center

    Schon, Daniele; Boyer, Maud; Moreno, Sylvain; Besson, Mireille; Peretz, Isabelle; Kolinsky, Regine

    2008-01-01

    In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. "Journal of Memory and Language," 35, 606-621.] have shown that adults…

  13. HIGH SCHOOL ENROLLMENTS IN LATIN, 1964-65.

    ERIC Educational Resources Information Center

    GOLDBERG, SAMUEL A.

    A MODERN LANGUAGE ASSOCIATION (MLA) STATISTICAL SURVEY SHOWS THE NUMBER OF STUDENTS STUDYING FRENCH, SPANISH, GERMAN, OR LATIN IN THE SECONDARY SCHOOLS DURING EACH SCHOOL YEAR FROM 1958-59 TO 1964-65, THE PERCENTAGE STUDYING EACH LANGUAGE IN RELATION TO THE TOTAL HIGH SCHOOL POPULATION, AND THE PERCENTAGE STUDYING LATIN IN RELATION TO THE TOTAL…

  14. Patterns and Trends in Michigan Migrant Education. JSRI Statistical Brief No. 8.

    ERIC Educational Resources Information Center

    Heiderson, Mazin A.; Leon, Edgar R.

    This report highlights trends of migrant education in Michigan from the late 1980s to the mid-1990s. Migrant education services include instruction in reading, math, oral language, English as a second language, and tutoring in other subjects. Support services include medical and dental screenings, career guidance, transportation, emergency…

  15. Serbian: The Serbian Language in Education in Hungary. Regional Dossiers Series

    ERIC Educational Resources Information Center

    Paulik, Anton, Comp.; Solymosi, Judit, Comp.

    2014-01-01

    This regional dossier aims at providing a concise description of and basic statistics on minority language education in a specific region of Europe--the territory of Magyarország (Hungary). Aspects that are addressed include features of the education system, recent educational policies, main actors, legal arrangements, and support structures, as…

  16. Implications of Timing of Maternal Depressive Symptoms for Early Cognitive and Language Development

    ERIC Educational Resources Information Center

    Sohr-Preston, Sara L.; Scaramella, Laura V.

    2006-01-01

    Statistically, women, particularly pregnant women and new mothers, are at heightened risk for depression. The present review describes the current state of the research linking maternal depressed mood and children's cognitive and language development. Exposure to maternal depressive symptoms, whether during the prenatal period, postpartum period,…

  17. Ethnic Minorities, Language Diversity, and Educational Implications: A Case Study on the Netherlands.

    ERIC Educational Resources Information Center

    Extra, Guus

    1990-01-01

    A discussion of the Dutch situation looks at how growing immigrant numbers and resulting second language groups have prompted a rethinking of traditional concepts of education. First, ethnic population trends across national boundaries in Western Europe are examined and basic statistics on ethnic minorities in the Netherlands are presented. The…

  18. Selected Bibliography of Educational Materials: Algeria, Libya, Morocco, Tunisia. Vol. 6, No. 3. 1972.

    ERIC Educational Resources Information Center

    Azzouz, Azzedine; And Others

    English language annotations of articles from 13 French language periodicals covering educational materials of interest to North Africans are included in this annotated bibliography. Citations are categorized by country. Topics touch on philosophy and theory of education, educational statistics, education organization by grade and type, adult…

  19. Anchors Aweigh: The Impact of Overlearning on Entrenchment Effects in Statistical Learning

    ERIC Educational Resources Information Center

    Bulgarelli, Federica; Weiss, Daniel J.

    2016-01-01

    Previous research has revealed that when learners encounter multiple artificial languages in succession only the first is learned, unless there are contextual cues correlating with the change in structure or if exposure to the second language is protracted. These experiments provided a fixed amount of exposure irrespective of when learning…

  20. Statistical, Practical, Clinical, and Personal Significance: Definitions and Applications in Speech-Language Pathology

    ERIC Educational Resources Information Center

    Bothe, Anne K.; Richardson, Jessica D.

    2011-01-01

    Purpose: To discuss constructs and methods related to assessing the magnitude and the meaning of clinical outcomes, with a focus on applications in speech-language pathology. Method: Professionals in medicine, allied health, psychology, education, and many other fields have long been concerned with issues referred to variously as practical…

  1. Corpora Processing and Computational Scaffolding for a Web-Based English Learning Environment: The CANDLE Project

    ERIC Educational Resources Information Center

    Liou, Hsien-Chin; Chang, Jason S; Chen, Hao-Jan; Lin, Chih-Cheng; Liaw, Meei-Ling; Gao, Zhao-Ming; Jang, Jyh-Shing Roger; Yeh, Yuli; Chuang, Thomas C.; You, Geeng-Neng

    2006-01-01

    This paper describes the development of an innovative web-based environment for English language learning with advanced data-driven and statistical approaches. The project uses various corpora, including a Chinese-English parallel corpus ("Sinorama") and various natural language processing (NLP) tools to construct effective English…

  2. Friulian: The Friulian Language in Education in Italy. Regional Dossiers Series

    ERIC Educational Resources Information Center

    Petris, Cinzia, Comp.

    2014-01-01

    This regional dossier aims to provide a concise, description and basic statistics about minority language education in a specific region of Europe. Aspects that are addressed include features of the education system, recent educational policies, main actors, legal arrangements, and support structures, as well as quantitative aspects, such as the…

  3. Catalan: The Catalan Language in Education in Spain, 2nd Edition. Regional Dossiers Series

    ERIC Educational Resources Information Center

    Areny, Maria, Comp.; Mayans, Pere, Comp.; Forniès, David, Comp.

    2013-01-01

    Regional dossiers aim at providing a concise description and basic statistics about minority language education in a specific region of Europe. Aspects that are addressed include features of the education system, recent educational policies, main actors, legal arrangements, and support structures, as well as quantitative aspects, such as the…

  4. Construct Equivalence of a National Certification Examination that Uses Dual Languages and Audio Assistance

    ERIC Educational Resources Information Center

    Wang, Shudong; Wang, Ning; Hoadley, David

    2007-01-01

    This study used confirmatory factor analysis (CFA) to examine the comparability of the National Nurse Aide Assessment Program (NNAAP[TM]) test scores across language and administration condition groups for calibration and validation samples that were randomly drawn from the same population. Fit statistics supported both the calibration and…

  5. Language Models and the Teaching of English Language to Secondary School Students in Cameroon

    ERIC Educational Resources Information Center

    Ntongieh, Njwe Amah Eyovi

    2016-01-01

    This paper investigates Language models with an emphasis on an appraisal of the Competence Based Language Teaching Model (CBLT) employed in the teaching and learning of English language in Cameroon. Research endeavours at various levels combined with cumulative deficiencies experienced over the years have propelled educational policy makers to…

  6. Word Recognition Reflects Dimension-Based Statistical Learning

    ERIC Educational Resources Information Center

    Idemaru, Kaori; Holt, Lori L.

    2011-01-01

    Speech processing requires sensitivity to long-term regularities of the native language yet demands listeners to flexibly adapt to perturbations that arise from talker idiosyncrasies such as nonnative accent. The present experiments investigate whether listeners exhibit "dimension-based statistical learning" of correlations between acoustic…

  7. Integrated Model for E-Learning Acceptance

    NASA Astrophysics Data System (ADS)

    Ramadiani; Rodziah, A.; Hasan, S. M.; Rusli, A.; Noraini, C.

    2016-01-01

    E-learning is not going to work if the system is not used in accordance with user needs. User Interface is very important to encourage using the application. Many theories had discuss about user interface usability evaluation and technology acceptance separately, actually why we do not make it correlation between interface usability evaluation and user acceptance to enhance e-learning process. Therefore, the evaluation model for e-learning interface acceptance is considered important to investigate. The aim of this study is to propose the integrated e-learning user interface acceptance evaluation model. This model was combined some theories of e-learning interface measurement such as, user learning style, usability evaluation, and the user benefit. We formulated in constructive questionnaires which were shared at 125 English Language School (ELS) students. This research statistics used Structural Equation Model using LISREL v8.80 and MANOVA analysis.

  8. Multilingual natural language generation as part of a medical terminology server.

    PubMed

    Wagner, J C; Solomon, W D; Michel, P A; Juge, C; Baud, R H; Rector, A L; Scherrer, J R

    1995-01-01

    Re-usable and sharable, and therefore language-independent concept models are of increasing importance in the medical domain. The GALEN project (Generalized Architecture for Languages Encyclopedias and Nomenclatures in Medicine) aims at developing language-independent concept representation systems as the foundations for the next generation of multilingual coding systems. For use within clinical applications, the content of the model has to be mapped to natural language. A so-called Multilingual Information Module (MM) establishes the link between the language-independent concept model and different natural languages. This text generation software must be versatile enough to cope at the same time with different languages and with different parts of a compositional model. It has to meet, on the one hand, the properties of the language as used in the medical domain and, on the other hand, the specific characteristics of the underlying model and its representation formalism. We propose a semantic-oriented approach to natural language generation that is based on linguistic annotations to a concept model. This approach is realized as an integral part of a Terminology Server, built around the concept model and offering different terminological services for clinical applications.

  9. Using statistical deformable models to reconstruct vocal tract shape from magnetic resonance images.

    PubMed

    Vasconcelos, M J M; Rua Ventura, S M; Freitas, D R S; Tavares, J M R S

    2010-10-01

    The mechanisms involved in speech production are complex and have thus been subject to growing attention by the scientific community. It has been demonstrated that magnetic resonance imaging (MRI) is a powerful means in the understanding of the morphology of the vocal tract. Over the last few years, statistical deformable models have been successfully used to identify and characterize bones and organs in medical images and point distribution models (PDMs) have gained particular relevance. In this work, the suitability of these models has been studied to characterize and further reconstruct the shape of the vocal tract in the articulation of Portuguese European (EP) speech sounds, one of the most spoken languages worldwide, with the aid of MR images. Therefore, a PDM has been built from a set of MR images acquired during the artificially sustained articulation of 25 EP speech sounds. Following this, the capacity of this statistical model to characterize the shape deformation of the vocal tract during the production of sounds was analysed. Next, the model was used to reconstruct five EP oral vowels and the EP fricative consonants. As far as a study on speech production is concerned, this study is considered to be the first approach to characterize and reconstruct the vocal tract shape from MR images by using PDMs. In addition, the findings achieved permit one to conclude that this modelling technique compels an enhanced understanding of the dynamic speech events involved in sustained articulations based on MRI, which are of particular interest for speech rehabilitation and simulation.

  10. Linguistics: Modelling the dynamics of language death

    NASA Astrophysics Data System (ADS)

    Abrams, Daniel M.; Strogatz, Steven H.

    2003-08-01

    Thousands of the world's languages are vanishing at an alarming rate, with 90% of them being expected to disappear with the current generation. Here we develop a simple model of language competition that explains historical data on the decline of Welsh, Scottish Gaelic, Quechua (the most common surviving indigenous language in the Americas) and other endangered languages. A linguistic parameter that quantifies the threat of language extinction can be derived from the model and may be useful in the design and evaluation of language-preservation programmes.

  11. Lexical diversity and omission errors as predictors of language ability in the narratives of sequential Spanish-English bilinguals: a cross-language comparison.

    PubMed

    Jacobson, Peggy F; Walden, Patrick R

    2013-08-01

    This study explored the utility of language sample analysis for evaluating language ability in school-age Spanish-English sequential bilingual children. Specifically, the relative potential of lexical diversity and word/morpheme omission as predictors of typical or atypical language status was evaluated. Narrative samples were obtained from 48 bilingual children in both of their languages using the suggested narrative retell protocol and coding conventions as per Systematic Analysis of Language Transcripts (SALT; Miller & Iglesias, 2008) software. An additional lexical diversity measure, VocD, was also calculated. A series of logistical hierarchical regressions explored the utility of the number of different words, VocD statistic, and word and morpheme omissions in each language for predicting language status. Omission errors turned out to be the best predictors of bilingual language impairment at all ages, and this held true across languages. Although lexical diversity measures did not predict typical or atypical language status, the measures were significantly related to oral language proficiency in English and Spanish. The results underscore the significance of omission errors in bilingual language impairment while simultaneously revealing the limitations of lexical diversity measures as indicators of impairment. The relationship between lexical diversity and oral language proficiency highlights the importance of considering relative language proficiency in bilingual assessment.

  12. Causal modelling applied to the risk assessment of a wastewater discharge.

    PubMed

    Paul, Warren L; Rokahr, Pat A; Webb, Jeff M; Rees, Gavin N; Clune, Tim S

    2016-03-01

    Bayesian networks (BNs), or causal Bayesian networks, have become quite popular in ecological risk assessment and natural resource management because of their utility as a communication and decision-support tool. Since their development in the field of artificial intelligence in the 1980s, however, Bayesian networks have evolved and merged with structural equation modelling (SEM). Unlike BNs, which are constrained to encode causal knowledge in conditional probability tables, SEMs encode this knowledge in structural equations, which is thought to be a more natural language for expressing causal information. This merger has clarified the causal content of SEMs and generalised the method such that it can now be performed using standard statistical techniques. As it was with BNs, the utility of this new generation of SEM in ecological risk assessment will need to be demonstrated with examples to foster an understanding and acceptance of the method. Here, we applied SEM to the risk assessment of a wastewater discharge to a stream, with a particular focus on the process of translating a causal diagram (conceptual model) into a statistical model which might then be used in the decision-making and evaluation stages of the risk assessment. The process of building and testing a spatial causal model is demonstrated using data from a spatial sampling design, and the implications of the resulting model are discussed in terms of the risk assessment. It is argued that a spatiotemporal causal model would have greater external validity than the spatial model, enabling broader generalisations to be made regarding the impact of a discharge, and greater value as a tool for evaluating the effects of potential treatment plant upgrades. Suggestions are made on how the causal model could be augmented to include temporal as well as spatial information, including suggestions for appropriate statistical models and analyses.

  13. Learning abstract visual concepts via probabilistic program induction in a Language of Thought.

    PubMed

    Overlan, Matthew C; Jacobs, Robert A; Piantadosi, Steven T

    2017-11-01

    The ability to learn abstract concepts is a powerful component of human cognition. It has been argued that variable binding is the key element enabling this ability, but the computational aspects of variable binding remain poorly understood. Here, we address this shortcoming by formalizing the Hierarchical Language of Thought (HLOT) model of rule learning. Given a set of data items, the model uses Bayesian inference to infer a probability distribution over stochastic programs that implement variable binding. Because the model makes use of symbolic variables as well as Bayesian inference and programs with stochastic primitives, it combines many of the advantages of both symbolic and statistical approaches to cognitive modeling. To evaluate the model, we conducted an experiment in which human subjects viewed training items and then judged which test items belong to the same concept as the training items. We found that the HLOT model provides a close match to human generalization patterns, significantly outperforming two variants of the Generalized Context Model, one variant based on string similarity and the other based on visual similarity using features from a deep convolutional neural network. Additional results suggest that variable binding happens automatically, implying that binding operations do not add complexity to peoples' hypothesized rules. Overall, this work demonstrates that a cognitive model combining symbolic variables with Bayesian inference and stochastic program primitives provides a new perspective for understanding people's patterns of generalization. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Influence of schooling on language abilities of adults without linguistic disorders.

    PubMed

    Soares, Ellen Cristina Siqueira; Ortiz, Karin Zazo

    2009-01-01

    In order to properly assess language, sociodemographic variables that can influence the linguistic performance of individuals with or without linguistic disorders need to be taken into account. The aim of this study was to evaluate the influence of schooling and age on the results from the Montreal Toulouse (Modified MT Beta-86) language assessment test among individuals without linguistic disorders. Cross-sectional study carried out between March 2006 and August 2007 in the Speech, Language and Hearing Pathology Department of Universidade Federal de São Paulo (Unifesp), São Paulo, Brazil. Eighty volunteers were selected. Schooling was stratified into three bands: A (1-4 years), B (5-8 years) and C (nine years and over). The age range was from 17 to 80 years. All the subjects underwent the Montreal Toulouse (Modified MT Beta-86) language assessment protocol. Statistically significant differences were found in relation to schooling levels, in the tasks of oral comprehension, reading, graphical comprehension, naming, lexical availability, dictation, graphical naming of actions and number reading. Statistically significant age-related differences in dictation and lexical availability tasks were observed. The Montreal Toulouse (Modified MT Beta-86) test seems to be sensitive to variations in schooling and age. These variables should be taken into account when this test is used for assessing patients with brain damage.

  15. Phonological deficits in specific language impairment and developmental dyslexia: towards a multidimensional model

    PubMed Central

    Ramus, Franck; Marshall, Chloe R.; Rosen, Stuart

    2013-01-01

    An on-going debate surrounds the relationship between specific language impairment and developmental dyslexia, in particular with respect to their phonological abilities. Are these distinct disorders? To what extent do they overlap? Which cognitive and linguistic profiles correspond to specific language impairment, dyslexia and comorbid cases? At least three different models have been proposed: the severity model, the additional deficit model and the component model. We address this issue by comparing children with specific language impairment only, those with dyslexia-only, those with specific language impairment and dyslexia and those with no impairment, using a broad test battery of language skills. We find that specific language impairment and dyslexia do not always co-occur, and that some children with specific language impairment do not have a phonological deficit. Using factor analysis, we find that language abilities across the four groups of children have at least three independent sources of variance: one for non-phonological language skills and two for distinct sets of phonological abilities (which we term phonological skills versus phonological representations). Furthermore, children with specific language impairment and dyslexia show partly distinct profiles of phonological deficit along these two dimensions. We conclude that a multiple-component model of language abilities best explains the relationship between specific language impairment and dyslexia and the different profiles of impairment that are observed. PMID:23413264

  16. Conceptual and non-conceptual repetition priming in category exemplar generation: Evidence from bilinguals.

    PubMed

    Francis, Wendy S; Fernandez, Norma P; Bjork, Robert A

    2010-10-01

    One measure of conceptual implicit memory is repetition priming in the generation of exemplars from a semantic category, but does such priming transfer across languages? That is, do the overlapping conceptual representations for translation equivalents provide a sufficient basis for such priming? In Experiment 1 (N=96) participants carried out a deep encoding task, and priming between languages was statistically reliable, but attenuated, relative to within-language priming. Experiment 2 (N=96) replicated the findings of Experiment 1 and assessed the contributions of conceptual and non-conceptual processes using a levels-of-processing manipulation. Words that underwent shallow encoding exhibited within-language, but not between-language, priming. Priming in shallow conditions cannot therefore be explained by incidental activation of the concept. Instead, part of the within-language priming effect, even under deep-encoding conditions, is due to increased availability of language-specific lemmas or phonological word forms.

  17. Conceptual and Non-conceptual Repetition Priming in Category Exemplar Generation: Evidence from Bilinguals

    PubMed Central

    Francis, Wendy S.; Fernandez, Norma P.; Bjork, Robert A.

    2010-01-01

    One measure of conceptual implicit memory is repetition priming in the generation of exemplars from a semantic category, but does such priming transfer across languages? That is, do the overlapping conceptual representations for translation equivalents provide a sufficient basis for such priming? In Experiment 1 (N = 96), participants carried out a deep encoding task, and priming between languages was statistically reliable, but attenuated, relative to within-language priming. Experiment 2 (N = 96) replicated the findings of Experiment 1 and assessed the contributions of conceptual and non-conceptual processes using a levels-of-processing manipulation. Words that underwent shallow encoding exhibited within-language, but not between-language, priming. Priming in shallow conditions cannot, therefore, be explained by incidental activation of the concept. Instead, part of the within-language priming effect, even under deep-encoding conditions, is due to increased availability of language-specific lemmas or phonological word forms. PMID:20924951

  18. Conceptual clusters in figurative language production.

    PubMed

    Corts, Daniel P; Meyers, Kristina

    2002-07-01

    Although most prior research on figurative language examines comprehension, several recent studies on the production of such language have proved to be informative. One of the most noticeable traits of figurative language production is that it is produced at a somewhat random rate with occasional bursts of highly figurative speech (e.g., Corts & Pollio, 1999). The present article seeks to extend these findings by observing production during speech that involves a very high base rate of figurative language, making statistically defined bursts difficult to detect. In an analysis of three Baptist sermons, burst-like clusters of figurative language were identified. Further study indicated that these clusters largely involve a central root metaphor that represents the topic under consideration. An interaction of the coherence, along with a conceptual understanding of a topic and the relative importance of the topic to the purpose of the speech, is offered as the most likely explanation for the clustering of figurative language in natural speech.

  19. First stage identification of syntactic elements in an extra-terrestrial signal

    NASA Astrophysics Data System (ADS)

    Elliott, John

    2011-02-01

    By investigating the generic attributes of a representative set of terrestrial languages at varying levels of abstraction, it is our endeavour to try and isolate elements of the signal universe, which are computationally tractable for its detection and structural decipherment. Ultimately, our aim is to contribute in some way to the understanding of what 'languageness' actually is. This paper describes algorithms and software developed to characterise and detect generic intelligent language-like features in an input signal, using natural language learning techniques: looking for characteristic statistical "language-signatures" in test corpora. As a first step towards such species-independent language-detection, we present a suite of programs to analyse digital representations of a range of data, and use the results to extrapolate whether or not there are language-like structures which distinguish this data from other sources, such as music, images, and white noise.

  20. The proper treatment of language acquisition and change in a population setting.

    PubMed

    Niyogi, Partha; Berwick, Robert C

    2009-06-23

    Language acquisition maps linguistic experience, primary linguistic data (PLD), onto linguistic knowledge, a grammar. Classically, computational models of language acquisition assume a single target grammar and one PLD source, the central question being whether the target grammar can be acquired from the PLD. However, real-world learners confront populations with variation, i.e., multiple target grammars and PLDs. Removing this idealization has inspired a new class of population-based language acquisition models. This paper contrasts 2 such models. In the first, iterated learning (IL), each learner receives PLD from one target grammar but different learners can have different targets. In the second, social learning (SL), each learner receives PLD from possibly multiple targets, e.g., from 2 parents. We demonstrate that these 2 models have radically different evolutionary consequences. The IL model is dynamically deficient in 2 key respects. First, the IL model admits only linear dynamics and so cannot describe phase transitions, attested rapid changes in languages over time. Second, the IL model cannot properly describe the stability of languages over time. In contrast, the SL model leads to nonlinear dynamics, bifurcations, and possibly multiple equilibria and so suffices to model both the case of stable language populations, mixtures of more than 1 language, as well as rapid language change. The 2 models also make distinct, empirically testable predictions about language change. Using historical data, we show that the SL model more faithfully replicates the dynamics of the evolution of Middle English.

  1. Parents' and Speech and Language Therapists' Explanatory Models of Language Development, Language Delay and Intervention

    ERIC Educational Resources Information Center

    Marshall, Julie; Goldbart, Juliet; Phillips, Julie

    2007-01-01

    Background: Parental and speech and language therapist (SLT) explanatory models may affect engagement with speech and language therapy, but there has been dearth of research in this area. This study investigated parents' and SLTs' views about language development, delay and intervention in pre-school children with language delay. Aims: The aims…

  2. Understanding Language Learning: Review of the Application of the Interaction Model in Foreign Language Contexts

    ERIC Educational Resources Information Center

    Dixon, L. Quentin; Wu, Shuang

    2014-01-01

    Purpose: This paper examined the application of the input-interaction-output model in English-as-Foreign-Language (EFL) learning environments with four specific questions: (1) How do the three components function in the model? (2) Does interaction in the foreign language classroom seem to be effective for foreign language acquisition? (3) What…

  3. The Bilingual Language Interaction Network for Comprehension of Speech*

    PubMed Central

    Marian, Viorica

    2013-01-01

    During speech comprehension, bilinguals co-activate both of their languages, resulting in cross-linguistic interaction at various levels of processing. This interaction has important consequences for both the structure of the language system and the mechanisms by which the system processes spoken language. Using computational modeling, we can examine how cross-linguistic interaction affects language processing in a controlled, simulated environment. Here we present a connectionist model of bilingual language processing, the Bilingual Language Interaction Network for Comprehension of Speech (BLINCS), wherein interconnected levels of processing are created using dynamic, self-organizing maps. BLINCS can account for a variety of psycholinguistic phenomena, including cross-linguistic interaction at and across multiple levels of processing, cognate facilitation effects, and audio-visual integration during speech comprehension. The model also provides a way to separate two languages without requiring a global language-identification system. We conclude that BLINCS serves as a promising new model of bilingual spoken language comprehension. PMID:24363602

  4. Neurocognitive mechanisms of statistical-sequential learning: what do event-related potentials tell us?

    PubMed Central

    Daltrozzo, Jerome; Conway, Christopher M.

    2014-01-01

    Statistical-sequential learning (SL) is the ability to process patterns of environmental stimuli, such as spoken language, music, or one’s motor actions, that unfold in time. The underlying neurocognitive mechanisms of SL and the associated cognitive representations are still not well understood as reflected by the heterogeneity of the reviewed cognitive models. The purpose of this review is: (1) to provide a general overview of the primary models and theories of SL, (2) to describe the empirical research – with a focus on the event-related potential (ERP) literature – in support of these models while also highlighting the current limitations of this research, and (3) to present a set of new lines of ERP research to overcome these limitations. The review is articulated around three descriptive dimensions in relation to SL: the level of abstractness of the representations learned through SL, the effect of the level of attention and consciousness on SL, and the developmental trajectory of SL across the life-span. We conclude with a new tentative model that takes into account these three dimensions and also point to several promising new lines of SL research. PMID:24994975

  5. Seamless Language Learning: Second Language Learning with Social Media

    ERIC Educational Resources Information Center

    Wong, Lung-Hsiang; Chai, Ching Sing; Aw, Guat Poh

    2017-01-01

    This conceptual paper describes a language learning model that applies social media to foster contextualized and connected language learning in communities. The model emphasizes weaving together different forms of language learning activities that take place in different learning contexts to achieve seamless language learning. it promotes social…

  6. How Many Is Enough?—Statistical Principles for Lexicostatistics

    PubMed Central

    Zhang, Menghan; Gong, Tao

    2016-01-01

    Lexicostatistics has been applied in linguistics to inform phylogenetic relations among languages. There are two important yet not well-studied parameters in this approach: the conventional size of vocabulary list to collect potentially true cognates and the minimum matching instances required to confirm a recurrent sound correspondence. Here, we derive two statistical principles from stochastic theorems to quantify these parameters. These principles validate the practice of using the Swadesh 100- and 200-word lists to indicate degree of relatedness between languages, and enable a frequency-based, dynamic threshold to detect recurrent sound correspondences. Using statistical tests, we further evaluate the generality of the Swadesh 100-word list compared to the Swadesh 200-word list and other 100-word lists sampled randomly from the Swadesh 200-word list. All these provide mathematical support for applying lexicostatistics in historical and comparative linguistics. PMID:28018261

  7. A Framework for Thinking about Informal Statistical Inference

    ERIC Educational Resources Information Center

    Makar, Katie; Rubin, Andee

    2009-01-01

    Informal inferential reasoning has shown some promise in developing students' deeper understanding of statistical processes. This paper presents a framework to think about three key principles of informal inference--generalizations "beyond the data," probabilistic language, and data as evidence. The authors use primary school classroom…

  8. How are we related? Causality, correlation, and association.

    PubMed

    Jupiter, Daniel C

    2012-01-01

    A key to clear scientific communication is careful use of the technical language of statistics. In this commentary we discuss one commonly occurring juxtaposition of statistical terminology. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  9. Network analysis of named entity co-occurrences in written texts

    NASA Astrophysics Data System (ADS)

    Amancio, Diego Raphael

    2016-06-01

    The use of methods borrowed from statistics and physics to analyze written texts has allowed the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to unveil patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we represent entities appearing in the same context as a co-occurrence network, where links are established according to a null model based on random, shuffled texts. Computational simulations performed in novels revealed that the proposed model displays interesting topological features, such as the small world feature, characterized by high values of clustering coefficient. The effectiveness of our model was verified in a practical pattern recognition task in real networks. When compared with traditional word adjacency networks, our model displayed optimized results in identifying unknown references in texts. Because the proposed representation plays a complementary role in characterizing unstructured documents via topological analysis of named entities, we believe that it could be useful to improve the characterization of written texts (and related systems), specially if combined with traditional approaches based on statistical and deeper paradigms.

  10. Statistical learning and the challenge of syntax: Beyond finite state automata

    NASA Astrophysics Data System (ADS)

    Elman, Jeff

    2003-10-01

    Over the past decade, it has been clear that even very young infants are sensitive to the statistical structure of language input presented to them, and use the distributional regularities to induce simple grammars. But can such statistically-driven learning also explain the acquisition of more complex grammar, particularly when the grammar includes recursion? Recent claims (e.g., Hauser, Chomsky, and Fitch, 2002) have suggested that the answer is no, and that at least recursion must be an innate capacity of the human language acquisition device. In this talk evidence will be presented that indicates that, in fact, statistically-driven learning (embodied in recurrent neural networks) can indeed enable the learning of complex grammatical patterns, including those that involve recursion. When the results are generalized to idealized machines, it is found that the networks are at least equivalent to Push Down Automata. Perhaps more interestingly, with limited and finite resources (such as are presumed to exist in the human brain) these systems demonstrate patterns of performance that resemble those in humans.

  11. Learning of grammar-like visual sequences by adults with and without language-learning disabilities.

    PubMed

    Aguilar, Jessica M; Plante, Elena

    2014-08-01

    Two studies examined learning of grammar-like visual sequences to determine whether a general deficit in statistical learning characterizes this population. Furthermore, we tested the hypothesis that difficulty in sustaining attention during the learning task might account for differences in statistical learning. In Study 1, adults with normal language (NL) or language-learning disability (LLD) were familiarized with the visual artificial grammar and then tested using items that conformed or deviated from the grammar. In Study 2, a 2nd sample of adults with NL and LLD were presented auditory word pairs with weak semantic associations (e.g., groom + clean) along with the visual learning task. Participants were instructed to attend to visual sequences and to ignore the auditory stimuli. Incidental encoding of these words would indicate reduced attention to the primary task. In Studies 1 and 2, both groups demonstrated learning and generalization of the artificial grammar. In Study 2, neither the NL nor the LLD group appeared to encode the words presented during the learning phase. The results argue against a general deficit in statistical learning for individuals with LLD and demonstrate that both NL and LLD learners can ignore extraneous auditory stimuli during visual learning.

  12. Linking sounds to meanings: infant statistical learning in a natural language.

    PubMed

    Hay, Jessica F; Pelucchi, Bruna; Graf Estes, Katharine; Saffran, Jenny R

    2011-09-01

    The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants' subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants' prior experience with the distribution of sounds that make up words in natural languages. Copyright © 2011 Elsevier Inc. All rights reserved.

  13. SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (IBM VERSION)

    NASA Technical Reports Server (NTRS)

    Manteufel, R.

    1994-01-01

    The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.

  14. SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (DEC VAX VERSION)

    NASA Technical Reports Server (NTRS)

    Merwarth, P. D.

    1994-01-01

    The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.

  15. Infant Directed Speech Enhances Statistical Learning in Newborn Infants: An ERP Study

    PubMed Central

    Teinonen, Tuomas; Tervaniemi, Mari; Huotilainen, Minna

    2016-01-01

    Statistical learning and the social contexts of language addressed to infants are hypothesized to play important roles in early language development. Previous behavioral work has found that the exaggerated prosodic contours of infant-directed speech (IDS) facilitate statistical learning in 8-month-old infants. Here we examined the neural processes involved in on-line statistical learning and investigated whether the use of IDS facilitates statistical learning in sleeping newborns. Event-related potentials (ERPs) were recorded while newborns were exposed to12 pseudo-words, six spoken with exaggerated pitch contours of IDS and six spoken without exaggerated pitch contours (ADS) in ten alternating blocks. We examined whether ERP amplitudes for syllable position within a pseudo-word (word-initial vs. word-medial vs. word-final, indicating statistical word learning) and speech register (ADS vs. IDS) would interact. The ADS and IDS registers elicited similar ERP patterns for syllable position in an early 0–100 ms component but elicited different ERP effects in both the polarity and topographical distribution at 200–400 ms and 450–650 ms. These results provide the first evidence that the exaggerated pitch contours of IDS result in differences in brain activity linked to on-line statistical learning in sleeping newborns. PMID:27617967

  16. Assessing segmentation processes by click detection: online measure of statistical learning, or simple interference?

    PubMed

    Franco, Ana; Gaillard, Vinciane; Cleeremans, Axel; Destrebecqz, Arnaud

    2015-12-01

    Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212-223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.

  17. Language screening in preschool Chinese children.

    PubMed

    Wong, V; Lee, P W; Lieh-Mak, F; Yeung, C Y; Leung, P W; Luk, S L; Yiu, E

    1992-01-01

    The incidence of language delay in Chinese preschool children was studied by a stratified proportional sampling of all 3 year olds in Hong Kong. The Developmental Language Screening Scale (DLSS) devised for use with Cantonese speaking children was used to identify children with language delay. Of 855 children sampled in the stage I screening procedure, 4%, 2.8% and 3.3% were identified as having delay in verbal comprehension, expression or both respectively. The stage II clinical diagnostic study included a randomly selected group of children screened in stage I with or without any associated behavioural problem. Among these, 3.4% were identified as having a language delay using the Reynell Language Developmental Scale (RDLS) with a criterion of language age of less than or equal to two-thirds of the chronological age; 3% had specific language delay using the criteria of language age less than or equal to two-thirds the chronological age and developmental age more than or equal to two-thirds the chronological age. More boys were found to have language delay, although this was not statistically significant.

  18. Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM

    NASA Astrophysics Data System (ADS)

    Köylü, Ü.; Geymen, A.

    2016-10-01

    Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.

  19. Language Sampling for Preschoolers With Severe Speech Impairments

    PubMed Central

    Ragsdale, Jamie; Bustos, Aimee

    2016-01-01

    Purpose The purposes of this investigation were to determine if measures such as mean length of utterance (MLU) and percentage of comprehensible words can be derived reliably from language samples of children with severe speech impairments and if such measures correlate with tools that measure constructs assumed to be related. Method Language samples of 15 preschoolers with severe speech impairments (but receptive language within normal limits) were transcribed independently by 2 transcribers. Nonparametric statistics were used to determine which measures, if any, could be transcribed reliably and to determine if correlations existed between language sample measures and standardized measures of speech, language, and cognition. Results Reliable measures were extracted from the majority of the language samples, including MLU in words, mean number of syllables per utterance, and percentage of comprehensible words. Language sample comprehensibility measures were correlated with a single word comprehensibility task. Also, language sample MLUs and mean length of the participants' 3 longest sentences from the MacArthur–Bates Communicative Development Inventory (Fenson et al., 2006) were correlated. Conclusion Language sampling, given certain modifications, may be used for some 3-to 5-year-old children with normal receptive language who have severe speech impairments to provide reliable expressive language and comprehensibility information. PMID:27552110

  20. White-matter microstructure and language lateralization in left-handers: a whole-brain MRI analysis.

    PubMed

    Perlaki, Gabor; Horvath, Reka; Orsi, Gergely; Aradi, Mihaly; Auer, Tibor; Varga, Eszter; Kantor, Gyongyi; Altbäcker, Anna; John, Flora; Doczi, Tamas; Komoly, Samuel; Kovacs, Norbert; Schwarcz, Attila; Janszky, Jozsef

    2013-08-01

    Most people are left-hemisphere dominant for language. However the neuroanatomy of language lateralization is not fully understood. By combining functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI), we studied whether language lateralization is associated with cerebral white-matter (WM) microstructure. Sixteen healthy, left-handed women aged 20-25 were included in the study. Left-handers were targeted in order to increase the chances of involving subjects with atypical language lateralization. Language lateralization was determined by fMRI using a verbal fluency paradigm. Tract-based spatial statistics analysis of DTI data was applied to test for WM microstructural correlates of language lateralization across the whole brain. Fractional anisotropy and mean diffusivity were used as indicators of WM microstructural organization. Right-hemispheric language dominance was associated with reduced microstructural integrity of the left superior longitudinal fasciculus and left-sided parietal lobe WM. In left-handed women, reduced integrity of the left-sided language related tracts may be closely linked to the development of right hemispheric language dominance. Our results may offer new insights into language lateralization and structure-function relationships in human language system. Copyright © 2013 Elsevier Inc. All rights reserved.

Top