Brysbaert, Marc; Keuleers, Emmanuel; New, Boris
2011-01-01
In this Perspective Article we assess the usefulness of Google's new word frequencies for word recognition research (lexical decision and word naming). We find that, despite the massive corpus on which the Google estimates are based (131 billion words from books published in the United States alone), the Google American English frequencies explain 11% less of the variance in the lexical decision times from the English Lexicon Project (Balota et al., 2007) than the SUBTLEX-US word frequencies, based on a corpus of 51 million words from film and television subtitles. Further analyses indicate that word frequencies derived from recent books (published after 2000) are better predictors of word processing times than frequencies based on the full corpus, and that word frequencies based on fiction books predict word processing times better than word frequencies based on the full corpus. The most predictive word frequencies from Google still do not explain more of the variance in word recognition times of undergraduate students and old adults than the subtitle-based word frequencies. PMID:21713191
Interplay between morphology and frequency in lexical access: The case of the base frequency effect
Vannest, Jennifer; Newport, Elissa L.; Newman, Aaron J.; Bavelier, Daphne
2011-01-01
A major issue in lexical processing concerns storage and access of lexical items. Here we make use of the base frequency effect to examine this. Specifically, reaction time to morphologically complex words (words made up of base and suffix, e.g., agree+able) typically reflects frequency of the base element (i.e., total frequency of all words in which agree appears) rather than surface word frequency (i.e., frequency of agreeable itself). We term these complex words decomposable. However, a class of words termed whole-word do not show such sensitivity to base frequency (e.g., serenity). Using an event-related MRI design, we exploited the fact that processing low-frequency words increases BOLD activity relative to high frequency ones, and examined effects of base frequency on brain activity for decomposable and whole-word items. Morphologically complex words, half high and half low base frequency, were compared to matched high and low frequency simple monomorphemic words using a lexical decision task. Morphologically complex words increased activation in left inferior frontal and left superior temporal cortices versus simple words. The only area to mirror the behavioral distinction between decomposable and whole-word types was the thalamus. Surprisingly, most frequency-sensitive areas failed to show base frequency effects. This variety of responses to frequency and word type across brain areas supports an integrative view of multiple variables during lexical access, rather than a dichotomy between memory-based access and on-line computation. Lexical access appears best captured as interplay of several neural processes with different sensitivities to various linguistic factors including frequency and morphological complexity. PMID:21167136
SUBTLEX-ESP: Spanish Word Frequencies Based on Film Subtitles
ERIC Educational Resources Information Center
Cuetos, Fernando; Glez-Nosti, Maria; Barbon, Analia; Brysbaert, Marc
2011-01-01
Recent studies have shown that word frequency estimates obtained from films and television subtitles are better to predict performance in word recognition experiments than the traditional word frequency estimates based on books and newspapers. In this study, we present a subtitle-based word frequency list for Spanish, one of the most widely spoken…
SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles
Cai, Qing; Brysbaert, Marc
2010-01-01
Background Word frequency is the most important variable in language research. However, despite the growing interest in the Chinese language, there are only a few sources of word frequency measures available to researchers, and the quality is less than what researchers in other languages are used to. Methodology Following recent work by New, Brysbaert, and colleagues in English, French and Dutch, we assembled a database of word and character frequencies based on a corpus of film and television subtitles (46.8 million characters, 33.5 million words). In line with what has been found in the other languages, the new word and character frequencies explain significantly more of the variance in Chinese word naming and lexical decision performance than measures based on written texts. Conclusions Our results confirm that word frequencies based on subtitles are a good estimate of daily language exposure and capture much of the variance in word processing efficiency. In addition, our database is the first to include information about the contextual diversity of the words and to provide good frequency estimates for multi-character words and the different syntactic roles in which the words are used. The word frequencies are freely available for research purposes. PMID:20532192
Just Google It: An Approach on Word Frequencies Based on Online Search Result.
Moret-Tatay, Carmen; Gamermann, Daniel; Murphy, Michael; Kuzmičová, Anezka
2018-01-01
Word frequency is one of the most robust factors in the literature on word processing, based on the lexical corpus of a language. However, different sources might be used in order to determine the actual frequency of each word. Recent research has determined frequencies based on movie subtitles, Twitter, blog posts, or newspapers. In this paper, we examine a determination of these frequencies based on the World Wide Web. For this purpose, a Python script was developed to obtain frequencies of a word through online search results. These frequencies were employed to estimate lexical decision times in comparison to the traditional frequencies in a lexical decision task. It was found that the Google frequencies predict reaction times comparably to the traditional frequencies. Still, the explained variance was higher for the traditional database.
EHME: a new word database for research in Basque language.
Acha, Joana; Laka, Itziar; Landa, Josu; Salaburu, Pello
2014-11-14
This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria 1 , as well as to obtain statistical characteristics from a list of words
The Effects of Semantic Transparency and Base Frequency on the Recognition of English Complex Words
ERIC Educational Resources Information Center
Xu, Joe; Taft, Marcus
2015-01-01
A visual lexical decision task was used to examine the interaction between base frequency (i.e., the cumulative frequencies of morphologically related forms) and semantic transparency for a list of derived words. Linear mixed effects models revealed that high base frequency facilitates the recognition of the complex word (i.e., a "base…
Reassessing word frequency as a determinant of word recognition for skilled and unskilled readers
Kuperman, Victor; Van Dyke, Julie A.
2013-01-01
The importance of vocabulary in reading comprehension emphasizes the need to accurately assess an individual’s familiarity with words. The present article highlights problems with using occurrence counts in corpora as an index of word familiarity, especially when studying individuals varying in reading experience. We demonstrate via computational simulations and norming studies that corpus-based word frequencies systematically overestimate strengths of word representations, especially in the low-frequency range and in smaller-size vocabularies. Experience-driven differences in word familiarity prove to be faithfully captured by the subjective frequency ratings collected from responders at different experience levels. When matched on those levels, this lexical measure explains more variance than corpus-based frequencies in eye-movement and lexical decision latencies to English words, attested in populations with varied reading experience and skill. Furthermore, the use of subjective frequencies removes the widely reported (corpus) frequency-by-skill interaction, showing that more skilled readers are equally faster in processing any word than the less skilled readers, not disproportionally faster in processing lower-frequency words. This finding challenges the view that the more skilled an individual is in generic mechanisms of word processing the less reliant he/she will be on the actual lexical characteristics of that word. PMID:23339352
Saint-Aubin, Jean; LeBlanc, Jacinthe
2005-12-01
In immediate serial recall, high-frequency words are better recalled than low-frequency words. Recently, it has been suggested that high-frequency words are better recalled because of their better long-term associative links, and not because of the intrinsic properties of their long-term representations. In the experiment reported here, recall performance was compared for pure lists of high- and low-frequency words, and for mixed lists composed of either one low- and five high-frequency words or the reverse. The usual advantage of high-frequency words was found with pure lists and this advantage was reduced, but still significant with mixed lists composed of five low-frequency words. However, the low-frequency word included in a high-frequency list was recalled just as well as high-frequency words. Results are challenging for the associative link hypothesis and are best interpreted within an item-based reconstruction hypothesis, along with a distinctiveness account.
Juhasz, Barbara J; Yap, Melvin J; Raoul, Akila; Kaye, Micaela
2018-04-23
Word frequency is an important predictor of lexical-decision task performance. The current study further examined the role of this variable by exploring the influence of frequency trajectory. Frequency trajectory is measured by how often a word occurs in childhood relative to adulthood. Past research on the role of this variable in word recognition has produced equivocal results. In the current study, words were selected based on their frequencies in Grade 1 (child frequency) and Grade 13 (college frequency). In Experiment 1, four frequency trajectory conditions were factorially examined in a lexical-decision task with English words: high-to-high (world), high-to-low (uncle), low-to-high (brain) and low-to-low (opera). an interaction between Grade 1 and college frequency demonstrated that words in the low-to-high condition were processed significantly faster and more accurately than words in the low-to-low condition, whereas the high-to-high and high-to-low conditions did not differ significantly. In Experiment 2, an advantage for words with an increasing frequency trajectory was also supported in regression analyses on both lexical decision and naming times for 3,039 items selected from the English Lexicon Project (Balota et al., 2007). This was replicated in Experiment 3, based on a regression analysis of 2,680 words from the British Lexicon Project (BLP; Keuleers, Lacey, Rastle, & Brysbaert, 2012). In all analyses, rated age-of-acquisition also significantly impacted word recognition. Together, the results suggest that the age at which a word is initially learned as well as its frequency trajectory across childhood impact performance in the lexical-decision task. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
ERIC Educational Resources Information Center
Chen, Qi; Guang-Chun, Ge
2007-01-01
We conducted a lexical study on the word frequency and the text coverage of the 570 word families from Coxhead's Academic Word List (AWL) in medical research articles (RAs) based on a corpus of 50 medical RAs written in English with 190425 running words. By computer analysis, we found that the text coverage of the AWL words accounted for around…
Is There a Neighborhood Frequency Effect in English?: Evidence from Reading and Lexical Decision
ERIC Educational Resources Information Center
Sears, Christopher R.; Campbell, Crystal R.; Lupker, Stephen J.
2006-01-01
What is the effect of a word's higher frequency neighbors on its identification time? According to activation-based models of word identification (J. Grainger & A. M. Jacobs, 1996; J. L. McClelland & D. E. Rumelhart, 1981), words with higher frequency neighbors will be processed more slowly than words without higher frequency neighbors because of…
Development of a Frequency-based Measure of Syntactic Difficulty for Estimating Readability.
ERIC Educational Resources Information Center
Selden, Ramsay
Readability estimates are usually based on measures of word difficulty and measures of sentence difficulty. Word difficulty is measured in two ways: by the structural size and complexity of words or by reference to phonomena of language use, such as word-list frequency or the regularity of spelling patterns. Sentence difficulty is measured only in…
Subtitle-Based Word Frequencies as the Best Estimate of Reading Behavior: The Case of Greek
Dimitropoulou, Maria; Duñabeitia, Jon Andoni; Avilés, Alberto; Corral, José; Carreiras, Manuel
2010-01-01
Previous evidence has shown that word frequencies calculated from corpora based on film and television subtitles can readily account for reading performance, since the language used in subtitles greatly approximates everyday language. The present study examines this issue in a society with increased exposure to subtitle reading. We compiled SUBTLEX-GR, a subtitled-based corpus consisting of more than 27 million Modern Greek words, and tested to what extent subtitle-based frequency estimates and those taken from a written corpus of Modern Greek account for the lexical decision performance of young Greek adults who are exposed to subtitle reading on a daily basis. Results showed that SUBTLEX-GR frequency estimates effectively accounted for participants’ reading performance in two different visual word recognition experiments. More importantly, different analyses showed that frequencies estimated from a subtitle corpus explained the obtained results significantly better than traditional frequencies derived from written corpora. PMID:21833273
Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.
Soares, Ana Paula; Iriarte, Álvaro; de Almeida, José João; Simões, Alberto; Costa, Ana; Machado, João; França, Patrícia; Comesaña, Montserrat; Rauber, Andreia; Rato, Anabela; Perea, Manuel
2018-05-31
In this article, we present Procura-PALavras (P-PAL), a Web-based interface for a new European Portuguese (EP) lexical database. Based on a contemporary printed corpus of over 227 million words, P-PAL provides a broad range of word attributes and statistics, including several measures of word frequency (e.g., raw counts, per-million word frequency, logarithmic Zipf scale), morpho-syntactic information (e.g., parts of speech [PoSs], grammatical gender and number, dominant PoS, and frequency and relative frequency of the dominant PoS), as well as several lexical and sublexical orthographic (e.g., number of letters; consonant-vowel orthographic structure; density and frequency of orthographic neighbors; orthographic Levenshtein distance; orthographic uniqueness point; orthographic syllabification; and trigram, bigram, and letter type and token frequencies), and phonological measures (e.g., pronunciation, number of phonemes, stress, density and frequency of phonological neighbors, transposed and phonographic neighbors, syllabification, and biphone and phone type and token frequencies) for ~53,000 lemmatized and ~208,000 nonlemmatized EP word forms. To obtain these metrics, researchers can choose between two word queries in the application: (i) analyze words previously selected for specific attributes and/or lexical and sublexical characteristics, or (ii) generate word lists that meet word requirements defined by the user in the menu of analyses. For the measures it provides and the flexibility it allows, P-PAL will be a key resource to support research in all cognitive areas that use EP verbal stimuli. P-PAL is freely available at http://p-pal.di.uminho.pt/tools .
Factors That Influence the Difficulty of Science Words
ERIC Educational Resources Information Center
Cervetti, Gina N.; Hiebert, Elfrieda H.; Pearson, P. David; McClung, Nicola A.
2015-01-01
This study examines, within the domain of science, the characteristics of words that predict word knowledge and word learning. The authors identified a set of word characteristics--length, part of speech, polysemy, frequency, morphological frequency, domain specificity, and concreteness--that, based on earlier research, were prime candidates to…
Derivational Morphology and Base Morpheme Frequency
ERIC Educational Resources Information Center
Ford, M. A.; Davis, M. H.; Marslen-Wilson, W. D.
2010-01-01
Morpheme frequency effects for derived words (e.g. an influence of the frequency of the base "dark" on responses to "darkness") have been interpreted as evidence of morphemic representation. However, it has been suggested that most derived words would not show these effects if family size (a type frequency count claimed to reflect semantic…
Automatic Text Analysis Based on Transition Phenomena of Word Occurrences
ERIC Educational Resources Information Center
Pao, Miranda Lee
1978-01-01
Describes a method of selecting index terms directly from a word frequency list, an idea originally suggested by Goffman. Results of the analysis of word frequencies of two articles seem to indicate that the automated selection of index terms from a frequency list holds some promise for automatic indexing. (Author/MBR)
Differential lexical and semantic spreading activation in Alzheimer's disease.
Foster, Paul S; Drago, Valeria; Yung, Raegan C; Pearson, Jaclyn; Stringer, Kristi; Giovannetti, Tania; Libon, David; Heilman, Kenneth M
2013-08-01
Alzheimer's disease (AD) is known to be associated with disruption in semantic networks. Previous studies examining changes in spreading activation in AD have used a lexical decision task paradigm. We have used a paradigm based on average word frequencies obtained from the words generated on the Controlled Oral Word Association Test (COWAT) and the Animal Naming (AN) test. The COWAT and AN tests were administered to a group of 25 patients with AD and 20 control participants. We predicted that the patients with AD would have higher average word frequencies on the COWAT and AN tests than the control participants. The results indicated that the AD group generated words with a higher average word frequency on the AN test but a lower average word frequency on the COWAT. The reasons for the discrepancy in average word frequencies on the AN test and COWAT are discussed.
Skipping of Chinese characters does not rely on word-based processing.
Lin, Nan; Angele, Bernhard; Hua, Huimin; Shen, Wei; Zhou, Junyi; Li, Xingshan
2018-02-01
Previous eye-movement studies have indicated that people tend to skip extremely high-frequency words in sentence reading, such as "the" in English and "/de" in Chinese. Two alternative hypotheses have been proposed to explain how this frequent skipping happens in Chinese reading: one assumes that skipping happens when the preview has been fully identified at the word level (word-based skipping); the other assumes that skipping happens whenever the preview character is easy to identify regardless of whether lexical processing has been completed or not (character-based skipping). Using the gaze-contingent display change paradigm, we examined the two hypotheses by substituting the preview of the third character of a four-character Chinese word with the high-frequency Chinese character "/de", which should disrupt the ongoing word-level processing. The character-based skipping hypothesis predicts that this manipulation will enhance the skipping probability of the target character (i.e., the third character of the target word), because the character "/de" has much higher character frequency than the original character. The word-based skipping hypothesis instead predicts a reduction of the skipping probability of the target character because the presence of the character "/de" is lexically infelicitous at word level. The results supported the character-based skipping hypothesis, indicating that in Chinese reading the decision of skipping a character can be made before integrating it into a word.
Vitevitch, Michael S.
2008-01-01
A comparison of the lexical characteristics of 88 auditory misperceptions (i.e., slips of the ear) showed no difference in word-frequency, neighborhood density, and neighborhood frequency between the actual and the perceived utterances. Another comparison of slip of the ear tokens (i.e., actual and perceived utterances) and words in general (i.e., randomly selected from the lexicon) showed that slip of the ear tokens had denser neighborhoods and higher neighborhood frequency than words in general, as predicted from laboratory studies. Contrary to prediction, slip of the ear tokens were higher in frequency of occurrence than words in general. Additional laboratory-based investigations examined the possible source of the contradictory word frequency finding, highlighting the importance of using naturalistic and experimental data to develop models of spoken language processing. PMID:12866911
ERIC Educational Resources Information Center
Herdagdelen, Amaç; Marelli, Marco
2017-01-01
Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are…
Effects of Word and Morpheme Familiarity on Reading of Derived Words
ERIC Educational Resources Information Center
Carlisle, Joanne F.; Katz, Lauren A.
2006-01-01
The purpose of this study is to examine factors that influence students' reading of derived words. Recent research suggests that the lexical quality of a derived word depends on the familiarity of the word, its morphemic constituents (i.e., base word and affixes), and the frequency with which the base word appears in other words (i.e., members of…
A fresh look at the predictors of naming accuracy and errors in Alzheimer's disease.
Cuetos, Fernando; Rodríguez-Ferreiro, Javier; Sage, Karen; Ellis, Andrew W
2012-09-01
In recent years, a considerable number of studies have tried to establish which characteristics of objects and their names predict the responses of patients with Alzheimer's disease (AD) in the picture-naming task. The frequency of use of words and their age of acquisition (AoA) have been implicated as two of the most influential variables, with naming being best preserved for objects with high-frequency, early-acquired names. The present study takes a fresh look at the predictors of naming success in Spanish and English AD patients using a range of measures of word frequency and AoA along with visual complexity, imageability, and word length as predictors. Analyses using generalized linear mixed modelling found that naming accuracy was better predicted by AoA ratings taken from older adults than conventional ratings from young adults. Older frequency measures based on written language samples predicted accuracy better than more modern measures based on the frequencies of words in film subtitles. Replacing adult frequency with an estimate of cumulative (lifespan) frequency did not reduce the impact of AoA. Semantic error rates were predicted by both written word frequency and senior AoA while null response errors were only predicted by frequency. Visual complexity, imageability, and word length did not predict naming accuracy or errors. ©2012 The British Psychological Society.
ERIC Educational Resources Information Center
Baayen, R. Harald; Hendrix, Peter; Ramscar, Michael
2013-01-01
Arnon and Snider ((2010). More than words: Frequency effects for multi-word phrases. "Journal of Memory and Language," 62, 67-82) documented frequency effects for compositional four-grams independently of the frequencies of lower-order "n"-grams. They argue that comprehenders apparently store frequency information about…
Lexical Influences on Spoken Spondaic Word Recognition in Hearing-Impaired Patients
Moulin, Annie; Richard, Céline
2015-01-01
Top-down contextual influences play a major part in speech understanding, especially in hearing-impaired patients with deteriorated auditory input. Those influences are most obvious in difficult listening situations, such as listening to sentences in noise but can also be observed at the word level under more favorable conditions, as in one of the most commonly used tasks in audiology, i.e., repeating isolated words in silence. This study aimed to explore the role of top-down contextual influences and their dependence on lexical factors and patient-specific factors using standard clinical linguistic material. Spondaic word perception was tested in 160 hearing-impaired patients aged 23–88 years with a four-frequency average pure-tone threshold ranging from 21 to 88 dB HL. Sixty spondaic words were randomly presented at a level adjusted to correspond to a speech perception score ranging between 40 and 70% of the performance intensity function obtained using monosyllabic words. Phoneme and whole-word recognition scores were used to calculate two context-influence indices (the j factor and the ratio of word scores to phonemic scores) and were correlated with linguistic factors, such as the phonological neighborhood density and several indices of word occurrence frequencies. Contextual influence was greater for spondaic words than in similar studies using monosyllabic words, with an overall j factor of 2.07 (SD = 0.5). For both indices, context use decreased with increasing hearing loss once the average hearing loss exceeded 55 dB HL. In right-handed patients, significantly greater context influence was observed for words presented in the right ears than for words presented in the left, especially in patients with many years of education. The correlations between raw word scores (and context influence indices) and word occurrence frequencies showed a significant age-dependent effect, with a stronger correlation between perception scores and word occurrence frequencies when the occurrence frequencies were based on the years corresponding to the patients' youth, showing a “historic” word frequency effect. This effect was still observed for patients with few years of formal education, but recent occurrence frequencies based on current word exposure had a stronger influence for those patients, especially for younger ones. PMID:26778945
Angelelli, Paola; Marinelli, Chiara Valeria; De Salvatore, Marinella; Burani, Cristina
2017-11-01
Italian sixth graders, with and without dyslexia, read pseudowords and low-frequency words that include high-frequency morphemes better than stimuli not including any morpheme. The present study assessed whether morphemes affect (1) younger children, with and without dyslexia; (2) spelling as well as reading; and (3) words with low-frequency morphemes. Two groups of third graders (16 children with dyslexia and dysorthography and 16 age-matched typically developing children) read aloud and spelt to dictation pseudowords and words. Pseudowords included (1) root + suffix in not existing combinations (e.g. lampadista, formed by lampad-, 'lamp', and -ista, '-ist') and (2) orthographic sequences not corresponding to any Italian root or suffix (e.g. livonosto). Words had low frequency and included: (1) root + suffix, both of high frequency (e.g. bestiale, 'beastly'); (2) root + suffix, both of low frequency (e.g. asprigno, 'rather sour'); and (3) simple words (e.g. insulso, 'vapid'). Children with dyslexia and dysorthography were less accurate than typically developing children. Root + suffix pseudowords were read and spelt more accurately than non-morphological pseudowords by both groups. Morphologically complex (root + suffix) words were read and spelt better than simple words. However, task interacted with morphology: reading was not facilitated by low-frequency morphemes. We conclude that children acquiring a transparent orthography exploit morpheme-based reading and spelling to face difficulties in processing long unfamiliar stimuli. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Examination of the neighborhood activation theory in normal and hearing-impaired listeners.
Dirks, D D; Takayanagi, S; Moshfegh, A; Noffsinger, P D; Fausti, S A
2001-02-01
Experiments were conducted to examine the effects of lexical information on word recognition among normal hearing listeners and individuals with sensorineural hearing loss. The lexical factors of interest were incorporated in the Neighborhood Activation Model (NAM). Central to this model is the concept that words are recognized relationally in the context of other phonemically similar words. NAM suggests that words in the mental lexicon are organized into similarity neighborhoods and the listener is required to select the target word from competing lexical items. Two structural characteristics of similarity neighborhoods that influence word recognition have been identified; "neighborhood density" or the number of phonemically similar words (neighbors) for a particular target item and "neighborhood frequency" or the average frequency of occurrence of all the items within a neighborhood. A third lexical factor, "word frequency" or the frequency of occurrence of a target word in the language, is assumed to optimize the word recognition process by biasing the system toward choosing a high frequency over a low frequency word. Three experiments were performed. In the initial experiments, word recognition for consonant-vowel-consonant (CVC) monosyllables was assessed in young normal hearing listeners by systematically partitioning the items into the eight possible lexical conditions that could be created by two levels of the three lexical factors, word frequency (high and low), neighborhood density (high and low), and average neighborhood frequency (high and low). Neighborhood structure and word frequency were estimated computationally using a large, on-line lexicon-based Webster's Pocket Dictionary. From this program 400 highly familiar, monosyllables were selected and partitioned into eight orthogonal lexical groups (50 words/group). The 400 words were presented randomly to normal hearing listeners in speech-shaped noise (Experiment 1) and "in quiet" (Experiment 2) as well as to an elderly group of listeners with sensorineural hearing loss in the speech-shaped noise (Experiment 3). The results of three experiments verified predictions of NAM in both normal hearing and hearing-impaired listeners. In each experiment, words from low density neighborhoods were recognized more accurately than those from high density neighborhoods. The presence of high frequency neighbors (average neighborhood frequency) produced poorer recognition performance than comparable conditions with low frequency neighbors. Word frequency was found to have a highly significant effect on word recognition. Lexical conditions with high word frequencies produced higher performance scores than conditions with low frequency words. The results supported the basic tenets of NAM theory and identified both neighborhood structural properties and word frequency as significant lexical factors affecting word recognition when listening in noise and "in quiet." The results of the third experiment permit extension of NAM theory to individuals with sensorineural hearing loss. Future development of speech recognition tests should allow for the effects of higher level cognitive (lexical) factors on lower level phonemic processing.
Burt, Jennifer S
2016-02-01
University students made lexical decisions to eight- or nine-letter words preceded by masked primes that were the target, an unrelated word, or a typical misspelling of the target. At a stimulus onset asynchrony (SOA) of 47 ms, primes that were misspellings of the target produced a priming benefit for low-, medium-, and high-frequency words, even when the misspelled primes were changed to differ phonologically from their targets. At a longer SOA of 80 ms, misspelled primes facilitated lexical decisions only to medium- and low-frequency targets, and a phonological change attenuated the benefit for medium-frequency targets. The results indicate that orthographic similarity can be preserved over changes in letter position and word length, and that the priming effect of misspelled words at the shorter SOA is orthographically based. Orthographic-priming effects depend on the quality of the orthographic learning of the target word.
Automatic generation of stop word lists for information retrieval and analysis
Rose, Stuart J
2013-01-08
Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.
Visual attention based bag-of-words model for image classification
NASA Astrophysics Data System (ADS)
Wang, Qiwei; Wan, Shouhong; Yue, Lihua; Wang, Che
2014-04-01
Bag-of-words is a classical method for image classification. The core problem is how to count the frequency of the visual words and what visual words to select. In this paper, we propose a visual attention based bag-of-words model (VABOW model) for image classification task. The VABOW model utilizes visual attention method to generate a saliency map, and uses the saliency map as a weighted matrix to instruct the statistic process for the frequency of the visual words. On the other hand, the VABOW model combines shape, color and texture cues and uses L1 regularization logistic regression method to select the most relevant and most efficient features. We compare our approach with traditional bag-of-words based method on two datasets, and the result shows that our VABOW model outperforms the state-of-the-art method for image classification.
Serial recall, word frequency, and mixed lists: the influence of item arrangement.
Miller, Leonie M; Roodenrys, Steven
2012-11-01
Studies of the effect of word frequency in the serial recall task show that lists of high-frequency words are better recalled than lists of low-frequency words; however, when high- and low-frequency words are alternated within a list, there is no difference in the level of recall for the two types of words, and recall is intermediate between lists of pure frequency. This pattern has been argued to arise from the development of a network of activated long-term representations of list items that support the redintegration of all list items in a nondirectional and nonspecific way. More recently, it has been proposed that the frequency effect might be a product of the coarticulation of items at word boundaries and their influence on rehearsal rather than a consequence of memory representations. The current work examines recall performance in mixed lists of an equal number of high- and low-frequency items arranged in contiguous segments (i.e., HHHLLL and LLLHHH), under quiet and articulatory suppression conditions, to test whether the effect is (a) nondirectional and (b) dependent on articulatory processes. These experiments demonstrate that neither explanation is satisfactory, although the results suggest that the effect is mnemonic. A language-based approach to short-term memory is favored with emphasis on the role of speech production processes at output.
Parametric Effects of Word Frequency in Memory for Mixed Frequency Lists
ERIC Educational Resources Information Center
Lohnas, Lynn J.; Kahana, Michael J.
2013-01-01
The "word frequency paradox" refers to the finding that low frequency words are better recognized than high frequency words yet high frequency words are better recalled than low frequency words. Rather than comparing separate groups of low and high frequency words, we sought to quantify the functional relation between word frequency and…
An Evaluation Method of Words Tendency Depending on Time-Series Variation and Its Improvements.
ERIC Educational Resources Information Center
Atlam, El-Sayed; Okada, Makoto; Shishibori, Masami; Aoe, Jun-ichi
2002-01-01
Discussion of word frequency and keywords in text focuses on a method to estimate automatically the stability classes that indicate a word's popularity with time-series variations based on the frequency change in past electronic text data. Compares the evaluation of decision tree stability class results with manual classification results.…
A diffusion decision model analysis of evidence variability in the lexical decision task.
Tillman, Gabriel; Osth, Adam F; van Ravenzwaaij, Don; Heathcote, Andrew
2017-12-01
The lexical-decision task is among the most commonly used paradigms in psycholinguistics. In both the signal-detection theory and Diffusion Decision Model (DDM; Ratcliff, Gomez, & McKoon, Psychological Review, 111, 159-182, 2004) frameworks, lexical-decisions are based on a continuous source of word-likeness evidence for both words and non-words. The Retrieving Effectively from Memory model of Lexical-Decision (REM-LD; Wagenmakers et al., Cognitive Psychology, 48(3), 332-367, 2004) provides a comprehensive explanation of lexical-decision data and makes the prediction that word-likeness evidence is more variable for words than non-words and that higher frequency words are more variable than lower frequency words. To test these predictions, we analyzed five lexical-decision data sets with the DDM. For all data sets, drift-rate variability changed across word frequency and non-word conditions. For the most part, REM-LD's predictions about the ordering of evidence variability across stimuli in the lexical-decision task were confirmed.
The role of character positional frequency on Chinese word learning during natural reading.
Liang, Feifei; Blythe, Hazel I; Bai, Xuejun; Yan, Guoli; Li, Xin; Zang, Chuanli; Liversedge, Simon P
2017-01-01
Readers' eye movements were recorded to examine the role of character positional frequency on Chinese lexical acquisition during reading and its possible modulation by word spacing. In Experiment 1, three types of pseudowords were constructed based on each character's positional frequency, providing congruent, incongruent, and no positional word segmentation information. Each pseudoword was embedded into two sets of sentences, for the learning and the test phases. In the learning phase, half the participants read sentences in word-spaced format, and half in unspaced format. In the test phase, all participants read sentences in unspaced format. The results showed an inhibitory effect of character positional frequency upon the efficiency of word learning when processing incongruent pseudowords both in the learning and test phase, and also showed facilitatory effect of word spacing in the learning phase, but not at test. Most importantly, these two characteristics exerted independent influences on word segmentation. In Experiment 2, three analogous types of pseudowords were created whilst controlling for orthographic neighborhood size. The results of the two experiments were consistent, except that the effect of character positional frequency was absent in the test phase in Experiment 2. We argue that the positional frequency of a word's constituent characters may influence the character-to-word assignment in a process that likely incorporates both lexical segmentation and identification.
Herdağdelen, Amaç; Marelli, Marco
2017-05-01
Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency-based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter- and Facebook-based frequencies on American English and British English datasets) and are still substantial when we control for corpus size. © 2016 The Authors. Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society.
A Bootstrapping Model of Frequency and Context Effects in Word Learning.
Kachergis, George; Yu, Chen; Shiffrin, Richard M
2017-04-01
Prior research has shown that people can learn many nouns (i.e., word-object mappings) from a short series of ambiguous situations containing multiple words and objects. For successful cross-situational learning, people must approximately track which words and referents co-occur most frequently. This study investigates the effects of allowing some word-referent pairs to appear more frequently than others, as is true in real-world learning environments. Surprisingly, high-frequency pairs are not always learned better, but can also boost learning of other pairs. Using a recent associative model (Kachergis, Yu, & Shiffrin, 2012), we explain how mixing pairs of different frequencies can bootstrap late learning of the low-frequency pairs based on early learning of higher frequency pairs. We also manipulate contextual diversity, the number of pairs a given pair appears with across training, since it is naturalistically confounded with frequency. The associative model has competing familiarity and uncertainty biases, and their interaction is able to capture the individual and combined effects of frequency and contextual diversity on human learning. Two other recent word-learning models do not account for the behavioral findings. Copyright © 2016 Cognitive Science Society, Inc.
Perea, Manuel; Urkia, Miriam; Davis, Colin J; Agirre, Ainhoa; Laseka, Edurne; Carreiras, Manuel
2006-11-01
We describe a Windows program that enables users to obtain a broad range of statistics concerning the properties of word and nonword stimuli in an agglutinative language (Basque), including measures of word frequency (at the whole-word and lemma levels), bigram and biphone frequency, orthographic similarity, orthographic and phonological structure, and syllable-based measures. It is designed for use by researchers in psycholinguistics, particularly those concerned with recognition of isolated words and morphology. In addition to providing standard orthographic and phonological neighborhood measures, the program can be used to obtain information about other forms of orthographic similarity, such as transposed-letter similarity and embedded-word similarity. It is available free of charge from www .uv.es/mperea/E-Hitz.zip.
ERIC Educational Resources Information Center
Hansen, Pernille
2017-01-01
This article analyses how a set of psycholinguistic factors may account for children's lexical development. Age of acquisition is compared to a measure of lexical development based on vocabulary size rather than age, and robust regression models are used to assess the individual and joint effects of word class, frequency, imageability and…
ERIC Educational Resources Information Center
Deacon, S. Helene; Whalen, Rachel; Kirby, John R.
2011-01-01
We examined whether Grade 4, 6, and 8 children access the base form when reading morphologically complex words. We asked children to read words varying systematically in the frequency of the surface and base forms and in the transparency of the base form. At all grade levels, children were faster at reading derived words with high rather than low…
Effects of aging and text-stimulus quality on the word-frequency effect during Chinese reading.
Wang, Jingxin; Li, Lin; Li, Sha; Xie, Fang; Liversedge, Simon P; Paterson, Kevin B
2018-06-01
Age-related reading difficulty is well established for alphabetic languages. Compared to young adults (18-30 years), older adults (65+ years) read more slowly, make more and longer fixations, make more regressions, and produce larger word-frequency effects. However, whether similar effects are observed for nonalphabetic languages like Chinese remains to be determined. In particular, recent research has suggested Chinese readers experience age-related reading difficulty but do not produce age differences in the word-frequency effect. This might represent an important qualitative difference in aging effects, so we investigated this further by presenting young and older adult Chinese readers with sentences that included high- or low-frequency target words. Additionally, to test theories that suggest reductions in text-stimulus quality differentially affect lexical processing by adult age groups, we presented either the target words (Experiment 1) or all characters in sentences (Experiment 2) normally or with stimulus quality reduced. Analyses based on mean eye-movement parameters and distributional analyses of fixation times for target words showed typical age-related reading difficulty. We also observed age differences in the word-frequency effect, predominantly in the tails of fixation-time distributions, consistent with an aging effect on the processing of high- and low-frequency words. Reducing stimulus quality disrupted eye movements more for the older readers, but the influence of stimulus quality on the word-frequency effect did not differ across age groups. This suggests Chinese older readers' lexical processing is resilient to reductions in stimulus quality, perhaps due to greater experience recognizing words from impoverished visual input. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.
Soares, Inês; Goios, Ana; Amorim, António
2012-01-01
The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
A preliminary study of subjective frequency estimates of words spoken in Cantonese.
Yip, M C
2001-06-01
A database is presented of the subjective frequency estimates for a set of 30 Chinese homophones. The estimates are based on analysis of responses from a simple listening task by 120 University students. On the listening task, they are asked to mention the first meaning thought of upon hearing a Chinese homophone by writing down the corresponding Chinese characters. There was correlation of .66 between the frequency of spoken and written words, suggesting distributional information about the lexical representations is generally independent of modality. These subjective frequency counts should be useful in the construction of material sets for research on word recognition using spoken Chinese (Cantonese).
Bryden, John; Wright, Shaun P; Jansen, Vincent A A
2018-02-01
Language transmission, the passing on of language features such as words between people, is the process of inheritance that underlies linguistic evolution. To understand how language transmission works, we need a mechanistic understanding based on empirical evidence of lasting change of language usage. Here, we analysed 200 million online conversations to investigate transmission between individuals. We find that the frequency of word usage is inherited over conversations, rather than only the binary presence or absence of a word in a person's lexicon. We propose a mechanism for transmission whereby for each word someone encounters there is a chance they will use it more often. Using this mechanism, we measure that, for one word in around every hundred a person encounters, they will use that word more frequently. As more commonly used words are encountered more often, this means that it is the frequencies of words which are copied. Beyond this, our measurements indicate that this per-encounter mechanism is neutral and applies without any further distinction as to whether a word encountered in a conversation is commonly used or not. An important consequence of this is that frequencies of many words can be used in concert to observe and measure language transmission, and our results confirm this. These results indicate that our mechanism for transmission can be used to study language patterns and evolution within populations. © 2018 The Author(s).
The low-frequency encoding disadvantage: Word frequency affects processing demands.
Diana, Rachel A; Reder, Lynne M
2006-07-01
Low-frequency words produce more hits and fewer false alarms than high-frequency words in a recognition task. The low-frequency hit rate advantage has sometimes been attributed to processes that operate during the recognition test (e.g., L. M. Reder et al., 2000). When tasks other than recognition, such as recall, cued recall, or associative recognition, are used, the effects seem to contradict a low-frequency advantage in memory. Four experiments are presented to support the claim that in addition to the advantage of low-frequency words at retrieval, there is a low-frequency disadvantage during encoding. That is, low-frequency words require more processing resources to be encoded episodically than high-frequency words. Under encoding conditions in which processing resources are limited, low-frequency words show a larger decrement in recognition than high-frequency words. Also, studying items (pictures and words of varying frequencies) along with low-frequency words reduces performance for those stimuli. Copyright 2006 APA, all rights reserved.
Word List for a Spelling Program.
ERIC Educational Resources Information Center
Smith, Carl B.
What logic should educators use in choosing words for students to learn to spell? Common sense provides the answer: students should learn to spell the words they use in writing. What these words are has been a subject of concern since the beginning of this century. Dozens of word frequency lists have been developed over the years, based primarily…
Using Word Clouds to Develop Proactive Learners
ERIC Educational Resources Information Center
Miley, Frances; Read, Andrew
2011-01-01
This article examines student responses to a technique for summarizing electronically available information based on word frequency. Students used this technique to create word clouds, using those word clouds to enhance personal and small group study. This is a qualitative study. Small focus groups were used to obtain student feedback. Feedback…
ERIC Educational Resources Information Center
Heston, Wilma
The three-volume set of materials describes and presents the results to date of a federally-funded project to develop Pashto-English and English-Pashto dictionaries. The goal was to produce a list of 12,000 basic Pashto words for English-speaking users. Words were selected based on frequency in various kinds of oral and written materials, and were…
Subjective age-of-acquisition norms for 600 Turkish words from four age groups.
Göz, İlyas; Tekcan, Ali I; Erciyes, Aslı Aktan
2017-10-01
The main purpose of this study was to report age-based subjective age-of-acquisition (AoA) norms for 600 Turkish words. A total of 115 children, 100 young adults, 115 middle-aged adults, and 127 older adults provided AoA estimates for 600 words on a 7-point scale. The intraclass correlations suggested high reliability, and the AoA estimates were highly correlated across the four age groups. Children gave earlier AoA estimates than the three adult groups; this was true for high-frequency as well as low-frequency words. In addition to the means and standard deviations of the AoA estimates, we report word frequency, concreteness, and imageability ratings, as well as word length measures (numbers of syllables and letters), for the 600 words as supplemental materials. The present ratings represent a potentially useful database for researchers working on lexical processing as well as other aspects of cognitive processing, such as autobiographical memory.
Auditory word recognition: extrinsic and intrinsic effects of word frequency.
Connine, C M; Titone, D; Wang, J
1993-01-01
Two experiments investigated the influence of word frequency in a phoneme identification task. Speech voicing continua were constructed so that one endpoint was a high-frequency word and the other endpoint was a low-frequency word (e.g., best-pest). Experiment 1 demonstrated that ambiguous tokens were labeled such that a high-frequency word was formed (intrinsic frequency effect). Experiment 2 manipulated the frequency composition of the list (extrinsic frequency effect). A high-frequency list bias produced an exaggerated influence of frequency; a low-frequency list bias showed a reverse frequency effect. Reaction time effects were discussed in terms of activation and postaccess decision models of frequency coding. The results support a late use of frequency in auditory word recognition.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
Anderson, Julie D
2007-02-01
The purpose of this study was to examine (a) the role of neighborhood density (number of words that are phonologically similar to a target word) and frequency variables on the stuttering-like disfluencies of preschool children who stutter, and (b) whether these variables have an effect on the type of stuttering-like disfluency produced. A 500+ word speech sample was obtained from each participant (N = 15). Each stuttered word was randomly paired with the firstly produced word that closely matched it in grammatical class, familiarity, and number of syllables/phonemes. Frequency, neighborhood density, and neighborhood frequency values were obtained for the stuttered and fluent words from an online database. Findings revealed that stuttered words were lower in frequency and neighborhood frequency than fluent words. Words containing part-word repetitions and sound prolongations were also lower in frequency and/or neighborhood frequency than fluent words, but these frequency variables did not have an effect on single-syllable word repetitions. Neighborhood density failed to influence the susceptibility of words to stuttering, as well as the type of stuttering-like disfluency produced. In general, findings suggest that neighborhood and frequency variables not only influence the fluency with which words are produced in speech, but also have an impact on the type of stuttering-like disfluency produced.
Keith, Jeff; Westbury, Chris; Goldman, James
2015-09-01
Corpus-based semantic space models, which primarily rely on lexical co-occurrence statistics, have proven effective in modeling and predicting human behavior in a number of experimental paradigms that explore semantic memory representation. The most widely studied extant models, however, are strongly influenced by orthographic word frequency (e.g., Shaoul & Westbury, Behavior Research Methods, 38, 190-195, 2006). This has the implication that high-frequency closed-class words can potentially bias co-occurrence statistics. Because these closed-class words are purported to carry primarily syntactic, rather than semantic, information, the performance of corpus-based semantic space models may be improved by excluding closed-class words (using stop lists) from co-occurrence statistics, while retaining their syntactic information through other means (e.g., part-of-speech tagging and/or affixes from inflected word forms). Additionally, very little work has been done to explore the effect of employing morphological decomposition on the inflected forms of words in corpora prior to compiling co-occurrence statistics, despite (controversial) evidence that humans perform early morphological decomposition in semantic processing. In this study, we explored the impact of these factors on corpus-based semantic space models. From this study, morphological decomposition appears to significantly improve performance in word-word co-occurrence semantic space models, providing some support for the claim that sublexical information-specifically, word morphology-plays a role in lexical semantic processing. An overall decrease in performance was observed in models employing stop lists (e.g., excluding closed-class words). Furthermore, we found some evidence that weakens the claim that closed-class words supply primarily syntactic information in word-word co-occurrence semantic space models.
Evaluating a Pivot-Based Approach for Bilingual Lexicon Extraction
Kim, Jae-Hoon; Kwon, Hong-Seok; Seo, Hyeong-Won
2015-01-01
A pivot-based approach for bilingual lexicon extraction is based on the similarity of context vectors represented by words in a pivot language like English. In this paper, in order to show validity and usability of the pivot-based approach, we evaluate the approach in company with two different methods for estimating context vectors: one estimates them from two parallel corpora based on word association between source words (resp., target words) and pivot words and the other estimates them from two parallel corpora based on word alignment tools for statistical machine translation. Empirical results on two language pairs (e.g., Korean-Spanish and Korean-French) have shown that the pivot-based approach is very promising for resource-poor languages and this approach observes its validity and usability. Furthermore, for words with low frequency, our method is also well performed. PMID:25983745
ERIC Educational Resources Information Center
SILIAKUS, H.J.
IN PREPARATION FOR THE DEVELOPMENT OF A GENERAL FREQUENCY WORD LIST IN GERMAN DESIGNED TO MEET THE NEEDS OF THE INTERMEDIATE AND ADVANCED LEVELS OF READING IN THE GERMAN CURRICULUM, A COMPUTER-BASED WORD COUNT WAS BEGUN IN AUSTRALIA'S UNIVERSITY OF ADELAIDE. USING MAGNETIC TAPES CONTAINING (1) A TEXT OF OVER 100,000 RUNNING WORDS, (2) 1,000 MOST…
Rapid automatic keyword extraction for information retrieval and analysis
Rose, Stuart J [Richland, WA; Cowley,; E, Wendy [Richland, WA; Crow, Vernon L [Richland, WA; Cramer, Nicholas O [Richland, WA
2012-03-06
Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.
Friends and Foes in the Lexicon: Homophone Naming in Aphasia
ERIC Educational Resources Information Center
Middleton, Erica L.; Chen, Qi; Verkuilen, Jay
2015-01-01
The study of homophones--words with different meanings that sound the same--has great potential to inform models of language production. Of particular relevance is a phenomenon termed "frequency" inheritance, where a low-frequency word (e.g., "deer") is produced more fluently than would be expected based on its frequency…
The influence of contextual diversity on eye movements in reading.
Plummer, Patrick; Perea, Manuel; Rayner, Keith
2014-01-01
Recent research has shown contextual diversity (i.e., the number of passages in which a given word appears) to be a reliable predictor of word processing difficulty. It has also been demonstrated that word-frequency has little or no effect on word recognition speed when accounting for contextual diversity in isolated word processing tasks. An eye-movement experiment was conducted wherein the effects of word-frequency and contextual diversity were directly contrasted in a normal sentence reading scenario. Subjects read sentences with embedded target words that varied in word-frequency and contextual diversity. All 1st-pass and later reading times were significantly longer for words with lower contextual diversity compared to words with higher contextual diversity when controlling for word-frequency and other important lexical properties. Furthermore, there was no difference in reading times for higher frequency and lower frequency words when controlling for contextual diversity. The results confirm prior findings regarding contextual diversity and word-frequency effects and demonstrate that contextual diversity is a more accurate predictor of word processing speed than word-frequency within a normal reading task. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Anderson, Julie D.
2008-01-01
Purpose The purpose of this study was to examine (a) the role of neighborhood density (number of words that are phonologically similar to a target word) and frequency variables on the stuttering-like disfluencies of preschool children who stutter, and (b) whether these variables have an effect on the type of stuttering-like disfluency produced. Method A 500+ word speech sample was obtained from each participant (N = 15). Each stuttered word was randomly paired with the firstly produced word that closely matched it in grammatical class, familiarity, and number of syllables/phonemes. Frequency, neighborhood density, and neighborhood frequency values were obtained for the stuttered and fluent words from an online database. Results Findings revealed that stuttered words were lower in frequency and neighborhood frequency than fluent words. Words containing part-word repetitions and sound prolongations were also lower in frequency and/or neighborhood frequency than fluent words, but these frequency variables did not have an effect on single-syllable word repetitions. Neighborhood density failed to influence the susceptibility of words to stuttering, as well as the type of stuttering-like disfluency produced. Conclusions In general, findings suggest that neighborhood and frequency variables not only influence the fluency with which words are produced in speech, but also have an impact on the type of stuttering-like disfluency produced. PMID:17344561
Yap, Melvin J.; Tse, Chi-Shing; Balota, David A.
2009-01-01
Word frequency and semantic priming effects are among the most robust effects in visual word recognition, and it has been generally assumed that these two variables produce interactive effects in lexical decision performance, with larger priming effects for low-frequency targets. The results from four lexical decision experiments indicate that the joint effects of semantic priming and word frequency are critically dependent upon differences in the vocabulary knowledge of the participants. Specifically, across two Universities, additive effects of the two variables were observed in participants with more vocabulary knowledge, while interactive effects were observed in participants with less vocabulary knowledge. These results are discussed with reference to Borowsky and Besner’s (1993) multistage account and Plaut and Booth’s (2000) single-mechanism model. In general, the findings are also consistent with a flexible lexical processing system that optimizes performance based on processing fluency and task demands. PMID:20161653
Responding to nonwords in the lexical decision task: Insights from the English Lexicon Project.
Yap, Melvin J; Sibley, Daragh E; Balota, David A; Ratcliff, Roger; Rueckl, Jay
2015-05-01
Researchers have extensively documented how various statistical properties of words (e.g., word frequency) influence lexical processing. However, the impact of lexical variables on nonword decision-making performance is less clear. This gap is surprising, because a better specification of the mechanisms driving nonword responses may provide valuable insights into early lexical processes. In the present study, item-level and participant-level analyses were conducted on the trial-level lexical decision data for almost 37,000 nonwords in the English Lexicon Project in order to identify the influence of different psycholinguistic variables on nonword lexical decision performance and to explore individual differences in how participants respond to nonwords. Item-level regression analyses reveal that nonword response time was positively correlated with number of letters, number of orthographic neighbors, number of affixes, and base-word number of syllables, and negatively correlated with Levenshtein orthographic distance and base-word frequency. Participant-level analyses also point to within- and between-session stability in nonword responses across distinct sets of items, and intriguingly reveal that higher vocabulary knowledge is associated with less sensitivity to some dimensions (e.g., number of letters) but more sensitivity to others (e.g., base-word frequency). The present findings provide well-specified and interesting new constraints for informing models of word recognition and lexical decision. (c) 2015 APA, all rights reserved).
Pupillary responses during lexical decisions vary with word frequency but not emotional valence.
Kuchinke, Lars; Võ, Melissa L-H; Hofmann, Markus; Jacobs, Arthur M
2007-08-01
Pupillary responses were examined during a lexical decision task (LDT). Word frequency (high and low frequency words) and emotional valence (positive, neutral and negative words) were varied as experimental factors incidental to the subjects. Both variables significantly affected lexical decision performance and an interaction effect was observed. The behavioral results suggest that manipulating word frequency may partly account for the heterogeneous literature findings regarding emotional valence effects in the LDT. In addition, a difference between high and low frequency words was observed in the pupil data as reflected by higher peak pupil dilations for low frequency words, whereas pupillary responses to emotionally valenced words did not differ. This result was further supported by means of a principal component analysis on the pupil data, in which a late component was shown only to be affected by word frequency. Consistent with previous findings, word frequency was found to affect the resource allocation towards processing of the letter string, while emotionally valenced words tend to facilitate processing.
Serial Recall, Word Frequency, and Mixed Lists: The Influence of Item Arrangement
ERIC Educational Resources Information Center
Miller, Leonie M.; Roodenrys, Steven
2012-01-01
Studies of the effect of word frequency in the serial recall task show that lists of high-frequency words are better recalled than lists of low-frequency words; however, when high- and low-frequency words are alternated within a list, there is no difference in the level of recall for the two types of words, and recall is intermediate between lists…
Text Detection and Translation from Natural Scenes
2001-06-01
is no explicit tags around Chinese words. A module for Chinese word segmentation is included in the system. This segmentor uses a word- frequency ... list to make segmentation decisions. We tested the EBMT based method using randomly selected 50 signs from our database, assuming perfect sign
ERIC Educational Resources Information Center
Vongpumivitch, Viphavee; Huang, Ju-yu; Chang, Yu-Chia
2009-01-01
This study is a corpus-based lexical study that aims to explore the use of words in Coxhead's (2000) Academic Word List (AWL) in journal articles in the field of applied linguistics. A 1.5 million-word corpus called the Applied Linguistics Research Articles Corpus (ALC) was created for this study. The corpus consists of 200 research articles that…
Grammar and Frequency Effects in the Acquisition of Prosodic Words in European Portuguese
ERIC Educational Resources Information Center
Vigario, Marina; Freitas, Maria Joao; Frota, Sonia
2006-01-01
This paper investigates the acquisition of prosodic words in European Portuguese (EP) through analysis of grammatical and statistical properties of the target language and child speech. The analysis of grammatical properties shows that there are solid cues to the prosodic word (PW) in EP, and the presence of early word-based phonology in child…
A Word Count of Modern Arabic Prose.
ERIC Educational Resources Information Center
Landau, Jacob M.
This book presents a word count of Arabic prose based on 60 twentieth-century Egyptian books. The text is divided into an alphabetical list and a word frequency list. This word count is intended as an aid in the: (1) writing of primers and the compilation of graded readers, (2) examination of the vocabulary selection of primers and readers…
The Role of Derivative Suffix Productivity in the Visual Word Recognition of Complex Words
ERIC Educational Resources Information Center
Lázaro, Miguel; Sainz, Javier; Illera, Víctor
2015-01-01
In this article we present two lexical decision experiments that examine the role of base frequency and of derivative suffix productivity in visual recognition of Spanish words. In the first experiment we find that complex words with productive derivative suffixes result in lower response times than those with unproductive derivative suffixes.…
Lexical frequency and voice assimilation in complex words in Dutch
NASA Astrophysics Data System (ADS)
Ernestus, Mirjam; Lahey, Mybeth; Verhees, Femke; Baayen, Harald
2004-05-01
Words with higher token frequencies tend to have more reduced acoustic realizations than lower frequency words (e.g., Hay, 2000; Bybee, 2001; Jurafsky et al., 2001). This study documents frequency effects for regressive voice assimilation (obstruents are voiced before voiced plosives) in Dutch morphologically complex words in the subcorpus of read-aloud novels in the corpus of spoken Dutch (Oostdijk et al., 2002). As expected, the initial obstruent of the cluster tends to be absent more often as lexical frequency increases. More importantly, as frequency increases, the duration of vocal-fold vibration in the cluster decreases, and the duration of the bursts in the cluster increases, after partialing out cluster duration. This suggests that there is less voicing for higher-frequency words. In fact, phonetic transcriptions show regressive voice assimilation for only half of the words and progressive voice assimilation for one third. Interestingly, the progressive voice assimilation observed for higher-frequency complex words renders these complex words more similar to monomorphemic words: Dutch monomorphemic words typically contain voiceless obstruent clusters (Zonneveld, 1983). Such high-frequency complex words may therefore be less easily parsed into their constituent morphemes (cf. Hay, 2000), favoring whole word lexical access (Bertram et al., 2000).
Jared, Debra; O'Donnell, Katrina
2017-02-01
We examined whether highly skilled adult readers activate the meanings of high-frequency words using phonology when reading sentences for meaning. A homophone-error paradigm was used. Sentences were written to fit 1 member of a homophone pair, and then 2 other versions were created in which the homophone was replaced by its mate or a spelling-control word. The error words were all high-frequency words, and the correct homophones were either higher-frequency words or low-frequency words-that is, the homophone errors were either the subordinate or dominant member of the pair. Participants read sentences as their eye movements were tracked. When the high-frequency homophone error words were the subordinate member of the homophone pair, participants had shorter immediate eye-fixation latencies on these words than on matched spelling-control words. In contrast, when the high-frequency homophone error words were the dominant member of the homophone pair, a difference between these words and spelling controls was delayed. These findings provide clear evidence that the meanings of high-frequency words are activated by phonological representations when skilled readers read sentences for meaning. Explanations of the differing patterns of results depending on homophone dominance are discussed.
Badham, Stephen P; Whitney, Cora; Sanghera, Sumeet; Maylor, Elizabeth A
2017-07-01
Many studies show that age deficits in memory are smaller for information supported by pre-experimental experience. Many studies also find dissociations in memory tasks between words that occur with high and low frequencies in language, but the literature is mixed regarding the extent of word frequency effects in normal ageing. We examined whether age deficits in episodic memory could be influenced by manipulations of word frequency. In Experiment 1, young and older adults studied short and long lists of high- and low-frequency words for free recall. The list length effect (the drop in proportion recalled for longer lists) was larger in young compared to older adults and for high- compared to low-frequency words. In Experiment 2, young and older adults completed item and associative recognition memory tests with high- and low-frequency words. Age deficits were greater for associative memory than for item memory, demonstrating an age-related associative deficit. High-frequency words led to better associative memory performance whilst low-frequency words resulted in better item memory performance. In neither experiment was there any evidence for age deficits to be smaller for high- relative to low-frequency words, suggesting that word frequency effects on memory operate independently from effects due to cognitive ageing.
Semantic Feature Distinctiveness and Frequency
ERIC Educational Resources Information Center
Lamb, Katherine M.
2012-01-01
Lexical access is the process in which basic components of meaning in language, the lexical entries (words) are activated. This activation is based on the organization and representational structure of the lexical entries. Semantic features of words, which are the prominent semantic characteristics of a word concept, provide important information…
Production frequency effects in perception of phonological variation
NASA Astrophysics Data System (ADS)
Connine, Cynthia M.; Ranbom, Larissa J.
2004-05-01
Two experiments were conducted that investigated the relationship between phonological variant occurrence frequency (based on a corpus analysis of conversational speech) and auditory word recognition. The variant investigated was an alternation between the presence of [nt] and a nasal flap (e.g., center, cen'er). The corpus analysis showed that 80% of productions are nasal flaps, with wide variability across words (from 0% for ``enter'' to 100% for ``twenty''). In a production goodness rating experiment, listeners rated [nt] productions as better than their nasal flap counterparts. For individual items, a strong positive correlation was found between nasal flap frequency and goodness ratings: words typically produced with nasal flaps were rated as better productions. A lexical decision experiment showed that nasal flap variants were recognized more slowly and less accurately than [nt] versions. The rated quality of the nasal-flapped production was strongly correlated with the results of the lexical decision task: nasal-flapped words considered highly acceptable were recognized more quickly and accurately than words rated as poor nasal flap productions. The results demonstrate a strong relationship between experienced variant frequency and auditory word recognition and suggest that phonological variation is explicitly represented in the mental lexicon.
Meier, Beat; Rey-Mermet, Alodie; Rothen, Nicolas; Graf, Peter
2013-01-01
The goal of this study was to investigate recognition memory performance across the lifespan and to determine how estimates of recollection and familiarity contribute to performance. In each of three experiments, participants from five groups from 14 up to 85 years of age (children, young adults, middle-aged adults, young-old adults, and old-old adults) were presented with high- and low-frequency words in a study phase and were tested immediately afterwards and/or after a one day retention interval. The results showed that word frequency and retention interval affected recognition memory performance as well as estimates of recollection and familiarity. Across the lifespan, the trajectory of recognition memory followed an inverse u-shape function that was neither affected by word frequency nor by retention interval. The trajectory of estimates of recollection also followed an inverse u-shape function, and was especially pronounced for low-frequency words. In contrast, estimates of familiarity did not differ across the lifespan. The results indicate that age differences in recognition memory are mainly due to differences in processes related to recollection while the contribution of familiarity-based processes seems to be age-invariant. PMID:24198796
The Wide and Wild World of Words: Interview with Averil Coxhead
ERIC Educational Resources Information Center
Mah, Adeline Shi Hui; Yeo, Marie
2016-01-01
Averil Coxhead is widely known for developing the Academic Word List, a list of 570 word families associated with great frequency in academic texts. This list has been particularly useful to teachers of English as a Second Language as well as independent learners in tertiary education. She has also developed a Science-based word list (Coxhead and…
Standard-Chinese Lexical Neighborhood Test in normal-hearing young children.
Liu, Chang; Liu, Sha; Zhang, Ning; Yang, Yilin; Kong, Ying; Zhang, Luo
2011-06-01
The purposes of the present study were to establish the Standard-Chinese version of Lexical Neighborhood Test (LNT) and to examine the lexical and age effects on spoken-word recognition in normal-hearing children. Six lists of monosyllabic and six lists of disyllabic words (20 words/list) were selected from the database of daily speech materials for normal-hearing (NH) children of ages 3-5 years. The lists were further divided into "easy" and "hard" halves according to the word frequency and neighborhood density in the database based on the theory of Neighborhood Activation Model (NAM). Ninety-six NH children (age ranged between 4.0 and 7.0 years) were divided into three different age groups of 1-year intervals. Speech-perception tests were conducted using the Standard-Chinese monosyllabic and disyllabic LNT. The inter-list performance was found to be equivalent and inter-rater reliability was high with 92.5-95% consistency. Results of word-recognition scores showed that the lexical effects were all significant. Children scored higher with disyllabic words than with monosyllabic words. "Easy" words scored higher than "hard" words. The word-recognition performance also increased with age in each lexical category. A multiple linear regression analysis showed that neighborhood density, age, and word frequency appeared to have increasingly more contributions to Chinese word recognition. The results of the present study indicated that performances of Chinese word recognition were influenced by word frequency, age, and neighborhood density, with word frequency playing a major role. These results were consistent with those in other languages, supporting the application of NAM in the Chinese language. The development of Standard-Chinese version of LNT and the establishment of a database of children of 4-6 years old can provide a reliable means for spoken-word recognition test in children with hearing impairment. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Altszyler, Edgar; Ribeiro, Sidarta; Sigman, Mariano; Fernández Slezak, Diego
2017-11-01
Computer-based dreams content analysis relies on word frequencies within predefined categories in order to identify different elements in text. As a complementary approach, we explored the capabilities and limitations of word-embedding techniques to identify word usage patterns among dream reports. These tools allow us to quantify words associations in text and to identify the meaning of target words. Word-embeddings have been extensively studied in large datasets, but only a few studies analyze semantic representations in small corpora. To fill this gap, we compared Skip-gram and Latent Semantic Analysis (LSA) capabilities to extract semantic associations from dream reports. LSA showed better performance than Skip-gram in small size corpora in two tests. Furthermore, LSA captured relevant word associations in dream collection, even in cases with low-frequency words or small numbers of dreams. Word associations in dreams reports can thus be quantified by LSA, which opens new avenues for dream interpretation and decoding. Copyright © 2017 Elsevier Inc. All rights reserved.
Does "Word Coach" Coach Words?
ERIC Educational Resources Information Center
Cobb, Tom; Horst, Marlise
2011-01-01
This study reports on the design and testing of an integrated suite of vocabulary training games for Nintendo[TM] collectively designated "My Word Coach" (Ubisoft, 2008). The games' design is based on a wide range of learning research, from classic studies on recycling patterns to frequency studies of modern corpora. Its general usage…
ERIC Educational Resources Information Center
Cho, Euna
2017-01-01
The present study examined the effects of multimedia enhancement in video form in addition to textual information on L2 vocabulary instruction for high-level, low-frequency English words among Korean learners of English. Although input-based incidental learning of L2 vocabulary through extensive reading has been conventionally believed to be…
Cluster analysis of word frequency dynamics
NASA Astrophysics Data System (ADS)
Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.
2015-01-01
This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.
Petrova, Ana; Gaskell, M. Gareth; Ferrand, Ludovic
2011-01-01
Many studies have repeatedly shown an orthographic consistency effect in the auditory lexical decision task. Words with phonological rimes that could be spelled in multiple ways (i.e., inconsistent words) typically produce longer auditory lexical decision latencies and more errors than do words with rimes that could be spelled in only one way (i.e., consistent words). These results have been extended to different languages and tasks, suggesting that the effect is quite general and robust. Despite this growing body of evidence, some psycholinguists believe that orthographic effects on spoken language are exclusively strategic, post-lexical, or restricted to peculiar (low-frequency) words. In the present study, we manipulated consistency and word-frequency orthogonally in order to explore whether the orthographic consistency effect extends to high-frequency words. Two different tasks were used: lexical decision and rime detection. Both tasks produced reliable consistency effects for both low- and high-frequency words. Furthermore, in Experiment 1 (lexical decision), an interaction revealed a stronger consistency effect for low-frequency words than for high-frequency words, as initially predicted by Ziegler and Ferrand (1998), whereas no interaction was found in Experiment 2 (rime detection). Our results extend previous findings by showing that the orthographic consistency effect is obtained not only for low-frequency words but also for high-frequency words. Furthermore, these effects were also obtained in a rime detection task, which does not require the explicit processing of orthographic structure. Globally, our results suggest that literacy changes the way people process spoken words, even for frequent words. PMID:22025916
Woltz, Dan J; Gardner, Michael K
2015-09-01
Previous research has demonstrated a systematic, nonlinear relationship between word frequency judgments and values from word frequency norms. This relationship could reflect a perceptual process similar to that found in the psychophysics literature for a variety of sensory phenomena. Alternatively, it could reflect memory strength differences that are expected for words of varying levels of prior exposure. Two experiments tested the memory strength explanation by semantically priming words prior to frequency judgments. Exposure to related word meanings produced a small but measurable increase in target word frequency ratings. Repetition but not semantic priming had a greater impact on low compared to high frequency words. These findings are consistent with a memory strength view of frequency judgments that assumes a distributed network with lexical and semantic levels of representation. Copyright © 2015 Elsevier B.V. All rights reserved.
Emotion words and categories: evidence from lexical decision.
Scott, Graham G; O'Donnell, Patrick J; Sereno, Sara C
2014-05-01
We examined the categorical nature of emotion word recognition. Positive, negative, and neutral words were presented in lexical decision tasks. Word frequency was additionally manipulated. In Experiment 1, "positive" and "negative" categories of words were implicitly indicated by the blocked design employed. A significant emotion-frequency interaction was obtained, replicating past research. While positive words consistently elicited faster responses than neutral words, only low frequency negative words demonstrated a similar advantage. In Experiments 2a and 2b, explicit categories ("positive," "negative," and "household" items) were specified to participants. Positive words again elicited faster responses than did neutral words. Responses to negative words, however, were no different than those to neutral words, regardless of their frequency. The overall pattern of effects indicates that positive words are always facilitated, frequency plays a greater role in the recognition of negative words, and a "negative" category represents a somewhat disparate set of emotions. These results support the notion that emotion word processing may be moderated by distinct systems.
Funk, Mark E
2013-01-01
This lecture explores changes in the medical library profession over the last fifty years, as revealed by individual word usage in a body of literature. I downloaded articles published in the Bulletin of the Medical Library Association and Journal of the Medical Library Association between 1961 and 2000 to create an electronic corpus and tracked annual frequency of individual word usage. I used frequency sparklines of words, matching one of four archetypal shapes (level, rise, fall, and rise-and-fall) to identify significant words. Most significant words fell into the categories of environment, management, technology, and research. Based on word usage changes, the following trends are revealed: Compared to 1961, today's medical librarians are more concerned with digital information, not physical packages. We prefer information to be evidence-based. We focus more on health than medicine. We are reaching out to new constituents, sometimes leaving our building to do so. Teaching has become important for us. We run our libraries more like businesses, using constantly changing technology. We are publishing more research articles. Although these words were chosen by individual authors to tell their particular stories, in the aggregate, our words reveal our story of change in our profession.
The word frequency effect during sentence reading: A linear or nonlinear effect of log frequency?
White, Sarah J; Drieghe, Denis; Liversedge, Simon P; Staub, Adrian
2016-10-20
The effect of word frequency on eye movement behaviour during reading has been reported in many experimental studies. However, the vast majority of these studies compared only two levels of word frequency (high and low). Here we assess whether the effect of log word frequency on eye movement measures is linear, in an experiment in which a critical target word in each sentence was at one of three approximately equally spaced log frequency levels. Separate analyses treated log frequency as a categorical or a continuous predictor. Both analyses showed only a linear effect of log frequency on the likelihood of skipping a word, and on first fixation duration. Ex-Gaussian analyses of first fixation duration showed similar effects on distributional parameters in comparing high- and medium-frequency words, and medium- and low-frequency words. Analyses of gaze duration and the probability of a refixation suggested a nonlinear pattern, with a larger effect at the lower end of the log frequency scale. However, the nonlinear effects were small, and Bayes Factor analyses favoured the simpler linear models for all measures. The possible roles of lexical and post-lexical factors in producing nonlinear effects of log word frequency during sentence reading are discussed.
Recognizing Spoken Words: The Neighborhood Activation Model
Luce, Paul A.; Pisoni, David B.
2012-01-01
Objective A fundamental problem in the study of human spoken word recognition concerns the structural relations among the sound patterns of words in memory and the effects these relations have on spoken word recognition. In the present investigation, computational and experimental methods were employed to address a number of fundamental issues related to the representation and structural organization of spoken words in the mental lexicon and to lay the groundwork for a model of spoken word recognition. Design Using a computerized lexicon consisting of transcriptions of 20,000 words, similarity neighborhoods for each of the transcriptions were computed. Among the variables of interest in the computation of the similarity neighborhoods were: 1) the number of words occurring in a neighborhood, 2) the degree of phonetic similarity among the words, and 3) the frequencies of occurrence of the words in the language. The effects of these variables on auditory word recognition were examined in a series of behavioral experiments employing three experimental paradigms: perceptual identification of words in noise, auditory lexical decision, and auditory word naming. Results The results of each of these experiments demonstrated that the number and nature of words in a similarity neighborhood affect the speed and accuracy of word recognition. A neighborhood probability rule was developed that adequately predicted identification performance. This rule, based on Luce's (1959) choice rule, combines stimulus word intelligibility, neighborhood confusability, and frequency into a single expression. Based on this rule, a model of auditory word recognition, the neighborhood activation model, was proposed. This model describes the effects of similarity neighborhood structure on the process of discriminating among the acoustic-phonetic representations of words in memory. The results of these experiments have important implications for current conceptions of auditory word recognition in normal and hearing impaired populations of children and adults. PMID:9504270
Fernández, Gerardo; Sapognikoff, Marcelo; Guinjoan, Salvador; Orozco, David; Agamennoni, Osvaldo
2016-07-01
The current study analyze the effect of word properties (i.e., word length, word frequency and word predictability) on the eye movement behavior of patients with schizophrenia (SZ) compared to age-matched controls. 18 SZ patients and 40 age matched controls participated in the study. Eye movements were recorded during reading regular sentences by using the eyetracking technique. Eye movement analyses were performed using linear mixed models. Analysis of eye movements revealed that patients with SZ decreased the amount of single fixations, increased their total number of second pass fixations compared with healthy individuals (Controls). In addition, SZ patients showed an increase in gaze duration, compared to Controls. Interestingly, the effects of current word frequency and current word length processing were similar in Controls and SZ patients. The high rate of second pass fixations and its low rate in single fixation might reveal impairments in working memory when integrating neighbor words. In contrast, word frequency and length processing might require less complex mechanisms, which were functioning in SZ patients. To the best of our knowledge, this is the first study measuring how patients with SZ process dynamically well-defined words embedded in regular sentences. The findings suggest that evaluation of the resulting changes in eye movement behavior may supplement current symptom-based diagnosis. Copyright © 2016 Elsevier Inc. All rights reserved.
The locus of word frequency effects in skilled spelling-to-dictation.
Chua, Shi Min; Liow, Susan J Rickard
2014-01-01
In spelling-to-dictation tasks, skilled spellers consistently initiate spelling of high-frequency words faster than that of low-frequency words. Tainturier and Rapp's model of spelling shows three possible loci for this frequency effect: spoken word recognition, orthographic retrieval, and response execution of the first letter. Thus far, researchers have attributed the effect solely to orthographic retrieval without considering spoken word recognition or response execution. To investigate word frequency effects at each of these three loci, Experiment 1 involved a delayed spelling-to-dictation task and Experiment 2 involved a delayed/uncertain task. In Experiment 1, no frequency effect was found in the 1200-ms delayed condition, suggesting that response execution is not affected by word frequency. In Experiment 2, no frequency effect was found in the delayed/uncertain task that reflects the orthographic retrieval, whereas a frequency effect was found in the comparison immediate/uncertain task that reflects both spoken word recognition and orthographic retrieval. The results of this two-part study suggest that frequency effects in spoken word recognition play a substantial role in skilled spelling-to-dictation. Discrepancies between these findings and previous research, and the limitations of the present study, are discussed.
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.
Blind Linguistic Steganalysis against Translation Based Steganography
NASA Astrophysics Data System (ADS)
Chen, Zhili; Huang, Liusheng; Meng, Peng; Yang, Wei; Miao, Haibo
Translation based steganography (TBS) is a kind of relatively new and secure linguistic steganography. It takes advantage of the "noise" created by automatic translation of natural language text to encode the secret information. Up to date, there is little research on the steganalysis against this kind of linguistic steganography. In this paper, a blind steganalytic method, which is named natural frequency zoned word distribution analysis (NFZ-WDA), is presented. This method has improved on a previously proposed linguistic steganalysis method based on word distribution which is targeted for the detection of linguistic steganography like nicetext and texto. The new method aims to detect the application of TBS and uses none of the related information about TBS, its only used resource is a word frequency dictionary obtained from a large corpus, or a so called natural frequency dictionary, so it is totally blind. To verify the effectiveness of NFZ-WDA, two experiments with two-class and multi-class SVM classifiers respectively are carried out. The experimental results show that the steganalytic method is pretty promising.
Lee, Kang-Hoon; Shin, Kyung-Seop; Lim, Debora; Kim, Woo-Chan; Chung, Byung Chang; Han, Gyu-Bum; Roh, Jeongkyu; Cho, Dong-Ho; Cho, Kiho
2015-07-01
The genomes of living organisms are populated with pleomorphic repetitive elements (REs) of varying densities. Our hypothesis that genomic RE landscapes are species/strain/individual-specific was implemented into the Genome Signature Imaging system to visualize and compute the RE-based signatures of any genome. Following the occurrence profiling of 5-nucleotide REs/words, the information from top-50 frequency words was transformed into a genome-specific signature and visualized as Genome Signature Images (GSIs), using a CMYK scheme. An algorithm for computing distances among GSIs was formulated using the GSIs' variables (word identity, frequency, and frequency order). The utility of the GSI-distance computation system was demonstrated with control genomes. GSI-based computation of genome-relatedness among 1766 microbes (117 archaea and 1649 bacteria) identified their clustering patterns; although the majority paralleled the established classification, some did not. The Genome Signature Imaging system, with its visualization and distance computation functions, enables genome-scale evolutionary studies involving numerous genomes with varying sizes. Copyright © 2015 Elsevier Inc. All rights reserved.
Subjective frequency estimates for 2,938 monosyllabic words.
Balota, D A; Pilotti, M; Cortese, M J
2001-06-01
Subjective frequency estimates for large sample of monosyllabic English words were collected from 574 young adults (undergraduate students) and from a separate group of 1,590 adults of varying ages and educational backgrounds. Estimates from the latter group were collected via the internet. In addition, 90 healthy older adults provided estimates for a random sample of 480 of these words. All groups rated words with respect to the estimated frequency of encounters of each word on a 7-point scale, ranging from never encountered to encountered several times a day. The young and older groups also rated each word with respect to the frequency of encounters in different perceptual domains (e.g., reading, hearing, writing, or speaking). The results of regression analyses indicated that objective log frequency and meaningfulness accounted for most of the variance in subjective frequency estimates, whereas neighborhood size accounted for the least amount of variance in the ratings. The predictive power of log frequency and meaningfulness were dependent on the level of subjective frequency estimates. Meaningfulness was a better predictor of subjective frequency for uncommon words, whereas log frequency was a better predictor of subjective frequency for common words. Our discussion focuses on the utility of subjective frequency estimates compared with other estimates of familiarity. The raw subjective frequency data for all words are available at http://www.artsci.wustl.edu/dbalota/labpub.html.
Learning new meanings for known words: Biphasic effects of prior knowledge.
Fang, Xiaoping; Perfetti, Charles; Stafura, Joseph
2017-01-01
In acquiring word meanings, learners are often confronted by a single word form that is mapped to two or more meanings. For example, long after how to roller-"skate", one may learn that "skate" is also a kind of fish. Such learning of new meanings for familiar words involves two potentially contrasting processes, relative to new form-new meaning learning: 1) Form-based familiarity may facilitate learning a new meaning, and 2) meaning-based interference may inhibit learning a new meaning. We examined these two processes by having native English speakers learn new, unrelated meanings for familiar (high frequency) and less familiar (low frequency) English words, as well as for unfamiliar (novel or pseudo-) words. Tracking learning with cued-recall tasks at several points during learning revealed a biphasic pattern: higher learning rates and greater learning efficiency for familiar words relative to novel words early in learning and a reversal of this pattern later in learning. Following learning, interference from original meanings for familiar words was detected in a semantic relatedness judgment task. Additionally, lexical access to familiar words with new meanings became faster compared to their exposure controls, but no such effect occurred for less familiar words. Overall, the results suggest a biphasic pattern of facilitating and interfering processes: Familiar word forms facilitate learning earlier, while interference from original meanings becomes more influential later. This biphasic pattern reflects the co-activation of new and old meanings during learning, a process that may play a role in lexicalization of new meanings.
Computational Modeling of Morphological Effects in Bangla Visual Word Recognition.
Dasgupta, Tirthankar; Sinha, Manjira; Basu, Anupam
2015-10-01
In this paper we aim to model the organization and processing of Bangla polymorphemic words in the mental lexicon. Our objective is to determine whether the mental lexicon accesses a polymorphemic word as a whole or decomposes the word into its constituent morphemes and then recognize them accordingly. To address this issue, we adopted two different strategies. First, we conduct a masked priming experiment over native speakers. Analysis of reaction time (RT) and error rates indicates that in general, morphologically derived words are accessed via decomposition process. Next, based on the collected RT data we have developed a computational model that can explain the processing phenomena of the access and representation of Bangla derivationally suffixed words. In order to do so, we first explored the individual roles of different linguistic features of a Bangla morphologically complex word and observed that processing of Bangla morphologically complex words depends upon several factors like, the base and surface word frequency, suffix type/token ratio, suffix family size and suffix productivity. Accordingly, we have proposed different feature models. Finally, we combine these feature models together and came up with a new model that takes the advantage of the individual feature models and successfully explain the processing phenomena of most of the Bangla morphologically derived words. Our proposed model shows an accuracy of around 80% which outperforms the other related frequency models.
Fast alignment-free sequence comparison using spaced-word frequencies.
Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard
2014-07-15
Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
Understanding Zipf's law of word frequencies through sample-space collapse in sentence formation
Thurner, Stefan; Hanel, Rudolf; Liu, Bo; Corominas-Murtra, Bernat
2015-01-01
The formation of sentences is a highly structured and history-dependent process. The probability of using a specific word in a sentence strongly depends on the ‘history’ of word usage earlier in that sentence. We study a simple history-dependent model of text generation assuming that the sample-space of word usage reduces along sentence formation, on average. We first show that the model explains the approximate Zipf law found in word frequencies as a direct consequence of sample-space reduction. We then empirically quantify the amount of sample-space reduction in the sentences of 10 famous English books, by analysis of corresponding word-transition tables that capture which words can follow any given word in a text. We find a highly nested structure in these transition tables and show that this ‘nestedness’ is tightly related to the power law exponents of the observed word frequency distributions. With the proposed model, it is possible to understand that the nestedness of a text can be the origin of the actual scaling exponent and that deviations from the exact Zipf law can be understood by variations of the degree of nestedness on a book-by-book basis. On a theoretical level, we are able to show that in the case of weak nesting, Zipf's law breaks down in a fast transition. Unlike previous attempts to understand Zipf's law in language the sample-space reducing model is not based on assumptions of multiplicative, preferential or self-organized critical mechanisms behind language formation, but simply uses the empirically quantifiable parameter ‘nestedness’ to understand the statistics of word frequencies. PMID:26063827
Understanding Zipf's law of word frequencies through sample-space collapse in sentence formation.
Thurner, Stefan; Hanel, Rudolf; Liu, Bo; Corominas-Murtra, Bernat
2015-07-06
The formation of sentences is a highly structured and history-dependent process. The probability of using a specific word in a sentence strongly depends on the 'history' of word usage earlier in that sentence. We study a simple history-dependent model of text generation assuming that the sample-space of word usage reduces along sentence formation, on average. We first show that the model explains the approximate Zipf law found in word frequencies as a direct consequence of sample-space reduction. We then empirically quantify the amount of sample-space reduction in the sentences of 10 famous English books, by analysis of corresponding word-transition tables that capture which words can follow any given word in a text. We find a highly nested structure in these transition tables and show that this 'nestedness' is tightly related to the power law exponents of the observed word frequency distributions. With the proposed model, it is possible to understand that the nestedness of a text can be the origin of the actual scaling exponent and that deviations from the exact Zipf law can be understood by variations of the degree of nestedness on a book-by-book basis. On a theoretical level, we are able to show that in the case of weak nesting, Zipf's law breaks down in a fast transition. Unlike previous attempts to understand Zipf's law in language the sample-space reducing model is not based on assumptions of multiplicative, preferential or self-organized critical mechanisms behind language formation, but simply uses the empirically quantifiable parameter 'nestedness' to understand the statistics of word frequencies. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Soares, Ana Paula; Medeiros, José Carlos; Simões, Alberto; Machado, João; Costa, Ana; Iriarte, Álvaro; de Almeida, José João; Pinheiro, Ana P; Comesaña, Montserrat
2014-03-01
In this article, we introduce ESCOLEX, the first European Portuguese children's lexical database with grade-level-adjusted word frequency statistics. Computed from a 3.2-million-word corpus, ESCOLEX provides 48,381 word forms extracted from 171 elementary and middle school textbooks for 6- to 11-year-old children attending the first six grades in the Portuguese educational system. Like other children's grade-level databases (e.g., Carroll, Davies, & Richman, 1971; Corral, Ferrero, & Goikoetxea, Behavior Research Methods, 41, 1009-1017, 2009; Lété, Sprenger-Charolles, & Colé, Behavior Research Methods, Instruments, & Computers, 36, 156-166, 2004; Zeno, Ivens, Millard, Duvvuri, 1995), ESCOLEX provides four frequency indices for each grade: overall word frequency (F), index of dispersion across the selected textbooks (D), estimated frequency per million words (U), and standard frequency index (SFI). It also provides a new measure, contextual diversity (CD). In addition, the number of letters in the word and its part(s) of speech, number of syllables, syllable structure, and adult frequencies taken from P-PAL (a European Portuguese corpus-based lexical database; Soares, Comesaña, Iriarte, Almeida, Simões, Costa, …, Machado, 2010; Soares, Iriarte, Almeida, Simões, Costa, França, …, Comesaña, in press) are provided. ESCOLEX will be a useful tool both for researchers interested in language processing and development and for professionals in need of verbal materials adjusted to children's developmental stages. ESCOLEX can be downloaded along with this article or from http://p-pal.di.uminho.pt/about/databases .
Encourage Students to Read through the Use of Data Visualization
ERIC Educational Resources Information Center
Bandeen, Heather M.; Sawin, Jason E.
2012-01-01
Instructors are always looking for new ways to engage students in reading assignments. The authors present a few techniques that rely on a web-based data visualization tool called Wordle (wordle.net). Wordle creates word frequency representations called word clouds. The larger a word appears within a cloud, the more frequently it occurs within a…
Reasoning Words as Linguistic Features of Exploratory Talk: Classroom Use and What It Can Tell Us
ERIC Educational Resources Information Center
Boyd, Maureen; Kong, Yiren
2017-01-01
Reasoning words are linguistic features associated with classroom exploratory talk as students talk-to-learn, explore ideas, and probe each other's thinking. This study extends established research on use of reasoning words to a fourth- to fifth-grade literature-based English language learning context. We examined frequency and patterning of…
Models of Vocabulary Acquisition: Direct Tests and Text-Derived Simulations of Vocabulary Growth
ERIC Educational Resources Information Center
Biemiller, Andrew; Rosenstein, Mark; Sparks, Randall; Landauer, Thomas K.; Foltz, Peter W.
2014-01-01
Determining word meanings that ought to be taught or introduced is important for educators. A sequence for vocabulary growth can be inferred from many sources, including testing children's knowledge of word meanings at various ages, predicting from print frequency, or adult-recalled Age of Acquisition. A new approach, Word Maturity, is based on…
Parafoveal load of word N+1 modulates preprocessing effectiveness of word N+2 in Chinese reading.
Yan, Ming; Kliegl, Reinhold; Shu, Hua; Pan, Jinger; Zhou, Xiaolin
2010-12-01
Preview benefits (PBs) from two words to the right of the fixated one (i.e., word N + 2) and associated parafoveal-on-foveal effects are critical for proposals of distributed lexical processing during reading. This experiment examined parafoveal processing during reading of Chinese sentences, using a boundary manipulation of N + 2-word preview with low- and high-frequency words N + 1. The main findings were (a) an identity PB for word N + 2 that was (b) primarily observed when word N + 1 was of high frequency (i.e., an interaction between frequency of word N + 1 and PB for word N + 2), and (c) a parafoveal-on-foveal frequency effect of word N + 1 for fixation durations on word N. We discuss implications for theories of serial attention shifts and parallel distributed processing of words during reading.
True reason for Zipf's law in language
NASA Astrophysics Data System (ADS)
Dahui, Wang; Menghui, Li; Zengru, Di
2005-12-01
Analysis of word frequency have historically used data that included English, French, or other language, data typically described by Zipf's law. Using data on traditional and modern Chinese literatures, we show here that Chinese character frequency stroked Zipf's law based on literature before Qin dynasty; however, it departed from Zipf's law based on literature after Qin dynasty. Combined with data about English dictionaries and Chinese dictionaries, we show that the true reason for Zipf's Law in language is that growth and preferential selection mechanism of word or character in given language.
Funk, Mark E.
2013-01-01
Purpose: This lecture explores changes in the medical library profession over the last fifty years, as revealed by individual word usage in a body of literature. Methods: I downloaded articles published in the Bulletin of the Medical Library Association and Journal of the Medical Library Association between 1961 and 2000 to create an electronic corpus and tracked annual frequency of individual word usage. I used frequency sparklines of words, matching one of four archetypal shapes (level, rise, fall, and rise-and-fall) to identify significant words. Results: Most significant words fell into the categories of environment, management, technology, and research. Based on word usage changes, the following trends are revealed: Compared to 1961, today's medical librarians are more concerned with digital information, not physical packages. We prefer information to be evidence-based. We focus more on health than medicine. We are reaching out to new constituents, sometimes leaving our building to do so. Teaching has become important for us. We run our libraries more like businesses, using constantly changing technology. We are publishing more research articles. Conclusions: Although these words were chosen by individual authors to tell their particular stories, in the aggregate, our words reveal our story of change in our profession. PMID:23405042
Rank-frequency distributions of Romanian words
NASA Astrophysics Data System (ADS)
Cocioceanu, Adrian; Raportaru, Carina Mihaela; Nicolin, Alexandru I.; Jakimovski, Dragan
2017-12-01
The calibration of voice biometrics solutions requires detailed analyses of spoken texts and in this context we investigate by computational means the rank-frequency distributions of Romanian words and word series to determine the most common words and word series of the language. To this end, we have constructed a corpus of approximately 2.5 million words and then determined that the rank-frequency distributions of the Romanian words, as well as series of two, and three subsequent words, obey the celebrated Zipf law.
Chinese translation norms for 1,429 English words.
Wen, Yun; van Heuven, Walter J B
2017-06-01
We present Chinese translation norms for 1,429 English words. Chinese-English bilinguals (N = 28) were asked to provide the first Chinese translation that came to mind for 1,429 English words. The results revealed that 71 % of the English words received more than one correct translation indicating the large amount of translation ambiguity when translating from English to Chinese. The relationship between translation ambiguity and word frequency, concreteness and language proficiency was investigated. Although the significant correlations were not strong, results revealed that English word frequency was positively correlated with the number of alternative translations, whereas English word concreteness was negatively correlated with the number of translations. Importantly, regression analyses showed that the number of Chinese translations was predicted by word frequency and concreteness. Furthermore, an interaction between these predictors revealed that the number of translations was more affected by word frequency for more concrete words than for less concrete words. In addition, mixed-effects modelling showed that word frequency, concreteness and English language proficiency were all significant predictors of whether or not a dominant translation was provided. Finally, correlations between the word frequencies of English words and their Chinese dominant translations were higher for translation-unambiguous pairs than for translation-ambiguous pairs. The translation norms are made available in a database together with lexical information about the words, which will be a useful resource for researchers investigating Chinese-English bilingual language processing.
Is a "Phoenician" reading style superior to a "Chinese" reading style? Evidence from fourth graders.
Bowey, Judith A
2008-07-01
This study compared normally achieving fourth-grade "Phoenician" readers, who identify nonwords significantly more accurately than they do exception words, with "Chinese" readers, who show the reverse pattern. Phoenician readers scored lower than Chinese readers on word identification, exception word reading, orthographic choice, spelling, reading comprehension, and verbal ability. When compared with normally achieving children who read nonwords and exception words equally well, Chinese readers scored as well as these children on word identification, regular word reading, orthographic choice, spelling, reading comprehension, phonological sensitivity, and verbal ability and scored better on exception word reading. Chinese readers also used rhyme-based analogies to read nonwords derived from high-frequency exception words just as often as did these children. As predicted, Phoenician and Chinese readers adopted somewhat different strategies in reading ambiguous nonwords constructed by analogy to high-frequency exception words. Phoenician readers were more likely than Chinese readers to read ambiguous monosyllabic nonwords via context-free grapheme-phoneme correspondences and were less likely to read disyllabic nonwords by analogy to high-frequency analogues. Although the Chinese reading style was more common than the Phoenician style in normally achieving fourth graders, there were similar numbers of poor readers with phonological dyslexia (identifying nonwords significantly more accurately than exception words) and surface dyslexia (showing the reverse pattern), although surface dyslexia was more common in the severely disabled readers. However, few of the poor readers showed pure patterns of phonological or surface dyslexia.
Phonological and Lexical Effects in Verbal Recall by Children with Specific Language Impairments
Coady, Jeffry A.; Mainela-Arnold, Elina; Evans, Julia L.
2014-01-01
Background & Aims The present study examined how phonological and lexical knowledge influences memory in children with specific language impairments (SLI). Previous work showed recall advantages for typical adults and children due to word frequency and phonotactic pattern frequency and a recall disadvantage due to phonological similarity among words. While children with SLI have well documented memory difficulties, it is not clear whether these language knowledge factors also influence recall in this population. Methods & Procedures 16 children with SLI (mean age 10;2) and CAM controls recalled lists of words differing in phonological similarity, word frequency, and phonotactic pattern frequency. While previous studies used a small set of words appearing in multiple word lists, the current study used a larger set of words, without replacement, so that children could not gain practice with individual test items. Outcomes & Results All main effects were significant. Interactions revealed that children with SLI were affected by similarity, but less so than their peers, comparably affected by word frequency, and unaffected by phonotactic pattern frequency. Conclusions Results due to phonological similarity suggest that children with SLI use less efficient encoding, while results due to word frequency and phonotactic pattern frequency were mixed. Children with SLI used coarse-grained language knowledge (word frequency) comparably to peers, but were less able to use fine-grained knowledge (phonotactic pattern frequency). Paired with phonological similarity results, this suggests that children with SLI have difficulty establishing robust phonological knowledge for use in language tasks. PMID:23472955
When does word frequency influence written production?
Baus, Cristina; Strijkers, Kristof; Costa, Albert
2013-01-01
The aim of the present study was to explore the central (e.g., lexical processing) and peripheral processes (motor preparation and execution) underlying word production during typewriting. To do so, we tested non-professional typers in a picture typing task while continuously recording EEG. Participants were instructed to write (by means of a standard keyboard) the corresponding name for a given picture. The lexical frequency of the words was manipulated: half of the picture names were of high-frequency while the remaining were of low-frequency. Different measures were obtained: (1) first keystroke latency and (2) keystroke latency of the subsequent letters and duration of the word. Moreover, ERPs locked to the onset of the picture presentation were analyzed to explore the temporal course of word frequency in typewriting. The results showed an effect of word frequency for the first keystroke latency but not for the duration of the word or the speed to which letter were typed (interstroke intervals). The electrophysiological results showed the expected ERP frequency effect at posterior sites: amplitudes for low-frequency words were more positive than those for high-frequency words. However, relative to previous evidence in the spoken modality, the frequency effect appeared in a later time-window. These results demonstrate two marked differences in the processing dynamics underpinning typing compared to speaking: First, central processing dynamics between speaking and typing differ already in the manner that words are accessed; second, central processing differences in typing, unlike speaking, do not cascade to peripheral processes involved in response execution.
When does word frequency influence written production?
Baus, Cristina; Strijkers, Kristof; Costa, Albert
2013-01-01
The aim of the present study was to explore the central (e.g., lexical processing) and peripheral processes (motor preparation and execution) underlying word production during typewriting. To do so, we tested non-professional typers in a picture typing task while continuously recording EEG. Participants were instructed to write (by means of a standard keyboard) the corresponding name for a given picture. The lexical frequency of the words was manipulated: half of the picture names were of high-frequency while the remaining were of low-frequency. Different measures were obtained: (1) first keystroke latency and (2) keystroke latency of the subsequent letters and duration of the word. Moreover, ERPs locked to the onset of the picture presentation were analyzed to explore the temporal course of word frequency in typewriting. The results showed an effect of word frequency for the first keystroke latency but not for the duration of the word or the speed to which letter were typed (interstroke intervals). The electrophysiological results showed the expected ERP frequency effect at posterior sites: amplitudes for low-frequency words were more positive than those for high-frequency words. However, relative to previous evidence in the spoken modality, the frequency effect appeared in a later time-window. These results demonstrate two marked differences in the processing dynamics underpinning typing compared to speaking: First, central processing dynamics between speaking and typing differ already in the manner that words are accessed; second, central processing differences in typing, unlike speaking, do not cascade to peripheral processes involved in response execution. PMID:24399980
Howell, Peter
2010-10-01
This letter comments on a study by Anderson (2007) that compared the effects of word frequency, neighborhood density, and phonological neighborhood frequency on part-word repetitions, prolongations, and single-syllable word repetitions produced by children who stutter. Anderson discussed her results with respect to 2 theories about stuttering: the covert repair hypothesis and execution planning (EXPLAN) theory. Her remarks about EXPLAN theory are examined. Anderson considered that EXPLAN does not predict the relationship between word and neighborhood frequency and stuttering for part-word repetitions and prolongations (she considered that EXPLAN predicts that stuttering occurs on simple words for children). The actual predictions that EXPLAN makes are upheld by her results. She also considered that EXPLAN cannot account for why stuttering is affected by the same variables that lead to speech errors, and it is shown that this is incorrect. The effects of word frequency, neighborhood density, and phonological neighborhood frequency on part-word repetitions, prolongations, and single-syllable word repetitions reported by Anderson (2007) are consistent with the predictions of the EXPLAN model.
The time course of spoken word learning and recognition: studies with artificial lexicons.
Magnuson, James S; Tanenhaus, Michael K; Aslin, Richard N; Dahan, Delphine
2003-06-01
The time course of spoken word recognition depends largely on the frequencies of a word and its competitors, or neighbors (similar-sounding words). However, variability in natural lexicons makes systematic analysis of frequency and neighbor similarity difficult. Artificial lexicons were used to achieve precise control over word frequency and phonological similarity. Eye tracking provided time course measures of lexical activation and competition (during spoken instructions to perform visually guided tasks) both during and after word learning, as a function of word frequency, neighbor type, and neighbor frequency. Apparent shifts from holistic to incremental competitor effects were observed in adults and neural network simulations, suggesting such shifts reflect general properties of learning rather than changes in the nature of lexical representations.
Learning new meanings for known words: Biphasic effects of prior knowledge
Fang, Xiaoping; Perfetti, Charles; Stafura, Joseph
2017-01-01
In acquiring word meanings, learners are often confronted by a single word form that is mapped to two or more meanings. For example, long after how to roller-“skate”, one may learn that “skate” is also a kind of fish. Such learning of new meanings for familiar words involves two potentially contrasting processes, relative to new form-new meaning learning: 1) Form-based familiarity may facilitate learning a new meaning, and 2) meaning-based interference may inhibit learning a new meaning. We examined these two processes by having native English speakers learn new, unrelated meanings for familiar (high frequency) and less familiar (low frequency) English words, as well as for unfamiliar (novel or pseudo-) words. Tracking learning with cued-recall tasks at several points during learning revealed a biphasic pattern: higher learning rates and greater learning efficiency for familiar words relative to novel words early in learning and a reversal of this pattern later in learning. Following learning, interference from original meanings for familiar words was detected in a semantic relatedness judgment task. Additionally, lexical access to familiar words with new meanings became faster compared to their exposure controls, but no such effect occurred for less familiar words. Overall, the results suggest a biphasic pattern of facilitating and interfering processes: Familiar word forms facilitate learning earlier, while interference from original meanings becomes more influential later. This biphasic pattern reflects the co-activation of new and old meanings during learning, a process that may play a role in lexicalization of new meanings. PMID:29399593
Experiments in automatic word class and word sense identification for information retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gauch, S.; Futrelle, R.P.
Automatic identification of related words and automatic detection of word senses are two long-standing goals of researchers in natural language processing. Word class information and word sense identification may enhance the performance of information retrieval system4ms. Large online corpora and increased computational capabilities make new techniques based on corpus linguisitics feasible. Corpus-based analysis is especially needed for corpora from specialized fields for which no electronic dictionaries or thesauri exist. The methods described here use a combination of mutual information and word context to establish word similarities. Then, unsupervised classification is done using clustering in the word space, identifying word classesmore » without pretagging. We also describe an extension of the method to handle the difficult problems of disambiguation and of determining part-of-speech and semantic information for low-frequency words. The method is powerful enough to produce high-quality results on a small corpus of 200,000 words from abstracts in a field of molecular biology.« less
ERIC Educational Resources Information Center
Hendrix, Peter; Bolger, Patrick; Baayen, Harald
2017-01-01
Recent studies have documented frequency effects for word n-grams, independently of word unigram frequency. Further studies have revealed constructional prototype effects, both at the word level as well as for phrases. The present speech production study investigates the time course of these effects for the production of prepositional phrases in…
ERIC Educational Resources Information Center
Ledoux, Kerry; Gordon, Barry
2011-01-01
Processing and/or hemispheric differences in the neural bases of word recognition were examined in patients with long-standing, medically-intractable epilepsy localized to the left (N = 18) or right (N = 7) temporal lobe. Participants were asked to read words that varied in the frequency of their spelling-to-sound correspondences. For the right…
Catalan speakers' perception of word stress in unaccented contexts.
Ortega-Llebaria, Marta; del Mar Vanrell, Maria; Prieto, Pilar
2010-01-01
In unaccented contexts, formant frequency differences related to vowel reduction constitute a consistent cue to word stress in English, whereas in languages such as Spanish that have no systematic vowel reduction, stress perception is based on duration and intensity cues. This article examines the perception of word stress by speakers of Central Catalan, in which, due to its vowel reduction patterns, words either alternate stressed open vowels with unstressed mid-central vowels as in English or contain no vowel quality cues to stress, as in Spanish. Results show that Catalan listeners perceive stress based mainly on duration cues in both word types. Other cues pattern together with duration to make stress perception more robust. However, no single cue is absolutely necessary and trading effects compensate for a lack of differentiation in one dimension by changes in another dimension. In particular, speakers identify longer mid-central vowels as more stressed than shorter open vowels. These results and those obtained in other stress-accent languages provide cumulative evidence that word stress is perceived independently of pitch accents by relying on a set of cues with trading effects so that no single cue, including formant frequency differences related to vowel reduction, is absolutely necessary for stress perception.
Do handwritten words magnify lexical effects in visual word recognition?
Perea, Manuel; Gil-López, Cristina; Beléndez, Victoria; Carreiras, Manuel
2016-01-01
An examination of how the word recognition system is able to process handwritten words is fundamental to formulate a comprehensive model of visual word recognition. Previous research has revealed that the magnitude of lexical effects (e.g., the word-frequency effect) is greater with handwritten words than with printed words. In the present lexical decision experiments, we examined whether the quality of handwritten words moderates the recruitment of top-down feedback, as reflected in word-frequency effects. Results showed a reading cost for difficult-to-read and easy-to-read handwritten words relative to printed words. But the critical finding was that difficult-to-read handwritten words, but not easy-to-read handwritten words, showed a greater word-frequency effect than printed words. Therefore, the inherent physical variability of handwritten words does not necessarily boost the magnitude of lexical effects.
Knowledge inhibition and N400: a study with words that look like common words.
Debruille, J B
1998-04-01
In addition to their own representations, low frequency words, such as BRIBE, can covertly activate the representations of higher frequency words they look like (e.g., BRIDE). Hence, look-alike words can activate knowledge that is incompatible with the knowledge corresponding to accurate representations. Comparatively, eccentric words, that is, low frequency words that do not look as much like higher frequency words, are less likely to activate incompatible knowledge. This study focuses on the hypothesis that the N400 component of the event-related potential reflects the inhibition of incompatible knowledge. This hypothesis predicts that look-alike words elicit N400s of greater amplitudes than eccentric words in conditions where incompatible knowledge is inhibited. Results from a single item lexical decision experiment are reported which support the inhibition hypothesis. Copyright 1998 Academic Press.
Lexical frequency and acoustic reduction in spoken Dutch
NASA Astrophysics Data System (ADS)
Pluymaekers, Mark; Ernestus, Mirjam; Baayen, R. Harald
2005-10-01
This study investigates the effects of lexical frequency on the durational reduction of morphologically complex words in spoken Dutch. The hypothesis that high-frequency words are more reduced than low-frequency words was tested by comparing the durations of affixes occurring in different carrier words. Four Dutch affixes were investigated, each occurring in a large number of words with different frequencies. The materials came from a large database of face-to-face conversations. For each word containing a target affix, one token was randomly selected for acoustic analysis. Measurements were made of the duration of the affix as a whole and the durations of the individual segments in the affix. For three of the four affixes, a higher frequency of the carrier word led to shorter realizations of the affix as a whole, individual segments in the affix, or both. Other relevant factors were the sex and age of the speaker, segmental context, and speech rate. To accommodate for these findings, models of speech production should allow word frequency to affect the acoustic realizations of lower-level units, such as individual speech sounds occurring in affixes.
A Dual-Route Model that Learns to Pronounce English Words
NASA Technical Reports Server (NTRS)
Remington, Roger W.; Miller, Craig S.; Null, Cynthia H. (Technical Monitor)
1995-01-01
This paper describes a model that learns to pronounce English words. Learning occurs in two modules: 1) a rule-based module that constructs pronunciations by phonetic analysis of the letter string, and 2) a whole-word module that learns to associate subsets of letters to the pronunciation, without phonetic analysis. In a simulation on a corpus of over 300 words the model produced pronunciation latencies consistent with the effects of word frequency and orthographic regularity observed in human data. Implications of the model for theories of visual word processing and reading instruction are discussed.
ERIC Educational Resources Information Center
Faes, Jolien; Gillis, Joris; Gillis, Steven
2017-01-01
The frequency of occurrence of words and sounds has a pervasive influence on typically developing children's language acquisition. For instance, highly frequent words appear earliest in a child's lexicon, and highly frequent phonemes are produced more accurately. This study evaluates (a) whether word frequency influences word accuracy and (b)…
Locus of Word Frequency Effects in Spelling to Dictation: Still at the Orthographic Level!
ERIC Educational Resources Information Center
Bonin, Patrick; Laroche, Betty; Perret, Cyril
2016-01-01
The present study was aimed at testing the locus of word frequency effects in spelling to dictation: Are they located at the level of spoken word recognition (Chua & Rickard Liow, 2014) or at the level of the orthographic output lexicon (Delattre, Bonin, & Barry, 2006)? Words that varied on objective word frequency and on phonological…
Bag-of-features based medical image retrieval via multiple assignment and visual words weighting.
Wang, Jingyan; Li, Yongping; Zhang, Ying; Wang, Chao; Xie, Honglan; Chen, Guoling; Gao, Xin
2011-11-01
Bag-of-features based approaches have become prominent for image retrieval and image classification tasks in the past decade. Such methods represent an image as a collection of local features, such as image patches and key points with scale invariant feature transform (SIFT) descriptors. To improve the bag-of-features methods, we first model the assignments of local descriptors as contribution functions, and then propose a novel multiple assignment strategy. Assuming the local features can be reconstructed by their neighboring visual words in a vocabulary, reconstruction weights can be solved by quadratic programming. The weights are then used to build contribution functions, resulting in a novel assignment method, called quadratic programming (QP) assignment. We further propose a novel visual word weighting method. The discriminative power of each visual word is analyzed by the sub-similarity function in the bin that corresponds to the visual word. Each sub-similarity function is then treated as a weak classifier. A strong classifier is learned by boosting methods that combine those weak classifiers. The weighting factors of the visual words are learned accordingly. We evaluate the proposed methods on medical image retrieval tasks. The methods are tested on three well-known data sets, i.e., the ImageCLEFmed data set, the 304 CT Set, and the basal-cell carcinoma image set. Experimental results demonstrate that the proposed QP assignment outperforms the traditional nearest neighbor assignment, the multiple assignment, and the soft assignment, whereas the proposed boosting based weighting strategy outperforms the state-of-the-art weighting methods, such as the term frequency weights and the term frequency-inverse document frequency weights.
Moers, Cornelia; Meyer, Antje; Janse, Esther
2017-06-01
High-frequency units are usually processed faster than low-frequency units in language comprehension and language production. Frequency effects have been shown for words as well as word combinations. Word co-occurrence effects can be operationalized in terms of transitional probability (TP). TPs reflect how probable a word is, conditioned by its right or left neighbouring word. This corpus study investigates whether three different age groups-younger children (8-12 years), adolescents (12-18 years) and older (62-95 years) Dutch speakers-show frequency and TP context effects on spoken word durations in reading aloud, and whether age groups differ in the size of these effects. Results show consistent effects of TP on word durations for all age groups. Thus, TP seems to influence the processing of words in context, beyond the well-established effect of word frequency, across the entire age range. However, the study also indicates that age groups differ in the size of TP effects, with older adults having smaller TP effects than adolescent readers. Our results show that probabilistic reduction effects in reading aloud may at least partly stem from contextual facilitation that leads to faster reading times in skilled readers, as well as in young language learners.
The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2.
Brysbaert, Marc; Stevens, Michaël; Mandera, Paweł; Keuleers, Emmanuel
2016-03-01
Keuleers, Stevens, Mandera, and Brysbaert (2015) presented a new variable, word prevalence, defined as word knowledge in the population. Some words are known to more people than other. This is particularly true for low-frequency words (e.g., screenshot vs. scourage). In the present study, we examined the impact of the measure by collecting lexical decision times for 30,000 Dutch word lemmas of various lengths (the Dutch Lexicon Project 2). Word prevalence had the second highest correlation with lexical decision times (after word frequency): Words known by everyone in the population were responded to 100 ms faster than words known to only half of the population, even after controlling for word frequency, word length, age of acquisition, similarity to other words, and concreteness. Because word prevalence has rather low correlations with the existing measures (including word frequency), the unique variance it contributes to lexical decision times is higher than that of the other variables. We consider the reasons why word prevalence has an impact on word processing times and we argue that it is likely to be the most important new variable protecting researchers against experimenter bias in selecting stimulus materials. (c) 2016 APA, all rights reserved).
Niche as a Determinant of Word Fate in Online Groups
Altmann, Eduardo G.; Pierrehumbert, Janet B.; Motter, Adilson E.
2011-01-01
Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between their intrinsic properties and the environments in which they function. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity. PMID:21589910
Meyer, Ted A.; Pisoni, David B.
2012-01-01
Objective The Phonetically Balanced Kindergarten (PBK) Test (Haskins, Reference Note 2) has been used for almost 50 yr to assess spoken word recognition performance in children with hearing impairments. The test originally consisted of four lists of 50 words, but only three of the lists (lists 1, 3, and 4) were considered “equivalent” enough to be used clinically with children. Our goal was to determine if the lexical properties of the different PBK lists could explain any differences between the three “equivalent” lists and the fourth PBK list (List 2) that has not been used in clinical testing. Design Word frequency and lexical neighborhood frequency and density measures were obtained from a computerized database for all of the words on the four lists from the PBK Test as well as the words from a single PB-50 (Egan, 1948) word list. Results The words in the “easy” PBK list (List 2) were of higher frequency than the words in the three “equivalent” lists. Moreover, the lexical neighborhoods of the words on the “easy” list contained fewer phonetically similar words than the neighborhoods of the words on the other three “equivalent” lists. Conclusions It is important for researchers to consider word frequency and lexical neighborhood frequency and density when constructing word lists for testing speech perception. The results of this computational analysis of the PBK Test provide additional support for the proposal that spoken words are recognized “relationally” in the context of other phonetically similar words in the lexicon. Implications of using open-set word recognition tests with children with hearing impairments are discussed with regard to the specific vocabulary and information processing demands of the PBK Test. PMID:10466571
ERIC Educational Resources Information Center
Dufour, Sophie; Brunelliere, Angele; Frauenfelder, Ulrich H.
2013-01-01
Although the word-frequency effect is one of the most established findings in spoken-word recognition, the precise processing locus of this effect is still a topic of debate. In this study, we used event-related potentials (ERPs) to track the time course of the word-frequency effect. In addition, the neighborhood density effect, which is known to…
Antecedent Frequency Effects on Anaphoric Pronoun Resolution: Evidence from Spanish
ERIC Educational Resources Information Center
Egusquiza, Nerea; Navarrete, Eduardo; Zawiszewski, Adam
2016-01-01
High-frequency words are usually understood and produced faster than low-frequency words. Although the effect of word frequency is a reliable phenomenon in many domains of language processing, it remains unclear whether and how frequency affects pronominal anaphoric resolution. We evaluated this issue by means of two self-paced reading…
Investigating the Accuracy of Teachers' Word Frequency Intuitions
ERIC Educational Resources Information Center
McCrostie, James
2007-01-01
Previous research has found that native English speakers can judge, with a relatively high degree of accuracy, the frequency of words in the English language. However, there has been little investigation of the ability to judge the frequency of high and middle frequency words. Similarly, the accuracy of EFL teachers' frequency judgements remains…
A Reassessment of Frequency and Vocabulary Size in L2 Vocabulary Teaching
ERIC Educational Resources Information Center
Schmitt, Norbert; Schmitt, Diane
2014-01-01
The high-frequency vocabulary of English has traditionally been thought to consist of the 2,000 most frequent word families, and low-frequency vocabulary as that beyond the 10,000 frequency level. This paper argues that these boundaries should be reassessed on pedagogic grounds. Based on a number of perspectives (including frequency and…
A neuroimaging study of conflict during word recognition.
Riba, Jordi; Heldmann, Marcus; Carreiras, Manuel; Münte, Thomas F
2010-08-04
Using functional magnetic resonance imaging the neural activity associated with error commission and conflict monitoring in a lexical decision task was assessed. In a cohort of 20 native speakers of Spanish conflict was introduced by presenting words with high and low lexical frequency and pseudo-words with high and low syllabic frequency for the first syllable. Erroneous versus correct responses showed activation in the frontomedial and left inferior frontal cortex. A similar pattern was found for correctly classified words of low versus high lexical frequency and for correctly classified pseudo-words of high versus low syllabic frequency. Conflict-related activations for language materials largely overlapped with error-induced activations. The effect of syllabic frequency underscores the role of sublexical processing in visual word recognition and supports the view that the initial syllable mediates between the letter and word level.
Semantic Factors Predict the Rate of Lexical Replacement of Content Words
Vejdemo, Susanne; Hörberg, Thomas
2016-01-01
The rate of lexical replacement estimates the diachronic stability of word forms on the basis of how frequently a proto-language word is replaced or retained in its daughter languages. Lexical replacement rate has been shown to be highly related to word class and word frequency. In this paper, we argue that content words and function words behave differently with respect to lexical replacement rate, and we show that semantic factors predict the lexical replacement rate of content words. For the 167 content items in the Swadesh list, data was gathered on the features of lexical replacement rate, word class, frequency, age of acquisition, synonyms, arousal, imageability and average mutual information, either from published databases or gathered from corpora and lexica. A linear regression model shows that, in addition to frequency, synonyms, senses and imageability are significantly related to the lexical replacement rate of content words–in particular the number of synonyms that a word has. The model shows no differences in lexical replacement rate between word classes, and outperforms a model with word class and word frequency predictors only. PMID:26820737
Children's early reading vocabulary: description and word frequency lists.
Stuart, Morag; Dixon, Maureen; Masterson, Jackie; Gray, Bob
2003-12-01
When constructing stimuli for experimental investigations of cognitive processes in early reading development, researchers have to rely on adult or American children's word frequency counts, as no such counts exist for English children. The present paper introduces a database of children's early reading vocabulary, for use by researchers and teachers. Texts from 685 books from reading schemes and story books read by 5-7 year-old children were used in the construction of the database. All words from the 685 books were typed or scanned into an Oracle database. The resulting up-to-date word frequency list of early print exposure in the UK is available in two forms from a website address given in this paper. This allows access to one list of the words ordered alphabetically and one list of the words ordered by frequency. We also briefly address some fundamental issues underlying early reading vocabulary (e.g., that it is heavily skewed towards low frequencies). Other characteristics of the vocabulary are then discussed. We hope the word frequency lists will be of use to researchers seeking to control word frequency, and to teachers interested in the vocabulary to which young children are exposed in their reading material.
Parkinson's disease and the effect of lexical factors on vowel articulation.
Watson, Peter J; Munson, Benjamin
2008-11-01
Lexical factors (i.e., word frequency and phonological neighborhood density) influence speech perception and production. It is unknown if these factors are affected by Parkinson's disease (PD). Ten men with PD and ten healthy men read CVC words (varying orthogonally for word frequency and density) aloud while audio recorded. Acoustic analysis was performed on duration and Bark-scaled F1-F2 values of the vowels contained in the words. Vowel space was larger for low-frequency words from dense neighborhoods than from sparse ones for both groups. However, the participants with PD did not show an effect of density on dispersion for high-frequency words.
E-READING II: words database for reading by students from Basic Education II.
Oliveira, Adriana Marques de; Capellini, Simone Aparecida
2016-01-01
To develop a database of words of high, medium and low frequency in reading for Basic Education II. The words were taken from the teaching material for Portuguese Language, used by the teaching network of the State of São Paulo in the 6th to the 9th year of Basic Education. Only nouns were selected. The frequency with which each word occurred was recorded and a single database was created. In order to classify the words as of high, medium and low frequency, the decision was taken to work with the distribution terciles, mean frequency and the cutoff point of the terciles. In order to ascertain whether the words of high, medium and low frequency corresponded to this classification, 224 students were assessed: G1 (6th year, n= 61); G2 (7th year, n= 44); G3 (8th year, n= 65); and G4 (9th year, n= 54). The lists of words were presented to the students for reading out loud, in two sessions: 1st) words of high and medium frequency and 2nd) words of low-frequency. Words which encompassed the exclusion criteria, or which caused discomfort or joking on the part of the students, were excluded. The word database was made up of 1659 words and was titled 'E - LEITURA II' ('E-READING II', in English). The E-LEITURA II database is a useful resource for the professionals, as it provides a database which can be used for research, educational and clinical purposes among students of Basic Education II. The professional can choose the words according to her objectives and criteria for elaborating evaluation or intervention procedures involving reading.
Vinkers, Christiaan H; Tijdink, Joeri K; Otte, Willem M
2015-12-14
To investigate whether language used in science abstracts can skew towards the use of strikingly positive and negative words over time. Retrospective analysis of all scientific abstracts in PubMed between 1974 and 2014. The yearly frequencies of positive, negative, and neutral words (25 preselected words in each category), plus 100 randomly selected words were normalised for the total number of abstracts. Subanalyses included pattern quantification of individual words, specificity for selected high impact journals, and comparison between author affiliations within or outside countries with English as the official majority language. Frequency patterns were compared with 4% of all books ever printed and digitised by use of Google Books Ngram Viewer. Frequencies of positive and negative words in abstracts compared with frequencies of words with a neutral and random connotation, expressed as relative change since 1980. The absolute frequency of positive words increased from 2.0% (1974-80) to 17.5% (2014), a relative increase of 880% over four decades. All 25 individual positive words contributed to the increase, particularly the words "robust," "novel," "innovative," and "unprecedented," which increased in relative frequency up to 15,000%. Comparable but less pronounced results were obtained when restricting the analysis to selected journals with high impact factors. Authors affiliated to an institute in a non-English speaking country used significantly more positive words. Negative word frequencies increased from 1.3% (1974-80) to 3.2% (2014), a relative increase of 257%. Over the same time period, no apparent increase was found in neutral or random word use, or in the frequency of positive word use in published books. Our lexicographic analysis indicates that scientific abstracts are currently written with more positive and negative words, and provides an insight into the evolution of scientific writing. Apparently scientists look on the bright side of research results. But whether this perception fits reality should be questioned. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Serial reconstruction of order and serial recall in verbal short-term memory.
Quinlan, Philip T; Roodenrys, Steven; Miller, Leonie M
2017-10-01
We carried out a series of experiments on verbal short-term memory for lists of words. In the first experiment, participants were tested via immediate serial recall, and word frequency and list set size were manipulated. With closed lists, the same set of items was repeatedly sampled, and with open lists, no item was presented more than once. In serial recall, effects of word frequency and set size were found. When a serial reconstruction-of-order task was used, in a second experiment, robust effects of word frequency emerged, but set size failed to show an effect. The effects of word frequency in order reconstruction were further examined in two final experiments. The data from these experiments revealed that the effects of word frequency are robust and apparently are not exclusively indicative of output processes. In light of these findings, we propose a multiple-mechanisms account in which word frequency can influence both retrieval and preretrieval processes.
The effect of high- and low-frequency previews and sentential fit on word skipping during reading
Angele, Bernhard; Laishley, Abby; Rayner, Keith; Liversedge, Simon P.
2014-01-01
In a previous gaze-contingent boundary experiment, Angele and Rayner (2012) found that readers are likely to skip a word that appears to be the definite article the even when syntactic constraints do not allow for articles to occur in that position. In the present study, we investigated whether the word frequency of the preview of a three-letter target word influences a reader’s decision to fixate or skip that word. We found that the word frequency rather than the felicitousness (syntactic fit) of the preview affected how often the upcoming word was skipped. These results indicate that visual information about the upcoming word trumps information from the sentence context when it comes to making a skipping decision. Skipping parafoveal instances of the therefore may simply be an extreme case of skipping high-frequency words. PMID:24707791
The Low-Frequency Encoding Disadvantage: Word Frequency Affects Processing Demands
ERIC Educational Resources Information Center
Diana, Rachel A.; Reder, Lynne M.
2006-01-01
Low-frequency words produce more hits and fewer false alarms than high-frequency words in a recognition task. The low-frequency hit rate advantage has sometimes been attributed to processes that operate during the recognition test (e.g., L. M. Reder et al., 2000). When tasks other than recognition, such as recall, cued recall, or associative…
Adding part-of-speech information to the SUBTLEX-US word frequencies.
Brysbaert, Marc; New, Boris; Keuleers, Emmanuel
2012-12-01
The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts-of-speech, or PoSs) of the entries. Five new columns have been added to the SUBTLEX-US word frequency list: the dominant (most frequent) PoS for the entry, the frequency of the dominant PoS, the frequency of the dominant PoS relative to the entry's total frequency, all PoSs observed for the entry, and the respective frequencies of these PoSs. Because the current definition of lemma frequency does not seem to provide word recognition researchers with useful information (as illustrated by a comparison of the lemma frequencies and the word form frequencies from the Corpus of Contemporary American English), we have not provided a column with this variable. Instead, we hope that the full list of PoS frequencies will help researchers to collectively determine which combination of frequencies is the most informative.
Quantitative learning strategies based on word networks
NASA Astrophysics Data System (ADS)
Zhao, Yue-Tian-Yi; Jia, Zi-Yang; Tang, Yong; Xiong, Jason Jie; Zhang, Yi-Cheng
2018-02-01
Learning English requires a considerable effort, but the way that vocabulary is introduced in textbooks is not optimized for learning efficiency. With the increasing population of English learners, learning process optimization will have significant impact and improvement towards English learning and teaching. The recent developments of big data analysis and complex network science provide additional opportunities to design and further investigate the strategies in English learning. In this paper, quantitative English learning strategies based on word network and word usage information are proposed. The strategies integrate the words frequency with topological structural information. By analyzing the influence of connected learned words, the learning weights for the unlearned words and dynamically updating of the network are studied and analyzed. The results suggest that quantitative strategies significantly improve learning efficiency while maintaining effectiveness. Especially, the optimized-weight-first strategy and segmented strategies outperform other strategies. The results provide opportunities for researchers and practitioners to reconsider the way of English teaching and designing vocabularies quantitatively by balancing the efficiency and learning costs based on the word network.
The Word Frequency Effect on Second Language Vocabulary Learning
ERIC Educational Resources Information Center
Koirala, Cesar
2015-01-01
This study examines several linguistic factors as possible contributors to perceived word difficulty in second language learners in an experimental setting. The investigated factors include: (1) frequency of word usage in the first language, (2) word length, (3) number of syllables in a word, and (4) number of consonant clusters in a word. Word…
Word Frequency Effects in Dual-Task Studies Using Lexical Decision and Naming as Task 2
NASA Technical Reports Server (NTRS)
Remington, Roger W.; McCann, Robert S.; VanSelst, Mark; Shafto, Michael G. (Technical Monitor)
1997-01-01
Word frequency effects in dual-task lexical decision are variously reported to be additive or underadditive across SOA. We replicate and extend earlier lexical decision studies and find word frequency to be additive across SOA. To more directly capture lexical processing, we examine dual-task naming. Once again, we find word frequency to be additive across SOA. Lexical processing appears to be constrained by central processing limitations.
Locus of word frequency effects in spelling to dictation: Still at the orthographic level!
Bonin, Patrick; Laroche, Betty; Perret, Cyril
2016-11-01
The present study was aimed at testing the locus of word frequency effects in spelling to dictation: Are they located at the level of spoken word recognition (Chua & Rickard Liow, 2014) or at the level of the orthographic output lexicon (Delattre, Bonin, & Barry, 2006)? Words that varied on objective word frequency and on phonological neighborhood density were orally presented to adults who had to write them down. Following the additive factors logic (Sternberg, 1969, 2001), if word frequency in spelling to dictation influences a processing level, that is, the orthographic output level, different from that influenced by phonological neighborhood density, that is, spoken word recognition, the impact of the 2 factors should be additive. In contrast, their influence should be overadditive if they act at the same processing level in spelling to dictation, namely the spoken word recognition level. We found that both factors had a reliable influence on the spelling latencies but did not interact. This finding is in line with an orthographic output locus hypothesis of word frequency effects in spelling to dictation. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Effects of Lexicality and Word Frequency on Brain Activation in Dyslexic Readers
ERIC Educational Resources Information Center
Heim, Stefan; Wehnelt, Anke; Grande, Marion; Huber, Walter; Amunts, Katrin
2013-01-01
We investigated the neural basis of lexical access to written stimuli in adult dyslexics and normal readers via the Lexicality effect (pseudowords greater than words) and the Frequency effect (low greater than high frequent words). The participants read aloud German words (with low or high lexical frequency) or pseudowords while being scanned. In…
Frequency of Use Leads to Automaticity of Production: Evidence from Repair in Conversation
ERIC Educational Resources Information Center
Kapatsinski, Vsevolod
2010-01-01
In spontaneous speech, speakers sometimes replace a word they have just produced or started producing by another word. The present study reports that in these replacement repairs, low-frequency replaced words are more likely to be interrupted prior to completion than high-frequency words, providing support to the hypothesis that the production of…
Additive Effects of Stimulus Quality and Word Frequency on Eye Movements during Chinese Reading
ERIC Educational Resources Information Center
Liu, Pingping; Li, Xingshan; Han, Buxin
2015-01-01
Eye movements of Chinese readers were recorded for sentences in which high- and low-frequency target words were presented normally or with reduced stimulus quality in two experiments. We found stimulus quality and word frequency produced strong additive effects on fixation durations for target words. The results demonstrate that stimulus quality…
Moreno-Martínez, F Javier; Montoro, Pedro R; Rodríguez-Rojo, Inmaculada C
2014-12-01
This article presents a new corpus of 820 words pertaining to 14 semantic categories, 7 natural (animals, body parts, insects, flowers, fruits, trees, and vegetables) and 7 man-made (buildings, clothing, furniture, kitchen utensils, musical instruments, tools, and vehicles); each word in the database was collected empirically in a previous exemplar generation study. In the present study, 152 Spanish speakers provided data for four psycholinguistic variables known to affect lexical-semantic processing in both neurologically intact and brain-damaged participants: age of acquisition, familiarity, manipulability, and typicality. Furthermore, we collected lexical frequency data derived from Internet search hits, plus three additional Spanish lexical frequency indexes. Word length, number of syllables, and the proportion of respondents citing the exemplar as a category member-which can be useful as an additional measure of typicality-are also provided. Reliability and validity indexes showed that our items display characteristics similar to those of other corpora. Overall, this new corpus of words provides a useful tool for scientists engaged in cognitive- and neuroscience-based research focused on examining language, memory, and object processing. The full set of norms can be downloaded from www.psychonomic.org/archive.
Research on aviation unsafe incidents classification with improved TF-IDF algorithm
NASA Astrophysics Data System (ADS)
Wang, Yanhua; Zhang, Zhiyuan; Huo, Weigang
2016-05-01
The text content of Aviation Safety Confidential Reports contains a large number of valuable information. Term frequency-inverse document frequency algorithm is commonly used in text analysis, but it does not take into account the sequential relationship of the words in the text and its role in semantic expression. According to the seven category labels of civil aviation unsafe incidents, aiming at solving the problems of TF-IDF algorithm, this paper improved TF-IDF algorithm based on co-occurrence network; established feature words extraction and words sequential relations for classified incidents. Aviation domain lexicon was used to improve the accuracy rate of classification. Feature words network model was designed for multi-documents unsafe incidents classification, and it was used in the experiment. Finally, the classification accuracy of improved algorithm was verified by the experiments.
Core Vocabulary in Written Personal Narratives of School-Age Children
Wood, Carla; Appleget, Allyssa; Hart, Sara
2016-01-01
This study aimed to describe core words of written personal narratives to inform the implementation of AAC supports for literacy instruction. Investigators analyzed lexical diversity, frequency of specific word use and types of words that made up 70% of the total words used in 211 written narrative samples from children in first grade (n =94) and fourth grade (n=117). Across grades 191 different words made up 70% of the total words used in the 211 written narrative samples. The top 50 words were comprised of content words (64%) and function words (36%). Grade differences were noted in diversity and types of words, including differences in the number of words comprising the core (132 words for children in first grade and 207 for fourth grade) and a higher proportion of abstract nouns for children in fourth grade based on the 200 most frequently occurring words for each grade. PMID:27559987
Schultheiss, Oliver C.
2013-01-01
Traditionally, implicit motives (i.e., non-conscious preferences for specific classes of incentives) are assessed through semantic coding of imaginative stories. The present research tested the marker-word hypothesis, which states that implicit motives are reflected in the frequencies of specific words. Using Linguistic Inquiry and Word Count (LIWC; Pennebaker et al., 2001), Study 1 identified word categories that converged with a content-coding measure of the implicit motives for power, achievement, and affiliation in picture stories collected in German and US student samples, showed discriminant validity with self-reported motives, and predicted well-validated criteria of implicit motives (gender difference for the affiliation motive; in interaction with personal-goal progress: emotional well-being). Study 2 demonstrated LIWC-based motive scores' causal validity by documenting their sensitivity to motive arousal. PMID:24137149
Galbraith, G C; Jhaveri, S P; Kuo, J
1997-01-01
Speech-evoked brainstem frequency-following responses (FFRs) were recorded to repeated presentations of the same stimulus word. Word repetition results in illusory verbal transformations (VTs) in which word perceptions can differ markedly from the actual stimulus. Previous behavioral studies support an explanation of VTs based on changes in arousal or attention. Horizontal and vertical dipole FFRs were recorded to assess responses with putative origins in the auditory nerve and central brainstem, respectively. FFRs were recorded from 18 subjects when they correctly heard the stimulus and when they reported VTs. Although horizontal and vertical dipole FFRs showed different frequency response patterns, dipoles did not differentiate between perceptual conditions. However, when subjects were divided into low- and high-VT groups (based on percentage of VT trials), a significant Condition x Group interaction resulted. This interaction showed the largest difference in FFR amplitudes during VT trials, with the low-VT group showing increased amplitudes, and the high-VT group showing decreased amplitudes, relative to trials in which the stimulus was correctly perceived. These results demonstrate measurable subject differences in the early processing of complex signals, due to possible effects of attention on the brainstem FFR. The present research shows that the FFR is useful in understanding human language as it is coded and processed in the brainstem auditory pathway.
Predictability Effects on Durations of Content and Function Words in Conversational English
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bell, Alan; Brenier, Jason; Gregory, Michelle L.
Content and function word duration are affected differently by their frequency and predictability. Regression analyses of conversational speech show that content words are shorter when they are more frequent, but function words are not. Repeated content words are shorter, but function words are not. Furthermore, function words have shorter pronunciations, after controlling for frequency and predictability. both content and function words are strongly affected by predictability from the word following them, and only very frequent function words show sensitivity to predictability from the preceding word. The results support the view that content and function words are accessed by different productionmore » mechanisms. We argue that words’ form differences due to frequency or repetition stem from their faster or slower lexical access, mediated by a general mechanism that coordinates the pace of higher-level planning and the execution of the articulatory plan.« less
ERIC Educational Resources Information Center
Caza, Nicole; Moscovitch, Morris
2005-01-01
The purpose of this study was to investigate the issue of age-limited learning effects on visual lexical decision in normal and pathological aging, by using words with different frequency trajectories and cumulative frequencies. We selected words that objectively changed in frequency trajectory from an early word count (Thorndike, 1921, 1932;…
Scaltritti, Michele; Balota, David A; Peressotti, Francesca
2013-01-01
Stimulus quality and word frequency produce additive effects in lexical decision performance, whereas the semantic priming effect interacts with both stimulus quality and word frequency effects. This pattern places important constraints on models of visual word recognition. In Experiment 1, all three variables were investigated within a single speeded pronunciation study. The results indicated that the joint effects of stimulus quality and word frequency were dependent upon prime relatedness. In particular, an additive effect of stimulus quality and word frequency was found after related primes, and an interactive effect was found after unrelated primes. It was hypothesized that this pattern reflects an adaptive reliance on related prime information within the experimental context. In Experiment 2, related primes were eliminated from the list, and the interactive effects of stimulus quality and word frequency found following unrelated primes in Experiment 1 reverted to additive effects for the same unrelated prime conditions. The results are supportive of a flexible lexical processor that adapts to both local prime information and global list-wide context.
Pérez, Miguel A
2007-01-01
The aim of this study was to address the effect of objective age of acquisition (AoA) on picture-naming latencies when different measures of frequency (cumulative and adult word frequency) and frequency trajectory are taken into account. A total of 80 Spanish participants named a set of 178 pictures. Several multiple regression analyses assessed the influence of AoA, word frequency, frequency trajectory, object familiarity, name agreement, image agreement, image variability, name length, and orthographic neighbourhood density on naming times. The results revealed that AoA is the main predictor of picture-naming times. Cumulative frequency and adult word frequency (written or spoken) appeared as important factors in picture naming, but frequency trajectory and object familiarity did not. Other significant variables were image agreement, image variability, and neighbourhood density. These results (a) provide additional evidence of the predictive power of AoA in naming times independent of word-frequency and (b) suggest that image variability and neighbourhood density should also be taken into account in models of lexical production.
Dependence of exponents on text length versus finite-size scaling for word-frequency distributions
NASA Astrophysics Data System (ADS)
Corral, Álvaro; Font-Clos, Francesc
2017-08-01
Some authors have recently argued that a finite-size scaling law for the text-length dependence of word-frequency distributions cannot be conceptually valid. Here we give solid quantitative evidence for the validity of this scaling law, using both careful statistical tests and analytical arguments based on the generalized central-limit theorem applied to the moments of the distribution (and obtaining a novel derivation of Heaps' law as a by-product). We also find that the picture of word-frequency distributions with power-law exponents that decrease with text length [X. Yan and P. Minnhagen, Physica A 444, 828 (2016), 10.1016/j.physa.2015.10.082] does not stand with rigorous statistical analysis. Instead, we show that the distributions are perfectly described by power-law tails with stable exponents, whose values are close to 2, in agreement with the classical Zipf's law. Some misconceptions about scaling are also clarified.
The dependence of frequency distributions on multiple meanings of words, codes and signs
NASA Astrophysics Data System (ADS)
Yan, Xiaoyong; Minnhagen, Petter
2018-01-01
The dependence of the frequency distributions due to multiple meanings of words in a text is investigated by deleting letters. By coding the words with fewer letters the number of meanings per coded word increases. This increase is measured and used as an input in a predictive theory. For a text written in English, the word-frequency distribution is broad and fat-tailed, whereas if the words are only represented by their first letter the distribution becomes exponential. Both distribution are well predicted by the theory, as is the whole sequence obtained by consecutively representing the words by the first L = 6 , 5 , 4 , 3 , 2 , 1 letters. Comparisons of texts written by Chinese characters and the same texts written by letter-codes are made and the similarity of the corresponding frequency-distributions are interpreted as a consequence of the multiple meanings of Chinese characters. This further implies that the difference of the shape for word-frequencies for an English text written by letters and a Chinese text written by Chinese characters is due to the coding and not to the language per se.
Spreading activation in nonverbal memory networks.
Foster, Paul S; Wakefield, Candias; Pryjmak, Scott; Roosa, Katelyn M; Branch, Kaylei K; Drago, Valeria; Harrison, David W; Ruff, Ronald
2017-09-01
Theories of spreading activation primarily involve semantic memory networks. However, the existence of separate verbal and visuospatial memory networks suggests that spreading activation may also occur in visuospatial memory networks. The purpose of the present investigation was to explore this possibility. Specifically, this study sought to create and describe the design frequency corpus and to determine whether this measure of visuospatial spreading activation was related to right hemisphere functioning and spreading activation in verbal memory networks. We used word frequencies taken from the Controlled Oral Word Association Test and design frequencies taken from the Ruff Figural Fluency Test as measures of verbal and visuospatial spreading activation, respectively. Average word and design frequencies were then correlated with measures of left and right cerebral functioning. The results indicated that a significant relationship exists between performance on a test of right posterior functioning (Block Design) and design frequency. A significant negative relationship also exists between spreading activation in semantic memory networks and design frequency. Based on our findings, the hypotheses were supported. Further research will need to be conducted to examine whether spreading activation exists in visuospatial memory networks as well as the parameters that might modulate this spreading activation, such as the influence of neurotransmitters.
Kamp, Siri-Maria; Brumback, Ty; Donchin, Emanuel
2013-11-01
We examined the degree to which ERP components elicited by items that are isolated from their context, either by their font size ("size isolates") or by their frequency of usage, are correlated with subsequent immediate recall. Study lists contained (a) 15 words including a size isolate, (b) 14 high frequency (HF) words with one low frequency word ("LF isolate"), or (c) 14 LF words with one HF word. We used spatiotemporal PCA to quantify ERP components. We replicated previously reported P300 subsequent memory effects for size isolates and found additional correlations with recall in the novelty P3, a right lateralized positivity, and a left lateralized slow wave that was distinct from the slow wave correlated with recall for nonisolates. LF isolates also showed evidence of a P300 subsequent memory effect and also elicited the left lateralized subsequent memory effect, supporting a role of distinctiveness in word frequency effects in recall. Copyright © 2013 Society for Psychophysiological Research.
ERP correlates of letter identity and letter position are modulated by lexical frequency
Vergara-Martínez, Marta; Perea, Manuel; Gómez, Pablo; Swaab, Tamara Y.
2013-01-01
The encoding of letter position is a key aspect in all recently proposed models of visual-word recognition. We analyzed the impact of lexical frequency on letter position assignment by examining the temporal dynamics of lexical activation induced by pseudowords extracted from words of different frequencies. For each word (e.g., BRIDGE), we created two pseudowords: A transposed-letter (TL: BRIGDE) and a replaced-letter pseudoword (RL: BRITGE). ERPs were recorded while participants read words and pseudowords in two tasks: Semantic categorization (Experiment 1) and lexical decision (Experiment 2). For high-frequency stimuli, similar ERPs were obtained for words and TL-pseudowords, but the N400 component to words was reduced relative to RL-pseudowords, indicating less lexical/semantic activation. In contrast, TL- and RL-pseudowords created from low-frequency stimuli elicited similar ERPs. Behavioral responses in the lexical decision task paralleled this asymmetry. The present findings impose constraints on computational and neural models of visual-word recognition. PMID:23454070
A Web-based interface to calculate phonotactic probability for words and nonwords in English
VITEVITCH, MICHAEL S.; LUCE, PAUL A.
2008-01-01
Phonotactic probability refers to the frequency with which phonological segments and sequences of phonological segments occur in words in a given language. We describe one method of estimating phonotactic probabilities based on words in American English. These estimates of phonotactic probability have been used in a number of previous studies and are now being made available to other researchers via a Web-based interface. Instructions for using the interface, as well as details regarding how the measures were derived, are provided in the present article. The Phonotactic Probability Calculator can be accessed at http://www.people.ku.edu/~mvitevit/PhonoProbHome.html. PMID:15641436
Ota, Mitsuhiko; Green, Sam J
2013-06-01
Although it has been often hypothesized that children learn to produce new sound patterns first in frequently heard words, the available evidence in support of this claim is inconclusive. To re-examine this question, we conducted a survival analysis of word-initial consonant clusters produced by three children in the Providence Corpus (0 ; 11-4 ; 0). The analysis took account of several lexical factors in addition to lexical input frequency, including the age of first production, production frequency, neighborhood density and number of phonemes. The results showed that lexical input frequency was a significant predictor of the age at which the accuracy level of cluster production in each word first reached 80%. The magnitude of the frequency effect differed across cluster types. Our findings indicate that some of the between-word variance found in the development of sound production can indeed be attributed to the frequency of words in the child's ambient language.
ERIC Educational Resources Information Center
Criss, Amy H.; Malmberg, Kenneth J.
2008-01-01
One of the most studied and least well understood phenomena in episodic memory is the word frequency effect (WFE). The WFE is expressed as a mirror pattern where uncommon low frequency words (LF) are better recognized than common high frequency words (HF) by way of a higher HR and lower FAR. One explanation for the HR difference is the early-phase…
Problem Solving Frameworks for Mathematics and Software Development
ERIC Educational Resources Information Center
McMaster, Kirby; Sambasivam, Samuel; Blake, Ashley
2012-01-01
In this research, we examine how problem solving frameworks differ between Mathematics and Software Development. Our methodology is based on the assumption that the words used frequently in a book indicate the mental framework of the author. We compared word frequencies in a sample of 139 books that discuss problem solving. The books were grouped…
Predicting Lexical Proficiency in Language Learner Texts Using Computational Indices
ERIC Educational Resources Information Center
Crossley, Scott A.; Salsbury, Tom; McNamara, Danielle S.; Jarvis, Scott
2011-01-01
The authors present a model of lexical proficiency based on lexical indices related to vocabulary size, depth of lexical knowledge, and accessibility to core lexical items. The lexical indices used in this study come from the computational tool Coh-Metrix and include word length scores, lexical diversity values, word frequency counts, hypernymy…
An Investigation into the Use of Word Frequency Lists in Computing Vocabulary Profiles.
ERIC Educational Resources Information Center
Coniam, David
1999-01-01
Investigates word frequency as an indicator of language proficiency in the written English of Grade 13 learners of English in Hong Kong. The study develops Laufer and Nation's (1995) work on Lexical Frequency Profile in which student writing was analyzed for the frequency of word families, with vocabulary profiles produced from the scripts on the…
NASA Astrophysics Data System (ADS)
Balbin, Jessie R.; Padilla, Dionis A.; Fausto, Janette C.; Vergara, Ernesto M.; Garcia, Ramon G.; Delos Angeles, Bethsedea Joy S.; Dizon, Neil John A.; Mardo, Mark Kevin N.
2017-02-01
This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.
Vaden, Kenneth I; Kuchinsky, Stefanie E; Keren, Noam I; Harris, Kelly C; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A
2011-11-01
The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of phonotactic frequency to LIFG activity by manipulating word intelligibility in participants of varying age. Thirty six native English speakers, 19-79 years old (mean=50.5, sd=21.0) indicated with a button press whether they recognized 120 binaurally presented consonant-vowel-consonant words during a sparse sampling fMRI experiment (TR=8 s). Word intelligibility was manipulated by low-pass filtering (cutoff frequencies of 400 Hz, 1000 Hz, 1600 Hz, and 3150 Hz). Group analyses revealed a significant positive correlation between phonotactic frequency and LIFG activity, which was unaffected by age and hearing thresholds. A region of interest analysis revealed that the relation between phonotactic frequency and LIFG activity was significantly strengthened for the most intelligible words (low-pass cutoff at 3150 Hz). These results suggest that the responsiveness of the left inferior frontal cortex to phonotactic frequency reflects the downstream impact of word recognition rather than support of word recognition, at least when there are no speech production demands. Published by Elsevier Ltd.
Effects of lexical competition on immediate memory span for spoken words.
Goh, Winston D; Pisoni, David B
2003-08-01
Current theories and models of the structural organization of verbal short-term memory are primarily based on evidence obtained from manipulations of features inherent in the short-term traces of the presented stimuli, such as phonological similarity. In the present study, we investigated whether properties of the stimuli that are not inherent in the short-term traces of spoken words would affect performance in an immediate memory span task. We studied the lexical neighbourhood properties of the stimulus items, which are based on the structure and organization of words in the mental lexicon. The experiments manipulated lexical competition by varying the phonological neighbourhood structure (i.e., neighbourhood density and neighbourhood frequency) of the words on a test list while controlling for word frequency and intra-set phonological similarity (family size). Immediate memory span for spoken words was measured under repeated and nonrepeated sampling procedures. The results demonstrated that lexical competition only emerged when a nonrepeated sampling procedure was used and the participants had to access new words from their lexicons. These findings were not dependent on individual differences in short-term memory capacity. Additional results showed that the lexical competition effects did not interact with proactive interference. Analyses of error patterns indicated that item-type errors, but not positional errors, were influenced by the lexical attributes of the stimulus items. These results complement and extend previous findings that have argued for separate contributions of long-term knowledge and short-term memory rehearsal processes in immediate verbal serial recall tasks.
ERIC Educational Resources Information Center
Wang, Min; Koda, Keiko
2005-01-01
This study examined word identification skills among Chinese and Korean college students learning to read English as a second language in a naming experiment and an auditory category judgment task. Both groups demonstrated faster and more accurate naming performance on high-frequency words than low-frequency words and on regular words than…
Word Frequency As a Cue For Identifying Function Words In Infancy
ERIC Educational Resources Information Center
Hochmann, Jean-Remy; Endress, Ansgar D.; Mehler, Jacques
2010-01-01
While content words (e.g., 'dog') tend to carry meaning, function words (e.g., 'the') mainly serve syntactic purposes. Here, we ask whether 17-month old infants can use one language-universal cue to identify function word candidates: their high frequency of occurrence. In Experiment 1, infants listened to a series of short, naturally recorded…
Lexico-Semantic Structure and the Word-Frequency Effect in Recognition Memory
ERIC Educational Resources Information Center
Monaco, Joseph D.; Abbott, L. F.; Kahana, Michael J.
2007-01-01
The word-frequency effect (WFE) in recognition memory refers to the finding that more rare words are better recognized than more common words. We demonstrate that a familiarity-discrimination model operating on data from a semantic word-association space yields a robust WFE in data on both hit rates and false-alarm rates. Our modeling results…
Schotter, Elizabeth R.; Leinenger, Mallorie
2016-01-01
Current theories of eye movement control in reading posit that processing of an upcoming parafoveal preview word is used to facilitate processing of that word once it is fixated (i.e., a foveal target word). This preview benefit is demonstrated by shorter fixation durations in the case of valid (i.e., identical or linguistically similar) compared to invalid (i.e., dissimilar) preview conditions. However, we suggest that processing of the preview can directly influence fixation behavior on the target, independent of similarity between them. In Experiment 1, unrelated high and low frequency words were used as orthogonally crossed previews and targets and we observed a reversed preview benefit for low frequency targets—shorter fixation durations with an invalid, higher frequency preview compared to a valid, low frequency preview. In Experiment 2, the target words were replaced with orthographically legal and illegal nonwords and we found a similar effect of preview frequency on fixation durations on the targets, as well as a bimodal distribution in the illegal nonword target conditions with a denser early peak for high than low frequency previews. In Experiment 3, nonwords were used as previews for high and low frequency targets, replicating standard findings that “denied” preview increases fixation durations and the influence of target properties. These effects can be explained by forced fixations, cases in which fixations on the target were shortened as a consequence of the timing of word recognition of the preview relative to the time course of saccade programming to that word from the prior one. That is, the preview word was (at least partially) recognized so that it should have been skipped, but the word could not be skipped because the saccade to that word was in a non-labile stage. In these cases, the system pre-initiates the subsequent saccade off the upcoming word to the following word and the intervening fixation is short. PMID:27732044
De Luca, Maria; Barca, Laura; Burani, Cristina; Zoccolotti, Pierluigi
2008-12-01
To examine the effect of word length and several sublexical, and lexico-semantic variables on the reading of Italian children with a developmental reading deficit. Previous studies indicated the role of word length in transparent orthographies. However, several factors that may interact with word length were not controlled for. Seventeen impaired and 34 skilled sixth-grade readers were presented words of different lengths, matched for initial phoneme, bigram frequency, word frequency, age of acquisition, and imageability. Participants were asked to read aloud, as quickly and as accurately as possible. Reaction times at the onset of pronunciation and mispronunciations were recorded. Impaired readers' reaction times indicated a marked effect of word length; in skilled readers, there was no length effect for short words but, rather, a monotonic increase from 6-letter words on. Regression analyses confirmed the role of word length and indicated the influence of word frequency (similar in impaired and skilled readers). No other variables predicted reading latencies. Word length differentially influenced word recognition in impaired versus skilled readers, irrespective of the action of (potentially interfering) sublexical, lexical, and semantic variables. It is proposed that the locus of the length effect is at a perceptual level of analysis. The independent influence of word frequency on the reading performance of both groups of participants indicates the sparing of lexical activation in impaired readers.
The neural bases of the learning and generalization of morphological inflection.
Nevat, Michael; Ullman, Michael T; Eviatar, Zohar; Bitan, Tali
2017-04-01
Affixal inflectional morphology has been intensively examined as a model of productive aspects of language. Nevertheless, little is known about the neurocognition of the learning and generalization of affixal inflection, or the influence of certain factors that may affect these processes. In an event-related fMRI study, we examined the neurocognition of the learning and generalization of plural inflections in an artificial language, as well as the influence of both affix type frequency (the proportion of words receiving a given affix) and affix predictability (based on phonological cues in the stem). Adult participants were trained in three sessions, and were scanned after the first and last sessions while inflecting trained and untrained words. Untrained words yielded more activation than trained words in medial frontal (including pre-SMA) and left inferior frontal cortices, which have previously shown activation in compositional grammatical processing. A reliance on phonological cues for untrained word inflection correlated positively with pre-SMA activation, but negatively with activation in the pars triangularis. Thus, pre-SMA may be involved in phonological cue-based composition, while the pars triangularis underlies alternative processes. Inflecting trained items yielded activation in the caudate head bilaterally, only in the first session, consistent with a role for procedural memory in learning grammatical regularities. The medial frontal and left inferior regions activated by untrained items were also activated by trained items, but more weakly than untrained items, with weakest activation for trained-items taking the high-frequency affix. This suggests less involvement of compositional processes for inflecting trained than untrained items, and least of all for trained inflected forms with high-frequency affixes, consistent with the storage of such forms (e.g., in declarative memory). Overall, the findings further elucidate the neural bases of the learning and generalization of affixal morphology, and the roles of affix type frequency and affix phonological predictability in these processes. Moreover, the results support and further specify the declarative/procedural model, in particular in adult language learning. Copyright © 2016 Elsevier Ltd. All rights reserved.
Newman, Rochelle S; Bernstein Ratner, Nan
2007-02-01
The purpose of this study was to investigate whether lexical access in adults who stutter (AWS) differs from that in people who do not stutter. Specifically, the authors examined the role of 3 lexical factors on naming speed, accuracy, and fluency: word frequency, neighborhood density, and neighborhood frequency. If stuttering results from an impairment in lexical access, these factors were hypothesized to differentially affect AWS performance on a confrontation naming task. Twenty-five AWS and 25 normally fluent comparison speakers, matched for age and education, participated in a confrontation naming task designed to explore within-speaker performance on naming accuracy, speed, and fluency based on stimulus word frequency and neighborhood characteristics. Accuracy, fluency, and reaction time (from acoustic waveform analysis) were computed. In general, AWS demonstrated the same effects of lexical factors on their naming as did adults who do not stutter. However, accuracy of naming was reduced for AWS. Stuttering rate was influenced by word frequency but not other factors. Results suggest that AWS could have a fundamental deficit in lexical retrieval, but this deficit is unlikely to be at the level of the word's abstract phonological representation. Implications for further research are discussed.
ERIC Educational Resources Information Center
Shantz, Kailen
2017-01-01
This study reports on a self-paced reading experiment in which native and non-native speakers of English read sentences designed to evaluate the predictions of usage-based and rule-based approaches to second language acquisition (SLA). Critical stimuli were four-word sequences embedded into sentences in which phrase frequency and grammaticality…
Word Effects in Dual-Task Studies Using Lexical Decision and Naming as Task 2
NASA Technical Reports Server (NTRS)
Remington, Roger; McCann, Robert S.; VanSelst, Mark; Shafto, Michael (Technical Monitor)
1997-01-01
Word frequency effects in dual-task, lexical decision are variously reported to be additive or under-additive across SOA. We replicate and extend earlier lexical decision studies and find word frequency to be additive across SOA. To more directly capture lexical processing, we examine dual-task naming. Once again we find word frequency to be additive across SOA. Lexical processing appears to be constrained by central processing limitations.
Using Internet search engines to estimate word frequency.
Blair, Irene V; Urland, Geoffrey R; Ma, Jennifer E
2002-05-01
The present research investigated Internet search engines as a rapid, cost-effective alternative for estimating word frequencies. Frequency estimates for 382 words were obtained and compared across four methods: (1) Internet search engines, (2) the Kucera and Francis (1967) analysis of a traditional linguistic corpus, (3) the CELEX English linguistic database (Baayen, Piepenbrock, & Gulikers, 1995), and (4) participant ratings of familiarity. The results showed that Internet search engines produced frequency estimates that were highly consistent with those reported by Kucera and Francis and those calculated from CELEX, highly consistent across search engines, and very reliable over a 6-month period of time. Additional results suggested that Internet search engines are an excellent option when traditional word frequency analyses do not contain the necessary data (e.g., estimates for forenames and slang). In contrast, participants' familiarity judgments did not correspond well with the more objective estimates of word frequency. Researchers are advised to use search engines with large databases (e.g., AltaVista) to ensure the greatest representativeness of the frequency estimates.
A Whole Word and Number Reading Machine Based on Two Dimensional Low Frequency Fourier Transforms
1990-12-01
they are energy normalized. The normalization process accounts for brightness variations and is equivalent to graphing each 2DFT onto the surface of an n...determined empirically (trial and error). Each set is energy normalized based on the number of coefficients within the set. Therefore, the actual...using the 6 font group case with the top 1000 words, where the energy has been renormalized based on the particular number of coefficients being used
Event Recognition Based on Deep Learning in Chinese Texts
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
The Effects of Word Frequency and Context Variability in Cued Recall
ERIC Educational Resources Information Center
Criss, Amy H.; Aue, William R.; Smith, Larissa
2011-01-01
Normative word frequency and context variability affect memory in a range of episodic memory tasks and place constraints on theoretical development. In four experiments, we independently manipulated the word frequency and context variability of the targets (to-be-generated items) and cues in a cued recall paradigm. We found that high frequency…
Zur Wortbildung in wissenschaftlichen Texten (Word Formation in Scientific Texts)
ERIC Educational Resources Information Center
Rogalla, Hanna; Rogalla, Willy
1976-01-01
Discusses a German frequency list of 1,500 to 2,000 scientific words, which is being developed, and the importance of learning word-building principles. Substantive and adjective suffixes are listed according to frequency, followed by remarks on copulative compounds, with examples and frequency ranking, and, finally, prefixes. (Text is in German.)…
Conrad, Markus; Carreiras, Manuel; Tamm, Sascha; Jacobs, Arthur M
2009-04-01
Over the last decade, there has been increasing evidence for syllabic processing during visual word recognition. If syllabic effects prove to be independent from orthographic redundancy, this would seriously challenge the ability of current computational models to account for the processing of polysyllabic words. Three experiments are presented to disentangle effects of the frequency of syllabic units and orthographic segments in lexical decision. In Experiment 1 the authors obtained an inhibitory syllable frequency effect that was unaffected by the presence or absence of a bigram trough at the syllable boundary. In Experiments 2 and 3 an inhibitory effect of initial syllable frequency but a facilitative effect of initial bigram frequency emerged when manipulating 1 of the 2 measures and controlling for the other in Spanish words starting with consonant-vowel syllables. The authors conclude that effects of syllable frequency and letter-cluster frequency are independent and arise at different processing levels of visual word recognition. Results are discussed within the framework of an interactive activation model of visual word recognition. (c) 2009 APA, all rights reserved.
Keane, Margaret M; Martin, Elizabeth; Verfaellie, Mieke
2009-07-01
Accuracy in identifying a perceptually degraded word (e.g., stake) can be either enhanced by recent exposure to the same stimulus or reduced by recent exposure to a similar stimulus (e.g., stare). In the present study, we explored the mechanisms underlying these benefits and costs by examining the performance of amnesic and control groups in a forced choice perceptual identification (FCPI) task in which briefly flashed words (that were identical to studied words, similar to studied words, or new) had to be identified, and two response choices were provided that differed from each other by one letter. Control participants showed a performance benefit and cost in FCPI with both high- and low-frequency words. Amnesic participants showed a benefit (but no cost) with high-frequency words and a benefit and a cost with low-frequency words. The benefit/cost pattern with low-frequency words in amnesia was obtained even when the to-be-identified stimulus in the FCPI task was eliminated (Experiment 2), suggesting that this effect was driven by processes operating at the level of the response choices. Our findings suggest that implicit memory effects in FCPI reflect the operation of multiple mechanisms, the relative contributions of which may vary with the frequency of the test stimuli. The results also highlight the need for caution in interpreting results from normal participants in the FCPI task, since those findings may reflect a contribution of explicit memory processes.
Acute Alcohol Effects on Repetition Priming and Word Recognition Memory with Equivalent Memory Cues
ERIC Educational Resources Information Center
Ray, Suchismita; Bates, Marsha E.
2006-01-01
Acute alcohol intoxication effects on memory were examined using a recollection-based word recognition memory task and a repetition priming task of memory for the same information without explicit reference to the study context. Memory cues were equivalent across tasks; encoding was manipulated by varying the frequency of occurrence (FOC) of words…
Spelling Test Generator--Volume 1: English. [CD-ROM].
ERIC Educational Resources Information Center
Aud, Joel; DeWolfe, Rosemary; Gintz, Christopher; Griswold, Scott; Hefter, Richard; Lowery, Adam; Richards, Maureen; Yi, Song Choi
This software product makes the manipulation of the more than 3000 most commonly used words in the English language easy to select and manipulate into various activities for elementary and middle school students. Users of the program have a variety of options: the program can automatically select words based on their age/grade level, frequency of…
Winsler, Kurt; Holcomb, Phillip J; Midgley, Katherine J; Grainger, Jonathan
2017-01-01
Previous studies have shown that different spatial frequency information processing streams interact during the recognition of visual stimuli. However, it is a matter of debate as to the contributions of high and low spatial frequency (HSF and LSF) information for visual word recognition. This study examined the role of different spatial frequencies in visual word recognition using event-related potential (ERP) masked priming. EEG was recorded from 32 scalp sites in 30 English-speaking adults in a go/no-go semantic categorization task. Stimuli were white characters on a neutral gray background. Targets were uppercase five letter words preceded by a forward-mask (#######) and a 50 ms lowercase prime. Primes were either the same word (repeated) or a different word (un-repeated) than the subsequent target and either contained only high, only low, or full spatial frequency information. Additionally within each condition, half of the prime-target pairs were high lexical frequency, and half were low. In the full spatial frequency condition, typical ERP masked priming effects were found with an attenuated N250 (sub-lexical) and N400 (lexical-semantic) for repeated compared to un-repeated primes. For HSF primes there was a weaker N250 effect which interacted with lexical frequency, a significant reversal of the effect around 300 ms, and an N400-like effect for only high lexical frequency word pairs. LSF primes did not produce any of the classic ERP repetition priming effects, however they did elicit a distinct early effect around 200 ms in the opposite direction of typical repetition effects. HSF information accounted for many of the masked repetition priming ERP effects and therefore suggests that HSFs are more crucial for word recognition. However, LSFs did produce their own pattern of priming effects indicating that larger scale information may still play a role in word recognition.
Strength of word-specific neural memory traces assessed electrophysiologically.
Alexandrov, Alexander A; Boricheva, Daria O; Pulvermüller, Friedemann; Shtyrov, Yury
2011-01-01
Memory traces for words are frequently conceptualized neurobiologically as networks of neurons interconnected via reciprocal links developed through associative learning in the process of language acquisition. Neurophysiological reflection of activation of such memory traces has been reported using the mismatch negativity brain potential (MMN), which demonstrates an enhanced response to meaningful words over meaningless items. This enhancement is believed to be generated by the activation of strongly intraconnected long-term memory circuits for words that can be automatically triggered by spoken linguistic input and that are absent for unfamiliar phonological stimuli. This conceptual framework critically predicts different amounts of activation depending on the strength of the word's lexical representation in the brain. The frequent use of words should lead to more strongly connected representations, whereas less frequent items would be associated with more weakly linked circuits. A word with higher frequency of occurrence in the subject's language should therefore lead to a more pronounced lexical MMN response than its low-frequency counterpart. We tested this prediction by comparing the event-related potentials elicited by low- and high-frequency words in a passive oddball paradigm; physical stimulus contrasts were kept identical. We found that, consistent with our prediction, presenting the high-frequency stimulus led to a significantly more pronounced MMN response relative to the low-frequency one, a finding that is highly similar to previously reported MMN enhancement to words over meaningless pseudowords. Furthermore, activation elicited by the higher-frequency word peaked earlier relative to low-frequency one, suggesting more rapid access to frequently used lexical entries. These results lend further support to the above view on word memory traces as strongly connected assemblies of neurons. The speed and magnitude of their activation appears to be linked to the strength of internal connections in a memory circuit, which is in turn determined by the everyday use of language elements.
Navarrete, Eduardo; Pastore, Massimiliano; Valentini, Rosa; Peressotti, Francesca
2015-10-01
A large body of evidence indicates that the age at which a word is acquired predicts the time required to retrieve that word during speech production. Here we explored whether age of acquisition also predicts the experience of being unable to produce a known word at a particular moment. Italian speakers named a sequence of pictures in Experiment 1 or retrieved a word as a response to a definition in Experiment 2. In both experiments, the participants were instructed to indicate when they were in a tip-of-the-tongue (TOT) state. Generalized mixed-effects models performed on the TOT and correct responses revealed that word frequency and age of acquisition predicted the TOT states. Specifically, low-frequency words elicited more TOTs than did high-frequency words, replicating previous findings. In addition, late-acquired words elicited more TOTs than did early-acquired words. Further analyses revealed that the age of acquisition was a better predictor of TOTs than was word frequency. The effects of age of acquisition were similar with subjective and objective measures of age of acquisition, and persisted when several psycholinguistic variables were taken into consideration as predictors in the generalized mixed-effects models. We explained these results in terms of weaker semantic-to-phonological connections in the speech production system for late-acquired words.
Automatic measurement and representation of prosodic features
NASA Astrophysics Data System (ADS)
Ying, Goangshiuan Shawn
Effective measurement and representation of prosodic features of the acoustic signal for use in automatic speech recognition and understanding systems is the goal of this work. Prosodic features-stress, duration, and intonation-are variations of the acoustic signal whose domains are beyond the boundaries of each individual phonetic segment. Listeners perceive prosodic features through a complex combination of acoustic correlates such as intensity, duration, and fundamental frequency (F0). We have developed new tools to measure F0 and intensity features. We apply a probabilistic global error correction routine to an Average Magnitude Difference Function (AMDF) pitch detector. A new short-term frequency-domain Teager energy algorithm is used to measure the energy of a speech signal. We have conducted a series of experiments performing lexical stress detection on words in continuous English speech from two speech corpora. We have experimented with two different approaches, a segment-based approach and a rhythm unit-based approach, in lexical stress detection. The first approach uses pattern recognition with energy- and duration-based measurements as features to build Bayesian classifiers to detect the stress level of a vowel segment. In the second approach we define rhythm unit and use only the F0-based measurement and a scoring system to determine the stressed segment in the rhythm unit. A duration-based segmentation routine was developed to break polysyllabic words into rhythm units. The long-term goal of this work is to develop a system that can effectively detect the stress pattern for each word in continuous speech utterances. Stress information will be integrated as a constraint for pruning the word hypotheses in a word recognition system based on hidden Markov models.
ERIC Educational Resources Information Center
Crossley, Scott A.; Subtirelu, Nicholas; Salsbury, Tom
2013-01-01
This study examines frequency, contextual diversity, and contextual distinctiveness effects in predicting produced versus not-produced frequent nouns and verbs by early second language (L2) learners of English. The study analyzes whether word frequency is the strongest predictor of early L2 word production independent of contextual diversity and…
Colombo, Lucia; Fonti, Cristina; Cappa, Stefano
2004-01-01
The influence of lexical-semantic impairment and of executive dysfunction on word naming performance was investigated in a group of patients with probable Alzheimer dementia (AD). The patients, who varied in the severity of the illness, were tested in a word naming task where they had to read aloud Italian three-syllable words with a dominant or subordinate stress pattern. These types of words have been shown to interact with frequency in normal adults [J. Exp. Psychol.: Hum. Percept. Perform. 18 (4) (1992) 987], so that the effect of the subordinate stress pattern (slower reading times) is only apparent for low frequency words. The frequency and stress effects on accuracy increased across dementia severity levels. Regression analyses showed that the impairment in reading low frequency words with subordinate stress depended largely on the level of lexical-semantic impairment, measured by a test of semantic memory and comprehension. Implications for the current reading models are discussed.
Additive and Interactive Effects on Response Time Distributions in Visual Word Recognition
ERIC Educational Resources Information Center
Yap, Melvin J.; Balota, David A.
2007-01-01
Across 3 different word recognition tasks, distributional analyses were used to examine the joint effects of stimulus quality and word frequency on underlying response time distributions. Consistent with the extant literature, stimulus quality and word frequency produced additive effects in lexical decision, not only in the means but also in the…
The Influence of Word Frequency on Word Retrieval: Measuring Covert Behaviors
ERIC Educational Resources Information Center
Chih, Yu-Chun; Stierwalt, Julie A. G.; LaPointe, Leonard L.; Chih, Yu-Pin
2017-01-01
Physiological activities (heart rate and respiratory rate) during a word retrieval task were measured in normal participants. Word frequency demonstrated a significant effect on naming accuracy and latencies but not on physiological activities. These data will serve as a basis for comparison for individuals with a compromised language system.
Piai, Vitória; Roelofs, Ardi; Maris, Eric
2014-01-01
Two fundamental factors affecting the speed of spoken word production are lexical frequency and sentential constraint, but little is known about their timing and electrophysiological basis. In the present study, we investigated event-related potentials (ERPs) and oscillatory brain responses induced by these factors, using a task in which participants named pictures after reading sentences. Sentence contexts were either constraining or nonconstraining towards the final word, which was presented as a picture. Picture names varied in their frequency of occurrence in the language. Naming latencies and electrophysiological responses were examined as a function of context and lexical frequency. Lexical frequency is an index of our cumulative learning experience with words, so lexical-frequency effects most likely reflect access to memory representations for words. Pictures were named faster with constraining than nonconstraining contexts. Associated with this effect, starting around 400 ms pre-picture presentation, oscillatory power between 8 and 30 Hz was lower for constraining relative to nonconstraining contexts. Furthermore, pictures were named faster with high-frequency than low-frequency names, but only for nonconstraining contexts, suggesting differential ease of memory access as a function of sentential context. Associated with the lexical-frequency effect, starting around 500 ms pre-picture presentation, oscillatory power between 4 and 10 Hz was higher for high-frequency than for low-frequency names, but only for constraining contexts. Our results characterise electrophysiological responses associated with lexical frequency and sentential constraint in spoken word production, and point to new avenues for studying these fundamental factors in language production. © 2013 Published by Elsevier Ltd.
A model for evidence accumulation in the lexical decision task.
Wagenmakers, Eric-Jan; Steyvers, Mark; Raaijmakers, Jeroen G W; Shiffrin, Richard M; van Rijn, Hedderik; Zeelenberg, René
2004-05-01
We present a new model for lexical decision, REM-LD, that is based on REM theory (e.g., ). REM-LD uses a principled (i.e., Bayes' rule) decision process that simultaneously considers the diagnosticity of the evidence for the 'WORD' response and the 'NONWORD' response. The model calculates the odds ratio that the presented stimulus is a word or a nonword by averaging likelihood ratios for lexical entries from a small neighborhood of similar words. We report two experiments that used a signal-to-respond paradigm to obtain information about the time course of lexical processing. Experiment 1 verified the prediction of the model that the frequency of the word stimuli affects performance for nonword stimuli. Experiment 2 was done to study the effects of nonword lexicality, word frequency, and repetition priming and to demonstrate how REM-LD can account for the observed results. We discuss how REM-LD could be extended to account for effects of phonology such as the pseudohomophone effect, and how REM-LD can predict response times in the traditional 'respond-when-ready' paradigm.
A Comparative Usage-Based Approach to the Reduction of the Spanish and Portuguese Preposition "Para"
ERIC Educational Resources Information Center
Gradoville, Michael Stephen
2013-01-01
This study examines the frequency effect of two-word collocations involving "para" "to," "for" (e.g. "fui para," "para que") on the reduction of "para" to "pa" (in Spanish) and "pra" (in Portuguese). Collocation frequency effects demonstrate that language speakers…
Whitford, Veronica; Titone, Debra
2014-01-01
We used eye movement measures of paragraph reading to examine whether word frequency and predictability interact during the earliest stages of lexical processing, with a specific focus on whether these effects are modulated by individual differences in reading comprehension or launch site (i.e., saccade length between the prior and currently fixated word--a proxy for the amount of parafoveal word processing). The joint impact of frequency and predictability on reading will elucidate whether these variables additively or multiplicatively affect the earliest stages of lexical access, which, in turn, has implications for computational models of eye movements during reading. Linear mixed effects models revealed additive effects during both early- and late-stage reading, where predictability effects were comparable for low- and high-frequency words. Moreover, less cautious readers (e.g., readers who engaged in skimming, scanning, mindless reading) demonstrated smaller frequency effects than more cautious readers. Taken together, our findings suggest that during extended reading, frequency and predictability exert additive influences on lexical and postlexical processing, and that individual differences in reading comprehension modulate sensitivity to the effects of word frequency.
Variability in Word Duration as a Function of Probability, Speech Style, and Prosody
Baker, Rachel E.; Bradlow, Ann R.
2010-01-01
This article examines how probability (lexical frequency and previous mention), speech style, and prosody affect word duration, and how these factors interact. Participants read controlled materials in clear and plain speech styles. As expected, more probable words (higher frequencies and second mentions) were significantly shorter than less probable words, and words in plain speech were significantly shorter than those in clear speech. Interestingly, we found second mention reduction effects in both clear and plain speech, indicating that while clear speech is hyper-articulated, this hyper-articulation does not override probabilistic effects on duration. We also found an interaction between mention and frequency, but only in plain speech. High frequency words allowed more second mention reduction than low frequency words in plain speech, revealing a tendency to hypo-articulate as much as possible when all factors support it. Finally, we found that first mentions were more likely to be accented than second mentions. However, when these differences in accent likelihood were controlled, a significant second mention reduction effect remained. This supports the concept of a direct link between probability and duration, rather than a relationship solely mediated by prosodic prominence. PMID:20121039
Dynamic Self-Organization and Early Lexical Development in Children
ERIC Educational Resources Information Center
Li, Ping; Zhao, Xiaowei; Whinney, Brian Mac
2007-01-01
In this study we present a self-organizing connectionist model of early lexical development. We call this model DevLex-II, based on the earlier DevLex model. DevLex-II can simulate a variety of empirical patterns in children's acquisition of words. These include a clear vocabulary spurt, effects of word frequency and length on age of acquisition,…
ERIC Educational Resources Information Center
Cornwell, Steve; Kakutani, Tomoko
1998-01-01
A project to develop word lists for first-year English as a second language instruction at Osaka Jogakuin Junior College (Japan) is described. The lists were drawn from high-frequency vocabulary lists, with word selection based on course unit themes and rhetorical patterns. These include: introduction/people and places; women's issues;…
Lin, Nan; Yu, Xi; Zhao, Ying; Zhang, Mingxia
2016-01-01
This fMRI study aimed to identify the neural mechanisms underlying the recognition of Chinese multi-character words by partialling out the confounding effect of reaction time (RT). For this purpose, a special type of nonword-transposable nonword-was created by reversing the character orders of real words. These nonwords were included in a lexical decision task along with regular (non-transposable) nonwords and real words. Through conjunction analysis on the contrasts of transposable nonwords versus regular nonwords and words versus regular nonwords, the confounding effect of RT was eliminated, and the regions involved in word recognition were reliably identified. The word-frequency effect was also examined in emerged regions to further assess their functional roles in word processing. Results showed significant conjunctional effect and positive word-frequency effect in the bilateral inferior parietal lobules and posterior cingulate cortex, whereas only conjunctional effect was found in the anterior cingulate cortex. The roles of these brain regions in recognition of Chinese multi-character words were discussed.
Lin, Nan; Yu, Xi; Zhao, Ying; Zhang, Mingxia
2016-01-01
This fMRI study aimed to identify the neural mechanisms underlying the recognition of Chinese multi-character words by partialling out the confounding effect of reaction time (RT). For this purpose, a special type of nonword—transposable nonword—was created by reversing the character orders of real words. These nonwords were included in a lexical decision task along with regular (non-transposable) nonwords and real words. Through conjunction analysis on the contrasts of transposable nonwords versus regular nonwords and words versus regular nonwords, the confounding effect of RT was eliminated, and the regions involved in word recognition were reliably identified. The word-frequency effect was also examined in emerged regions to further assess their functional roles in word processing. Results showed significant conjunctional effect and positive word-frequency effect in the bilateral inferior parietal lobules and posterior cingulate cortex, whereas only conjunctional effect was found in the anterior cingulate cortex. The roles of these brain regions in recognition of Chinese multi-character words were discussed. PMID:26901644
The effect of orthographic and emotional neighbourhood in a colour categorization task.
Camblats, Anna-Malika; Mathey, Stéphanie
2016-02-01
This study investigated whether and how the strength of reading interference in a colour categorization task can be influenced by lexical competition and the emotional characteristics of words not directly presented. Previous findings showed inhibitory effects of high-frequency orthographic and emotional neighbourhood in the lexical decision task. Here, we examined the effect of orthographic neighbour frequency according to the emotional valence of the higher-frequency neighbour in an emotional orthographic Stroop paradigm. Stimuli were coloured neutral words that had either (1) no orthographic neighbour (e.g. PISTIL [pistil]), (2) one neutral higher-frequency neighbour (e.g. tirade [tirade]/TIRAGE [draw]) or (3) one negative higher-frequency neighbour (e.g. idiome [idiom]/IDIOTE [idiotic]). The results showed that colour categorization times were longer for words with no orthographic neighbour than for words with one neutral neighbour of higher frequency and even longer when the higher-frequency neighbour was neutral rather than negative. Thus, it appears not only that the orthographic neighbourhood of the coloured stimulus words intervenes in a colour categorization task, but also that the emotional content of the neighbour contributes to response times. These findings are discussed in terms of lexical competition between the stimulus word and non-presented orthographic neighbours, which in turn would modify the strength of reading interference on colour categorization times.
Localizing the Frequency x Regularity Word Reading Interaction in the Cerebral Cortex
ERIC Educational Resources Information Center
Cummine, Jacqueline; Sarty, Gordon E.; Borowsky, Ron
2010-01-01
The aim of this work is to combine behavioural and functional magnetic resonance imaging (fMRI) data to advance our knowledge of where the Frequency x Regularity interaction on word naming is located in the cerebral cortex. Participants named high and low frequency, regular and exception words in a behavioural lab (Experiment 1) and during an fMRI…
Hoffman, Paul; Jefferies, Elizabeth; Ralph, Matthew A Lambon
2011-02-01
More efficient processing of high frequency (HF) words is a ubiquitous finding in healthy individuals, yet frequency effects are often small or absent in stroke aphasia. We propose that some patients fail to show the expected frequency effect because processing of HF words places strong demands on semantic control and regulation processes, counteracting the usual effect. This may occur because HF words appear in a wide range of linguistic contexts, each associated with distinct semantic information. This theory predicts that in extreme circumstances, patients with impaired semantic control should show an outright reversal of the normal frequency effect. To test this prediction, we tested two patients with impaired semantic control with a delayed repetition task that emphasised activation of semantic representations. By alternating HF and low frequency (LF) trials, we demonstrated a significant repetition advantage for LF words, principally because of perseverative errors in which patients produced the previous LF response in place of the HF target. These errors indicated that HF words were more weakly activated than LF words. We suggest that when presented with no contextual information, patients generate a weak and unstable pattern of semantic activation for HF words because information relating to many possible contexts and interpretations is activated. In contrast, LF words are associated with more stable patterns of activation because similar semantic information is activated whenever they are encountered. Copyright © 2011 Elsevier Ltd. All rights reserved.
Contextual diversity is a main determinant of word identification times in young readers.
Perea, Manuel; Soares, Ana Paula; Comesaña, Montserrat
2013-09-01
Recent research with college-aged skilled readers by Adelman and colleagues revealed that contextual diversity (i.e., the number of contexts in which a word appears) is a more critical determinant of visual word recognition than mere repeated exposure (i.e., word frequency) (Psychological Science, 2006, Vol. 17, pp. 814-823). Given that contextual diversity has been claimed to be a relevant factor to word acquisition in developing readers, the effects of contextual diversity should also be a main determinant of word identification times in developing readers. A lexical decision experiment was conducted to examine the effects of contextual diversity and word frequency in young readers (children in fourth grade). Results revealed a sizable effect of contextual diversity, but not of word frequency, thereby generalizing Adelman and colleagues' data to a child population. These findings call for the implementation of dynamic developmental models of visual word recognition that go beyond a learning rule by mere exposure. Copyright © 2012 Elsevier Inc. All rights reserved.
Ease of identifying words degraded by visual noise.
Barber, P; de la Mahotière, C
1982-08-01
A technique is described for investigating word recognition involving the superimposition of 'noise' on the visual target word. For this task a word is printed in the form of letters made up of separate elements; noise consists of additional elements which serve to reduce the ease whereby the words may be recognized, and a threshold-like measure can be obtained in terms of the amount of noise. A word frequency effect was obtained for the noise task, and for words presented tachistoscopically but in conventional typography. For the tachistoscope task, however, the frequency effect depended on the method of presentation. A second study showed no effect of inspection interval on performance on the noise task. A word-frequency effect was also found in a third experiment with tachistoscopic exposure of the noise task stimuli in undegraded form. The question of whether common processes are drawn on by tasks entailing different ways of varying ease of recognition is addressed, and the suitability of different tasks for word recognition research is discussed.
Thompson, Charee M; Crook, Brittani; Love, Brad; Macpherson, Catherine Fiona; Johnson, Rebecca
2015-04-27
We compared adolescent and young adult cancer patient and survivor language between mediated and face-to-face support communities in order to understand how the use of certain words frame conversations about family, friends, health, work, achievement, and leisure. We analyzed transcripts from an online discussion board (N = 360) and face-to-face support group (N = 569) for adolescent and young adults using Linguistic Inquiry and Word Count, a word-based computerized text analysis software that counts the frequency of words and word stems. There were significant differences between the online and face-to-face support groups in terms of content (e.g. friends, health) and style words (e.g. verb tense, negative emotion, and cognitive process). © The Author(s) 2015.
Pina, Jamie; Massoudi, Barbara L; Chester, Kelley; Koyanagi, Mark
2018-06-07
Researchers and analysts have not completely examined word frequency analysis as an approach to creating a public health quality improvement taxonomy. To develop a taxonomy of public health quality improvement concepts for an online exchange of quality improvement work. We analyzed documents, conducted an expert review, and employed a user-centered design along with a faceted search approach to make online entries searchable for users. To provide the most targeted facets to users, we used word frequency to analyze 334 published public health quality improvement documents to find the most common clusters of word meanings. We then reviewed the highest-weighted concepts and categorized their relationships to quality improvement details in our taxonomy. Next, we mapped meanings to items in our taxonomy and presented them in order of their weighted percentages in the data. Using these methods, we developed and sorted concepts in the faceted search presentation so that online exchange users could access relevant search criteria. We reviewed 50 of the top synonym clusters and identified 12 categories for our taxonomy data. The final categories were as follows: Summary; Planning and Execution Details; Health Impact; Training and Preparation; Information About the Community; Information About the Health Department; Results; Quality Improvement (QI) Staff; Information; Accreditation Details; Collaborations; and Contact Information of the Submitter. Feedback about the elements in the taxonomy and presentation of elements in our search environment from users has been positive. When relevant data are available, the word frequency analysis method may be useful in other taxonomy development efforts for public health.
Phylogenetic tree construction using trinucleotide usage profile (TUP).
Chen, Si; Deng, Lih-Yuan; Bowman, Dale; Shiau, Jyh-Jen Horng; Wong, Tit-Yee; Madahian, Behrouz; Lu, Henry Horng-Shing
2016-10-06
It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 4 6 =4096 to 4 15 . We propose a simple improvement over the popular FFP method using only a typical word length of 3. A new method, called Trinucleotide Usage Profile (TUP), is proposed based only on the (relative) frequency distribution using non-overlapping windows of length 3. The total number of possible words needed for TUP is 4 3 =64, which is much less than the total count for the recommended optimal "resolution" for FFP. To build a phylogenetic tree, we propose first representing each of the species by a TUP vector and then using an appropriate distance measure between pairs of the TUP vectors for the tree construction. In particular, we propose summarizing a DNA sequence by a matrix of three rows corresponding to three reading frames, recording the frequency distribution of the non-overlapping words of length 3 in each of the reading frame. We also provide a numerical measure for comparing trees constructed with various methods. Compared to the FFP method, our empirical study showed that the proposed TUP method is more capable of building phylogenetic trees with a stronger biological support. We further provide some justifications on this from the information theory viewpoint. Unlike the FFP method, the TUP method takes the advantage that the starting of the first reading frame is (usually) known. Without this information, the FFP method could only rely on the frequency distribution of overlapping words, which is the average (or mixture) of the frequency distributions of three possible reading frames. Consequently, we show (from the entropy viewpoint) that the FFP procedure could dilute important gene information and therefore provides less accurate classification.
Effects of word frequency and modality on sentence comprehension impairments in people with aphasia.
DeDe, Gayle
2012-05-01
It is well known that people with aphasia have sentence comprehension impairments. The present study investigated whether lexical factors contribute to sentence comprehension impairments in both the auditory and written modalities using online measures of sentence processing. People with aphasia and non brain-damaged controls participated in the experiment (n = 8 per group). Twenty-one sentence pairs containing high- and low-frequency words were presented in self-paced listening and reading tasks. The sentences were syntactically simple and differed only in the critical words. The dependent variables were response times for critical segments of the sentence and accuracy on the comprehension questions. The results showed that word frequency influences performance on measures of sentence comprehension in people with aphasia. The accuracy data on the comprehension questions suggested that people with aphasia have more difficulty understanding sentences containing low-frequency words in the written compared to auditory modality. Both group and single-case analyses of the response time data also indicated that people with aphasia experience more difficulty with reading than listening. Sentence comprehension in people with aphasia is influenced by word frequency and presentation modality.
Zipf's Law for Word Frequencies: Word Forms versus Lemmas in Long Texts.
Corral, Álvaro; Boleda, Gemma; Ferrer-i-Cancho, Ramon
2015-01-01
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf's law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf's law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf's law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation.
Not all cultural values are created equal: Cultural change in China reexamined through Google books.
Zhang, Rui; Weng, Liping
2017-06-20
Given its major transformations in recent decades, China has figured prominently in research on cultural change. Previous research converges in showing a general trend towards individualism in contemporary China while noting that rising individualism tends to coexist with enduring collectivism. To further understand this, we tested whether perceived traditional importance of cultural values would modulate the trajectory of cultural change reflected in word usage frequencies in published books. We re-analysed Google's Chinese corpus since 1980 based on a broad sample of words associated with individualism-collectivism. We replicated the pattern of rising individualism and declining collectivism among words of modest and low perceived traditional importance. Most important, however, collectivistic words of high perceived traditional importance increased in usage frequencies with time, thus departing from the general trend towards individualism. Overall, our research underscores the role of core culture in cultural maintenance during times of rapid cultural change. © 2017 International Union of Psychological Science.
Measuring information-based energy and temperature of literary texts
NASA Astrophysics Data System (ADS)
Chang, Mei-Chu; Yang, Albert C.-C.; Eugene Stanley, H.; Peng, C.-K.
2017-02-01
We apply a statistical method, information-based energy, to quantify informative symbolic sequences. To apply this method to literary texts, it is assumed that different words with different occurrence frequencies are at different energy levels, and that the energy-occurrence frequency distribution obeys a Boltzmann distribution. The temperature within the Boltzmann distribution can be an indicator for the author's writing capacity as the repertory of thoughts. The relative temperature of a text is obtained by comparing the energy-occurrence frequency distributions of words collected from one text versus from all texts of the same author. Combining the relative temperature with the Shannon entropy as the text complexity, the information-based energy of the text is defined and can be viewed as a quantitative evaluation of an author's writing performance. We demonstrate the method by analyzing two authors, Shakespeare in English and Jin Yong in Chinese, and find that their well-known works are associated with higher information-based energies. This method can be used to measure the creativity level of a writer's work in linguistics, and can also quantify symbolic sequences in different systems.
Modeling the Chinese language as an evolving network
NASA Astrophysics Data System (ADS)
Liang, Wei; Shi, Yuming; Huang, Qiuling
2014-01-01
The evolution of Chinese language has three main features: the total number of characters is gradually increasing, new words are generated in the existing characters, and some old words are no longer used in daily-life language. Based on the features, we propose an evolving language network model. Finally, we use this model to simulate the character co-occurrence networks (nodes are characters, and two characters are connected by an edge if they are adjacent to each other) constructed from essays in 11 different periods of China, and find that characters that appear with high frequency in old words are likely to be reused when new words are formed.
ERIC Educational Resources Information Center
Rapp, Brenda; Dufor, Olivier
2011-01-01
This research is directed at charting the neurotopography of the component processes of the spelling system by using fMRI to identify the neural substrates that are sensitive to the factors of lexical frequency and word length. In spelling, word frequency effects index orthographic long-term memory whereas length effects, as measured by the number…
ERIC Educational Resources Information Center
Hemmer, Pernille; Criss, Amy H.
2013-01-01
The role of experience in memory, specifically the word frequency (WF) mirror effect showing higher hit rates and lower false alarm rates for low-frequency words, is one of the hallmarks of memory. However, this "regularity of memory" is limited because normative WF has been treated as discrete (low vs. high). We evaluate the extent to…
ERIC Educational Resources Information Center
Yap, Melvin J.; Tse, Chi-Shing; Balota, David A.
2009-01-01
Word frequency and semantic priming effects are among the most robust effects in visual word recognition, and it has been generally assumed that these two variables produce interactive effects in lexical decision performance, with larger priming effects for low-frequency targets. The results from four lexical decision experiments indicate that the…
ERIC Educational Resources Information Center
Eaton, Helen S., Comp.
This semantic frequency list for English, French, German, and Spanish correlates 6,474 concepts represented by individual words in an order of diminishing occurrence. Designed as a research tool, the work is segmented into seven comparative "Thousand Concepts" lists with 115 sectional subdivisions, each of which begins with the key English word…
Differences in Poor Readers' Abilities to Identify High-Frequency Words in Isolation and Context.
ERIC Educational Resources Information Center
Krieger, Veronica K.
1981-01-01
Reports that fourth-grade poor readers were able to identify more high-frequency words in context than in isolation. Discusses the findings in terms of a context-oriented approach of word identification instruction and assessment. (FL)
The ERP signature of the contextual diversity effect in visual word recognition.
Vergara-Martínez, Marta; Comesaña, Montserrat; Perea, Manuel
2017-06-01
Behavioral experiments have revealed that words appearing in many different contexts are responded to faster than words that appear in few contexts. Although this contextual diversity (CD) effect has been found to be stronger than the word-frequency (WF) effect, it is a matter of debate whether the facilitative effects of CD and WF reflect the same underlying mechanisms. The analysis of the electrophysiological correlates of CD may shed some light on this issue. This experiment is the first to examine the ERPs to high- and low-CD words when WF is controlled for. Results revealed that while high-CD words produced faster responses than low-CD words, their ERPs showed larger negativities (225-325 ms) than low-CD words. This result goes in the opposite direction of the ERP WF effect (high-frequency words elicit smaller N400 amplitudes than low-frequency words). The direction and scalp distribution of the CD effect resembled the ERP effects associated with "semantic richness." Thus, while apparently related, CD and WF originate from different sources during the access of lexical-semantic representations.
Towards a Reconceptualisation of "Word" for High Frequency Word Generation in Word Knowledge Studies
ERIC Educational Resources Information Center
Sibanda, Jabulani; Baxen, Jean
2014-01-01
The present paper derives from a PhD study investigating the nexus between Grade 4 textbook vocabulary demands and Grade 3 isiXhosa-speaking learners' knowledge of that vocabulary to enable them to read to learn in Grade 4. The paper challenges the efficacy of the four current definitions of "word" for generating high frequency words…
Lexical-Semantic Reading in a Shallow Orthography: Evidence from a Girl with Williams Syndrome
ERIC Educational Resources Information Center
Barca, Laura; Bello, Arianna; Volterra, Virginia; Burani, Cristina
2010-01-01
The reading skills of a girl with Williams Syndrome are assessed by a timed word-naming task. To test the efficiency of lexical and nonlexical reading, we considered four marker effects: Lexicality (better reading of words than nonwords), frequency (better reading of high than low frequency words), length (better reading of short than long words),…
ERIC Educational Resources Information Center
Durrwachter, Ute; Sokolov, Alexander N.; Reinhard, Jens; Klosinski, Gunther; Trauzettel-Klosinski, Susanne
2010-01-01
We combined independently the word length and word frequency to examine if the difficulty of reading material affects eye movements in readers of German, which has high orthographic regularity, comparing the outcome with previous findings available in other languages. Sixteen carefully selected German-speaking dyslexic children (mean age, 9.5…
Neighborhood Frequency Effect in Chinese Word Recognition: Evidence from Naming and Lexical Decision
ERIC Educational Resources Information Center
Li, Meng-Feng; Gao, Xin-Yu; Chou, Tai-Li; Wu, Jei-Tun
2017-01-01
Neighborhood frequency is a crucial variable to know the nature of word recognition. Different from alphabetic scripts, neighborhood frequency in Chinese is usually confounded by component character frequency and neighborhood size. Three experiments were designed to explore the role of the neighborhood frequency effect in Chinese and the stimuli…
Exploring new possibilities of astronomy education and outreach
NASA Astrophysics Data System (ADS)
Fukushima, Kodai
2015-08-01
I investigate the influences of astronomy education and outreach activities on people in order to explore their potential benefits and contribution to society. This research is based on the astronomy education lessons I gave to 287 senior high school and junior high school students in Cambodia in November 2013. Before and after my lesson, I asked them to answer my questionnaires in Khmer, where they could also write free descriptions. Sentences in their free descriptions translated into Japanese are analyzed by means of a text mining method. By converting text data to various numbers using a text mining method, it is possible for us to do statistical analysis. I counted the number of question sentences and computed their rate with respect to the total number of sentences. The rate of question sentences in 9th and 12th grade students are 39% and 9%, respectively. This shows 9th grade students wonder why and how more frequently and appear to be more stimulated in their curiosity than 12th grade students. I counted the frequency of words in the free descriptions and examined high frequency words, to take a broad view of the characteristics of free description. The word ''world'' is the fourth highest frequency word among 369 words following the three words, ''the universe'', ''the earth'', and ''a star'', which frequently appear in the lesson in astronomy. The most sentences including the word “world” described amazement at the existence of so vast unknown world outside of what they had known until then. The frequency of sentences including the word ''world'' of 12th grade students is much higher (45%) than that (18%) of 9th grade students. A significant fraction of 12th grade students appears to have had a strong impact and changed their views of the world. It is found that my lesson and related activities inspired intellectual curiosity in many students, especially in 9th grade students. It is also found that a significant fraction of 12th grade students appear to have had a strong impact and changed their views of the world. I conclude that astronomy education and outreach activities have a potential to contribute to Cambodian development.
Not all reading disabilities are dyslexia: distinct neurobiology of specific comprehension deficits.
Cutting, Laurie E; Clements-Stephens, Amy; Pugh, Kenneth R; Burns, Scott; Cao, Aize; Pekar, James J; Davis, Nicole; Rimrodt, Sheryl L
2013-01-01
Although an extensive literature exists on the neurobiological correlates of dyslexia (DYS), to date, no studies have examined the neurobiological profile of those who exhibit poor reading comprehension despite intact word-level abilities (specific reading comprehension deficits [S-RCD]). Here we investigated the word-level abilities of S-RCD as compared to typically developing readers (TD) and those with DYS by examining the blood oxygenation-level dependent response to words varying on frequency. Understanding whether S-RCD process words in the same manner as TD, or show alternate pathways to achieve normal word-reading abilities, may provide insights into the origin of this disorder. Results showed that as compared to TD, DYS showed abnormal covariance during word processing with right-hemisphere homologs of the left-hemisphere reading network in conjunction with left occipitotemporal underactivation. In contrast, S-RCD showed an intact neurobiological response to word stimuli in occipitotemporal regions (associated with fast and efficient word processing); however, inferior frontal gyrus (IFG) abnormalities were observed. Specifically, TD showed a higher-percent signal change within right IFG for low-versus-high frequency words as compared to both S-RCD and DYS. Using psychophysiological interaction analyses, a coupling-by-reading group interaction was found in right IFG for DYS, as indicated by a widespread greater covariance between right IFG and right occipitotemporal cortex/visual word-form areas, as well as bilateral medial frontal gyrus, as compared to TD. For S-RCD, the context-dependent functional interaction anomaly was most prominently seen in left IFG, which covaried to a greater extent with hippocampal, parahippocampal, and prefrontal areas than for TD for low- as compared to high-frequency words. Given the greater lexical access demands of low frequency as compared to high-frequency words, these results may suggest specific weaknesses in accessing lexical-semantic representations during word recognition. These novel findings provide foundational insights into the nature of S-RCD, and set the stage for future investigations of this common, but understudied, reading disorder.
Guo, Chunyan; Zhu, Ying; Ding, Jinhong; Fan, Silu; Paller, Ken A
2004-02-12
Memory encoding can be studied by monitoring brain activity correlated with subsequent remembering. To understand brain potentials associated with encoding, we compared multiple factors known to affect encoding. Depth of processing was manipulated by requiring subjects to detect animal names (deep encoding) or boldface (shallow encoding) in a series of Chinese words. Recognition was more accurate with deep than shallow encoding, and for low- compared to high-frequency words. Potentials were generally more positive for subsequently recognized versus forgotten words; for deep compared to shallow processing; and, for remembered words only, for low- than for high-frequency words. Latency and topographic differences between these potentials suggested that several factors influence the effectiveness of encoding and can be distinguished using these methods, even with Chinese logographic symbols.
Bilingual reading of compound words.
Ko, In Yeong; Wang, Min; Kim, Say Young
2011-02-01
The present study investigated whether bilingual readers activate constituents of compound words in one language while processing compound words in the other language via decomposition. Two experiments using a lexical decision task were conducted with adult Korean-English bilingual readers. In Experiment 1, the lexical decision of real English compound words was more accurate when the translated compounds (the combination of the translation equivalents of the constituents) in Korean (the nontarget language) were real words than when they were nonwords. In Experiment 2, when the frequency of the second constituents of compound words in English (the target language) was manipulated, the effect of lexical status of the translated compounds was greater on the compounds with high-frequency second constituents than on those with low-frequency second constituents in the target language. Together, these results provided evidence for morphological decomposition and cross-language activation in bilingual reading of compound words.
ERIC Educational Resources Information Center
Ucar, Serpil
2017-01-01
The utilization of English recurrent word combinations--lexical bundles--play a fundamental role in academic prose (Karabacak & Qin, 2013). There has been highly limited research about comparing Turkish non-native and native English writers' use of lexical bundles in academic prose in terms of frequency, structure and functions of lexical…
ERIC Educational Resources Information Center
Sciarone, A. G.
1979-01-01
An approach to language textbooks evaluation based on objective criteria and relying on data easily obtained by means of computers, such as word frequency lists, is proposed. The importance of vocabulary acquisition in language learning is emphasized. Accordingly, word selection and rate of repetition are seen as central evaluation criteria. (MES)
Dissociating visual form from lexical frequency using Japanese.
Twomey, Tae; Kawabata Duncan, Keith J; Hogan, John S; Morita, Kenji; Umeda, Kazumasa; Sakai, Katsuyuki; Devlin, Joseph T
2013-05-01
In Japanese, the same word can be written in either morphographic Kanji or syllabographic Hiragana and this provides a unique opportunity to disentangle a word's lexical frequency from the frequency of its visual form - an important distinction for understanding the neural information processing in regions engaged by reading. Behaviorally, participants responded more quickly to high than low frequency words and to visually familiar relative to less familiar words, independent of script. Critically, the imaging results showed that visual familiarity, as opposed to lexical frequency, had a strong effect on activation in ventral occipito-temporal cortex. Activation here was also greater for Kanji than Hiragana words and this was not due to their inherent differences in visual complexity. These findings can be understood within a predictive coding framework in which vOT receives bottom-up information encoding complex visual forms and top-down predictions from regions encoding non-visual attributes of the stimulus. Copyright © 2012 Elsevier Inc. All rights reserved.
Schotter, Elizabeth R.; Bicknell, Klinton; Howard, Ian; Levy, Roger; Rayner, Keith
2014-01-01
It is well-known that word frequency and predictability affect processing time. These effects change magnitude across tasks, but studies testing this use tasks with different response types (e.g., lexical decision, naming, and fixation time during reading; Schilling, Rayner & Chumbley, 1998), preventing direct comparison. Recently, Kaakinen and Hyönä (2010) overcame this problem, comparing fixation times in reading for comprehension and proofreading, showing that the frequency effect was larger in proofreading than in reading. This result could be explained by readers exhibiting substantial cognitive flexibility, and qualitatively changing how they process words in the proofreading task in a way that magnifies effects of word frequency. Alternatively, readers may not change word processing so dramatically, and instead may perform more careful identification generally, increasing the magnitude of many word processing effects (e.g., both frequency and predictability). We tested these possibilities with two experiments: subjects read for comprehension and then proofread for spelling errors (letter transpositions) that produce nonwords (e.g., trcak for track as in Kaakinen & Hyönä) or that produce real but unintended words (e.g., trial for trail) to compare how the task changes these effects. Replicating Kaakinen and Hyönä, frequency effects increased during proofreading. However, predictability effects only increased when integration with the sentence context was necessary to detect errors (i.e., when spelling errors produced words that were inappropriate in the sentence; trial for trail). The results suggest that readers adopt sophisticated word processing strategies to accommodate task demands. PMID:24434024
Brébion, Gildas; David, Anthony S; Bressan, Rodrigo A; Pilowsky, Lyn S
2007-01-01
The role of various types of slowing of processing speed, as well as the role of depressed mood, on each stage of verbal memory functioning in patients diagnosed with schizophrenia was investigated. Mixed lists of high- and low-frequency words were presented, and immediate and delayed free recall and recognition were required. Two levels of encoding were studied by contrasting the relatively automatic encoding of the high-frequency words and the more effortful encoding of the low-frequency words. Storage was studied by contrasting immediate and delayed recall. Retrieval was studied by contrasting free recall and recognition. Three tests of motor and cognitive processing speed were administered as well. Regression analyses involving the three processing speed measures revealed that cognitive speed was the only predictor of the recall and recognition of the low-frequency words. Furthermore, slowing in cognitive speed accounted for the deficit in recall and recognition of the low-frequency words relative to a healthy control group. Depressed mood was significantly associated with recognition of the low-frequency words. Neither processing speed nor depressed mood was associated with storage efficiency. It is concluded that both cognitive speed slowing and depressed mood impact on effortful encoding processes.
The Role of Semantic Diversity in Word Recognition across Aging and Bilingualism
Johns, Brendan T.; Sheppard, Christine L.; Jones, Michael N.; Taler, Vanessa
2016-01-01
Frequency effects are pervasive in studies of language, with higher frequency words being recognized faster than lower frequency words. However, the exact nature of frequency effects has recently been questioned, with some studies finding that contextual information provides a better fit to lexical decision and naming data than word frequency (Adelman et al., 2006). Recent work has cemented the importance of these results by demonstrating that a measure of the semantic diversity of the contexts that a word occurs in provides a powerful measure to account for variability in word recognition latency (Johns et al., 2012, 2015; Jones et al., 2012). The goal of the current study is to extend this measure to examine bilingualism and aging, where multiple theories use frequency of occurrence of linguistic constructs as central to accounting for empirical results (Gollan et al., 2008; Ramscar et al., 2014). A lexical decision experiment was conducted with four groups of subjects: younger and older monolinguals and bilinguals. Consistent with past results, a semantic diversity variable accounted for the greatest amount of variance in the latency data. In addition, the pattern of fits of semantic diversity across multiple corpora suggests that bilinguals and older adults are more sensitive to semantic diversity information than younger monolinguals. PMID:27458392
Reading Aloud: On the Determinants of the Joint Effects of Stimulus Quality and Word Frequency
ERIC Educational Resources Information Center
White, Darcy; Besner, Derek
2017-01-01
There are multiple reports, in the context of the time taken to read aloud, that the joint effects of stimulus quality and word frequency (a) interact when only words appear in the list but (b) are additive when nonwords are intermixed with words (O'Malley & Besner, 2008). This triple interaction has been explained in terms of the idea that…
Tip-of-the-tongue states reveal age differences in the syllable frequency effect.
Farrell, Meagan T; Abrams, Lise
2011-01-01
Syllable frequency has been shown to facilitate production in some languages but has yielded inconsistent results in English and has never been examined in older adults. Tip-of-the-tongue (TOT) states represent a unique type of production failure where the phonology of a word is unable to be retrieved, suggesting that the frequency of phonological forms, like syllables, may influence the occurrence of TOT states. In the current study, we investigated the role of first-syllable frequency on TOT incidence and resolution in young (18-26 years of age), young-old (60-74 years of age), and old-old (75-89 years of age) adults. Data from 3 published studies were compiled, where TOTs were elicited by presenting definition-like questions and asking participants to respond with "Know," "Don't Know," or "TOT." Young-old and old-old adults, but not young adults, experienced more TOTs for words beginning with low-frequency first syllables relative to high-frequency first syllables. Furthermore, age differences in TOT incidence occurred only for words with low-frequency first syllables. In contrast, when a prime word with the same first syllable as the target was presented during TOT states, all age groups resolved more TOTs for words beginning with low-frequency syllables. These findings support speech production models that allow for bidirectional activation between conceptual, lexical, and phonological forms of words. Furthermore, the age-specific effects of syllable frequency provide insight into the progression of age-linked changes to phonological processes. (PsycINFO Database Record (c) 2010 APA, all rights reserved).
Extended Attribute Constructions in German Radio Newscasts: Analysis and Implications
ERIC Educational Resources Information Center
Wipf, Joseph
2004-01-01
Although a number of word-frequency lists exist in German, there is an absence of studies investigating the relative frequency with which various grammatical structures are used. Traditionally, extended modifiers have been most prevalent in written German. Based on an analysis of authentic radio news broadcasts, this article makes the case that…
Higham, Philip A; Perfect, Timothy J; Bruno, Davide
2009-01-01
Criterion- versus distribution-shift accounts of frequency and strength effects in recognition memory were investigated with Type-2 signal detection receiver operating characteristic (ROC) analysis, which provides a measure of metacognitive monitoring. Experiment 1 demonstrated a frequency-based mirror effect, with a higher hit rate and lower false alarm rate, for low frequency words compared with high frequency words. In Experiment 2, the authors manipulated item strength with repetition, which showed an increased hit rate but no effect on the false alarm rate. Whereas Type-1 indices were ambiguous as to whether these effects were based on a criterion- or distribution-shift model, the two models predict opposite effects on Type-2 distractor monitoring under some assumptions. Hence, Type-2 ROC analysis discriminated between potential models of recognition that could not be discriminated using Type-1 indices alone. In Experiment 3, the authors manipulated Type-1 response bias by varying the number of old versus new response categories to confirm the assumptions made in Experiments 1 and 2. The authors conclude that Type-2 analyses are a useful tool for investigating recognition memory when used in conjunction with more traditional Type-1 analyses.
Dynamic burstiness of word-occurrence and network modularity in textbook systems
NASA Astrophysics Data System (ADS)
Cui, Xue-Mei; Yoon, Chang No; Youn, Hyejin; Lee, Sang Hoon; Jung, Jean S.; Han, Seung Kee
2017-12-01
We show that the dynamic burstiness of word occurrence in textbook systems is attributed to the modularity of the word association networks. At first, a measure of dynamic burstiness is introduced to quantify burstiness of word occurrence in a textbook. The advantage of this measure is that the dynamic burstiness is decomposable into two contributions: one coming from the inter-event variance and the other from the memory effects. Comparing network structures of physics textbook systems with those of surrogate random textbooks without the memory or variance effects are absent, we show that the network modularity increases systematically with the dynamic burstiness. The intra-connectivity of individual word representing the strength of a tie with which a node is bound to a module accordingly increases with the dynamic burstiness, suggesting individual words with high burstiness are strongly bound to one module. Based on the frequency and dynamic burstiness, physics terminology is classified into four categories: fundamental words, topical words, special words, and common words. In addition, we test the correlation between the dynamic burstiness of word occurrence and network modularity using a two-state model of burst generation.
Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts
Corral, Álvaro; Boleda, Gemma; Ferrer-i-Cancho, Ramon
2015-01-01
Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf’s law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf’s law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf’s law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation. PMID:26158787
Effects of Word Frequency and Modality on Sentence Comprehension Impairments in People with Aphasia
DeDe, Gayle
2014-01-01
Purpose It is well known that people with aphasia have sentence comprehension impairments. The present study investigated whether lexical factors contribute to sentence comprehension impairments in both the auditory and written modalities using on-line measures of sentence processing. Methods People with aphasia and non-brain-damaged controls participated in the experiment (n=8 per group). Twenty-one sentence pairs containing high and low frequency words were presented in self-paced listening and reading tasks. The sentences were syntactically simple and differed only in the critical words. The dependent variables were response times for critical segments of the sentence and accuracy on the comprehension questions. Results The results showed that word frequency influences performance on measures of sentence comprehension in people with aphasia. The accuracy data on the comprehension questions suggested that people with aphasia have more difficulty understanding sentences containing low frequency words in the written compared to auditory modality. Both group and single case analyses of the response time data also pointed to more difficulty with reading than listening. Conclusions The results show that sentence comprehension in people with aphasia is influenced by word frequency and presentation modality. PMID:22294411
Wang, Cheng; Zhang, Qingfang
2015-01-01
To what extent do phonological codes constrain orthographic output in handwritten production? We investigated how phonological codes constrain the selection of orthographic codes via sublexical and lexical routes in Chinese written production. Participants wrote down picture names in a picture-naming task in Experiment 1or response words in a symbol—word associative writing task in Experiment 2. A sublexical phonological property of picture names (phonetic regularity: regular vs. irregular) in Experiment 1and a lexical phonological property of response words (homophone density: dense vs. sparse) in Experiment 2, as well as word frequency of the targets in both experiments, were manipulated. A facilitatory effect of word frequency was found in both experiments, in which words with high frequency were produced faster than those with low frequency. More importantly, we observed an inhibitory phonetic regularity effect, in which low-frequency picture names with regular first characters were slower to write than those with irregular ones, and an inhibitory homophone density effect, in which characters with dense homophone density were produced more slowly than those with sparse homophone density. Results suggested that phonological codes constrained handwritten production via lexical and sublexical routes. PMID:25879662
Reingold, Eyal M.; Reichle, Erik D.; Glaholt, Mackenzie G.; Sheridan, Heather
2013-01-01
Participants’ eye movements were monitored in an experiment that manipulated the frequency of target words (high vs. low) as well as their availability for parafoveal processing during fixations on the pre-target word (valid vs. invalid preview). The influence of the word-frequency by preview validity manipulation on the distributions of first fixation duration was examined by using ex-Gaussian fitting as well as a novel survival analysis technique which provided precise estimates of the timing of the first discernible influence of word frequency on first fixation duration. Using this technique, we found a significant influence of word frequency on fixation duration in normal reading (valid preview) as early as 145 ms from the start of fixation. We also demonstrated an equally rapid non-lexical influence on first fixation duration as a function of initial landing position (location) on target words. The time-course of frequency effects, but not location effects was strongly influenced by preview validity, demonstrating the crucial role of parafoveal processing in enabling direct lexical control of reading fixation times. Implications for models of eye-movement control are discussed. PMID:22542804
Niefind, Florian; Dimigen, Olaf
2016-12-01
During reading, the parafoveal processing of an upcoming word n+1 can influence word recognition in two ways: It can affect fixation behavior during the preceding fixation on word n (parafovea-on-fovea effect, POF), and it can facilitate subsequent foveal processing once word n+1 is fixated (preview benefit). While preview benefits are established, evidence for POF effects is mixed. Recently, it has been suggested that POF effects exist, but have a delayed impact on saccade planning and thus coincide with preview benefits measured on word n+1. We combined eye movement and EEG recordings to investigate and separate neural correlates of POF and preview benefit effects. Participants read lists of nouns either in a boundary paradigm or the RSVP-with-flankers paradigm, while we recorded fixation- or event-related potentials (FRPs/ERPs), respectively. The validity and lexical frequency of the word shown as preview for the upcoming word n+1 were orthogonally manipulated. Analyses focused on the first fixation on word n+1. Preview validity (correct vs. incorrect preview) strongly modulated fixation times and electrophysiological N1 amplitudes, replicating previous findings. Importantly, gaze durations and FRPs measured on word n+1 were also affected by the frequency of the word shown as preview, with low-frequency previews eliciting a sustained, N400-like centroparietal negativity. Results support the idea that POF effects exist but affect word recognition with a delay. Lastly, once word n+1 was fixated, its frequency also modulated N1 amplitudes in ERPs and FRPs. Taken together, we separated immediate and delayed effects of parafoveal processing on brain correlates of word recognition. © 2016 Society for Psychophysiological Research.
Car manufacturers and global road safety: a word frequency analysis of road safety documents.
Roberts, I; Wentz, R; Edwards, P
2006-10-01
The World Bank believes that the car manufacturers can make a valuable contribution to road safety in poor countries and has established the Global Road Safety Partnership (GRSP) for this purpose. However, some commentators are sceptical. The authors examined road safety policy documents to assess the extent of any bias. Word frequency analyses of road safety policy documents from the World Health Organization (WHO) and the GRSP. The relative occurrence of key road safety terms was quantified by calculating a word prevalence ratio with 95% confidence intervals. Terms for which there was a fourfold difference in prevalence between the documents were tabulated. Compared to WHO's World report on road traffic injury prevention, the GRSP road safety documents were substantially less likely to use the words speed, speed limits, child restraint, pedestrian, public transport, walking, and cycling, but substantially more likely to use the words school, campaign, driver training, and billboard. There are important differences in emphasis in road safety policy documents prepared by WHO and the GRSP. Vigilance is needed to ensure that the road safety interventions that the car industry supports are based on sound evidence of effectiveness.
The Influence of Item Properties on Association-Memory
ERIC Educational Resources Information Center
Madan, Christopher R.; Glaholt, Mackenzie G.; Caplan, Jeremy B.
2010-01-01
Word properties like imageability and word frequency improve cued recall of verbal paired-associates. We asked whether these enhancements follow simply from prior effects on item-memory, or also strengthen associations between items. Participants studied word pairs varying in imageability or frequency: pairs were "pure" (high-high, low-low) or…
Distributional Effects of Word Frequency on Eye Fixation Durations
ERIC Educational Resources Information Center
Staub, Adrian; White, Sarah J.; Drieghe, Denis; Hollway, Elizabeth C.; Rayner, Keith
2010-01-01
Recent research using word recognition paradigms, such as lexical decision and speeded pronunciation, has investigated how a range of variables affect the location and shape of response time distributions, using both parametric and non-parametric techniques. In this article, we explore the distributional effects of a word frequency manipulation on…
Emotion and language: Valence and arousal affect word recognition
Brysbaert, Marc; Warriner, Amy Beth
2014-01-01
Emotion influences most aspects of cognition and behavior, but emotional factors are conspicuously absent from current models of word recognition. The influence of emotion on word recognition has mostly been reported in prior studies on the automatic vigilance for negative stimuli, but the precise nature of this relationship is unclear. Various models of automatic vigilance have claimed that the effect of valence on response times is categorical, an inverted-U, or interactive with arousal. The present study used a sample of 12,658 words, and included many lexical and semantic control factors, to determine the precise nature of the effects of arousal and valence on word recognition. Converging empirical patterns observed in word-level and trial-level data from lexical decision and naming indicate that valence and arousal exert independent monotonic effects: Negative words are recognized more slowly than positive words, and arousing words are recognized more slowly than calming words. Valence explained about 2% of the variance in word recognition latencies, whereas the effect of arousal was smaller. Valence and arousal do not interact, but both interact with word frequency, such that valence and arousal exert larger effects among low-frequency words than among high-frequency words. These results necessitate a new model of affective word processing whereby the degree of negativity monotonically and independently predicts the speed of responding. This research also demonstrates that incorporating emotional factors, especially valence, improves the performance of models of word recognition. PMID:24490848
Recognition memory and awareness: A high-frequency advantage in the accuracy of knowing.
Gregg, Vernon H; Gardiner, John M; Karayianni, Irene; Konstantinou, Ira
2006-04-01
The well-established advantage of low-frequency words over high-frequency words in recognition memory has been found to occur in remembering and not knowing. Two experiments employed remember and know judgements, and divided attention to investigate the possibility of an effect of word frequency on know responses given appropriate study conditions. With undivided attention at study, the usual low-frequency advantage in the accuracy of remember responses, but no effect on know responses, was obtained. Under a demanding divided attention task at encoding, a high-frequency advantage in the accuracy of know responses was obtained. The results are discussed in relation to theories of knowing, particularly those incorporating perceptual and conceptual fluency.
A Method for Correcting Broken Hyphenations in Noisy English Text
2012-04-01
words, such as a frequency list . An algorithm that would make use of word validation, taking into account the various usages of hyphens in English, is...commas, and question marks from the surrounding words. The British National Corpus (2) (BNC) frequency list was used to perform the validation...rather than a separate spell checking program. This was primarily because implementation of the algorithm using a frequency list was quite trivial
ERIC Educational Resources Information Center
Goodman, Kenneth S.; Bird, Lois Bridges
Analyzing word frequency in six complete texts, a study investigated how vocabulary can be used to define texts. The texts included three stories from 5th and 6th grade readers, selections from literature anthologies for 8th grade and 12th grade students, and a magazine essay for adults. Results indicated that if particular words occur frequently…
ERIC Educational Resources Information Center
Plante, Elena; Bahl, Megha; Vance, Rebecca; Gerken, LouAnn
2011-01-01
Phonotactic frequency effects on word production are thought to reflect accumulated experience with a language. Here we demonstrate that frequency effects can also be obtained through short-term manipulations of the input to children. We presented children with nonwords in an experiment that systematically manipulated English phonotactic frequency…
Ratner, Nan Bernstein; Newman, Rochelle; Strekas, Amy
2009-12-01
In a prior study (Newman & Bernstein Ratner, 2007), we examined the effects of word frequency and phonological neighborhood characteristics on confrontation naming latency, accuracy and fluency in adults who stutter and typically fluent speakers. A small difference in accuracy favoring fluent adults was noted, but no other patterns differentiated fluent speaker responses from those obtained from the adults who stutter. Because lexical organization or retrieval differences might be more easily observed in less mature language users, we replicated the experiment using 15 children who stutter (ages 4;10 16;2) and age- and gender-matched peers. Results replicated the earlier study: the two groups of participants showed strikingly similar patterns of responses based on word frequency and neighborhood characteristics. There were also no differences in naming accuracy overall between the two groups. Given our results and those of other researchers who have explored the impact of neighborhood variables on lexical retrieval in people who stutter, we suggest that differences between language production in PWS and fluent speakers are not likely to involve atypical phonological organization of lexical neighborhoods. After reading this article, the reader will be able to: (1) define and illustrate words that have differing frequency and phonological neighborhood characteristics; (2) evaluate whether or not children who stutter appear to organize their mental lexicons differently than those of children who are typically fluent; (3) suggest future areas of research into language processing in people who stutter.
Chang, Xing; Zhou, Xin; Luo, Linzhi; Yang, Chengjia; Pan, Hui; Zhang, Shuyang
2017-09-12
This study aimed to identify hotspots in research on clinical competence measurements from 2012 to 2016. The authors retrieved literature published between 2012 and 2016 from PubMed using selected medical subject headings (MeSH) terms. They used BibExcel software to generate high-frequency MeSH terms and identified hotspots by co-word analysis and cluster analysis. The authors searched 588 related articles and identified 31 high-frequency MeSH terms. In addition, they obtained 6 groups of high-frequency MeSH terms that reflected the domain hotspots. This study identified 6 hotspots of domain research, including studies on influencing factors and perception evaluation, improving and developing measurement tools, feedback measurement, measurement approaches based on computer simulation, the measurement of specific students in different learning phases, and the measurement of students' communication ability. All of these research topics could provide useful information for educators and researchers to continually conduct in-depth studies.
Schotter, Elizabeth R; Bicknell, Klinton; Howard, Ian; Levy, Roger; Rayner, Keith
2014-04-01
It is well-known that word frequency and predictability affect processing time. These effects change magnitude across tasks, but studies testing this use tasks with different response types (e.g., lexical decision, naming, and fixation time during reading; Schilling, Rayner, & Chumbley, 1998), preventing direct comparison. Recently, Kaakinen and Hyönä (2010) overcame this problem, comparing fixation times in reading for comprehension and proofreading, showing that the frequency effect was larger in proofreading than in reading. This result could be explained by readers exhibiting substantial cognitive flexibility, and qualitatively changing how they process words in the proofreading task in a way that magnifies effects of word frequency. Alternatively, readers may not change word processing so dramatically, and instead may perform more careful identification generally, increasing the magnitude of many word processing effects (e.g., both frequency and predictability). We tested these possibilities with two experiments: subjects read for comprehension and then proofread for spelling errors (letter transpositions) that produce nonwords (e.g., trcak for track as in Kaakinen & Hyönä) or that produce real but unintended words (e.g., trial for trail) to compare how the task changes these effects. Replicating Kaakinen and Hyönä, frequency effects increased during proofreading. However, predictability effects only increased when integration with the sentence context was necessary to detect errors (i.e., when spelling errors produced words that were inappropriate in the sentence; trial for trail). The results suggest that readers adopt sophisticated word processing strategies to accommodate task demands. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Lexical effects on speech production and intelligibility in Parkinson's disease
NASA Astrophysics Data System (ADS)
Chiu, Yi-Fang
Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the recognition process. The relationship between acoustic results and perceptual accuracy is limited in this study suggesting that listeners incorporate acoustic and non-acoustic information to maximize speech intelligibility.
Examining assortativity in the mental lexicon: Evidence from word associations.
Van Rensbergen, Bram; Storms, Gert; De Deyne, Simon
2015-12-01
Words are characterized by a variety of lexical and psychological properties, such as their part of speech, word-frequency, concreteness, or affectivity. In this study, we examine how these properties relate to a word's connectivity in the mental lexicon, the structure containing a person's knowledge of words. In particular, we examine the extent to which these properties display assortative mixing, that is, the extent to which words in the lexicon are more likely to be connected to words that share these properties. We investigated three types of word properties: 1) subjective word covariates: valence, dominance, arousal, and concreteness; 2) lexical information: part of speech; and 3) distributional word properties: age-of-acquisition, word frequency, and contextual diversity. We assessed which of these factors exhibit assortativity using a word association task, where the probability of producing a certain response to a cue is a measure of the associative strength between the cue and response in the mental lexicon. Our results show that the extent to which these aspects exhibit assortativity varies considerably, with a high cue-response correspondence on valence, dominance, arousal, concreteness, and part of speech, indicating that these factors correspond to the words people deem as related. In contrast, we find that cues and responses show only little correspondence on word frequency, contextual diversity, and age-of-acquisition, indicating that, compared to subjective and lexical word covariates, distributional properties exhibit only little assortativity in the mental lexicon. Possible theoretical accounts and implications of these findings are discussed.
How do we use language? Shared patterns in the frequency of word use across 17 world languages
Calude, Andreea S.; Pagel, Mark
2011-01-01
We present data from 17 languages on the frequency with which a common set of words is used in everyday language. The languages are drawn from six language families representing 65 per cent of the world's 7000 languages. Our data were collected from linguistic corpora that record frequencies of use for the 200 meanings in the widely used Swadesh fundamental vocabulary. Our interest is to assess evidence for shared patterns of language use around the world, and for the relationship of language use to rates of lexical replacement, defined as the replacement of a word by a new unrelated or non-cognate word. Frequencies of use for words in the Swadesh list range from just a few per million words of speech to 191 000 or more. The average inter-correlation among languages in the frequency of use across the 200 words is 0.73 (p < 0.0001). The first principal component of these data accounts for 70 per cent of the variance in frequency of use. Elsewhere, we have shown that frequently used words in the Indo-European languages tend to be more conserved, and that this relationship holds separately for different parts of speech. A regression model combining the principal factor loadings derived from the worldwide sample along with their part of speech predicts 46 per cent of the variance in the rates of lexical replacement in the Indo-European languages. This suggests that Indo-European lexical replacement rates might be broadly representative of worldwide rates of change. Evidence for this speculation comes from using the same factor loadings and part-of-speech categories to predict a word's position in a list of 110 words ranked from slowest to most rapidly evolving among 14 of the world's language families. This regression model accounts for 30 per cent of the variance. Our results point to a remarkable regularity in the way that human speakers use language, and hint that the words for a shared set of meanings have been slowly evolving and others more rapidly evolving throughout human history. PMID:21357232
Word naming times and psycholinguistic norms for Italian nouns.
Barca, Laura; Burani, Cristina; Arduino, Lisa S
2002-08-01
The present study describes normative measures for 626 Italian simple nouns. The database (LEXVAR.XLS) is freely available for down-loading on the Web site http://wwwistc.ip.rm.cnr.it/materia/database/. For each of the 626 nouns, values for the following variables are reported: age of acquisition, familiarity, imageability, concreteness, adult written frequency, child written frequency, adult spoken frequency, number of orthographic neighbors, mean bigram frequency, length in syllables, and length in letters. A classification of lexical stress and of the type of word-initial phoneme is also provided. The intercorrelations among the variables, a factor analysis, and the effects of variables and of the extracted factors on word naming are reported. Naming latencies were affected primarily by a factor including word length and neighborhood size and by a word frequency factor. Neither a semantic factor including imageability, concreteness, and age of acquisition nor a factor defined by mean bigram frequency had significant effects on pronunciation times. These results hold for a language with shallow orthography, like Italian, for which lexical nonsemantic properties have been shown to affect reading aloud. These norms are useful in a variety of research areas involving the manipulation and control of stimulus attributes.
Bronk, Maria; Zwitserlood, Pienie; Bölte, Jens
2013-01-01
We tested current models of morphological processing in reading with data from four visual lexical decision experiments using German compounds and monomorphemic words. Triplets of two semantically transparent noun-noun compounds and one monomorphemic noun were used in Experiments 1a and 1b. Stimuli within a triplet were matched for full-form frequency. The frequency of the compounds' constituents was varied. The compounds of a triplet shared one constituent, while the frequency of the unshared constituent was either high or low, but always higher than full-form frequency. Reactions were faster to compounds with high-frequency constituents than to compounds with low-frequency constituents, while the latter did not differ from the monomorphemic words. This pattern was not influenced by task difficulty, induced by the type of pseudocompounds used. Pseudocompounds were either created by altering letters of an existing compound (easy pseudocompound, Experiment 1a) or by combining two free morphemes into a non-existing, but morphologically legal, compound (difficult pseudocompound, Experiment 1b). In Experiments 2a and 2b, frequency-matched pairs of semantically opaque noun-noun compounds and simple nouns were tested. In Experiment 2a, with easy pseudocompounds (of the same type as in Experiment 1a), a reaction-time advantage for compounds over monomorphemic words was again observed. This advantage disappeared in Experiment 2b, where difficult pseudocompounds were used. Although a dual-route might account for the data, the findings are best understood in terms of decomposition of low-frequency complex words prior to lexical access, followed by processing costs due to the recombination of morphemes for meaning access. These processing costs vary as a function of intrinsic factors such as semantic transparency, or external factors such as the difficulty of the experimental task. PMID:23986731
The effect of character contextual diversity on eye movements in Chinese sentence reading.
Chen, Qingrong; Zhao, Guoxia; Huang, Xin; Yang, Yiming; Tanenhaus, Michael K
2017-12-01
Chen, Huang, et al. (Psychonomic Bulletin & Review, 2017) found that when reading two-character Chinese words embedded in sentence contexts, contextual diversity (CD), a measure of the proportion of texts in which a word appears, affected fixation times to words. When CD is controlled, however, frequency did not affect reading times. Two experiments used the same experimental designs to examine whether there are frequency effects of the first character of two-character words when CD is controlled. In Experiment 1, yoked triples of characters from a control group, a group matched for character CD that is lower in frequency, and a group matched in frequency with the control group, but higher in character CD, were rotated through the same sentence frame. In Experiment 2 each character from a larger set was embedded in a separate sentence frame, allowing for a larger difference in log frequency compared to Experiment 1 (0.8 and 0.4, respectively). In both experiments, early and later eye movement measures were significantly shorter for characters with higher CD than for characters with lower CD, with no effects of character frequency. These results place constraints on models of visual word recognition and suggest ways in which Chinese can be used to tease apart the nature of context effects in word recognition and language processing in general.
Exploring Native and Non-Native Intuitions of Word Frequency.
ERIC Educational Resources Information Center
Schmitt, Norbert; Dunham, Bruce
1999-01-01
Asked native and nonnative speakers to give judgments of frequency for near synonyms in second-language lexical sets and compared those responses to modern corpus word counts. Native speakers were able to discern the core word in lexical sets either 77% or 85%, and nonnative speakers at 71% or 79%. (Author/VWL)
The Effect of Frequency of Input-Enhancements on Word Learning and Text Comprehension
ERIC Educational Resources Information Center
Rott, Susanne
2007-01-01
Research on second language lexical development during reading has found positive effects for word frequency, the provision of glosses, and elaborative word processing. However, findings have been inconclusive regarding the effect of such intervention tasks on long-term retention. Likewise, few studies have looked at the cumulative effect of…
The Left Fusiform Area Is Affected by Written Frequency of Words
ERIC Educational Resources Information Center
Proverbio, Alice M.; Zani, Alberto; Adorni, Roberta
2008-01-01
The recent neuroimaging literature gives conflicting evidence about whether the left fusiform gyrus (FG) might recognize words as unitary visual objects. The sensitivity of the left FG to word frequency might provide a neural basis for the orthographic input lexicon theorized by reading models [Patterson, K., Marshall, J. C., & Coltheart, M.…
[A co-word analysis of current research on neonatal jaundice].
Bao, Shan; Yang, Xiao-Yan; Tang, Jun; Wu, Jin-Lin; Mu, De-Zhi
2014-08-01
To investigate the research on neonatal jaundice in recent years by co-word analysis and to summarize the hot spots and trend of research in this field in China. The CNKI was searched with "neonate" and "jaundice" as the key words to identify the papers published from January 2009 to July 2013 that were in accordance with strict inclusion and exclusion criteria. To reveal the relationship between different high-frequency key words, Microsoft Office Excel 2013 was used for statistical analysis of key words, and Ucinet 6.0 and Netdraw were used for co-occurrence analysis. A total of 2 054 papers were included, and 44 high-frequency key words were extracted. The current hotspots of research on neonatal jaundice in China were displayed, and the relationship between different high-frequency key words was presented. There has been in-depth research on clinical manifestations and diagnosis of neonatal jaundice in China, but further research is needed to investigate the etiology, mechanism, and treatment of neonatal jaundice.
Not All Reading Disabilities Are Dyslexia: Distinct Neurobiology of Specific Comprehension Deficits
Clements-Stephens, Amy; Pugh, Kenneth R.; Burns, Scott; Cao, Aize; Pekar, James J.; Davis, Nicole; Rimrodt, Sheryl L.
2013-01-01
Abstract Although an extensive literature exists on the neurobiological correlates of dyslexia (DYS), to date, no studies have examined the neurobiological profile of those who exhibit poor reading comprehension despite intact word-level abilities (specific reading comprehension deficits [S-RCD]). Here we investigated the word-level abilities of S-RCD as compared to typically developing readers (TD) and those with DYS by examining the blood oxygenation-level dependent response to words varying on frequency. Understanding whether S-RCD process words in the same manner as TD, or show alternate pathways to achieve normal word-reading abilities, may provide insights into the origin of this disorder. Results showed that as compared to TD, DYS showed abnormal covariance during word processing with right-hemisphere homologs of the left-hemisphere reading network in conjunction with left occipitotemporal underactivation. In contrast, S-RCD showed an intact neurobiological response to word stimuli in occipitotemporal regions (associated with fast and efficient word processing); however, inferior frontal gyrus (IFG) abnormalities were observed. Specifically, TD showed a higher-percent signal change within right IFG for low-versus-high frequency words as compared to both S-RCD and DYS. Using psychophysiological interaction analyses, a coupling-by-reading group interaction was found in right IFG for DYS, as indicated by a widespread greater covariance between right IFG and right occipitotemporal cortex/visual word-form areas, as well as bilateral medial frontal gyrus, as compared to TD. For S-RCD, the context-dependent functional interaction anomaly was most prominently seen in left IFG, which covaried to a greater extent with hippocampal, parahippocampal, and prefrontal areas than for TD for low- as compared to high-frequency words. Given the greater lexical access demands of low frequency as compared to high-frequency words, these results may suggest specific weaknesses in accessing lexical-semantic representations during word recognition. These novel findings provide foundational insights into the nature of S-RCD, and set the stage for future investigations of this common, but understudied, reading disorder. PMID:23273430
The Processing of Singular and Plural Nouns in French and English
ERIC Educational Resources Information Center
New, Boris; Brysbaert, Marc; Segui, Juan; Ferrand, Ludovic; Rastle, Kathleen
2004-01-01
Contradictory data have been obtained about the processing of singular and plural nouns in Dutch and English. Whereas the Dutch findings point to an influence of the base frequency of the singular and the plural word forms on lexical decision times (Baayen, Dijkstra, & Schreuder, 1997), the English reaction times depend on the surface frequency of…
Effects of semantic neighborhood density in abstract and concrete words.
Reilly, Megan; Desai, Rutvik H
2017-12-01
Concrete and abstract words are thought to differ along several psycholinguistic variables, such as frequency and emotional content. Here, we consider another variable, semantic neighborhood density, which has received much less attention, likely because semantic neighborhoods of abstract words are difficult to measure. Using a corpus-based method that creates representations of words that emphasize featural information, the current investigation explores the relationship between neighborhood density and concreteness in a large set of English nouns. Two important observations emerge. First, semantic neighborhood density is higher for concrete than for abstract words, even when other variables are accounted for, especially for smaller neighborhood sizes. Second, the effects of semantic neighborhood density on behavior are different for concrete and abstract words. Lexical decision reaction times are fastest for words with sparse neighborhoods; however, this effect is stronger for concrete words than for abstract words. These results suggest that semantic neighborhood density plays a role in the cognitive and psycholinguistic differences between concrete and abstract words, and should be taken into account in studies involving lexical semantics. Furthermore, the pattern of results with the current feature-based neighborhood measure is very different from that with associatively defined neighborhoods, suggesting that these two methods should be treated as separate measures rather than two interchangeable measures of semantic neighborhoods. Copyright © 2017 Elsevier B.V. All rights reserved.
Mind the gap: Increased inter-letter spacing as a means of improving reading performance.
Dotan, Shahar; Katzir, Tami
2018-06-05
Theeffects of text display, specificallywithin-word spacing, on children's reading at different developmental levels has barely been investigated.This study explored the influence of manipulating inter-letter spacing on reading performance (accuracy and rate) of beginner Hebrew readers compared with older readers and of low-achieving readers compared with age-matched high-achieving readers.A computer-based isolated word reading task was performed by 132 first and third graders. Words were displayed under two spacing conditions: standard spacing (100%) and increased spacing (150%). Words were balanced for length and frequency across conditions. Results indicated that increased spacing contributed to reading accuracy without affecting reading rate. Interestingly, all first graders benefitted fromthe spaced condition. Thiseffect was found only in long words but not in short words. Among third graders, only low-achieving readers gained in accuracy fromthespaced condition. Thetheoretical and clinical effects ofthefindings are discussed. Copyright © 2018 Elsevier Inc. All rights reserved.
Domahs, Ulrike; Knaus, Johannes A.; El Shanawany, Heba; Wiese, Richard
2014-01-01
This article presents neurolinguistic data on word stress perception in Cairene Arabic, in comparison to previous results on German and Turkish. The main goal is to investigate how central properties of stress systems such as predictability of stress and metrical structure are reflected in the prosodic processing of words. Cairene Arabic is a language with a regular foot-based word stress system, leading to highly predictable placement of word stress. An ERP study on Cairene Arabic is reported, in which a stress violation paradigm is used to investigate the factors predictability of stress and foot structure. The results of the experiment show that for Cairene Arabic the internal structure of prosodic words in terms of feet determines prosodic processing. This structure effect is complemented by a frequency effect for stress patterns. PMID:25374546
Generalized entropies and the similarity of texts
NASA Astrophysics Data System (ADS)
Altmann, Eduardo G.; Dias, Laércio; Gerlach, Martin
2017-01-01
We show how generalized Gibbs-Shannon entropies can provide new insights on the statistical properties of texts. The universal distribution of word frequencies (Zipf’s law) implies that the generalized entropies, computed at the word level, are dominated by words in a specific range of frequencies. Here we show that this is the case not only for the generalized entropies but also for the generalized (Jensen-Shannon) divergences, used to compute the similarity between different texts. This finding allows us to identify the contribution of specific words (and word frequencies) for the different generalized entropies and also to estimate the size of the databases needed to obtain a reliable estimation of the divergences. We test our results in large databases of books (from the google n-gram database) and scientific papers (indexed by Web of Science).
Brand, Sophie; Ernestus, Mirjam
2018-05-01
In casual conversations, words often lack segments. This study investigates whether listeners rely on their experience with reduced word pronunciation variants during the processing of single segment reduction. We tested three groups of listeners in a lexical decision experiment with French words produced either with or without word-medial schwa (e.g., /ʀvy/ and /ʀvy/ for revue). Participants also rated the relative frequencies of the two pronunciation variants of the words. If the recognition accuracy and reaction times (RTs) for a given listener group correlate best with the frequencies of occurrence holding for that given listener group, recognition is influenced by listeners' exposure to these variants. Native listeners' relative frequency ratings correlated well with their accuracy scores and RTs. Dutch advanced learners' accuracy scores and RTs were best predicted by their own ratings. In contrast, the accuracy and RTs from Dutch beginner learners of French could not be predicted by any relative frequency rating; the rating task was probably too difficult for them. The participant groups showed behaviour reflecting their difference in experience with the pronunciation variants. Our results strongly suggest that listeners store the frequencies of occurrence of pronunciation variants, and consequently the variants themselves.
Zipf’s Law and the Frequency of Kazak Phonemes in Word Formation
NASA Astrophysics Data System (ADS)
Xin, Ruiqing; Li, Yonghong; Yu, Hongzhi
2018-03-01
Zipf’s Law is the basis of the principle of Least Effort, and is widely applicable in all natural fields. The occurring frequency of each phoneme in all Kazak words has been counted to testify the application of Zipf’s law in Kazak. Due to the limitation of the sample size, deviation is unavoidable, but overall results indicate that the occurring frequency and the reciprocal rank of each phoneme in Kazak words formation are in line with Zipf’s distribution.
ERIC Educational Resources Information Center
Fu, Zhuqin
2006-01-01
To many Chinese students, learning the words such as "make" and "do" seems a piece of cake, yet learning how to use them appropriately is anther case. This paper aims to investigate Chinese learners' use of the verbs "make" and "do", two major representatives of high-frequency words from the perspective of…
Maintenance Rehearsal: The Key to the Role Attention Plays in Storage and Forgetting
ERIC Educational Resources Information Center
McFarlane, Kimberley A.; Humphreys, Michael S.
2012-01-01
Research with the maintenance-rehearsal paradigm, in which word pairs are rehearsed as distractor material during a series of digit recall trials, has previously indicated that low frequency and new word pairs capture attention to a greater degree than high frequency and old word pairs. This impacts delayed recognition of the pairs and interferes…
ERIC Educational Resources Information Center
Kretzschmar, Franziska; Schlesewsky, Matthias; Staub, Adrian
2015-01-01
Two very reliable influences on eye fixation durations in reading are word frequency, as measured by corpus counts, and word predictability, as measured by cloze norming. Several studies have reported strictly additive effects of these 2 variables. Predictability also reliably influences the amplitude of the N400 component in event-related…
ERIC Educational Resources Information Center
Engels, L.K.
1968-01-01
The greatest fallacy of word counts, the author maintains, lies in the fact that advocates of frequency lists stress the high percentage without telling the whole truth. It has become common to pretend that a frequency list of 3,000 words covers 95 percent of the language, that it enables a person to speak and understand a foreign language by…
Mainela-Arnold, Elina; Evans, Julia L.; Coady, Jeffry
2010-01-01
Purpose This study investigated the impact of lexical processes on target word recall in sentence span tasks in children with and without specific language impairment (SLI). Method Participants were 42 children (ages 8;2–12;3), 21 with SLI and 21 typically developing peers matched on age and nonverbal IQ. Children completed a sentence span task where target words to be recalled varied in word frequency and neighborhood density. Two measures of lexical processes were examined, the number of non-target competitor words activated during a gating task (lexical cohort competition) and word definitions. Results Neighborhood density had no effect on word recall for either group. However, both groups recalled significantly more high than low frequency words. Lexical cohort competition and specificity of semantic representations accounted for unique variance in the number of target word recalled in the SLI and CA groups combined. Conclusions Performance on verbal working memory span tasks for both SLI and CA children is influenced by word frequency, lexical cohorts, and semantic representations. Future studies need to examine the extent to which verbal working memory capacity is a cognitive construct independent of extant language knowledge representations. PMID:20705747
Rinaldi, Pasquale; Barca, Laura; Burani, Cristina
2004-08-01
The CFVlexvar.xls database includes imageability, frequency, and grammatical properties of the first words acquired by Italian children. For each of 519 words that are known by children 18-30 months of age (taken from Caselli & Casadio's, 1995, Italian version of the MacArthur Communicative Development Inventory), new values of imageability are provided and values for age of acquisition, child written frequency, and adult written and spoken frequency are included. In this article, correlations among the variables are discussed and the words are grouped into grammatical categories. The results show that words acquired early have imageable referents, are frequently used in the texts read and written by elementary school children, and are frequent in adult written and spoken language. Nouns are acquired earlier and are more imageable than both verbs and adjectives. The composition in grammatical categories of the child's first vocabulary reflects the composition of adult vocabulary. The full set of these norms can be downloaded from www.psychonomic.org/archive/.
The role of low-spatial frequencies in lexical decision and masked priming.
Boden, C; Giaschi, D
2009-04-01
Spatial frequency filtering was used to test the hypotheses that low-spatial frequency information in printed text can: (1) lead to a rapid lexical decision or (2) facilitate word recognition. Adult proficient readers made lexical decisions in unprimed and masked repetition priming experiments with unfiltered, low-pass, high-pass and notch filtered letter strings. In the unprimed experiments, a filtered target was presented for 105 or 400 ms followed by a pattern mask. Sensitivity (d') was lowest for the low-pass filtered targets at both durations with a bias towards a 'non-word' response. Sensitivity was higher in the high-pass and notch filter conditions. In the priming experiments, a forward mask was followed by a filtered prime then an unfiltered target. Primed words, but not non-words, were identified faster than unprimed words in both the low-pass and high-pass filtered conditions. These results do not support a unique role for low-spatial frequency information in either facilitating or making rapid lexical decisions.
Utterance complexity and stuttering on function words in preschool-age children who stutter.
Richels, Corrin; Buhr, Anthony; Conture, Edward; Ntourou, Katerina
2010-09-01
The purpose of the present investigation was to examine the relation between utterance complexity and utterance position and the tendency to stutter on function words in preschool-age children who stutter (CWS). Two separate studies involving two different groups of participants (Study 1, n=30; Study 2, n=30) were conducted. Participants were preschool-age CWS between the age of 3, 0 and 5, 11 who engaged in 15-20min parent-child conversational interactions. From audio-video recordings of each interaction, every child utterance of each parent-child sample was transcribed. From these transcripts, for each participant, measures of language (e.g., length and complexity) and measures of stuttering (e.g., word type and utterance position) were obtained. Results of Study 1 indicated that children stuttered more frequently on function words, but that this tendency was not greater for complex than simple utterances. Results of Study 2, involving the assessment of utterance position and MLU quartile, indicated that that stuttering was more likely to occur with increasing sentence length, and that stuttering tended to occur at the utterance-initial position, the position where function words were also more likely to occur. Findings were taken to suggest that, although word-level influences cannot be discounted, utterance-level influences contribute to the loci of stuttering in preschool-age children, and may help account for developmental changes in the loci of stuttering. The reader will learn about and be able to: (a) describe the influence of word type (function versus content words), and grammatical complexity, on disfluent speech; (b) compare the effect of stuttering frequency based on the position of the word in the utterance; (c) discuss the contribution of utterance position on the frequency of stuttering on function words; and (d) explain possible reasons why preschoolers stutter more frequently on function words than content words.
ERIC Educational Resources Information Center
Caramazza, Alfonso; Bi, Yanchao; Costa, Albert; Miozzo, Michelle
2004-01-01
A. Caramazza, A. Costa, M. Miozzo, and Y. Bi (2001) reported a series of experiments showing that naming latencies for homophones are determined by specific-word frequency (e.g., frequency of nun) and not homophone frequency (frequency of nun + none). J. D. Jescheniak, A. S. Meyer, and W. J. M. Levelt (2003) have challenged these studies on a…
Genes2WordCloud: a quick way to identify biological themes from gene lists and free text.
Baroukh, Caroline; Jenkins, Sherry L; Dannenfelser, Ruth; Ma'ayan, Avi
2011-10-13
Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications.
Genes2WordCloud: a quick way to identify biological themes from gene lists and free text
2011-01-01
Background Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Results Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Methods Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Conclusions Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications. PMID:21995939
The span of correlations in dolphin whistle sequences
NASA Astrophysics Data System (ADS)
Ferrer-i-Cancho, Ramon; McCowan, Brenda
2012-06-01
Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-α) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle-whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements.
Angelelli, Paola; Marinelli, Chiara Valeria; Putzolu, Anna; Notarnicola, Alessandra; Iaia, Marika; Burani, Cristina
2018-03-01
We examined how whole-word lexical information and knowledge of distributional properties of orthography interact in children's spelling. High- versus low-frequency words, which included inconsistently spelled segments occurring more or less frequently in the orthography, were used in two experiments: (a) word spelling; (b) lexical priming of pseudoword spelling. Participants were 1st-, 2nd-, and 4th-grade Italian children. Word spelling showed sensitivity to the distributional properties of orthography in all children: accuracy in spelling uncommon transcription segments emerged progressively as a function of word frequency and schooling. Lexical priming effects emerged as a function of age. When related primes contained an uncommon segment, 2nd- and 4th-graders preferred uncommon segments than common ones in spelling target pseudowords, thus inverting the response trend found in the control condition. A smaller but significant effect was present in 1st- graders, who, unlike 2nd- and 4th-graders, still preferred common segments, only slightly increasing the use of uncommon ones. A larger priming effect emerged for high-frequency primes than low-frequency ones. Results indicate that children learning to spell in a transparent orthography are sensitive to the distributional properties of the orthography. However, whole-word lexical representations are also used, with larger effects in more skilled pupils.
Dissociating Visual Form from Lexical Frequency Using Japanese
ERIC Educational Resources Information Center
Twomey, Tae; Duncan, Keith J. Kawabata; Hogan, John S.; Morita, Kenji; Umeda, Kazumasa; Sakai, Katsuyuki; Devlin, Joseph T.
2013-01-01
In Japanese, the same word can be written in either morphographic Kanji or syllabographic Hiragana and this provides a unique opportunity to disentangle a word's lexical frequency from the frequency of its visual form--an important distinction for understanding the neural information processing in regions engaged by reading. Behaviorally,…
How Word Frequency Affects Morphological Processing in Monolinguals and Bilinguals
ERIC Educational Resources Information Center
Lehtonen, Minna; Laine, Matti
2003-01-01
The present study investigated processing of morphologically complex words in three different frequency ranges in monolingual Finnish speakers and Finnish-Swedish bilinguals. By employing a visual lexical decision task, we found a differential pattern of results in monolinguals vs. bilinguals. Monolingual Finns seemed to process low frequency and…
A multistream model of visual word recognition.
Allen, Philip A; Smith, Albert F; Lien, Mei-Ching; Kaut, Kevin P; Canfield, Angie
2009-02-01
Four experiments are reported that test a multistream model of visual word recognition, which associates letter-level and word-level processing channels with three known visual processing streams isolated in macaque monkeys: the magno-dominated (MD) stream, the interblob-dominated (ID) stream, and the blob-dominated (BD) stream (Van Essen & Anderson, 1995). We show that mixing the color of adjacent letters of words does not result in facilitation of response times or error rates when the spatial-frequency pattern of a whole word is familiar. However, facilitation does occur when the spatial-frequency pattern of a whole word is not familiar. This pattern of results is not due to different luminance levels across the different-colored stimuli and the background because isoluminant displays were used. Also, the mixed-case, mixed-hue facilitation occurred when different display distances were used (Experiments 2 and 3), so this suggests that image normalization can adjust independently of object size differences. Finally, we show that this effect persists in both spaced and unspaced conditions (Experiment 4)--suggesting that inappropriate letter grouping by hue cannot account for these results. These data support a model of visual word recognition in which lower spatial frequencies are processed first in the more rapid MD stream. The slower ID and BD streams may process some lower spatial frequency information in addition to processing higher spatial frequency information, but these channels tend to lose the processing race to recognition unless the letter string is unfamiliar to the MD stream--as with mixed-case presentation.
The word-length effect and disyllabic words.
Lovatt, P; Avons, S E; Masterson, J
2000-02-01
Three experiments compared immediate serial recall of disyllabic words that differed on spoken duration. Two sets of long- and short-duration words were selected, in each case maximizing duration differences but matching for frequency, familiarity, phonological similarity, and number of phonemes, and controlling for semantic associations. Serial recall measures were obtained using auditory and visual presentation and spoken and picture-pointing recall. In Experiments 1a and 1b, using the first set of items, long words were better recalled than short words. In Experiments 2a and 2b, using the second set of items, no difference was found between long and short disyllabic words. Experiment 3 confirmed the large advantage for short-duration words in the word set originally selected by Baddeley, Thomson, and Buchanan (1975). These findings suggest that there is no reliable advantage for short-duration disyllables in span tasks, and that previous accounts of a word-length effect in disyllables are based on accidental differences between list items. The failure to find an effect of word duration casts doubt on theories that propose that the capacity of memory span is determined by the duration of list items or the decay rate of phonological information in short-term memory.
Level statistics of words: Finding keywords in literary texts and symbolic sequences
NASA Astrophysics Data System (ADS)
Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.
2009-03-01
Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.
Neurally dissociable cognitive components of reading deficits in subacute stroke.
Boukrina, Olga; Barrett, A M; Alexander, Edward J; Yao, Bing; Graves, William W
2015-01-01
According to cognitive models of reading, words are processed by interacting orthographic (spelling), phonological (sound), and semantic (meaning) information. Despite extensive study of the neural basis of reading in healthy participants, little group data exist on patients with reading deficits from focal brain damage pointing to critical neural systems for reading. Here, we report on one such study. We have performed neuropsychological testing and magnetic resonance imaging on 11 patients with left-hemisphere stroke (<=5 weeks post-stroke). Patients completed tasks assessing cognitive components of reading such as semantics (matching picture or word choices to a target based on meaning), phonology (matching word choices to a target based on rhyming), and orthography (a two-alternative forced choice of the most plausible non-word). They also read aloud pseudowords and words with high or low levels of usage frequency, imageability, and spelling-sound consistency. As predicted by the cognitive model, when averaged across patients, the influence of semantics was most salient for low-frequency, low-consistency words, when phonological decoding is especially difficult. Qualitative subtraction analyses revealed lesion sites specific to phonological processing. These areas were consistent with those shown previously to activate for phonology in healthy participants, including supramarginal, posterior superior temporal, middle temporal, inferior frontal gyri, and underlying white matter. Notable divergence between this analysis and previous functional imaging is the association of lesions in the mid-fusiform gyrus and anterior temporal lobe with phonological reading deficits. This study represents progress toward identifying brain lesion-deficit relationships in the cognitive components of reading. Such correspondences are expected to help not only better understand the neural mechanisms of reading, but may also help tailor reading therapy to individual neurocognitive deficit profiles.
How Many Is Enough?—Statistical Principles for Lexicostatistics
Zhang, Menghan; Gong, Tao
2016-01-01
Lexicostatistics has been applied in linguistics to inform phylogenetic relations among languages. There are two important yet not well-studied parameters in this approach: the conventional size of vocabulary list to collect potentially true cognates and the minimum matching instances required to confirm a recurrent sound correspondence. Here, we derive two statistical principles from stochastic theorems to quantify these parameters. These principles validate the practice of using the Swadesh 100- and 200-word lists to indicate degree of relatedness between languages, and enable a frequency-based, dynamic threshold to detect recurrent sound correspondences. Using statistical tests, we further evaluate the generality of the Swadesh 100-word list compared to the Swadesh 200-word list and other 100-word lists sampled randomly from the Swadesh 200-word list. All these provide mathematical support for applying lexicostatistics in historical and comparative linguistics. PMID:28018261
Eye movements when reading sentences with handwritten words.
Perea, Manuel; Marcet, Ana; Uixera, Beatriz; Vergara-Martínez, Marta
2016-10-17
The examination of how we read handwritten words (i.e., the original form of writing) has typically been disregarded in the literature on reading. Previous research using word recognition tasks has shown that lexical effects (e.g., the word-frequency effect) are magnified when reading difficult handwritten words. To examine this issue in a more ecological scenario, we registered the participants' eye movements when reading handwritten sentences that varied in the degree of legibility (i.e., sentences composed of words in easy vs. difficult handwritten style). For comparison purposes, we included a condition with printed sentences. Results showed a larger reading cost for sentences with difficult handwritten words than for sentences with easy handwritten words, which in turn showed a reading cost relative to the sentences with printed words. Critically, the effect of word frequency was greater for difficult handwritten words than for easy handwritten words or printed words in the total times on a target word, but not on first-fixation durations or gaze durations. We examine the implications of these findings for models of eye movement control in reading.
A dual-task investigation of automaticity in visual word processing
NASA Technical Reports Server (NTRS)
McCann, R. S.; Remington, R. W.; Van Selst, M.
2000-01-01
An analysis of activation models of visual word processing suggests that frequency-sensitive forms of lexical processing should proceed normally while unattended. This hypothesis was tested by having participants perform a speeded pitch discrimination task followed by lexical decisions or word naming. As the stimulus onset asynchrony between the tasks was reduced, lexical-decision and naming latencies increased dramatically. Word-frequency effects were additive with the increase, indicating that frequency-sensitive processing was subject to postponement while attention was devoted to the other task. Either (a) the same neural hardware shares responsibility for lexical processing and central stages of choice reaction time task processing and cannot perform both computations simultaneously, or (b) lexical processing is blocked in order to optimize performance on the pitch discrimination task. Either way, word processing is not as automatic as activation models suggest.
Fond, Guillaume; Gaman, Alexandru; Brunel, Lore; Haffen, Emmanuel; Llorca, Pierre-Michel
2015-08-30
Two studies have shown that increasing the consultation of the word "suicide" in the Google search engine was associated with a subsequent increase in the prevalence of suicide attempts. The main goal of this article was to explore the trends generated by a key-word search associated with suicide, depression and bipolarity in an attempt to identify general trends (disorders epidemics in the population/"real events" vs newsworthy advertisement/"media event"). Based on previous studies, the frequency of the search words "how to suicide" and "commit suicide" were analyzed for suicide, as well as "depression" (for depressive disorders) and "bipolar disorder". Together, these analyses suggest that the search for the words "how to suicide" or "commit suicide" on the Google search engine may be a good indicator for suicide prevention policies. However, the tool is not developed enough to date to be used as a real time dynamic indicator of suicide epidemics. The frequency of the search for the word "suicide" was associated with those for "depression" but not for "bipolar disorder", but searches for psychiatric conditions seem to be influenced by media events more than by real events in the general population. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Paquot, Magali
2017-01-01
This study investigated French and Spanish EFL (English as a foreign language) learners' preferred use of three-word lexical bundles with discourse or stance-oriented function with a view to exploring the role of first language (L1) frequency effects in foreign language acquisition. Word combinations were extracted from learner performance data…
ERIC Educational Resources Information Center
Ghatala, Elizabeth S.; And Others
This study applied a frequency theory to measure the superiority of pictures over words in both discrimination learning and recognition memory tasks. Three groups of sixth grade students were given separate instructions before viewing slides of either common objects or words. The first group (control) was asked to study the items shown, the second…
ERIC Educational Resources Information Center
Bonvillian, John D.; And Others
1987-01-01
The relationship between sign language rehearsal and written free recall was examined by having deaf college students rehearse the sign language equivalents of printed English words. Studies of both immediate and delayed memory suggested that word recall increased as a function of total rehearsal frequency and frequency of appearance in rehearsal…
ERIC Educational Resources Information Center
Balota, David A.; Aschenbrenner, Andrew J.; Yap, Melvin J.
2013-01-01
A counterintuitive and theoretically important pattern of results in the visual word recognition literature is that both word frequency and stimulus quality produce large but additive effects in lexical decision performance. The additive nature of these effects has recently been called into question by Masson and Kliegl (in press), who used linear…
ERIC Educational Resources Information Center
Reingold, Eyal M.; Reichle, Erik D.; Glaholt, Mackenzie G.; Sheridan, Heather
2012-01-01
Participants' eye movements were monitored in an experiment that manipulated the frequency of target words (high vs. low) as well as their availability for parafoveal processing during fixations on the pre-target word (valid vs. invalid preview). The influence of the word-frequency by preview validity manipulation on the distributions of first…
Scaling laws and fluctuations in the statistics of word frequencies
NASA Astrophysics Data System (ADS)
Gerlach, Martin; Altmann, Eduardo G.
2014-11-01
In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths.
A novel speech processing algorithm based on harmonicity cues in cochlear implant
NASA Astrophysics Data System (ADS)
Wang, Jian; Chen, Yousheng; Zhang, Zongping; Chen, Yan; Zhang, Weifeng
2017-08-01
This paper proposed a novel speech processing algorithm in cochlear implant, which used harmonicity cues to enhance tonal information in Mandarin Chinese speech recognition. The input speech was filtered by a 4-channel band-pass filter bank. The frequency ranges for the four bands were: 300-621, 621-1285, 1285-2657, and 2657-5499 Hz. In each pass band, temporal envelope and periodicity cues (TEPCs) below 400 Hz were extracted by full wave rectification and low-pass filtering. The TEPCs were modulated by a sinusoidal carrier, the frequency of which was fundamental frequency (F0) and its harmonics most close to the center frequency of each band. Signals from each band were combined together to obtain an output speech. Mandarin tone, word, and sentence recognition in quiet listening conditions were tested for the extensively used continuous interleaved sampling (CIS) strategy and the novel F0-harmonic algorithm. Results found that the F0-harmonic algorithm performed consistently better than CIS strategy in Mandarin tone, word, and sentence recognition. In addition, sentence recognition rate was higher than word recognition rate, as a result of contextual information in the sentence. Moreover, tone 3 and 4 performed better than tone 1 and tone 2, due to the easily identified features of the former. In conclusion, the F0-harmonic algorithm could enhance tonal information in cochlear implant speech processing due to the use of harmonicity cues, thereby improving Mandarin tone, word, and sentence recognition. Further study will focus on the test of the F0-harmonic algorithm in noisy listening conditions.
Payne, Brennan R.; Lee, Chia-Lin; Federmeier, Kara D.
2015-01-01
The amplitude of the N400— an event-related potential (ERP) component linked to meaning processing and initial access to semantic memory— is inversely related to the incremental build-up of semantic context over the course of a sentence. We revisited the nature and scope of this incremental context effect, adopting a word-level linear mixed-effects modeling approach, with the goal of probing the continuous and incremental effects of semantic and syntactic context on multiple aspects of lexical processing during sentence comprehension (i.e., effects of word frequency and orthographic neighborhood). First, we replicated the classic word position effect at the single-word level: open-class words showed reductions in N400 amplitude with increasing word position in semantically congruent sentences only. Importantly, we found that accruing sentence context had separable influences on the effects of frequency and neighborhood on the N400. Word frequency effects were reduced with accumulating semantic context. However, orthographic neighborhood was unaffected by accumulating context, showing robust effects on the N400 across all words, even within congruent sentences. Additionally, we found that N400 amplitudes to closed-class words were reduced with incrementally constraining syntactic context in sentences that provided only syntactic constraints. Taken together, our findings indicate that modeling word-level variability in ERPs reveals mechanisms by which different sources of information simultaneously contribute to the unfolding neural dynamics of comprehension. PMID:26311477
Payne, Brennan R; Lee, Chia-Lin; Federmeier, Kara D
2015-11-01
The amplitude of the N400-an event-related potential (ERP) component linked to meaning processing and initial access to semantic memory-is inversely related to the incremental buildup of semantic context over the course of a sentence. We revisited the nature and scope of this incremental context effect, adopting a word-level linear mixed-effects modeling approach, with the goal of probing the continuous and incremental effects of semantic and syntactic context on multiple aspects of lexical processing during sentence comprehension (i.e., effects of word frequency and orthographic neighborhood). First, we replicated the classic word-position effect at the single-word level: Open-class words showed reductions in N400 amplitude with increasing word position in semantically congruent sentences only. Importantly, we found that accruing sentence context had separable influences on the effects of frequency and neighborhood on the N400. Word frequency effects were reduced with accumulating semantic context. However, orthographic neighborhood was unaffected by accumulating context, showing robust effects on the N400 across all words, even within congruent sentences. Additionally, we found that N400 amplitudes to closed-class words were reduced with incrementally constraining syntactic context in sentences that provided only syntactic constraints. Taken together, our findings indicate that modeling word-level variability in ERPs reveals mechanisms by which different sources of information simultaneously contribute to the unfolding neural dynamics of comprehension. © 2015 Society for Psychophysiological Research.
Inhoff, Albrecht W; Radach, Ralph; Eiter, Brianna M; Juhasz, Barbara
2003-07-01
Two experiments examined readers' use of parafoveally obtained word length information for word recognition. Both experiments manipulated the length (number of constituent characters) of a parafoveally previewed target word so that it was either accurately or inaccurately specified. In Experiment 1, previews also either revealed or denied useful orthographic information. In Experiment 2, parafoveal targets were either high- or low-frequency words. Eye movement contingent display changes were used to show the intact target upon its fixation. Examination of target viewing duration showed completely additive effects of word length previews and of ortho-graphic previews in Experiment 1, viewing duration being shorter in the accurate-length and the orthographic preview conditions. Experiment 2 showed completely additive effects of word length and word frequency, target viewing being shorter in the accurate-length and the high-frequency conditions. Together these results indicate that functionally distinct subsystems control the use of parafoveally visible spatial and linguistic information in reading. Parafoveally visible spatial information appears to be used for two distinct extralinguistic computations: visual object selection and saccade specification.
The differential role of phonological and distributional cues in grammatical categorisation.
Monaghan, Padraic; Chater, Nick; Christiansen, Morten H
2005-06-01
Recognising the grammatical categories of words is a necessary skill for the acquisition of syntax and for on-line sentence processing. The syntactic and semantic context of the word contribute as cues for grammatical category assignment, but phonological cues, too, have been implicated as important sources of information. The value of phonological and distributional cues has not, with very few exceptions, been empirically assessed. This paper presents a series of analyses of phonological cues and distributional cues and their potential for distinguishing grammatical categories of words in corpus analyses. The corpus analyses indicated that phonological cues were more reliable for less frequent words, whereas distributional information was most valuable for high frequency words. We tested this prediction in an artificial language learning experiment, where the distributional and phonological cues of categories of nonsense words were varied. The results corroborated the corpus analyses. For high-frequency nonwords, distributional information was more useful, whereas for low-frequency words there was more reliance on phonological cues. The results indicate that phonological and distributional cues contribute differentially towards grammatical categorisation.
Randomness versus specifics for word-frequency distributions
NASA Astrophysics Data System (ADS)
Yan, Xiaoyong; Minnhagen, Petter
2016-02-01
The text-length-dependence of real word-frequency distributions can be connected to the general properties of a random book. It is pointed out that this finding has strong implications, when deciding between two conceptually different views on word-frequency distributions, i.e. the specific 'Zipf's-view' and the non-specific 'Randomness-view', as is discussed. It is also noticed that the text-length transformation of a random book does have an exact scaling property precisely for the power-law index γ = 1, as opposed to the Zipf's exponent γ = 2 and the implication of this exact scaling property is discussed. However a real text has γ > 1 and as a consequence γ increases when shortening a real text. The connections to the predictions from the RGF (Random Group Formation) and to the infinite length-limit of a meta-book are also discussed. The difference between 'curve-fitting' and 'predicting' word-frequency distributions is stressed. It is pointed out that the question of randomness versus specifics for the distribution of outcomes in case of sufficiently complex systems has a much wider relevance than just the word-frequency example analyzed in the present work.
Effect of acute exposure to a complex fragrance on lexical decision performance.
Gaygen, Daniel E; Hedge, Alan
2009-01-01
This study tested the effect of acute exposure to a commercial air freshener, derived from fragrant botanical extracts, at an average concentration of 3.16 mg/m(3) total volatile organic compounds on the lexical decision performance of 28 naive participants. Participants attended two 18-min sessions on separate days and were continuously exposed to the fragrance in either the first (F/NF) or second (NF/F) session. Participants were not instructed about the fragrance. Exposure to the fragrance did not affect high-frequency word recognition. However, there was an order of administration effect for low-frequency word recognition accuracy. When the fragrance was administered first before the no-odor control condition, it did not affect accuracy, but when it was administered second after the control condition, it significantly decreased low-frequency word recognition accuracy. Reaction times to low-frequency words were significantly slower than those for high-frequency words, but no effect of either fragrance or order of administration on reaction times was found. The presence of fragrance in the second session apparently served as a distraction that impaired lexical task performance accuracy. The introduction of fragrances into buildings may not necessarily facilitate all aspects of work performance as anticipated.
Word recognition in Alzheimer's disease: Effects of semantic degeneration.
Cuetos, Fernando; Arce, Noemí; Martínez, Carmen; Ellis, Andrew W
2017-03-01
Impairments of word recognition in Alzheimer's disease (AD) have been less widely investigated than impairments affecting word retrieval and production. In particular, we know little about what makes individual words easier or harder for patients with AD to recognize. We used a lexical selection task in which participants were shown sets of four items, each set consisting of one word and three non-words. The task was simply to point to the word on each trial. Forty patients with mild-to-moderate AD were significantly impaired on this task relative to matched controls who made very few errors. The number of patients with AD able to recognize each word correctly was predicted by the frequency, age of acquisition, and imageability of the words, but not by their length or number of orthographic neighbours. Patient Mini-Mental State Examination and phonological fluency scores also predicted the number of words recognized. We propose that progressive degradation of central semantic representations in AD differentially affects the ability to recognize low-imageability, low-frequency, late-acquired words, with the same factors affecting word recognition as affecting word retrieval. © 2015 The British Psychological Society.
Contextual diversity facilitates learning new words in the classroom.
Rosa, Eva; Tapia, José Luis; Perea, Manuel
2017-01-01
In the field of word recognition and reading, it is commonly assumed that frequently repeated words create more accessible memory traces than infrequently repeated words, thus capturing the word-frequency effect. Nevertheless, recent research has shown that a seemingly related factor, contextual diversity (defined as the number of different contexts [e.g., films] in which a word appears), is a better predictor than word-frequency in word recognition and sentence reading experiments. Recent research has shown that contextual diversity plays an important role when learning new words in a laboratory setting with adult readers. In the current experiment, we directly manipulated contextual diversity in a very ecological scenario: at school, when Grade 3 children were learning words in the classroom. The new words appeared in different contexts/topics (high-contextual diversity) or only in one of them (low-contextual diversity). Results showed that words encountered in different contexts were learned and remembered more effectively than those presented in redundant contexts. We discuss the practical (educational [e.g., curriculum design]) and theoretical (models of word recognition) implications of these findings.
Contextual diversity facilitates learning new words in the classroom
Tapia, José Luis; Perea, Manuel
2017-01-01
In the field of word recognition and reading, it is commonly assumed that frequently repeated words create more accessible memory traces than infrequently repeated words, thus capturing the word-frequency effect. Nevertheless, recent research has shown that a seemingly related factor, contextual diversity (defined as the number of different contexts [e.g., films] in which a word appears), is a better predictor than word-frequency in word recognition and sentence reading experiments. Recent research has shown that contextual diversity plays an important role when learning new words in a laboratory setting with adult readers. In the current experiment, we directly manipulated contextual diversity in a very ecological scenario: at school, when Grade 3 children were learning words in the classroom. The new words appeared in different contexts/topics (high-contextual diversity) or only in one of them (low-contextual diversity). Results showed that words encountered in different contexts were learned and remembered more effectively than those presented in redundant contexts. We discuss the practical (educational [e.g., curriculum design]) and theoretical (models of word recognition) implications of these findings. PMID:28586354
Yap, Melvin J; Balota, David A; Tse, Chi-Shing; Besner, Derek
2008-05-01
The joint effects of stimulus quality and word frequency in lexical decision were examined in 4 experiments as a function of nonword type (legal nonwords, e.g., BRONE, vs. pseudohomophones, e.g., BRANE). When familiarity was a viable dimension for word-nonword discrimination, as when legal nonwords were used, additive effects of stimulus quality and word frequency were observed in both means and distributional characteristics of the response-time distributions. In contrast, when the utility of familiarity was undermined by using pseudohomophones, additivity was observed in the means but not in distributional characteristics. Specifically, opposing interactive effects in the underlying distribution were observed, producing apparent additivity in means. These findings are consistent with the suggestion that, when familiarity is deemphasized in lexical decision, cascaded processing between letter and word levels is in play, whereas, when familiarity is a viable dimension for word-nonword discrimination, processing is discrete.
Interactive language learning by robots: the transition from babbling to word forms.
Lyon, Caroline; Nehaniv, Chrystopher L; Saunders, Joe
2012-01-01
The advent of humanoid robots has enabled a new approach to investigating the acquisition of language, and we report on the development of robots able to acquire rudimentary linguistic skills. Our work focuses on early stages analogous to some characteristics of a human child of about 6 to 14 months, the transition from babbling to first word forms. We investigate one mechanism among many that may contribute to this process, a key factor being the sensitivity of learners to the statistical distribution of linguistic elements. As well as being necessary for learning word meanings, the acquisition of anchor word forms facilitates the segmentation of an acoustic stream through other mechanisms. In our experiments some salient one-syllable word forms are learnt by a humanoid robot in real-time interactions with naive participants. Words emerge from random syllabic babble through a learning process based on a dialogue between the robot and the human participant, whose speech is perceived by the robot as a stream of phonemes. Numerous ways of representing the speech as syllabic segments are possible. Furthermore, the pronunciation of many words in spontaneous speech is variable. However, in line with research elsewhere, we observe that salient content words are more likely than function words to have consistent canonical representations; thus their relative frequency increases, as does their influence on the learner. Variable pronunciation may contribute to early word form acquisition. The importance of contingent interaction in real-time between teacher and learner is reflected by a reinforcement process, with variable success. The examination of individual cases may be more informative than group results. Nevertheless, word forms are usually produced by the robot after a few minutes of dialogue, employing a simple, real-time, frequency dependent mechanism. This work shows the potential of human-robot interaction systems in studies of the dynamics of early language acquisition.
Memory for Frequency of Occurrence in Retarded and Nonretarded Persons.
ERIC Educational Resources Information Center
Ellis, Norman R.; Allison, Pamela
1988-01-01
Ninety-six mildly mentally retarded persons and 96 nonretarded college students estimated the frequency of occurrence of words and pictures in a study test paradigm. Frequency estimates were equal for words, but the nonretarded subjects were superior in accuracy on pictorial items. This finding points to an encoding deficiency attributed to…
Examining Second Language Receptive Knowledge of Collocation and Factors That Affect Learning
ERIC Educational Resources Information Center
Nguyen, Thi My Hang; Webb, Stuart
2017-01-01
This study investigated Vietnamese EFL learners' knowledge of verb-noun and adjective-noun collocations at the first three 1,000 word frequency levels, and the extent to which five factors (node word frequency, collocation frequency, mutual information score, congruency, and part of speech) predicted receptive knowledge of collocation. Knowledge…
The Role of Orthographic Neighborhood Size Effects in Chinese Word Recognition
ERIC Educational Resources Information Center
Li, Meng-Feng; Lin, Wei-Chun; Chou, Tai-Li; Yang, Fu-Ling; Wu, Jei-Tun
2015-01-01
Previous studies about the orthographic neighborhood size (NS) in Chinese have overlooked the morphological processing, and the co-variation between the character frequency and the the NS. The present study manipulated the word frequency and the NS simultaneously, with the leading character frequency controlled, to explore their influences on word…
Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
Altmann, Eduardo G.; Pierrehumbert, Janet B.; Motter, Adilson E.
2009-01-01
Background Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. Methodology/Principal Findings By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. Conclusions/Significance Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics. PMID:19907645
The word-frequency paradox for recall/recognition occurs for pictures.
Karlsen, Paul Johan; Snodgrass, Joan Gay
2004-08-01
A yes-no recognition task and two recall tasks were conducted using pictures of high and low familiarity ratings. Picture familiarity had analogous effects to word frequency, and replicated the word-frequency paradox in recall and recognition. Low-familiarity pictures were more recognizable than high-familiarity pictures, pure lists of high-familiarity pictures were more recallable than pure lists of low-familiarity pictures, and there was no effect of familiarity for mixed lists. These results are consistent with the predictions of the Search of Associative Memory (SAM) model.
Choi, Wonil; Gordon, Peter C.
2013-01-01
The coordination of word-recognition and oculomotor processes during reading was evaluated in two eye-tracking experiments that examined how word skipping, where a word is not fixated during first-pass reading, is affected by the lexical status of a letter string in the parafovea and ease of recognizing that string. Ease of lexical recognition was manipulated through target-word frequency (Experiment 1) and through repetition priming between prime-target pairs embedded in a sentence (Experiment 2). Using the gaze-contingent boundary technique the target word appeared in the parafovea either with full preview or with transposed-letter (TL) preview. The TL preview strings were nonwords in Experiment 1 (e.g., bilnk created from the target blink), but were words in Experiment 2 (e.g., sacred created from the target scared). Experiment 1 showed greater skipping for high-frequency than low-frequency target words in the full preview condition but not in the TL preview (nonword) condition. Experiment 2 showed greater skipping for target words that repeated an earlier prime word than for those that did not, with this repetition priming occurring both with preview of the full target and with preview of the target’s TL neighbor word. However, time to progress from the word after the target was greater following skips of the TL preview word, whose meaning was anomalous in the sentence context, than following skips of the full preview word whose meaning fit sensibly into the sentence context. Together, the results support the idea that coordination between word-recognition and oculomotor processes occurs at the level of implicit lexical decisions. PMID:23106372
Rank Diversity of Languages: Generic Behavior in Computational Linguistics
Cocho, Germinal; Flores, Jorge; Gershenson, Carlos; Pineda, Carlos; Sánchez, Sergio
2015-01-01
Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution rank diversity. We calculate this diversity for books published in six European languages since 1800, and find that it follows a universal lognormal distribution. Based on the mean and standard deviation associated with the lognormal distribution, we define three different word regimes of languages: “heads” consist of words which almost do not change their rank in time, “bodies” are words of general use, while “tails” are comprised by context-specific words and vary their rank considerably in time. The heads and bodies reflect the size of language cores identified by linguists for basic communication. We propose a Gaussian random walk model which reproduces the rank variation of words in time and thus the diversity. Rank diversity of words can be understood as the result of random variations in rank, where the size of the variation depends on the rank itself. We find that the core size is similar for all languages studied. PMID:25849150
Rank diversity of languages: generic behavior in computational linguistics.
Cocho, Germinal; Flores, Jorge; Gershenson, Carlos; Pineda, Carlos; Sánchez, Sergio
2015-01-01
Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution rank diversity. We calculate this diversity for books published in six European languages since 1800, and find that it follows a universal lognormal distribution. Based on the mean and standard deviation associated with the lognormal distribution, we define three different word regimes of languages: "heads" consist of words which almost do not change their rank in time, "bodies" are words of general use, while "tails" are comprised by context-specific words and vary their rank considerably in time. The heads and bodies reflect the size of language cores identified by linguists for basic communication. We propose a Gaussian random walk model which reproduces the rank variation of words in time and thus the diversity. Rank diversity of words can be understood as the result of random variations in rank, where the size of the variation depends on the rank itself. We find that the core size is similar for all languages studied.
Ojima, Shiro; Matsuba-Kurita, Hiroko; Dan, Ippeita; Tsuzuki, Daisuke; Katura, Takusige; Hagiwara, Hiroko
2011-01-01
A large-scale study of 484 elementary school children (6–10 years) performing word repetition tasks in their native language (L1-Japanese) and a second language (L2-English) was conducted using functional near-infrared spectroscopy. Three factors presumably associated with cortical activation, language (L1/L2), word frequency (high/low), and hemisphere (left/right), were investigated. L1 words elicited significantly greater brain activation than L2 words, regardless of semantic knowledge, particularly in the superior/middle temporal and inferior parietal regions (angular/supramarginal gyri). The greater L1-elicited activation in these regions suggests that they are phonological loci, reflecting processes tuned to the phonology of the native language, while phonologically unfamiliar L2 words were processed like nonword auditory stimuli. The activation was bilateral in the auditory and superior/middle temporal regions. Hemispheric asymmetry was observed in the inferior frontal region (right dominant), and in the inferior parietal region with interactions: low-frequency words elicited more right-hemispheric activation (particularly in the supramarginal gyrus), while high-frequency words elicited more left-hemispheric activation (particularly in the angular gyrus). The present results reveal the strong involvement of a bilateral language network in children’s brains depending more on right-hemispheric processing while acquiring unfamiliar/low-frequency words. A right-to-left shift in laterality should occur in the inferior parietal region, as lexical knowledge increases irrespective of language. PMID:21350046
Sugiura, Lisa; Ojima, Shiro; Matsuba-Kurita, Hiroko; Dan, Ippeita; Tsuzuki, Daisuke; Katura, Takusige; Hagiwara, Hiroko
2011-10-01
A large-scale study of 484 elementary school children (6-10 years) performing word repetition tasks in their native language (L1-Japanese) and a second language (L2-English) was conducted using functional near-infrared spectroscopy. Three factors presumably associated with cortical activation, language (L1/L2), word frequency (high/low), and hemisphere (left/right), were investigated. L1 words elicited significantly greater brain activation than L2 words, regardless of semantic knowledge, particularly in the superior/middle temporal and inferior parietal regions (angular/supramarginal gyri). The greater L1-elicited activation in these regions suggests that they are phonological loci, reflecting processes tuned to the phonology of the native language, while phonologically unfamiliar L2 words were processed like nonword auditory stimuli. The activation was bilateral in the auditory and superior/middle temporal regions. Hemispheric asymmetry was observed in the inferior frontal region (right dominant), and in the inferior parietal region with interactions: low-frequency words elicited more right-hemispheric activation (particularly in the supramarginal gyrus), while high-frequency words elicited more left-hemispheric activation (particularly in the angular gyrus). The present results reveal the strong involvement of a bilateral language network in children's brains depending more on right-hemispheric processing while acquiring unfamiliar/low-frequency words. A right-to-left shift in laterality should occur in the inferior parietal region, as lexical knowledge increases irrespective of language.
Mainela-Arnold, Elina; Evans, Julia L.
2016-01-01
Reduced verbal working memory capacity has been proposed as a possible account of language impairments in specific language impairment (SLI). Studies have shown, however, that differences in strength of linguistic representations in the form of word frequency affect list recall and performance on verbal working memory tasks. This suggests that verbal memory capacity and long-term linguistic knowledge may not be distinct constructs. It has been suggested that linguistic representations in SLI are weak in ways that result in a breakdown in language processing on tasks that require manipulation of unfamiliar material. In this study, the effects of word frequency, long-term linguistic knowledge, and serial order position on recall performance in the competing language processing task (CLPT) were investigated in 10 children with SLI and 10 age-matched peers (age 8 years 6 months to 12 years 4 months). The children with SLI recalled significantly fewer target words on the CLPT as compared with their age-matched controls. The SLI group did not differ, however, in their ability to recall target words having high word frequency but were significantly poorer in their ability to recall words on the CLPT having low word frequency. Differences in receptive and expressive language abilities also appeared closely related to performance on the CLPT, suggesting that working memory capacity is not distinct from language knowledge and that degraded linguistic representations may have an effect on performance on verbal working memory span tasks in children with SLI. PMID:16378481
The effect of morphology on spelling and reading accuracy: a study on Italian children
Angelelli, Paola; Marinelli, Chiara Valeria; Burani, Cristina
2014-01-01
In opaque orthographies knowledge of morphological information helps in achieving reading and spelling accuracy. In transparent orthographies with regular print-to-sound correspondences, such as Italian, the mappings of orthography onto phonology and phonology onto orthography are in principle sufficient to read and spell most words. The present study aimed to investigate the role of morphology in the reading and spelling accuracy of Italian children as a function of school experience to determine whether morphological facilitation was present in children learning a transparent orthography. The reading and spelling performances of 15 third-grade and 15 fifth-grade typically developing children were analyzed. Children read aloud and spelled both low-frequency words and pseudowords. Low-frequency words were manipulated for the presence of morphological structure (morphemic words vs. non-derived words). Morphemic words could also vary for the frequency (high vs. low) of roots and suffixes. Pseudo-words were made up of either a real root and a real derivational suffix in a combination that does not exist in the Italian language or had no morphological constituents. Results showed that, in Italian, morphological information is a useful resource for both reading and spelling. Typically developing children benefitted from the presence of morphological structure when they read and spelled pseudowords; however, in processing low-frequency words, morphology facilitated reading but not spelling. These findings are discussed in terms of morpho-lexical access and successful cooperation between lexical and sublexical processes in reading and spelling. PMID:25477855
The effect of morphology on spelling and reading accuracy: a study on Italian children.
Angelelli, Paola; Marinelli, Chiara Valeria; Burani, Cristina
2014-01-01
In opaque orthographies knowledge of morphological information helps in achieving reading and spelling accuracy. In transparent orthographies with regular print-to-sound correspondences, such as Italian, the mappings of orthography onto phonology and phonology onto orthography are in principle sufficient to read and spell most words. The present study aimed to investigate the role of morphology in the reading and spelling accuracy of Italian children as a function of school experience to determine whether morphological facilitation was present in children learning a transparent orthography. The reading and spelling performances of 15 third-grade and 15 fifth-grade typically developing children were analyzed. Children read aloud and spelled both low-frequency words and pseudowords. Low-frequency words were manipulated for the presence of morphological structure (morphemic words vs. non-derived words). Morphemic words could also vary for the frequency (high vs. low) of roots and suffixes. Pseudo-words were made up of either a real root and a real derivational suffix in a combination that does not exist in the Italian language or had no morphological constituents. Results showed that, in Italian, morphological information is a useful resource for both reading and spelling. Typically developing children benefitted from the presence of morphological structure when they read and spelled pseudowords; however, in processing low-frequency words, morphology facilitated reading but not spelling. These findings are discussed in terms of morpho-lexical access and successful cooperation between lexical and sublexical processes in reading and spelling.
Mestres-Missé, Anna; Trampel, Robert; Turner, Robert; Kotz, Sonja A
2016-04-01
A key aspect of optimal behavior is the ability to predict what will come next. To achieve this, we must have a fairly good idea of the probability of occurrence of possible outcomes. This is based both on prior knowledge about a particular or similar situation and on immediately relevant new information. One question that arises is: when considering converging prior probability and external evidence, is the most probable outcome selected or does the brain represent degrees of uncertainty, even highly improbable ones? Using functional magnetic resonance imaging, the current study explored these possibilities by contrasting words that differ in their probability of occurrence, namely, unbalanced ambiguous words and unambiguous words. Unbalanced ambiguous words have a strong frequency-based bias towards one meaning, while unambiguous words have only one meaning. The current results reveal larger activation in lateral prefrontal and insular cortices in response to dominant ambiguous compared to unambiguous words even when prior and contextual information biases one interpretation only. These results suggest a probability distribution, whereby all outcomes and their associated probabilities of occurrence--even if very low--are represented and maintained.
The use of ultrasound in the study of articulatory properties of vowels in clear speech.
Song, Jae Yung
2017-01-01
Although the acoustic properties of clear speech have been extensively studied, its underlying articulatory details have not been well understood. The purpose of the present study is twofold: To examine the specific articulatory processes of clear speech using ultrasound and to investigate whether and how the type of listener (hard of hearing, normal hearing) and the lexical property of words (frequency) interact in the production of clear speech. To this end, we examined productions of /ɑ/, /æ/ and /u/ from 16 speakers of US English. Overall, our ultrasound results suggested that the tongue's highest point moved in a direction that exaggerated the three vowels' phonological features, resulting in an expanded articulatory vowel space for the hard-of-hearing listener and low-frequency words. No interaction was found between the listener and word frequency, suggesting that the effects of word frequency hold constant across the two types of listeners.
Memory bias in health anxiety is related to the emotional valence of health-related words.
Ferguson, Eamonn; Moghaddam, Nima G; Bibby, Peter A
2007-03-01
A model based on the associative strength of object evaluations is tested to explain why those who score higher on health anxiety have a better memory for health-related words. Sixty participants observed health and nonhealth words. A recognition memory task followed a free recall task and finally subjects provided evaluations (emotionality, imageability, and frequency) for all the words. Hit rates for health words, d', c, and psychological response times (PRTs) for evaluations were examined using multi-level modelling (MLM) and regression. Health words had a higher hit rate, which was greater for those with higher levels of health anxiety. The higher hit rate for health words is partly mediated by the extent to which health words are evaluated as emotionally unpleasant, and this was stronger for (moderated by) those with higher levels of health anxiety. Consistent with the associative strength model, those with higher levels of health anxiety demonstrated faster PRTs when making emotional evaluations of health words compared to nonhealth words, while those lower in health anxiety were slower to evaluate health words. Emotional evaluations speed the recognition of health words for high health anxious individuals. These findings are discussed with respect to the wider literature on cognitive processes in health anxiety, automatic processing, implicit attitudes, and emotions in decision making.
ERIC Educational Resources Information Center
Carroll, John B.
In three immediately succeeding trials, 45 young adults named 50 pictures of objects as rapidly as possible; word retrieval latencies were measured for each item. Before each trial, one experimental group was given information as to the word frequency (WF) level of the items' names. The other experimental group was given information as to the…
ERIC Educational Resources Information Center
Lupker, Stephen J.; Pexman, Penny M.
2010-01-01
Performance in a lexical decision task is crucially dependent on the difficulty of the word-nonword discrimination. More wordlike nonwords cause not only a latency increase for words but also, as reported by Stone and Van Orden (1993), larger word frequency effects. Several current models of lexical decision making can explain these types of…
The Effect of High- and Low-Frequency Previews and Sentential Fit on Word Skipping during Reading
ERIC Educational Resources Information Center
Angele, Bernhard; Laishley, Abby E.; Rayner, Keith; Liversedge, Simon P.
2014-01-01
In a previous gaze-contingent boundary experiment, Angele and Rayner (2013) found that readers are likely to skip a word that appears to be the definite article "the" even when syntactic constraints do not allow for articles to occur in that position. In the present study, we investigated whether the word frequency of the preview of a…
The role of the frequency of constituents in compound words: evidence from Basque and Spanish.
Duñabeitia, Jon Andoni; Perea, Manuel; Carreiras, Manuel
2007-12-01
Recent data from compound word processing suggests that compounds are recognized via their constituent lexemes (Juhasz, Starr, Inhoff, & Placke, 2003). The present lexical decision experiment manipulated orthogonally the frequency of the constituents of compound words in two languages: Basque and Spanish. Basque and Spanish diverge widely in their morphological properties and in the number of existing compound words. Furthermore, the head lexeme (i.e., the most meaningful lexeme related to the whole-word meaning) in Spanish tends to be the second lexeme, whereas in Basque the percentage is more distributed. Results showed a facilitative effect of the frequency of the second lexeme, in both Basque and Spanish compounds. Thus, both Basque and Spanish readers decompose compounds into their constituents for lexical access, and this decomposition is carried out in a language-independent and blind-to-semantics manner. We examine the implications of these results for models of lexical access.
Algorithmic Classification of Five Characteristic Types of Paraphasias.
Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven
2016-12-01
This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). We analyzed 7,111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.
NASA Technical Reports Server (NTRS)
Lawton, Teri B.
1989-01-01
A method to improve the reading performance of subjects with losses in central vision is proposed in which the amplitudes of the intermediate spatial frequencies are boosted relative to the lower spatial frequencies. In the method, words are filtered using an image enhancement function which is based on a subject's losses in visual function relative to a normal subject. It was found that 30-70 percent less magnification was necessary, and that reading rates were improved 2-3 times, using the method. The individualized compensation filters improved the clarity and visibility of words. The shape of the enhancement function was shown to be important in determining the optimum compensation filter for improving reading performance.
Quantitative social dialectology: explaining linguistic variation geographically and socially.
Wieling, Martijn; Nerbonne, John; Baayen, R Harald
2011-01-01
In this study we examine linguistic variation and its dependence on both social and geographic factors. We follow dialectometry in applying a quantitative methodology and focusing on dialect distances, and social dialectology in the choice of factors we examine in building a model to predict word pronunciation distances from the standard Dutch language to 424 Dutch dialects. We combine linear mixed-effects regression modeling with generalized additive modeling to predict the pronunciation distance of 559 words. Although geographical position is the dominant predictor, several other factors emerged as significant. The model predicts a greater distance from the standard for smaller communities, for communities with a higher average age, for nouns (as contrasted with verbs and adjectives), for more frequent words, and for words with relatively many vowels. The impact of the demographic variables, however, varied from word to word. For a majority of words, larger, richer and younger communities are moving towards the standard. For a smaller minority of words, larger, richer and younger communities emerge as driving a change away from the standard. Similarly, the strength of the effects of word frequency and word category varied geographically. The peripheral areas of the Netherlands showed a greater distance from the standard for nouns (as opposed to verbs and adjectives) as well as for high-frequency words, compared to the more central areas. Our findings indicate that changes in pronunciation have been spreading (in particular for low-frequency words) from the Hollandic center of economic power to the peripheral areas of the country, meeting resistance that is stronger wherever, for well-documented historical reasons, the political influence of Holland was reduced. Our results are also consistent with the theory of lexical diffusion, in that distances from the Hollandic norm vary systematically and predictably on a word by word basis.
Quantitative Social Dialectology: Explaining Linguistic Variation Geographically and Socially
Wieling, Martijn; Nerbonne, John; Baayen, R. Harald
2011-01-01
In this study we examine linguistic variation and its dependence on both social and geographic factors. We follow dialectometry in applying a quantitative methodology and focusing on dialect distances, and social dialectology in the choice of factors we examine in building a model to predict word pronunciation distances from the standard Dutch language to 424 Dutch dialects. We combine linear mixed-effects regression modeling with generalized additive modeling to predict the pronunciation distance of 559 words. Although geographical position is the dominant predictor, several other factors emerged as significant. The model predicts a greater distance from the standard for smaller communities, for communities with a higher average age, for nouns (as contrasted with verbs and adjectives), for more frequent words, and for words with relatively many vowels. The impact of the demographic variables, however, varied from word to word. For a majority of words, larger, richer and younger communities are moving towards the standard. For a smaller minority of words, larger, richer and younger communities emerge as driving a change away from the standard. Similarly, the strength of the effects of word frequency and word category varied geographically. The peripheral areas of the Netherlands showed a greater distance from the standard for nouns (as opposed to verbs and adjectives) as well as for high-frequency words, compared to the more central areas. Our findings indicate that changes in pronunciation have been spreading (in particular for low-frequency words) from the Hollandic center of economic power to the peripheral areas of the country, meeting resistance that is stronger wherever, for well-documented historical reasons, the political influence of Holland was reduced. Our results are also consistent with the theory of lexical diffusion, in that distances from the Hollandic norm vary systematically and predictably on a word by word basis. PMID:21912639
Consolidation of novel word learning in native English-speaking adults.
Kurdziel, Laura B F; Spencer, Rebecca M C
2016-01-01
Sleep has been shown to improve the retention of newly learned words. However, most methodologies have used artificial or foreign language stimuli, with learning limited to word/novel word or word/image pairs. Such stimuli differ from many word-learning scenarios in which definition strings are learned with novel words. Thus, we examined sleep's benefit on learning new words within a native language by using very low-frequency words. Participants learned 45 low-frequency English words and, at subsequent recall, attempted to recall the words when given the corresponding definitions. Participants either learned in the morning with recall in the evening (wake group), or learned in the evening with recall the following morning (sleep group). Performance change across the delay was significantly better in the sleep than the wake group. Additionally, the Levenshtein distance, a measure of correctness of the typed word compared with the target word, became significantly worse following wake, whereas sleep protected correctness of recall. Polysomnographic data from a subsample of participants suggested that rapid eye movement (REM) sleep may be particularly important for this benefit. These results lend further support for sleep's function on semantic learning even for word/definition pairs within a native language.
Subjective Word Frequency Estimates in L1 and L2.
ERIC Educational Resources Information Center
Arnaud, Pierre J. L.
A study investigated the usefulness of non-native speakers' subjective, relative word frequency estimates as a measure of second language proficiency. In the experiment, two subjective frequency estimate (SFE) tasks, one on French and one on English, were presented to French learners of English (n=126) and American learners of French (n=87).…
Category Size Effects Revisited: Frequency and Masked Priming Effects in Semantic Categorization
ERIC Educational Resources Information Center
Forster, Kenneth I.
2004-01-01
Previous work indicates that semantic categorization decisions for nonexemplars (e.g., deciding that TURBAN is not an animal name) are faster for high-frequency words than low-frequency words. However, there is evidence that this result might depend on category size. When narrow categories are used (e.g., Months, Numbers), there is no frequency…
Meyer, Ted A; Frisch, Stefan A; Pisoni, David B; Miyamoto, Richard T; Svirsky, Mario A
2003-07-01
Do cochlear implants provide enough information to allow adult cochlear implant users to understand words in ways that are similar to listeners with acoustic hearing? Can we use a computational model to gain insight into the underlying mechanisms used by cochlear implant users to recognize spoken words? The Neighborhood Activation Model has been shown to be a reasonable model of word recognition for listeners with normal hearing. The Neighborhood Activation Model assumes that words are recognized in relation to other similar-sounding words in a listener's lexicon. The probability of correctly identifying a word is based on the phoneme perception probabilities from a listener's closed-set consonant and vowel confusion matrices modified by the relative frequency of occurrence of the target word compared with similar-sounding words (neighbors). Common words with few similar-sounding neighbors are more likely to be selected as responses than less common words with many similar-sounding neighbors. Recent studies have shown that several of the assumptions of the Neighborhood Activation Model also hold true for cochlear implant users. Closed-set consonant and vowel confusion matrices were obtained from 26 postlingually deafened adults who use cochlear implants. Confusion matrices were used to represent input errors to the Neighborhood Activation Model. Responses to the different stimuli were then generated by the Neighborhood Activation Model after incorporating the frequency of occurrence counts of the stimuli and their neighbors. Model outputs were compared with obtained performance measures on the Consonant-Vowel Nucleus-Consonant word test. Information transmission analysis was used to assess whether the Neighborhood Activation Model was able to successfully generate and predict word and individual phoneme recognition by cochlear implant users. The Neighborhood Activation Model predicted Consonant-Vowel Nucleus-Consonant test words at levels similar to those correctly identified by the cochlear implant users. The Neighborhood Activation Model also predicted phoneme feature information well. The results obtained suggest that the Neighborhood Activation Model provides a reasonable explanation of word recognition by postlingually deafened adults after cochlear implantation. It appears that multichannel cochlear implants give cochlear implant users access to their mental lexicons in a manner that is similar to listeners with acoustic hearing. The lexical properties of the test stimuli used to assess performance are important to spoken-word recognition and should be included in further models of the word recognition process.
ERIC Educational Resources Information Center
Keawchaum, Raksina; Pongpairoj, Nattama
2017-01-01
This study investigated how frequency influenced acquisition of L2 English infinitive and gerund complements among L1 Thai learners. Participants were separated into low and high proficiency groups based on their CU-TEP scores. Each group consisted of 30 participants. Data were collected using the Word Selection Task (WST) and the Grammaticality…
Moore, G. W.; Hutchins, G. M.; Miller, R. E.
1984-01-01
Computerized indexing and retrieval of medical records is increasingly important; but the use of natural language versus coded languages (SNOP, SNOMED) for this purpose remains controversial. In an effort to develop search strategies for natural language text, the authors examined the anatomic diagnosis reports by computer for 7000 consecutive autopsy subjects spanning a 13-year period at The Johns Hopkins Hospital. There were 923,657 words, 11,642 of them distinct. The authors observed an average of 1052 keystrokes, 28 lines, and 131 words per autopsy report, with an average 4.6 words per line and 7.0 letters per word. The entire text file represented 921 hours of secretarial effort. Words ranged in frequency from 33,959 occurrences of "and" to one occurrence for each of 3398 different words. Searches for rare diseases with unique names or for representative examples of common diseases were most readily performed with the use of computer-printed key word in context (KWIC) books. For uncommon diseases designated by commonly used terms (such as "cystic fibrosis"), needs were best served by a computerized search for logical combinations of key words. In an unbalanced word distribution, each conjunction (logical and) search should be performed in ascending order of word frequency; but each alternation (logical inclusive or) search should be performed in descending order of word frequency. Natural language text searches will assume a larger role in medical records analysis as the labor-intensive procedure of translation into a coded language becomes more costly, compared with the computer-intensive procedure of text searching. PMID:6546837
Assessing neglect dyslexia with compound words.
Reinhart, Stefan; Schunck, Alexander; Schaadt, Anna Katharina; Adams, Michaela; Simon, Alexandra; Kerkhoff, Georg
2016-10-01
The neglect syndrome is frequently associated with neglect dyslexia (ND), which is characterized by omissions or misread initial letters of single words. ND is usually assessed with standardized reading texts in clinical settings. However, particularly in the chronic phase of ND, patients often report reading deficits in everyday situations but show (nearly) normal performances in test situations that are commonly well-structured. To date, sensitive and standardized tests to assess the severity and characteristics of ND are lacking, although reading is of high relevance for daily life and vocational settings. Several studies found modulating effects of different word features on ND. We combined those features in a novel test to enhance test sensitivity in the assessment of ND. Low-frequency words of different length that contain residual pronounceable words when the initial letter strings are neglected were selected. We compared these words in a group of 12 ND-patients suffering from right-hemispheric first-ever stroke with word stimuli containing no existing residual words. Finally, we tested whether the serially presented words are more sensitive for the diagnosis of ND than text reading. The severity of ND was modulated strongly by the ND-test words and error frequencies in single word reading of ND words were on average more than 10 times higher than in a standardized text reading test (19.8% vs. 1.8%). The novel ND-test maximizes the frequency of specific ND-errors and is therefore more sensitive for the assessment of ND than conventional text reading tasks. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Juhasz, Barbara J; Lai, Yun-Hsuan; Woodcock, Michelle L
2015-12-01
Since the work of Taft and Forster (1976), a growing literature has examined how English compound words are recognized and organized in the mental lexicon. Much of this research has focused on whether compound words are decomposed during recognition by manipulating the word frequencies of their lexemes. However, many variables may impact morphological processing, including relational semantic variables such as semantic transparency, as well as additional form-related and semantic variables. In the present study, ratings were collected on 629 English compound words for six variables [familiarity, age of acquisition (AoA), semantic transparency, lexeme meaning dominance (LMD), imageability, and sensory experience ratings (SER)]. All of the compound words selected for this study are contained within the English Lexicon Project (Balota et al., 2007), which made it possible to use a regression approach to examine the predictive power of these variables for lexical decision and word naming performance. Analyses indicated that familiarity, AoA, imageability, and SER were all significant predictors of both lexical decision and word naming performance when they were added separately to a model containing the length and frequency of the compounds, as well as the lexeme frequencies. In addition, rated semantic transparency also predicted lexical decision performance. The database of English compound words should be beneficial to word recognition researchers who are interested in selecting items for experiments on compound words, and it will also allow researchers to conduct further analyses using the available data combined with word recognition times included in the English Lexicon Project.
Singleton, Jenny L; Morgan, Dianne; DiGello, Elizabeth; Wiles, Jill; Rivers, Rachel
2004-01-01
The written English vocabulary of 72 deaf elementary school students of various proficiency levels in American Sign Language (ASL) was compared with the performance of 60 hearing English-as-a-second-language (ESL) speakers and 61 hearing monolingual speakers of English, all of similar age. Students were asked to retell "The Tortoise and the Hare" story (previously viewed on video) in a writing activity. Writing samples were later scored for total number of words, use of words known to be highly frequent in children's writing, redundancy in writing, and use of English function words. All deaf writers showed significantly lower use of function words as compared to their hearing peers. Low-ASL-proficient students demonstrated a highly formulaic writing style, drawing mostly on high-frequency words and repetitive use of a limited range of function words. The moderate- and high-ASL-proficient deaf students' writing was not formulaic and incorporated novel, low-frequency vocabulary to communicate their thoughts. The moderate- and high-ASL students' performance revealed a departure from findings one might expect based on previous studies with deaf writers and their vocabulary use. The writing of the deaf writers also differed from the writing of hearing ESL speakers. Implications for deaf education and literacy instruction are discussed, with special attention to the fact that ASL-proficient, deaf second-language learners of English may be approaching English vocabulary acquisition in ways that are different from hearing ESL learners.
Pouplin, Samuel; Roche, Nicolas; Antoine, Jean-Yves; Vaugier, Isabelle; Pottier, Sandra; Figere, Marjorie; Bensmail, Djamel
2017-06-01
To determine whether activation of the frequency of use and automatic learning parameters of word prediction software has an impact on text input speed. Forty-five participants with cervical spinal cord injury between C4 and C8 Asia A or B accepted to participate to this study. Participants were separated in two groups: a high lesion group for participants with lesion level is at or above C5 Asia AIS A or B and a low lesion group for participants with lesion is between C6 and C8 Asia AIS A or B. A single evaluation session was carried out for each participant. Text input speed was evaluated during three copying tasks: • without word prediction software (WITHOUT condition) • with automatic learning of words and frequency of use deactivated (NOT_ACTIV condition) • with automatic learning of words and frequency of use activated (ACTIV condition) Results: Text input speed was significantly higher in the WITHOUT than the NOT_ACTIV (p< 0.001) or ACTIV conditions (p = 0.02) for participants with low lesions. Text input speed was significantly higher in the ACTIV than in the NOT_ACTIV (p = 0.002) or WITHOUT (p < 0.001) conditions for participants with high lesions. Use of word prediction software with the activation of frequency of use and automatic learning increased text input speed in participants with high-level tetraplegia. For participants with low-level tetraplegia, the use of word prediction software with frequency of use and automatic learning activated only decreased the number of errors. Implications in rehabilitation Access to technology can be difficult for persons with disabilities such as cervical spinal cord injury (SCI). Several methods have been developed to increase text input speed such as word prediction software.This study show that parameter of word prediction software (frequency of use) affected text input speed in persons with cervical SCI and differed according to the level of the lesion. • For persons with high-level lesion, our results suggest that this parameter must be activated so that text input speed is increased. • For persons with low lesion group, this parameter must be activated so that the numbers of errors are decreased. • In all cases, the activation of the parameter of frequency of use is essential in order to improve the efficiency of the word prediction software. • Health-related professionals should use these results in their clinical practice for better results and therefore better patients 'satisfaction.
ERIC Educational Resources Information Center
Yap, Melvin J.; Balota, David A.; Tse, Chi-Shing; Besner, Derek
2008-01-01
The joint effects of stimulus quality and word frequency in lexical decision were examined in 4 experiments as a function of nonword type (legal nonwords, e.g., BRONE, vs. pseudohomophones, e.g., BRANE). When familiarity was a viable dimension for word-nonword discrimination, as when legal nonwords were used, additive effects of stimulus quality…
Tamura, Niina; Castles, Anne; Nation, Kate
2017-06-01
Children learn new words via their everyday reading experience but little is known about how this learning happens. We addressed this by focusing on the conditions needed for new words to become familiar to children, drawing a distinction between lexical configuration (the acquisition of word knowledge) and lexical engagement (the emergence of interactive processes between newly learned words and existing words). In Experiment 1, 9-11-year-olds saw unfamiliar words in one of two storybook conditions, differing in degree of focus on the new words but matched for frequency of exposure. Children showed good learning of the novel words in terms of both configuration (form and meaning) and engagement (lexical competition). A frequency manipulation under incidental learning conditions in Experiment 2 revealed different time-courses of learning: a fast lexical configuration process, indexed by explicit knowledge, and a slower lexicalization process, indexed by lexical competition. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Schuster, Sarah; Hawelka, Stefan; Hutzler, Florian; Kronbichler, Martin; Richlan, Fabio
2016-01-01
Word length, frequency, and predictability count among the most influential variables during reading. Their effects are well-documented in eye movement studies, but pertinent evidence from neuroimaging primarily stem from single-word presentations. We investigated the effects of these variables during reading of whole sentences with simultaneous eye-tracking and functional magnetic resonance imaging (fixation-related fMRI). Increasing word length was associated with increasing activation in occipital areas linked to visual analysis. Additionally, length elicited a U-shaped modulation (i.e., least activation for medium-length words) within a brain stem region presumably linked to eye movement control. These effects, however, were diminished when accounting for multiple fixation cases. Increasing frequency was associated with decreasing activation within left inferior frontal, superior parietal, and occipito-temporal regions. The function of the latter region—hosting the putative visual word form area—was originally considered as limited to sublexical processing. An exploratory analysis revealed that increasing predictability was associated with decreasing activation within middle temporal and inferior frontal regions previously implicated in memory access and unification. The findings are discussed with regard to their correspondence with findings from single-word presentations and with regard to neurocognitive models of visual word recognition, semantic processing, and eye movement control during reading. PMID:27365297
Effects of metric hierarchy and rhyme predictability on word duration in The Cat in the Hat.
Breen, Mara
2018-05-01
Word durations convey many types of linguistic information, including intrinsic lexical features like length and frequency and contextual features like syntactic and semantic structure. The current study was designed to investigate whether hierarchical metric structure and rhyme predictability account for durational variation over and above other features in productions of a rhyming, metrically-regular children's book: The Cat in the Hat (Dr. Seuss, 1957). One-syllable word durations and inter-onset intervals were modeled as functions of segment number, lexical frequency, word class, syntactic structure, repetition, and font emphasis. Consistent with prior work, factors predicting longer word durations and inter-onset intervals included more phonemes, lower frequency, first mention, alignment with a syntactic boundary, and capitalization. A model parameter corresponding to metric grid height improved model fit of word durations and inter-onset intervals. Specifically, speakers realized five levels of metric hierarchy with inter-onset intervals such that interval duration increased linearly with increased height in the metric hierarchy. Conversely, speakers realized only three levels of metric hierarchy with word duration, demonstrating that they shortened the highly predictable rhyme resolutions. These results further understanding of the factors that affect spoken word duration, and demonstrate the myriad cues that children receive about linguistic structure from nursery rhymes. Copyright © 2018 Elsevier B.V. All rights reserved.
Effects of ocular transverse chromatic aberration on peripheral word identification.
Yang, Shun-Nan; Tai, Yu-chi; Laukkanen, Hannu; Sheedy, James E
2011-11-01
Transverse chromatic aberration (TCA) smears the retinal image of peripheral stimuli. We previously found that TCA significantly reduces the ability to recognize letters presented in the near fovea by degrading image quality and exacerbating crowding effect from adjacent letters. The present study examined whether TCA has a significant effect on near foveal and peripheral word identification, and whether within-word orthographic facilitation interacts with TCA effect to affect word identification. Subjects were briefly presented a 6- to 7-letter word of high or low frequency in each trial. Target words were generated with weak or strong horizontal color fringe to attenuate the TCA in the right periphery and exacerbate it in the left. The center of the target word was 1°, 2°, 4°, and 6° to the left or right of a fixation point. Subject's eye position was monitored with an eye-tracker to ensure proper fixation before target presentation. They were required to report the identity of the target word as soon and accurately as possible. Results show significant effect of color fringe on the latency and accuracy of word recognition, indicating existing TCA effect. Observed TCA effect was more salient in the right periphery, and was affected by word frequency more there. Individuals' subjective preference of color-fringed text was correlated to the TCA effect in the near periphery. Our results suggest that TCA significantly affects peripheral word identification, especially when it is located in the right periphery. Contextual facilitation such as word frequency interacts with TCA to influence the accuracy and latency of word recognition. Copyright © 2011 Elsevier Ltd. All rights reserved.
Kuperman, Victor; Drieghe, Denis; Keuleers, Emmanuel; Brysbaert, Marc
2013-01-01
We assess the amount of shared variance between three measures of visual word recognition latencies: eye movement latencies, lexical decision times, and naming times. After partialling out the effects of word frequency and word length, two well-documented predictors of word recognition latencies, we see that 7-44% of the variance is uniquely shared between lexical decision times and naming times, depending on the frequency range of the words used. A similar analysis of eye movement latencies shows that the percentage of variance they uniquely share either with lexical decision times or with naming times is much lower. It is 5-17% for gaze durations and lexical decision times in studies with target words presented in neutral sentences, but drops to 0.2% for corpus studies in which eye movements to all words are analysed. Correlations between gaze durations and naming latencies are lower still. These findings suggest that processing times in isolated word processing and continuous text reading are affected by specific task demands and presentation format, and that lexical decision times and naming times are not very informative in predicting eye movement latencies in text reading once the effect of word frequency and word length are taken into account. The difference between controlled experiments and natural reading suggests that reading strategies and stimulus materials may determine the degree to which the immediacy-of-processing assumption and the eye-mind assumption apply. Fixation times are more likely to exclusively reflect the lexical processing of the currently fixated word in controlled studies with unpredictable target words rather than in natural reading of sentences or texts.
Second language experience modulates word retrieval effort in bilinguals: evidence from pupillometry
Schmidtke, Jens
2014-01-01
Bilingual speakers often have less language experience compared to monolinguals as a result of speaking two languages and/or a later age of acquisition of the second language. This may result in weaker and less precise phonological representations of words in memory, which may cause greater retrieval effort during spoken word recognition. To gauge retrieval effort, the present study compared the effects of word frequency, neighborhood density (ND), and level of English experience by testing monolingual English speakers and native Spanish speakers who differed in their age of acquisition of English (early/late). In the experimental paradigm, participants heard English words and matched them to one of four pictures while the pupil size, an indication of cognitive effort, was recorded. Overall, both frequency and ND effects could be observed in the pupil response, indicating that lower frequency and higher ND were associated with greater retrieval effort. Bilingual speakers showed an overall delayed pupil response and a larger ND effect compared to the monolingual speakers. The frequency effect was the same in early bilinguals and monolinguals but was larger in late bilinguals. Within the group of bilingual speakers, higher English proficiency was associated with an earlier pupil response in addition to a smaller frequency and ND effect. These results suggest that greater retrieval effort associated with bilingualism may be a consequence of reduced language experience rather than constitute a categorical bilingual disadvantage. Future avenues for the use of pupillometry in the field of spoken word recognition are discussed. PMID:24600428
Hauk, Olaf; Davis, Matthew H; Pulvermüller, Friedemann
2008-09-01
Psycholinguistic research has documented a range of variables that influence visual word recognition performance. Many of these variables are highly intercorrelated. Most previous studies have used factorial designs, which do not exploit the full range of values available for continuous variables, and are prone to skewed stimulus selection as well as to effects of the baseline (e.g. when contrasting words with pseudowords). In our study, we used a parametric approach to study the effects of several psycholinguistic variables on brain activation. We focussed on the variable word frequency, which has been used in numerous previous behavioural, electrophysiological and neuroimaging studies, in order to investigate the neuronal network underlying visual word processing. Furthermore, we investigated the variable orthographic typicality as well as a combined variable for word length and orthographic neighbourhood size (N), for which neuroimaging results are still either scarce or inconsistent. Data were analysed using multiple linear regression analysis of event-related fMRI data acquired from 21 subjects in a silent reading paradigm. The frequency variable correlated negatively with activation in left fusiform gyrus, bilateral inferior frontal gyri and bilateral insulae, indicating that word frequency can affect multiple aspects of word processing. N correlated positively with brain activity in left and right middle temporal gyri as well as right inferior frontal gyrus. Thus, our analysis revealed multiple distinct brain areas involved in visual word processing within one data set.
The role of learning in improving functional writing in stroke aphasia.
Thiel, Lindsey; Sage, Karen; Conroy, Paul
2016-10-01
Improving writing in people with aphasia could improve ability to communicate, reduce isolation and increase access to information. One area that has not been sufficiently explored is the effect of impairment based spelling therapies on functional writing. A multiple case study was conducted with eight participants with aphasia subsequent to stroke. This aimed to measure the effects of spelling therapy on functional writing and perception of disability. Participants engaged in 10 sessions of copy and recall spelling therapy. Outcome measures included spelling to dictation of trained and untrained words, written picture description, spelling accuracy within emails, a disability questionnaire and a writing frequency diary. All participants made significant gains on treated words and six demonstrated improvements to untreated words. Group analyses showed significant improvements to written picture description, but not email writing, writing frequency or perceptions of disability. These results show that small doses of writing therapy can lead to large gains in specific types of writing. These gains did not extend to improvements in frequency of writing in daily living, nor ecological measures of email writing. There is a need to develop bridging interventions between experimental tasks towards more multi-faceted and ecological everyday writing tasks. Implications for Rehabilitation Acquired dysgraphia can restrict people from participating in social, educational and professional life. This study has shown that copy and recall spelling therapies can improve the spelling of treated words, untreated words and written picture description in people with a range of types and severities of dysgraphia following stroke. The results of this study suggest that more specific additional training is required for other writing activities such as email writing.
Sugiura, Lisa; Ojima, Shiro; Matsuba-Kurita, Hiroko; Dan, Ippeita; Tsuzuki, Daisuke; Katura, Takusige; Hagiwara, Hiroko
2015-10-01
Previous neuroimaging studies in adults have revealed that first and second languages (L1/L2) share similar neural substrates, and that proficiency is a major determinant of the neural organization of L2 in the lexical-semantic and syntactic domains. However, little is known about neural substrates of children in the phonological domain, or about sex differences. Here, we conducted a large-scale study (n = 484) of school-aged children using functional near-infrared spectroscopy and a word repetition task, which requires a great extent of phonological processing. We investigated cortical activation during word processing, emphasizing sex differences, to clarify similarities and differences between L1 and L2, and proficiency-related differences during early L2 learning. L1 and L2 shared similar neural substrates with decreased activation in L2 compared to L1 in the posterior superior/middle temporal and angular/supramarginal gyri for both sexes. Significant sex differences were found in cortical activation within language areas during high-frequency word but not during low-frequency word processing. During high-frequency word processing, widely distributed areas including the angular/supramarginal gyri were activated in boys, while more restricted areas, excluding the angular/supramarginal gyri were activated in girls. Significant sex differences were also found in L2 proficiency-related activation: activation significantly increased with proficiency in boys, whereas no proficiency-related differences were found in girls. Importantly, cortical sex differences emerged with proficiency. Based on previous research, the present results indicate that sex differences are acquired or enlarged during language development through different cognitive strategies between sexes, possibly reflecting their different memory functions. © 2015 Wiley Periodicals, Inc.
Syllable Frequency Effects in Visual Word Recognition: Developmental Approach in French Children
ERIC Educational Resources Information Center
Maionchi-Pino, Norbert; Magnan, Annie; Ecalle, Jean
2010-01-01
This study investigates the syllable's role in the normal reading acquisition of French children at three grade levels (1st, 3rd, and 5th), using a modified version of Cole, Magnan, and Grainger's (1999) paradigm. We focused on the effects of syllable frequency and word frequency. The results suggest that from the first to third years of reading…
Does Phonology Play a Role When Skilled Readers Read High-Frequency Words? Evidence from ERPs
ERIC Educational Resources Information Center
Newman, Randy Lynn; Jared, Debra; Haigh, Corinne A.
2012-01-01
We used event-related brain potentials to clarify the role of phonology in activating the meanings of high-frequency words during skilled silent reading. Target homophones ("meet") in sentences such as "The students arranged to meet in the library to study" were replaced on some trials by either a high-frequency homophone mate…
Rizio, Avery A; Moyer, Karlee J; Diaz, Michele T
2017-04-01
Older adults often show declines in phonological aspects of language production, particularly for low-frequency words, but maintain strong semantic systems. However, there are different theories about the mechanism that may underlie such age-related differences in language (e.g., age-related declines in transmission of activation or inhibition). This study used fMRI to investigate whether age-related differences in language production are associated with transmission deficits or inhibition deficits. We used the picture-word interference paradigm to examine age-related differences in picture naming as a function of both target frequency and the relationship between the target picture and distractor word. We found that the presence of a categorically related distractor led to greater semantic elaboration by older adults compared to younger adults, as evidenced by older adults' increased recruitment of regions including the left middle frontal gyrus and bilateral precuneus. When presented with a phonologically related distractor, patterns of neural activation are consistent with previously observed age deficits in phonological processing, including age-related reductions in the recruitment of regions such as the left middle temporal gyrus and right supramarginal gyrus. Lastly, older, but not younger, adults show increased brain activation of the pre- and postcentral gyri as a function of decreasing target frequency when target pictures are paired with a phonological distractor, suggesting that cuing the phonology of the target disproportionately aids production of low-frequency items. Overall, this pattern of results is generally consistent with the transmission deficit hypothesis, illustrating that links within the phonological system, but not the semantic system, are weakened with age.
Network-based modeling and intelligent data mining of social media for improving care.
Akay, Altug; Dragomir, Andrei; Erlandsson, Bjorn-Erik
2015-01-01
Intelligently extracting knowledge from social media has recently attracted great interest from the Biomedical and Health Informatics community to simultaneously improve healthcare outcomes and reduce costs using consumer-generated opinion. We propose a two-step analysis framework that focuses on positive and negative sentiment, as well as the side effects of treatment, in users' forum posts, and identifies user communities (modules) and influential users for the purpose of ascertaining user opinion of cancer treatment. We used a self-organizing map to analyze word frequency data derived from users' forum posts. We then introduced a novel network-based approach for modeling users' forum interactions and employed a network partitioning method based on optimizing a stability quality measure. This allowed us to determine consumer opinion and identify influential users within the retrieved modules using information derived from both word-frequency data and network-based properties. Our approach can expand research into intelligently mining social media data for consumer opinion of various treatments to provide rapid, up-to-date information for the pharmaceutical industry, hospitals, and medical staff, on the effectiveness (or ineffectiveness) of future treatments.
WORD FREQUENCY IN THE MODERN GERMAN SHORT STORY. FINAL REPORT.
ERIC Educational Resources Information Center
SCHERER, GEORGE A.; AND OTHERS
A LIST OF THE MOST FREQUENTLY USED WORDS IN MODERN GERMAN SHORT STORIES WAS COMPILED. AN ANTHOLOGY OF 702 RECENTLY PUBLISHED, GERMAN SHORT STORIES WAS OBTAINED AND USED FOR A WORD COUNT, INVOLVING THE RANDOM SELECTION OF 4 WORDS IN EVERY 100-WORD PASSAGE. TWO INDEPENDENT RANDOM SAMPLES OF ABOUT 80,000 WORDS EACH WERE DRAWN FROM NEARLY 2 MILLION…
Hoffman, Paul; Lambon Ralph, Matthew A; Rogers, Timothy T
2013-09-01
Semantic ambiguity is typically measured by summing the number of senses or dictionary definitions that a word has. Such measures are somewhat subjective and may not adequately capture the full extent of variation in word meaning, particularly for polysemous words that can be used in many different ways, with subtle shifts in meaning. Here, we describe an alternative, computationally derived measure of ambiguity based on the proposal that the meanings of words vary continuously as a function of their contexts. On this view, words that appear in a wide range of contexts on diverse topics are more variable in meaning than those that appear in a restricted set of similar contexts. To quantify this variation, we performed latent semantic analysis on a large text corpus to estimate the semantic similarities of different linguistic contexts. From these estimates, we calculated the degree to which the different contexts associated with a given word vary in their meanings. We term this quantity a word's semantic diversity (SemD). We suggest that this approach provides an objective way of quantifying the subtle, context-dependent variations in word meaning that are often present in language. We demonstrate that SemD is correlated with other measures of ambiguity and contextual variability, as well as with frequency and imageability. We also show that SemD is a strong predictor of performance in semantic judgments in healthy individuals and in patients with semantic deficits, accounting for unique variance beyond that of other predictors. SemD values for over 30,000 English words are provided as supplementary materials.
Influence of Lexical Factors on Word-Finding Accuracy, Error Patterns, and Substitution Types
ERIC Educational Resources Information Center
Newman, Rochelle S.; German, Diane J.; Jagielko, Jennifer R.
2018-01-01
This retrospective, exploratory investigation examined the types of target words that 66 children with/without word-finding difficulties (WFD) had difficulty naming, and the types of errors they made. Words were studied with reference to lexical factors (LFs) that might influence naming performance: word frequency, familiarity, length, phonotactic…
ERIC Educational Resources Information Center
Jones, Lyle V.; Wepman, Joseph M.
This word count is a composite listing of the different words spoken by a selected sample of 54 English-speaking adults and the frequency with which each of the different words was used in a particular test. The stimulus situation was identical for each subject and consisted of 20 cards of the Thematic Apperception Test. Although most word counts…
Languages cool as they expand: Allometric scaling and the decreasing need for new words
Petersen, Alexander M.; Tenenbaum, Joel N.; Havlin, Shlomo; Stanley, H. Eugene; Perc, Matjaž
2012-01-01
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature. PMID:23230508
Languages cool as they expand: Allometric scaling and the decreasing need for new words
NASA Astrophysics Data System (ADS)
Petersen, Alexander M.; Tenenbaum, Joel N.; Havlin, Shlomo; Stanley, H. Eugene; Perc, Matjaž
2012-12-01
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This ``cooling pattern'' forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
Amatchmethod Based on Latent Semantic Analysis for Earthquakehazard Emergency Plan
NASA Astrophysics Data System (ADS)
Sun, D.; Zhao, S.; Zhang, Z.; Shi, X.
2017-09-01
The structure of the emergency plan on earthquake is complex, and it's difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.
A Hierarchy of "Confusable" High-Frequency Words in Isolation and Context.
ERIC Educational Resources Information Center
Krieger, Veronica K.
1981-01-01
The study assessed the degree to which two sets of disabled readers (17 clinic students with a mean CA of 9 years and 13 fourth-grade disabled readers) commonly and identically confuse high frequency sight words in isolation and context. (Author/CL)
Regional amplitude of the low-frequency fluctuations at rest predicts word-reading skill.
Xu, M; De Beuckelaer, A; Wang, X; Liu, L; Song, Y; Liu, J
2015-07-09
Individuals' reading skills are critical for their educational development, but variation in reading skills is known to be large. The present study used functional magnetic resonance imaging (fMRI) to examine the role of spontaneous brain activity at rest in individual differences in reading skills in a large sample of participants (N=263). Specifically, we correlated individuals' word-reading skill with their fractional amplitude of low-frequency fluctuation (fALFF) of the whole brain at rest and found that the fALFFs of both the bilateral precentral gyrus (PCG) and superior temporal plane (STP) were positively associated with reading skills. The fALFF-reading association observed in these two regions remained after controlling for general cognitive abilities and in-scanner head motion. A cross-validation confirmed that the individual differences in word-reading skills were reliably correlated with the fALFF values of the bilateral PCG and STP. A follow-up task-based fMRI experiment revealed that the reading-related regions overlapped with regions showing a higher response to sentences than to pseudo-sentences (strings of pseudo-words), suggesting the resting-state brain activity partly captures the characteristics of task-based brain activity. In short, our study provides one of the first pieces of evidence that links spontaneous brain activity to reading behavior and offers an easy-to-access neural marker for evaluating reading skill. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.
Woodhead, Zoe Victoria Joan; Wise, Richard James Surtees; Sereno, Marty; Leech, Robert
2011-10-01
Different cortical regions within the ventral occipitotemporal junction have been reported to show preferential responses to particular objects. Thus, it is argued that there is evidence for a left-lateralized visual word form area and a right-lateralized fusiform face area, but the unique specialization of these areas remains controversial. Words are characterized by greater power in the high spatial frequency (SF) range, whereas faces comprise a broader range of high and low frequencies. We investigated how these high-order visual association areas respond to simple sine-wave gratings that varied in SF. Using functional magnetic resonance imaging, we demonstrated lateralization of activity that was concordant with the low-level visual property of words and faces; left occipitotemporal cortex is more strongly activated by high than by low SF gratings, whereas the right occipitotemporal cortex responded more to low than high spatial frequencies. Therefore, the SF of a visual stimulus may bias the lateralization of processing irrespective of its higher order properties.
Lupker, Stephen J; Pexman, Penny M
2010-09-01
Performance in a lexical decision task is crucially dependent on the difficulty of the word-nonword discrimination. More wordlike nonwords cause not only a latency increase for words but also, as reported by Stone and Van Orden (1993), larger word frequency effects. Several current models of lexical decision making can explain these types of results in terms of a single mechanism, a mechanism driven by the nature of the interactions within the lexicon. In 2 experiments, we replicated Stone and Van Orden's increased frequency effect using both pseudohomophones (e.g., BEEST) and transposed-letter nonwords (e.g., JUGDE) as the more wordlike nonwords. In a 3rd experiment, we demonstrated that simply increasing word latencies without changing the difficulty of the word-nonword discrimination does not produce larger frequency effects. These results are reasonably consistent with many current models. In contrast, neither pseudohomophones nor transposed-letter nonwords altered the size of semantic priming effects across 4 additional experiments, posing a challenge to models that would attempt to explain both nonword difficulty effects and semantic priming effects in lexical decision tasks in terms of a single, lexically driven mechanism. (c) 2010 APA, all rights reserved).
ERIC Educational Resources Information Center
Porritt, Laura L.; Zinser, Michael C.; Bachorowski, Jo-Anne; Kaplan, Peter S.
2014-01-01
F[subscript 0]-based acoustic measures were extracted from a brief, sentence-final target word spoken during structured play interactions between mothers and their 3- to 14-month-old infants and were analyzed based on demographic variables and DSM-IV Axis-I clinical diagnoses and their common modifiers. F[subscript 0] range (?F[subscript 0]) was…
The impact of impaired semantic knowledge on spontaneous iconic gesture production
Cocks, Naomi; Dipper, Lucy; Pritchard, Madeleine; Morgan, Gary
2013-01-01
Background Previous research has found that people with aphasia produce more spontaneous iconic gesture than control participants, especially during word-finding difficulties. There is some evidence that impaired semantic knowledge impacts on the diversity of gestural handshapes, as well as the frequency of gesture production. However, no previous research has explored how impaired semantic knowledge impacts on the frequency and type of iconic gestures produced during fluent speech compared with those produced during word-finding difficulties. Aims To explore the impact of impaired semantic knowledge on the frequency and type of iconic gestures produced during fluent speech and those produced during word-finding difficulties. Methods & Procedures A group of 29 participants with aphasia and 29 control participants were video recorded describing a cartoon they had just watched. All iconic gestures were tagged and coded as either “manner,” “path only,” “shape outline” or “other”. These gestures were then separated into either those occurring during fluent speech or those occurring during a word-finding difficulty. The relationships between semantic knowledge and gesture frequency and form were then investigated in the two different conditions. Outcomes & Results As expected, the participants with aphasia produced a higher frequency of iconic gestures than the control participants, but when the iconic gestures produced during word-finding difficulties were removed from the analysis, the frequency of iconic gesture was not significantly different between the groups. While there was not a significant relationship between the frequency of iconic gestures produced during fluent speech and semantic knowledge, there was a significant positive correlation between semantic knowledge and the proportion of word-finding difficulties that contained gesture. There was also a significant positive correlation between the speakers' semantic knowledge and the proportion of gestures that were produced during fluent speech that were classified as “manner”. Finally while not significant, there was a positive trend between semantic knowledge of objects and the production of “shape outline” gestures during word-finding difficulties for objects. Conclusions The results indicate that impaired semantic knowledge in aphasia impacts on both the iconic gestures produced during fluent speech and those produced during word-finding difficulties but in different ways. These results shed new light on the relationship between impaired language and iconic co-speech gesture production and also suggest that analysis of iconic gesture may be a useful addition to clinical assessment. PMID:24058228
Tse, Chi-Shing; Kurby, Christopher A.; Du, Feng
2010-01-01
We examined the effect of spatial iconicity (a perceptual simulation of canonical locations of objects) and word-order frequency on language processing and episodic memory of orientation. Participants made speeded relatedness judgments to pairs of words presented in locations typical to their real world arrangements (e.g., ceiling on top and floor on bottom). They then engaged in a surprise orientation recognition task for the word pairs. We replicated Louwerse’s finding (2008) that word-order frequency has a stronger effect on semantic relatedness judgments than spatial iconicity. This is consistent with recent suggestions that linguistic representations have a stronger impact on immediate decisions about verbal materials than perceptual simulations. In contrast, spatial iconicity enhanced episodic memory of orientation to a greater extent than word-order frequency did. This new finding indicates that perceptual simulations have an important role in episodic memory. Results are discussed with respect to theories of perceptual representation and linguistic processing. PMID:19742388
Dispersion and Frequency: Is There Any Difference as Regards Their Relation to L2 Vocabulary Gains?
ERIC Educational Resources Information Center
Alcaraz-Mármol, Gema
2015-01-01
Despite the current importance given to L2 vocabulary acquisition in the last two decades, considerable deficiencies are found in L2 students' vocabulary size. One of the aspects that may influence vocabulary learning is word frequency. However, scholars warn that frequency may lead to wrong conclusions if the way words are distributed is ignored.…
NASA Astrophysics Data System (ADS)
Alfarizy, A. D.; Indahwati; Sartono, B.
2017-03-01
Indonesia is the largest Hollywood movie industry target market in Southeast Asia in 2015. Hollywood movies distributed in Indonesia targeted people in all range of ages including children. Low awareness of guiding children while watching movies make them could watch any rated films even the unsuitable ones for their ages. Even after being translated into Bahasa and passed the censorship phase, words that uncomfortable for children to watch still exist. The purpose of this research is to cluster box office Hollywood movies based on Indonesian subtitle, revenue, IMDb user rating and genres as one of the reference for adults to choose right movies for their children to watch. Text mining is used to extract words from the subtitles and count the frequency for three group of words (bad words, sexual words and terror words), while Partition Around Medoids (PAM) Algorithm with Gower similarity coefficient as proximity matrix is used as clustering method. We clustered 624 movies from 2006 until first half of 2016 from IMDb. Cluster with highest silhouette coefficient value (0.36) is the one with 5 clusters. Animation, Adventure and Comedy movies with high revenue like in cluster 5 is recommended for children to watch, while Comedy movies with high revenue like in cluster 4 should be avoided to watch.
Phonotactic Probability Effects in Children Who Stutter
Anderson, Julie D.; Byrd, Courtney T.
2008-01-01
Purpose The purpose of this study was to examine the influence of phonotactic probability, the frequency of different sound segments and segment sequences, on the overall fluency with which words are produced by preschool children who stutter (CWS), as well as to determine whether it has an effect on the type of stuttered disfluency produced. Method A 500+ word language sample was obtained from 19 CWS. Each stuttered word was randomly paired with a fluently produced word that closely matched it in grammatical class, word length, familiarity, word and neighborhood frequency, and neighborhood density. Phonotactic probability values were obtained for the stuttered and fluent words from an online database. Results Phonotactic probability did not have a significant influence on the overall susceptibility of words to stuttering, but it did impact the type of stuttered disfluency produced. In specific, single-syllable word repetitions were significantly lower in phonotactic probability than fluently produced words, as well as part-word repetitions and sound prolongations. Conclusions In general, the differential impact of phonotactic probability on the type of stuttering-like disfluency produced by young CWS provides some support for the notion that different disfluency types may originate in the disruption of different levels of processing. PMID:18658056
Yap, Melvin J; Balota, David A; Cortese, Michael J; Watson, Jason M
2006-12-01
This article evaluates 2 competing models that address the decision-making processes mediating word recognition and lexical decision performance: a hybrid 2-stage model of lexical decision performance and a random-walk model. In 2 experiments, nonword type and word frequency were manipulated across 2 contrasts (pseudohomophone-legal nonword and legal-illegal nonword). When nonwords became more wordlike (i.e., BRNTA vs. BRANT vs. BRANE), response latencies to nonwords were slowed and the word frequency effect increased. More important, distributional analyses revealed that the Nonword Type = Word Frequency interaction was modulated by different components of the response time distribution, depending on the specific nonword contrast. A single-process random-walk model was able to account for this particular set of findings more successfully than the hybrid 2-stage model. (c) 2006 APA, all rights reserved.
The Influence of Contextual Diversity on Eye Movements in Reading
ERIC Educational Resources Information Center
Plummer, Patrick; Perea, Manuel; Rayner, Keith
2014-01-01
Recent research has shown contextual diversity (i.e., the number of passages in which a given word appears) to be a reliable predictor of word processing difficulty. It has also been demonstrated that word-frequency has little or no effect on word recognition speed when accounting for contextual diversity in isolated word processing tasks. An…
ERIC Educational Resources Information Center
Beaumont, Lee R.
1970-01-01
The level of difficulty of straight copy, which is used to measure typewriting speed, is influenced by syllable intensity (the average number of syllables per word), stroke intensity (average number of strokes per word), and high-frequency words. (CH)
Lewis, Ashley Glen; Schriefers, Herbert; Bastiaansen, Marcel; Schoffelen, Jan-Mathijs
2018-05-21
Reinstatement of memory-related neural activity measured with high temporal precision potentially provides a useful index for real-time monitoring of the timing of activation of memory content during cognitive processing. The utility of such an index extends to any situation where one is interested in the (relative) timing of activation of different sources of information in memory, a paradigm case of which is tracking lexical activation during language processing. Essential for this approach is that memory reinstatement effects are robust, so that their absence (in the average) definitively indicates that no lexical activation is present. We used electroencephalography to test the robustness of a reported subsequent memory finding involving reinstatement of frequency-specific entrained oscillatory brain activity during subsequent recognition. Participants learned lists of words presented on a background flickering at either 6 or 15 Hz to entrain a steady-state brain response. Target words subsequently presented on a non-flickering background that were correctly identified as previously seen exhibited reinstatement effects at both entrainment frequencies. Reliability of these statistical inferences was however critically dependent on the approach used for multiple comparisons correction. We conclude that effects are not robust enough to be used as a reliable index of lexical activation during language processing.
Whitford, Veronica; Joanisse, Marc F
2018-09-01
An extensive body of research has examined reading acquisition and performance in monolingual children. Surprisingly, however, much less is known about reading in bilingual children, who outnumber monolingual children globally. Here, we address this important imbalance in the literature by employing eye movement recordings to examine both global (i.e., text-level) and local (i.e., word-level) aspects of monolingual and bilingual children's reading performance across their first-language (L1) and second-language (L2). We also had a specific focus on lexical accessibility, indexed by word frequency effects. We had three main findings. First, bilingual children displayed reduced global and local L1 reading performance relative to monolingual children, including larger L1 word frequency effects. Second, bilingual children displayed reduced global and local L2 versus L1 reading performance, including larger L2 word frequency effects. Third, both groups of children displayed reduced global and local reading performance relative to adult comparison groups (across their known languages), including larger word frequency effects. Notably, our first finding was not captured by traditional offline measures of reading, such as standardized tests, suggesting that these measures may lack the sensitivity to detect such nuanced between-group differences in reading performance. Taken together, our findings demonstrate that bilingual children's simultaneous exposure to two reading systems leads to eye movement reading behavior that differs from that of monolingual children and has important consequences for how lexical information is accessed and integrated in both languages. Copyright © 2018 Elsevier Inc. All rights reserved.
Pittman, A L; Lewis, D E; Hoover, B M; Stelmachowicz, P G
2005-12-01
This study examined rapid word-learning in 5- to 14-year-old children with normal and impaired hearing. The effects of age and receptive vocabulary were examined as well as those of high-frequency amplification. Novel words were low-pass filtered at 4 kHz (typical of current amplification devices) and at 9 kHz. It was hypothesized that (1) the children with normal hearing would learn more words than the children with hearing loss, (2) word-learning would increase with age and receptive vocabulary for both groups, and (3) both groups would benefit from a broader frequency bandwidth. Sixty children with normal hearing and 37 children with moderate sensorineural hearing losses participated in this study. Each child viewed a 4-minute animated slideshow containing 8 nonsense words created using the 24 English consonant phonemes (3 consonants per word). Each word was repeated 3 times. Half of the 8 words were low-pass filtered at 4 kHz and half were filtered at 9 kHz. After viewing the story twice, each child was asked to identify the words from among pictures in the slide show. Before testing, a measure of current receptive vocabulary was obtained using the Peabody Picture Vocabulary Test (PPVT-III). The PPVT-III scores of the hearing-impaired children were consistently poorer than those of the normal-hearing children across the age range tested. A similar pattern of results was observed for word-learning in that the performance of the hearing-impaired children was significantly poorer than that of the normal-hearing children. Further analysis of the PPVT and word-learning scores suggested that although word-learning was reduced in the hearing-impaired children, their performance was consistent with their receptive vocabularies. Additionally, no correlation was found between overall performance and the age of identification, age of amplification, or years of amplification in the children with hearing loss. Results also revealed a small increase in performance for both groups in the extended bandwidth condition but the difference was not significant at the traditional p = 0.05 level. The ability to learn words rapidly appears to be poorer in children with hearing loss over a wide range of ages. These results coincide with the consistently poorer receptive vocabularies for these children. Neither the word-learning or receptive-vocabulary measures were related to the amplification histories of these children. Finally, providing an extended high-frequency bandwidth did not significantly improve rapid word-learning for either group with these stimuli.
[A study on English loan words in French plastic surgery].
Hansson, E; Tegelberg, E
2014-10-01
The French language is less and less used as an international scientific language and many French researchers publish their work in English. Nowadays, Annales de Chirurgie Plastique Esthétique is the only international plastic surgical journal published completely in French. The use of English loan words in French plastic surgery has never been studied. The aim of this study was to describe the frequency and types of English loan words in French plastic surgery. A corpus consisting of all the articles in a number of Annales de Chirurgie Plastique Esthethique, chosen by default, was created. The frequency of English loan words was calculated and the types of words were analysed. The corpus contains 367 (0.8%) English loan words. Most of them are non-integrated loan words and calques. The majority of the plastic surgical loan words describe surgical techniques. The French plastic surgical language seems to be influenced by English. The usage of loan words does not always follow the recommendations and the usage is sometimes ambiguous. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.
Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy
2014-11-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Automated video-based assessment of surgical skills for training and evaluation in medical schools.
Zia, Aneeq; Sharma, Yachna; Bettadapura, Vinay; Sarin, Eric L; Ploetz, Thomas; Clements, Mark A; Essa, Irfan
2016-09-01
Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches however are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities. We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos. Our evaluations show that frequency features perform better than motion texture features, which in-turn perform better than symbol-/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.
Computer Supported Indexing: A History and Evaluation of NASA's MAI System. Supplement 24
NASA Technical Reports Server (NTRS)
Silvester, June P.
1997-01-01
Computer supported indexing systems may be categorized in several ways. One classification scheme refers to them as statistical, syntactic, semantic or knowledge-based. While a system may emphasize one of these aspects, most systems actually combine two or more of these mechanisms to maximize system efficiency. Statistical systems can be based on counts of words or word stems, statistical association, and correlation techniques that assign weights to word locations or provide lexical disambiguation, calculations regarding the likelihood of word co-occurrences, clustering of word stems and transformations, or any other computational method used to identify pertinent terms. If words are counted, the ones of median frequency become candidate index terms. Syntactical systems stress grammar and identify parts of speech. Concepts found in designated grammatical combinations, such as noun phrases, generate the suggested terms. Semantic systems are concerned with the context sensitivity of words in text. The primary goal of this type of indexing is to identify without regard to syntax the subject matter and the context-bearing words in the text being indexed. Knowledge-based systems provide a conceptual network that goes past thesaurus or equivalent relationships to knowing (e.g., in the National Library of Medicine (NLM) system) that because the tibia is part of the leg, a document relating to injuries to the tibia should he indexed to LEG INJURIES, not the broader MeSH term INJURIES, or knowing that the term FEMALE should automatically be added when the term PREGNANCY is assigned, and also that the indexer should be prompted to add either HUMAN or ANIMAL. Another way of categorizing indexing systems is to identify them as producing either assigned- or derived-term indexes.
Meyer, Ted A.; Frisch, Stefan A.; Pisoni, David B.; Miyamoto, Richard T.; Svirsky, Mario A.
2012-01-01
Hypotheses Do cochlear implants provide enough information to allow adult cochlear implant users to understand words in ways that are similar to listeners with acoustic hearing? Can we use a computational model to gain insight into the underlying mechanisms used by cochlear implant users to recognize spoken words? Background The Neighborhood Activation Model has been shown to be a reasonable model of word recognition for listeners with normal hearing. The Neighborhood Activation Model assumes that words are recognized in relation to other similar-sounding words in a listener’s lexicon. The probability of correctly identifying a word is based on the phoneme perception probabilities from a listener’s closed-set consonant and vowel confusion matrices modified by the relative frequency of occurrence of the target word compared with similar-sounding words (neighbors). Common words with few similar-sounding neighbors are more likely to be selected as responses than less common words with many similar-sounding neighbors. Recent studies have shown that several of the assumptions of the Neighborhood Activation Model also hold true for cochlear implant users. Methods Closed-set consonant and vowel confusion matrices were obtained from 26 postlingually deafened adults who use cochlear implants. Confusion matrices were used to represent input errors to the Neighborhood Activation Model. Responses to the different stimuli were then generated by the Neighborhood Activation Model after incorporating the frequency of occurrence counts of the stimuli and their neighbors. Model outputs were compared with obtained performance measures on the Consonant-Vowel Nucleus-Consonant word test. Information transmission analysis was used to assess whether the Neighborhood Activation Model was able to successfully generate and predict word and individual phoneme recognition by cochlear implant users. Results The Neighborhood Activation Model predicted Consonant-Vowel Nucleus-Consonant test words at levels similar to those correctly identified by the cochlear implant users. The Neighborhood Activation Model also predicted phoneme feature information well. Conclusion The results obtained suggest that the Neighborhood Activation Model provides a reasonable explanation of word recognition by postlingually deafened adults after cochlear implantation. It appears that multichannel cochlear implants give cochlear implant users access to their mental lexicons in a manner that is similar to listeners with acoustic hearing. The lexical properties of the test stimuli used to assess performance are important to spoken-word recognition and should be included in further models of the word recognition process. PMID:12851554
The Development of Word Frequency Lists Prior to the 1944 Thorndike-Lorge List.
ERIC Educational Resources Information Center
Bontrager, Terry
1991-01-01
Examines the word frequency studies that preceded the 1944 Thorndike-Lorge count and places those investigations in their broad, cultural perspective. Draws attention to the impact of the studies on knowledge about language and its development, educational curriculum and assessment, and methods of research. (MG)
Short-Term and Long-Term Effects on Visual Word Recognition
ERIC Educational Resources Information Center
Protopapas, Athanassios; Kapnoula, Efthymia C.
2016-01-01
Effects of lexical and sublexical variables on visual word recognition are often treated as homogeneous across participants and stable over time. In this study, we examine the modulation of frequency, length, syllable and bigram frequency, orthographic neighborhood, and graphophonemic consistency effects by (a) individual differences, and (b) item…
Vocabulary in First-Year German Texts
ERIC Educational Resources Information Center
Tussing, Marjorie; Zimmermann, Jon
1977-01-01
Vocabulary in first-year high school and college level German texts was studied. It was suggested that textbook authors use published word frequency lists to develop teaching materials that are practical, useful, and relevant to the process of communication. A word list occurring in 12 textbooks is presented in descending frequency. (SW)
ERIC Educational Resources Information Center
Lehtonen, Minna; Hulten, Annika; Rodriguez-Fornells, Antoni; Cunillera, Toni; Tuomainen, Jyrki; Laine, Matti
2012-01-01
We investigated the behavioral and brain responses (ERPs) of bilingual word recognition to three fundamental psycholinguistic factors, frequency, morphology, and lexicality, in early bilinguals vs. monolinguals. Earlier behavioral studies have reported larger frequency effects in bilinguals' nondominant vs. dominant language and in some studies…
Measuring and Predicting Graded Reader Difficulty
ERIC Educational Resources Information Center
Holster, Trevor A.; Lake, J. W.; Pellowe, William R.
2017-01-01
This study used many-faceted Rasch measurement to investigate the difficulty of graded readers using a 3-item survey. Book difficulty was compared with Kyoto Level, Yomiyasusa Level, Lexile Level, book length, mean sentence length, and mean word frequency. Word frequency and Kyoto Level were found to be ineffective in predicting students'…
Hauk, O; Patterson, K; Woollams, A; Watling, L; Pulvermüller, F; Rogers, T T
2006-05-01
Using a speeded lexical decision task, event-related potentials (ERPs), and minimum norm current source estimates, we investigated early spatiotemporal aspects of cortical activation elicited by words and pseudo-words that varied in their orthographic typicality, that is, in the frequency of their component letter pairs (bi-grams) and triplets (tri-grams). At around 100 msec after stimulus onset, the ERP pattern revealed a significant typicality effect, where words and pseudo-words with atypical orthography (e.g., yacht, cacht) elicited stronger brain activation than items characterized by typical spelling patterns (cart, yart). At approximately 200 msec, the ERP pattern revealed a significant lexicality effect, with pseudo-words eliciting stronger brain activity than words. The two main factors interacted significantly at around 160 msec, where words showed a typicality effect but pseudo-words did not. The principal cortical sources of the effects of both typicality and lexicality were localized in the inferior temporal cortex. Around 160 msec, atypical words elicited the stronger source currents in the left anterior inferior temporal cortex, whereas the left perisylvian cortex was the site of greater activation to typical words. Our data support distinct but interactive processing stages in word recognition, with surface features of the stimulus being processed before the word as a meaningful lexical entry. The interaction of typicality and lexicality can be explained by integration of information from the early form-based system and lexicosemantic processes.
Lexical and Semantic Binding in Verbal Short-Term Memory
ERIC Educational Resources Information Center
Jefferies, Elizabeth; Frankish, Clive R.; Ralph, Matthew A. Lambon
2006-01-01
Semantic dementia patients make numerous phoneme migration errors in their immediate serial recall of poorly comprehended words. In this study, similar errors were induced in the word recall of healthy participants by presenting unpredictable mixed lists of words and nonwords. This technique revealed that lexicality, word frequency, imageability,…
On the Word Preferences of Suicidal versus Nonsuicidal College Students.
ERIC Educational Resources Information Center
Thurber, Steven; Torbet, David P.
1978-01-01
A word preference format was used to investigate reactions to verbal stimuli of suicidal and nonsuicidal persons. Words with aggressive or submissive denotative meanings significantly differentiated the two groups. The word "suicide" was selected at a higher frequency level by suicidal individuals when compared to their nonsuicidal counterparts.…
NASA Astrophysics Data System (ADS)
Neuman, Yair; Cohen, Yochai; Israeli, Navot; Tamir, Boaz
2018-02-01
The availability of historical textual corpora has led to the study of words' frequency along the historical time line, as representing the public's focus of attention over time. However, studying of the dynamics of words' meaning is still in its infancy. In this paper, we propose a methodology for studying the historical trajectory of words' meaning through Tsallis entropy. First, we present the idea that the meaning of a word may be studied through the entropy of its embedding. Using two historical case studies, we show that this entropy measure is correlated with the intensity in which a word is used. More importantly, we show that using Tsallis entropy with a superadditive entropy index may provide a better estimation of a word's frequency of use than using Shannon entropy. We explain this finding as resulting from an increasing redundancy between the words that comprise the semantic field of the target word and develop a new measure of redundancy between words. Using this measure, which relies on the Tsallis version of the Kullback-Leibler divergence, we show that the evolving meaning of a word involves the dynamics of increasing redundancy between components of its semantic field. The proposed methodology may enrich the toolkit of researchers who study the dynamics of word senses.
Gow, David W; Olson, Bruna B
2015-07-01
Phonotactic frequency effects play a crucial role in a number of debates over language processing and representation. It is unclear however, whether these effects reflect prelexical sensitivity to phonotactic frequency, or lexical "gang effects" in speech perception. In this paper, we use Granger causality analysis of MR-constrained MEG/EEG data to understand how phonotactic frequency influences neural processing dynamics during auditory lexical decision. Effective connectivity analysis showed weaker feedforward influence from brain regions involved in acoustic-phonetic processing (superior temporal gyrus) to lexical areas (supramarginal gyrus) for high phonotactic frequency words, but stronger top-down lexical influence for the same items. Low entropy nonwords (nonwords judged to closely resemble real words) showed a similar pattern of interactions between brain regions involved in lexical and acoustic-phonetic processing. These results contradict the predictions of a feedforward model of phonotactic frequency facilitation, but support the predictions of a lexically mediated account.
Gow, David W.; Olson, Bruna B.
2015-01-01
Phonotactic frequency effects play a crucial role in a number of debates over language processing and representation. It is unclear however, whether these effects reflect prelexical sensitivity to phonotactic frequency, or lexical “gang effects” in speech perception. In this paper, we use Granger causality analysis of MR-constrained MEG/EEG data to understand how phonotactic frequency influences neural processing dynamics during auditory lexical decision. Effective connectivity analysis showed weaker feedforward influence from brain regions involved in acoustic-phonetic processing (superior temporal gyrus) to lexical areas (supramarginal gyrus) for high phonotactic frequency words, but stronger top-down lexical influence for the same items. Low entropy nonwords (nonwords judged to closely resemble real words) showed a similar pattern of interactions between brain regions involved in lexical and acoustic-phonetic processing. These results contradict the predictions of a feedforward model of phonotactic frequency facilitation, but support the predictions of a lexically mediated account. PMID:25883413
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis
Steele, Joe; Bastola, Dhundy
2014-01-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base–base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel–Ziv techniques from data compression. PMID:23904502
The search for common ground: Part I. Lexical performance by linguistically diverse learners.
Windsor, Jennifer; Kohnert, Kathryn
2004-08-01
This study examines lexical performance by 3 groups of linguistically diverse school-age learners: English-only speakers with primary language impairment (LI), typical English-only speakers (EO), and typical bilingual Spanish-English speakers (BI). The accuracy and response time (RT) of 100 8- to 13-year-old children in word recognition and picture-naming tasks were analyzed. Within each task, stimulus difficulty was manipulated to include very easy stimuli (words that were high frequency/had an early age of acquisition in English) and more difficult stimuli (words of low frequency/late age of acquisition [AOA]). There was no difference among groups in real-word recognition accuracy or RT; all 3 groups showed lower accuracy with low-frequency words. In picture naming, all 3 groups showed a longer RT for words with a late AOA, although AOA had a disproportionate negative impact on BI performance. The EO group was faster and more accurate than both LI and BI groups in conditions with later acquired stimuli. Results are discussed in terms of quantitative differences separating EO children from the other 2 groups and qualitative similarities linking monolingual children with and without LI.
Word retrieval in picture descriptions produced by individuals with Alzheimer's disease
Kavé, Gitit; Goral, Mira
2016-01-01
What can tests of single-word production tell us about word retrieval in connected speech? We examined this question in 20 people with Alzheimer's disease (AD) and in 20 cognitively intact individuals. All participants completed tasks of picture naming and semantic fluency, and provided connected speech through picture descriptions. Picture descriptions were analyzed for total word output, percentages of content words, percentages of nouns, and percentages of pronouns out of all words, type-token ratio of all words and type-token ratio of nouns alone, mean frequency of all words and mean frequency of nouns alone, and mean word length. Individuals with AD performed worse than did cognitively intact individuals on the picture naming and semantic fluency tasks. They also produced a lower proportion of content words overall, a lower proportion of nouns, and a higher proportion of pronouns, as well as more frequent and shorter words on picture descriptions. Group differences in total word output and type-token ratios did not reach significance. Correlations between scores on tasks of single-word retrieval and measures of retrieval in picture descriptions emerged in the AD group but not in the control group. Scores on a picture naming task were associated with difficulties in word retrieval in connected speech in AD, while scores on a task of semantic verbal fluency were less useful in predicting measures of retrieval in context in this population. PMID:27171756
Empirical and Theoretical Bases of Zipf's Law.
ERIC Educational Resources Information Center
Wyllys, Ronald E.
1981-01-01
Explains Zipf's Law of Vocabulary Distribution (i.e., relationship between frequency of a word in a corpus and its rank), noting the discovery of the law, alternative forms, and literature relating to the search for a rationale for Zipf's Law. Thirty-eight references are cited. (EJS)
Predictors of photo naming: Dutch norms for 327 photos.
Shao, Zeshu; Stiegert, Julia
2016-06-01
In the present study, we report naming latencies and norms for 327 photos of objects in Dutch. We provide norms for eight psycholinguistic variables: age of acquisition, familiarity, imageability, image agreement, objective and subjective visual complexity, word frequency, word length in syllables and letters, and name agreement. Furthermore, multiple regression analyses revealed that the significant predictors of photo-naming latencies were name agreement, word frequency, imageability, and image agreement. The naming latencies, norms, and stimuli are provided as supplemental materials.
Age/Order of Acquisition Effects and the Cumulative Learning of Foreign Words: A Word Training Study
ERIC Educational Resources Information Center
Izura, Cristina; Perez, Miguel A.; Agallou, Elizabeth; Wright, Victoria C.; Marin, Javier; Stadthagen-Gonzalez, Hans; Ellis, Andrew W.
2011-01-01
Early acquired words are processed faster than later acquired words in lexical and semantic tasks. Demonstrating such age of acquisition (AoA) effects beyond reasonable doubt, and then investigating those effects empirically, is complicated by the natural correlation between AoA and other word properties such as frequency and imageability. In an…
ERIC Educational Resources Information Center
Eckerth, Johannes; Tavakoli, Parveneh
2012-01-01
Research on incidental second language (L2) vocabulary acquisition through reading has claimed that repeated encounters with unfamiliar words and the relative elaboration of processing these words facilitate word learning. However, so far both variables have been investigated in isolation. To help close this research gap, the current study…
An Adaptive Resonance Theory account of the implicit learning of orthographic word forms.
Glotin, H; Warnier, P; Dandurand, F; Dufau, S; Lété, B; Touzet, C; Ziegler, J C; Grainger, J
2010-01-01
An Adaptive Resonance Theory (ART) network was trained to identify unique orthographic word forms. Each word input to the model was represented as an unordered set of ordered letter pairs (open bigrams) that implement a flexible prelexical orthographic code. The network learned to map this prelexical orthographic code onto unique word representations (orthographic word forms). The network was trained on a realistic corpus of reading textbooks used in French primary schools. The amount of training was strictly identical to children's exposure to reading material from grade 1 to grade 5. Network performance was examined at each grade level. Adjustment of the learning and vigilance parameters of the network allowed us to reproduce the developmental growth of word identification performance seen in children. The network exhibited a word frequency effect and was found to be sensitive to the order of presentation of word inputs, particularly with low frequency words. These words were better learned with a randomized presentation order compared with the order of presentation in the school books. These results open up interesting perspectives for the application of ART networks in the study of the dynamics of learning to read. 2009 Elsevier Ltd. All rights reserved.
Letter position coding across modalities: the case of Braille readers.
Perea, Manuel; García-Chamorro, Cristina; Martín-Suesta, Miguel; Gómez, Pablo
2012-01-01
The question of how the brain encodes letter position in written words has attracted increasing attention in recent years. A number of models have recently been proposed to accommodate the fact that transposed-letter stimuli like jugde or caniso are perceptually very close to their base words. Here we examined how letter position coding is attained in the tactile modality via Braille reading. The idea is that Braille word recognition may provide more serial processing than the visual modality, and this may produce differences in the input coding schemes employed to encode letters in written words. To that end, we conducted a lexical decision experiment with adult Braille readers in which the pseudowords were created by transposing/replacing two letters. We found a word-frequency effect for words. In addition, unlike parallel experiments in the visual modality, we failed to find any clear signs of transposed-letter confusability effects. This dissociation highlights the differences between modalities. The present data argue against models of letter position coding that assume that transposed-letter effects (in the visual modality) occur at a relatively late, abstract locus.
The effects of divided attention on auditory priming.
Mulligan, Neil W; Duke, Marquinn; Cooper, Angela W
2007-09-01
Traditional theorizing stresses the importance of attentional state during encoding for later memory, based primarily on research with explicit memory. Recent research has begun to investigate the role of attention in implicit memory but has focused almost exclusively on priming in the visual modality. The present experiments examined the effect of divided attention on auditory implicit memory, using auditory perceptual identification, word-stem completion and word-fragment completion. Participants heard study words under full attention conditions or while simultaneously carrying out a distractor task (the divided attention condition). In Experiment 1, a distractor task with low response frequency failed to disrupt later auditory priming (but diminished explicit memory as assessed with auditory recognition). In Experiment 2, a distractor task with greater response frequency disrupted priming on all three of the auditory priming tasks as well as the explicit test. These results imply that although auditory priming is less reliant on attention than explicit memory, it is still greatly affected by at least some divided-attention manipulations. These results are consistent with research using visual priming tasks and have relevance for hypotheses regarding attention and auditory priming.
Grigoriev, Andrei; Oshhepkov, Ivan
2013-12-01
Normative data on the objective age of acquisition (AoA) for 286 Russian words are presented in this article. In addition, correlations between the objective AoA and subjective ratings, name agreement, picture name agreement, imageability, familiarity, word frequency, and word length are provided, as are correlations between the objective AoA and two measures of exemplar dominance (exemplar generation frequency and the number of times an exemplar was named first). The correlations between the aforementioned variables are generally consistent with the correlations reported in other normative studies. The objective AoA data are highly correlated with the subjective AoA ratings, whereas the correlations between the objective AoA and other psycholinguistic variables are moderate. The correlations between the objective AoA of Russian words and similar data for other languages are moderately high. The complete word norms may be downloaded from supplementary material.
Hierarchical Rhetorical Sentence Categorization for Scientific Papers
NASA Astrophysics Data System (ADS)
Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.
2018-03-01
Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.
Zhang, Juan; Xie, Jun; Hou, Wanli; Tu, Xiaochen; Xu, Jing; Song, Fujian; Wang, Zhihong; Lu, Zuxun
2012-01-01
Patient adherence is an important issue for health service providers and health researchers. However, the knowledge structure of diverse research on treatment adherence is unclear. This study used co-word analysis and social network analysis techniques to analyze research literature on adherence, and to show their knowledge structure and evolution over time. Published scientific papers about treatment adherence were retrieved from Web of Science (2000 to May 2011). A total of 2308 relevant articles were included: 788 articles published in 2000-2005 and 1520 articles published in 2006-2011. The keywords of each article were extracted by using the software Biblexcel, and the synonym and isogenous words were merged manually. The frequency of keywords and their co-occurrence frequency were counted. High frequency keywords were selected to yield the co-words matrix. Finally the decomposition maps were used to comb the complex knowledge structures. Research themes were more general in the first period (2000 to 2005), and more extensive with many more new terms in the second period (2006 to 2011). Research on adherence has covered more and more diseases, populations and methods, but other diseases/conditions are not as hot as HIV/AIDS and have not become specialty themes/sub-directions. Most studies originated from the United States. The dynamic of this field is mainly divergent, with increasing number of new sub-directions of research. Future research is required to investigate specific directions and converge as well to construct a general paradigm in this field.
Hou, Wanli; Tu, Xiaochen; Xu, Jing; Song, Fujian; Wang, Zhihong; Lu, Zuxun
2012-01-01
Background Patient adherence is an important issue for health service providers and health researchers. However, the knowledge structure of diverse research on treatment adherence is unclear. This study used co-word analysis and social network analysis techniques to analyze research literature on adherence, and to show their knowledge structure and evolution over time. Methods Published scientific papers about treatment adherence were retrieved from Web of Science (2000 to May 2011). A total of 2308 relevant articles were included: 788 articles published in 2000–2005 and 1520 articles published in 2006–2011. The keywords of each article were extracted by using the software Biblexcel, and the synonym and isogenous words were merged manually. The frequency of keywords and their co-occurrence frequency were counted. High frequency keywords were selected to yield the co-words matrix. Finally the decomposition maps were used to comb the complex knowledge structures. Results Research themes were more general in the first period (2000 to 2005), and more extensive with many more new terms in the second period (2006 to 2011). Research on adherence has covered more and more diseases, populations and methods, but other diseases/conditions are not as hot as HIV/AIDS and have not become specialty themes/sub-directions. Most studies originated from the United States. Conclusion The dynamic of this field is mainly divergent, with increasing number of new sub-directions of research. Future research is required to investigate specific directions and converge as well to construct a general paradigm in this field. PMID:22496819
The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening
Cibelli, Emily S.; Leonard, Matthew K.; Johnson, Keith; Chang, Edward F.
2015-01-01
Neural representations of words are thought to have a complex spatio-temporal cortical basis. It has been suggested that spoken word recognition is not a process of feed-forward computations from phonetic to lexical forms, but rather involves the online integration of bottom-up input with stored lexical knowledge. Using direct neural recordings from the temporal lobe, we examined cortical responses to words and pseudowords. We found that neural populations were not only sensitive to lexical status (real vs. pseudo), but also to cohort size (number of words matching the phonetic input at each time point) and cohort frequency (lexical frequency of those words). These lexical variables modulated neural activity from the posterior to anterior temporal lobe, and also dynamically as the stimuli unfolded on a millisecond time scale. Our findings indicate that word recognition is not purely modular, but relies on rapid and online integration of multiple sources of lexical knowledge. PMID:26072003
Identification of misspelled words without a comprehensive dictionary using prevalence analysis.
Turchin, Alexander; Chu, Julia T; Shubina, Maria; Einbinder, Jonathan S
2007-10-11
Misspellings are common in medical documents and can be an obstacle to information retrieval. We evaluated an algorithm to identify misspelled words through analysis of their prevalence in a representative body of text. We evaluated the algorithm's accuracy of identifying misspellings of 200 anti-hypertensive medication names on 2,000 potentially misspelled words randomly selected from narrative medical documents. Prevalence ratios (the frequency of the potentially misspelled word divided by the frequency of the non-misspelled word) in physician notes were computed by the software for each of the words. The software results were compared to the manual assessment by an independent reviewer. Area under the ROC curve for identification of misspelled words was 0.96. Sensitivity, specificity, and positive predictive value were 99.25%, 89.72% and 82.9% for the prevalence ratio threshold (0.32768) with the highest F-measure (0.903). Prevalence analysis can be used to identify and correct misspellings with high accuracy.
The effect of orthographic neighborhood in the reading span task.
Robert, Christelle; Postal, Virginie; Mathey, Stéphanie
2015-04-01
This study aimed at examining whether and to what extent orthographic neighborhood of words influences performance in a working memory span task. Twenty-five participants performed a reading span task in which final words to be memorized had either no higher frequency orthographic neighbor or at least one. In both neighborhood conditions, each participant completed three series of two, three, four, or five sentences. Results indicated an interaction between orthographic neighborhood and list length. In particular, an inhibitory effect of orthographic neighborhood on recall appeared in list length 5. A view is presented suggesting that words with higher frequency neighbors require more resources to be memorized than words with no such neighbors. The implications of the results are discussed with regard to memory processes and current models of visual word recognition.
Repetition priming of words and nonwords in Alzheimer's disease and normal aging
Ober, Beth A.; Shenaut, Gregory K.
2014-01-01
Objective This study examines the magnitude and direction of nonword and word lexical decision repetition priming effects in Alzheimer’s disease (AD) and normal aging, focusing specifically on the negative priming effect sometimes observed with repeated nonwords. Method Probable Alzheimer's disease (AD) patients (30), elderly normal controls (34), and young normal controls (49) participated in a repetition priming experiment using low-frequency words and word-like nonwords with a letter-level orthographic orienting task at study followed by a lexical decision test phase. Results Although participants' reaction times were longer in AD compared to elderly normal, and elderly normal compared to young normal, the repetition priming effect and the degree to which the repetition priming effect was reversed for nonwords compared to words was unaffected by AD or normal aging. Conclusion AD patients, like young and elderly normal participants, are able to modify (in the case of words) and create (in the case of nonwords) long-term memory traces for lexical stimuli, based on a single orthographic processing trial. The nonword repetition results are discussed from the perspective of new vocabulary learning commencing with a provisional lexical memory trace created after orthographic encoding of a novel word-like letter string. PMID:25000325
Balota, David A; Aschenbrenner, Andrew J; Yap, Melvin J
2013-09-01
A counterintuitive and theoretically important pattern of results in the visual word recognition literature is that both word frequency and stimulus quality produce large but additive effects in lexical decision performance. The additive nature of these effects has recently been called into question by Masson and Kliegl (in press), who used linear mixed effects modeling to provide evidence that the additive effects were actually being driven by previous trial history. Because Masson and Kliegl also included semantic priming as a factor in their study and recent evidence has shown that semantic priming can moderate the additivity of word frequency and stimulus quality (Scaltritti, Balota, & Peressotti, 2012), we reanalyzed data from 3 published studies to determine if previous trial history moderated the additive pattern when semantic priming was not also manipulated. The results indicated that previous trial history did not influence the joint influence of word frequency and stimulus quality. More important, and independent of Masson and Kliegl's conclusions, we also show how a common transformation used in linear mixed effects analyses to normalize the residuals can systematically alter the way in which two variables combine to influence performance. Specifically, using transformed, rather than raw reaction times, consistently produces more underadditive patterns. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Sentinels of Breach: Lexical Choice as a Measure of Urgency in Social Media.
Hampton, Andrew J; Shalin, Valerie L
2017-06-01
Objective This paper identifies general properties of language style in social media to help identify areas of need in disasters. Background In the search for metrics of need in social media data, much of the existing literature ignores processes of language usage. Psychological concepts, such as narrative breach, Gricean maxims, and lexical marking in cognition, may assist the recovery of disaster-relevant metrics from altered patterns of word prevalence. Method We analyzed several hundred thousand location-specific microblogs from Twitter for Hurricane Sandy, Oklahoma tornadoes, and the Boston Marathon bombing along with a fantasy football control corpus, examining the relative frequency of words in 36 antonym pairs. We compared the ratio of words within these pairs to the corresponding ratios recovered from an online word norm database. Results Partial rank correlation values between observed antonym ratios demonstrate consistent patterns across disasters. For Hurricane Sandy data, 25 antonym pairs have moderate to large effect sizes for discrepancies between observed and normative ratios. Across disasters, 7 pairs are stable and meet effect size criteria. Sentiment analysis, supplementary word frequency counts with respect to disaster proximity, and examples support a "breach" account for the observed results. Conclusion Lexical choice between antonyms, only somewhat related to sentiment, suggests that social media capture wide-ranging breaches of normal functioning. Application Antonym selection contributes to screening tools based on language style for identifying relevant content and quantifying disruption using social media without the a priori specification of content keywords.
Investigative change detection: identifying new topics using lexicon-based search
NASA Astrophysics Data System (ADS)
Hintz, Kenneth J.
2002-08-01
In law enforcement there is much textual data which needs to be searched in order to detect new threats. A new methodology which can be applied to this need is the automatic searching of the contents of documents from known sources to construct a lexicon of words used by that source. When analyzing future documents, the occurrence of words which have not been lexiconized are indicative of the introduction of a new topic into the source's lexicon which should be examined in its context by an analyst. A system analogous to this has been built and used to detect Fads and Categories on web sites. Fad refers to the first appearance of a word not in the lexicon; Category refers to the repeated appearance of a Fad word and the exceeding of some frequency or spatial occurrence metric indicating a permanence to the Category.
Strand, Julia F; Sommers, Mitchell S
2011-09-01
Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al., 2002) that perceptual and cognitive properties underlying spoken word recognition are not specific to the auditory domain. In addition, the results support the use of the phi-square statistic as a better predictor of lexical competition than metrics currently used in models of spoken word recognition. © 2011 Acoustical Society of America
Juhasz, Barbara J
2016-11-14
Recording eye movements provides information on the time-course of word recognition during reading. Juhasz and Rayner [Juhasz, B. J., & Rayner, K. (2003). Investigating the effects of a set of intercorrelated variables on eye fixation durations in reading. Journal of Experimental Psychology: Learning, Memory and Cognition, 29, 1312-1318] examined the impact of five word recognition variables, including familiarity and age-of-acquisition (AoA), on fixation durations. All variables impacted fixation durations, but the time-course differed. However, the study focused on relatively short, morphologically simple words. Eye movements are also informative for examining the processing of morphologically complex words such as compound words. The present study further examined the time-course of lexical and semantic variables during morphological processing. A total of 120 English compound words that varied in familiarity, AoA, semantic transparency, lexeme meaning dominance, sensory experience rating (SER), and imageability were selected. The impact of these variables on fixation durations was examined when length, word frequency, and lexeme frequencies were controlled in a regression model. The most robust effects were found for familiarity and AoA, indicating that a reader's experience with compound words significantly impacts compound recognition. These results provide insight into semantic processing of morphologically complex words during reading.
Frequency of word-use predicts rates of lexical evolution throughout Indo-European history.
Pagel, Mark; Atkinson, Quentin D; Meade, Andrew
2007-10-11
Greek speakers say "omicronupsilonrho", Germans "schwanz" and the French "queue" to describe what English speakers call a 'tail', but all of these languages use a related form of 'two' to describe the number after one. Among more than 100 Indo-European languages and dialects, the words for some meanings (such as 'tail') evolve rapidly, being expressed across languages by dozens of unrelated words, while others evolve much more slowly--such as the number 'two', for which all Indo-European language speakers use the same related word-form. No general linguistic mechanism has been advanced to explain this striking variation in rates of lexical replacement among meanings. Here we use four large and divergent language corpora (English, Spanish, Russian and Greek) and a comparative database of 200 fundamental vocabulary meanings in 87 Indo-European languages to show that the frequency with which these words are used in modern language predicts their rate of replacement over thousands of years of Indo-European language evolution. Across all 200 meanings, frequently used words evolve at slower rates and infrequently used words evolve more rapidly. This relationship holds separately and identically across parts of speech for each of the four language corpora, and accounts for approximately 50% of the variation in historical rates of lexical replacement. We propose that the frequency with which specific words are used in everyday language exerts a general and law-like influence on their rates of evolution. Our findings are consistent with social models of word change that emphasize the role of selection, and suggest that owing to the ways that humans use language, some words will evolve slowly and others rapidly across all languages.
"Journal of Geography" Key Words: Trends and Recommendations
ERIC Educational Resources Information Center
Mitchell, Jerry T.; Brysch, Carmen P.; Collins, Larianne
2015-01-01
The "Journal of Geography" has used key words since 1990 to help readers and researchers seek out work of particular interest. Key words generally supplement article titles and are hopefully chosen with care. The focus of this article is the "Journal of Geography" key word, its presence, timing, and frequency. Using a…
The Representation of Morphemes in the Russian Lexicon
ERIC Educational Resources Information Center
Antic, Eugenia
2010-01-01
Different morphological theories assign different status to parts of words, roots and affixes. Models range from accepting both bound roots and affixes to only assigning unit status to standalone words. Some questions that interest researchers are (1) What are the smallest morphological units, words or word parts? (2) How does frequency affect…
A Bootstrapping Model of Frequency and Context Effects in Word Learning
ERIC Educational Resources Information Center
Kachergis, George; Yu, Chen; Shiffrin, Richard M.
2017-01-01
Prior research has shown that people can learn many nouns (i.e., word--object mappings) from a short series of ambiguous situations containing multiple words and objects. For successful cross-situational learning, people must approximately track which words and referents co-occur most frequently. This study investigates the effects of allowing…
Masked Priming with Orthographic Neighbors: A Test of the Lexical Competition Assumption
ERIC Educational Resources Information Center
Nakayama, Mariko; Sears, Christopher R.; Lupker, Stephen J.
2008-01-01
In models of visual word identification that incorporate inhibitory competition among activated lexical units, a word's higher frequency neighbors will be the word's strongest competitors. Preactivation of these neighbors by a prime is predicted to delay the word's identification. Using the masked priming paradigm (K. I. Forster & C. Davis, 1984,…
Word-Processing "Efficiency"--By Means of Personalized Word-Frequency Lists.
ERIC Educational Resources Information Center
Coniam, David
2001-01-01
Examines the concept of the efficiency with which text is entered into a word processor--from the perspective of effective use of keyboard shortcuts. Illustrates how the possibility for productiveness offered by shortcuts, available through the use of features such as Autotext, are often under-utilized by many word processor users, academics being…
Parafoveal Processing Affects Outgoing Saccade Length during the Reading of Chinese
ERIC Educational Resources Information Center
Liu, Yanping; Reichle, Erik D.; Li, Xingshan
2015-01-01
Participants' eye movements were measured while reading Chinese sentences in which target-word frequency and the availability of parafoveal processing were manipulated using a gaze-contingent boundary paradigm. The results of this study indicate that preview availability and its interaction with word frequency modulated the length of the saccades…
Improving Spelling of High Frequency Words for Transfer in Written Work
ERIC Educational Resources Information Center
DuBois, Kathleen; Erickson, Kristie; Jacobs, Monica
2007-01-01
This project describes a 12-week program developed to improve student spelling of high frequency words for transfer in written work across the curriculum. The targeted population consists of kindergarten, first, and third graders in two public elementary schools in a community located in central Illinois. Following an extensive literature review,…
Syntactic Complexity and Frequency in the Neurocognitive Language System.
Yang, Yun-Hsuan; Marslen-Wilson, William D; Bozic, Mirjana
2017-09-01
Prominent neurobiological models of language follow the widely accepted assumption that language comprehension requires two principal mechanisms: a lexicon storing the sound-to-meaning mapping of words, primarily involving bilateral temporal regions, and a combinatorial processor for syntactically structured items, such as phrases and sentences, localized in a left-lateralized network linking left inferior frontal gyrus (LIFG) and posterior temporal areas. However, recent research showing that the processing of simple phrasal sequences may engage only bilateral temporal areas, together with the claims of distributional approaches to grammar, raise the question of whether frequent phrases are stored alongside individual words in temporal areas. In this fMRI study, we varied the frequency of words and of short and long phrases in English. If frequent phrases are indeed stored, then only less frequent items should generate selective left frontotemporal activation, because memory traces for such items would be weaker or not available in temporal cortex. Complementary univariate and multivariate analyses revealed that, overall, simple words (verbs) and long phrases engaged LIFG and temporal areas, whereas short phrases engaged bilateral temporal areas, suggesting that syntactic complexity is a key factor for LIFG activation. Although we found a robust frequency effect for words in temporal areas, no frequency effects were found for the two phrasal conditions. These findings support the conclusion that long and short phrases are analyzed, respectively, in the left frontal network and in a bilateral temporal network but are not retrieved from memory in the same way as simple words during spoken language comprehension.
Marian, Viorica; Bartolotti, James; Chabal, Sarah; Shook, Anthony
2012-01-01
Past research has demonstrated cross-linguistic, cross-modal, and task-dependent differences in neighborhood density effects, indicating a need to control for neighborhood variables when developing and interpreting research on language processing. The goals of the present paper are two-fold: (1) to introduce CLEARPOND (Cross-Linguistic Easy-Access Resource for Phonological and Orthographic Neighborhood Densities), a centralized database of phonological and orthographic neighborhood information, both within and between languages, for five commonly-studied languages: Dutch, English, French, German, and Spanish; and (2) to show how CLEARPOND can be used to compare general properties of phonological and orthographic neighborhoods across languages. CLEARPOND allows researchers to input a word or list of words and obtain phonological and orthographic neighbors, neighborhood densities, mean neighborhood frequencies, word lengths by number of phonemes and graphemes, and spoken-word frequencies. Neighbors can be defined by substitution, deletion, and/or addition, and the database can be queried separately along each metric or summed across all three. Neighborhood values can be obtained both within and across languages, and outputs can optionally be restricted to neighbors of higher frequency. To enable researchers to more quickly and easily develop stimuli, CLEARPOND can also be searched by features, generating lists of words that meet precise criteria, such as a specific range of neighborhood sizes, lexical frequencies, and/or word lengths. CLEARPOND is freely-available to researchers and the public as a searchable, online database and for download at http://clearpond.northwestern.edu. PMID:22916227
ERIC Educational Resources Information Center
Dambacher, Michael; Dimigen, Olaf; Braun, Mario; Wille, Kristin; Jacobs, Arthur M.; Kliegl, Reinhold
2012-01-01
Three ERP experiments examined the effect of word presentation rate (i.e., stimulus onset asynchrony, SOA) on the time course of word frequency and predictability effects in sentence reading. In Experiments 1 and 2, sentences were presented word-by-word in the screen center at an SOA of 700 and 490ms, respectively. While these rates are typical…
Large-scale functional networks connect differently for processing words and symbol strings.
Liljeström, Mia; Vartiainen, Johanna; Kujala, Jan; Salmelin, Riitta
2018-01-01
Reconfigurations of synchronized large-scale networks are thought to be central neural mechanisms that support cognition and behavior in the human brain. Magnetoencephalography (MEG) recordings together with recent advances in network analysis now allow for sub-second snapshots of such networks. In the present study, we compared frequency-resolved functional connectivity patterns underlying reading of single words and visual recognition of symbol strings. Word reading emphasized coherence in a left-lateralized network with nodes in classical perisylvian language regions, whereas symbol processing recruited a bilateral network, including connections between frontal and parietal regions previously associated with spatial attention and visual working memory. Our results illustrate the flexible nature of functional networks, whereby processing of different form categories, written words vs. symbol strings, leads to the formation of large-scale functional networks that operate at distinct oscillatory frequencies and incorporate task-relevant regions. These results suggest that category-specific processing should be viewed not so much as a local process but as a distributed neural process implemented in signature networks. For words, increased coherence was detected particularly in the alpha (8-13 Hz) and high gamma (60-90 Hz) frequency bands, whereas increased coherence for symbol strings was observed in the high beta (21-29 Hz) and low gamma (30-45 Hz) frequency range. These findings attest to the role of coherence in specific frequency bands as a general mechanism for integrating stimulus-dependent information across brain regions.
Changing word usage predicts changing word durations in New Zealand English.
Sóskuthy, Márton; Hay, Jennifer
2017-09-01
This paper investigates the emergence of lexicalized effects of word usage on word duration by looking at parallel changes in usage and duration over 130years in New Zealand English. Previous research has found that frequent words are shorter, informative words are longer, and words in utterance-final position are also longer. It has also been argued that some of these patterns are not simply online adjustments, but are incorporated into lexical representations. While these studies tend to focus on the synchronic aspects of such patterns, our corpus shows that word-usage patterns and word durations are not static over time. Many words change in duration and also change with respect to frequency, informativity and likelihood of occurring utterance-finally. Analysis of changing word durations over this time period shows substantial patterns of co-adaptation between word usage and word durations. Words that are increasing in frequency are becoming shorter. Words that are increasing/decreasing in informativity show a change in the same direction in duration (e.g. increasing informativity is associated with increasing duration). And words that are increasingly appearing utterance-finally are lengthening. These effects exist independently of the local effects of the predictors. For example, words that are increasing utterance-finally lengthen in all positions, including utterance-medially. We show that these results are compatible with a number of different views about lexical representations, but they cannot be explained without reference to a production-perception loop that allows speakers to update their representations dynamically on the basis of their experience. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Wierda, Stefan M; Taatgen, Niels A; van Rijn, Hedderik; Martens, Sander
2013-01-01
When a second target (T2) is presented in close succession of a first target (T1) within a stream of non-targets, people often fail to detect T2-a deficit known as the attentional blink (AB). Two types of theories can be distinguished that have tried to account for this phenomenon. Whereas attentional-control theories suggest that protection of consolidation processes induces the AB, limited-resource theories claim that the AB is caused by a lack of resources. According to the latter type of theories, increasing difficulty of one or both targets should increase the magnitude of the AB. Similarly, attentional-control theories predict that a difficult T1 increases the AB due to prolonged processing. However, the prediction for T2 is not as straightforward. Prolonged processing of T2 could cause conflicts and increase the AB. However, if consolidation of T2 is postponed without loss of identity, the AB might be attenuated. Participants performed an AB task that consisted of a stream of distractor non-words and two target words. Difficulty of T1 and T2 was manipulated by varying word-frequency. Overall performance for high-frequency words was better than for low-frequency words. When T1 was highly frequent, the AB was reduced. The opposite effect was found for T2. When T2 was highly frequent, performance during the AB period was relatively worse than for a low-frequency T2. A threaded-cognition model of the AB was presented that simulated the observed pattern of behavior by taking changes in the time-course of retrieval and consolidation processes into account. Our results were replicated in a subsequent ERP study. The finding that a difficult low-frequency T2 reduces the magnitude of the AB is at odds with limited-resource accounts of the AB. However, it was successfully accounted for by the threaded-cognition model, thus providing an explanation in terms of attentional control.
Colombo, Lucia; Pasini, Margherita; Balota, David A
2006-09-01
Performance in two experiments was compared on a list of words of high and low frequency in which familiarity/meaningfulness (FM) was balanced and on a list of high- and low-frequency words in which FM was confounded with frequency (i.e., high frequency--high familiarity vs. low frequency--low familiarity). Both repetition and task (lexical decision and naming) were investigated. In the lexical decision task of Experiment 1, both frequency and repetition effects were larger in the list with FM confounded than in the list with FM matched. In the naming task, frequency and repetition effects and their interaction were significant, but there was no influence of FM list context. In Experiment 2, in which the repetitions occurred across blocks, as opposed to randomly intermixed within a list, similar results were found; however, there was no interaction between list and repetition. The results suggest that an evaluation of items in terms of their meaning and familiarity explains a large part of the variance, only in lexical decision. These dimensions may be cued both by subjective feelings of familiarity and the extent to which semantic information is available and by episodic traces due to recent encounters with the item.
Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings
Yan, Xiaoyong; Minnhagen, Petter
2015-01-01
The word-frequency distribution of a text written by an author is well accounted for by a maximum entropy distribution, the RGF (random group formation)-prediction. The RGF-distribution is completely determined by the a priori values of the total number of words in the text (M), the number of distinct words (N) and the number of repetitions of the most common word (kmax). It is here shown that this maximum entropy prediction also describes a text written in Chinese characters. In particular it is shown that although the same Chinese text written in words and Chinese characters have quite differently shaped distributions, they are nevertheless both well predicted by their respective three a priori characteristic values. It is pointed out that this is analogous to the change in the shape of the distribution when translating a given text to another language. Another consequence of the RGF-prediction is that taking a part of a long text will change the input parameters (M, N, kmax) and consequently also the shape of the frequency distribution. This is explicitly confirmed for texts written in Chinese characters. Since the RGF-prediction has no system-specific information beyond the three a priori values (M, N, kmax), any specific language characteristic has to be sought in systematic deviations from the RGF-prediction and the measured frequencies. One such systematic deviation is identified and, through a statistical information theoretical argument and an extended RGF-model, it is proposed that this deviation is caused by multiple meanings of Chinese characters. The effect is stronger for Chinese characters than for Chinese words. The relation between Zipf’s law, the Simon-model for texts and the present results are discussed. PMID:25955175
Palmer, Rebecca; Hughes, Helen; Chater, Tim
2017-01-01
Word finding is a common difficulty for people with aphasia. Targeting words that are relevant to the individual could maximise the usefulness and impact of word finding therapy. To provide insights into words that people with aphasia perceive to be personally relevant. 100 people with aphasia were each asked to identify 100 words that would be particularly important for them to be able to say. Two speech and language therapist researchers conducted a quantitative content analysis of the words selected. The words were coded into a framework of topics and subtopics. The frequency with which different words and topics were selected was then calculated. 100 participants representing 20 areas of the United Kingdom ranged in age from 23 to 85 years. Word finding difficulties ranged from mild to severe. The sample of 9999 words selected for practice included 3095 different words in 27 topics. The majority of words selected (79.4%) were from the topics 'food and drink' (30.6%), 'nature and gardening' (10.3%), 'entertainment' (9.4%), 'places' (7.3%), 'people' (6.7%), 'house' (6.5%), 'clothes' (5.2%) and 'travel' (3.5%). The 100 words types chosen with the greatest frequency were identified. These account for 27 percent of the 9999 words chosen by the participants. Personally relevant vocabulary is unique to each individual and is likely to contain specific or specialist words for which material needs to be individually prepared. However there is some commonality in the words chosen by people with aphasia. This could inform pre-prepared materials for use in word finding therapy from which personally relevant words could be selected for practice.
Hughes, Helen; Chater, Tim
2017-01-01
Background Word finding is a common difficulty for people with aphasia. Targeting words that are relevant to the individual could maximise the usefulness and impact of word finding therapy. Aims To provide insights into words that people with aphasia perceive to be personally relevant. Methods and procedures 100 people with aphasia were each asked to identify 100 words that would be particularly important for them to be able to say. Two speech and language therapist researchers conducted a quantitative content analysis of the words selected. The words were coded into a framework of topics and subtopics. The frequency with which different words and topics were selected was then calculated. Outcomes and results 100 participants representing 20 areas of the United Kingdom ranged in age from 23 to 85 years. Word finding difficulties ranged from mild to severe. The sample of 9999 words selected for practice included 3095 different words in 27 topics. The majority of words selected (79.4%) were from the topics ‘food and drink’ (30.6%), ‘nature and gardening’ (10.3%), ‘entertainment’ (9.4%), ‘places’ (7.3%), ‘people’ (6.7%), ‘house’ (6.5%), ‘clothes’ (5.2%) and ‘travel’ (3.5%). The 100 words types chosen with the greatest frequency were identified. These account for 27 percent of the 9999 words chosen by the participants. Discussion Personally relevant vocabulary is unique to each individual and is likely to contain specific or specialist words for which material needs to be individually prepared. However there is some commonality in the words chosen by people with aphasia. This could inform pre-prepared materials for use in word finding therapy from which personally relevant words could be selected for practice. PMID:28346518
Does neighborhood size really cause the word length effect?
Guitard, Dominic; Saint-Aubin, Jean; Tehan, Gerald; Tolan, Anne
2018-02-01
In short-term serial recall, it is well-known that short words are remembered better than long words. This word length effect has been the cornerstone of the working memory model and a benchmark effect that all models of immediate memory should account for. Currently, there is no consensus as to what determines the word length effect. Jalbert and colleagues (Jalbert, Neath, Bireta, & Surprenant, 2011a; Jalbert, Neath, & Surprenant, 2011b) suggested that neighborhood size is one causal factor. In six experiments we systematically examined their suggestion. In Experiment 1, with an immediate serial recall task, multiple word lengths, and a large pool of words controlled for neighborhood size, the typical word length effect was present. In Experiments 2 and 3, with an order reconstruction task and words with either many or few neighbors, we observed the typical word length effect. In Experiment 4 we tested the hypothesis that the previous abolition of the word length effect when neighborhood size was controlled was due to a confounded factor: frequency of orthographic structure. As predicted, we reversed the word length effect when using short words with less frequent orthographic structures than the long words, as was done in both of Jalbert et al.'s studies. In Experiments 5 and 6, we again observed the typical word length effect, even if we controlled for neighborhood size and frequency of orthographic structure. Overall, the results were not consistent with the predictions of Jalbert et al. and clearly showed a large and reliable word length effect after controlling for neighborhood size.
Word lengths are optimized for efficient communication.
Piantadosi, Steven T; Tily, Harry; Gibson, Edward
2011-03-01
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.
Novel Topic Impact on Authorship Attribution
2009-12-01
sentences on a printed page. The work of Wilhelm Fucks in [7] attributed authorship based on the frequency distribution over word syllables. The most...Unitarian Review, vol. 30, pp. 452–460, 1888. [7] W. Fucks , "On Mathematical Analysis of Style," Biometrika, vol. 39, pp. 122–129, 1952. [8
Word Families and Frequency Bands in Vocabulary Tests: Challenging Conventions
ERIC Educational Resources Information Center
Kremmel, Benjamin
2016-01-01
Vocabulary test development often appears to be based on the design principles of previous tests, without questioning or empirically examining the assumptions underlying those principles. Given the current proliferation of vocabulary tests, it seems timely for the field of vocabulary testing to problematize some of those traditionalised…
An Analysis of the Most Frequently Occurring Words in Spoken American English.
ERIC Educational Resources Information Center
Plant, Geoff
1999-01-01
A study analyzed frequency of occurrence of consonants, vowels, and diphthongs, syllabic structure of the words, and segmental structure of the 311 monosyllabic words of 500 words that occur most frequently in English. Three mannerisms of articulation accounted for nearly 75 percent of all consonant occurrences: stops, semi-vowels, and nasals.…
Words that Second Language Learners Are Likely to Hear, Read, and Use
ERIC Educational Resources Information Center
Davidson, Douglas J.; Indefrey, Peter; Gullberg, Marianne
2008-01-01
In the present study, we explore whether multiple data sources may be more effective than single sources at predicting the words that language learners are likely to know. Second language researchers have hypothesized that there is a relationship between word frequency and the likelihood that words will be encountered or used by second language…
Comparing the Frequency Effect Between the Lexical Decision and Naming Tasks in Chinese
Wu, Jei-Tun
2016-01-01
In psycholinguistic research, the frequency effect can be one of the indicators for eligible experimental tasks that examine the nature of lexical access. Usually, only one of those tasks is chosen to examine lexical access in a study. Using two exemplar experiments, this paper introduces an approach to include both the lexical decision task and the naming task in a study. In the first experiment, the stimuli were Chinese characters with frequency and regularity manipulated. In the second experiment, the stimuli were switched to Chinese two-character words, in which the word frequency and the regularity of the leading character were manipulated. The logic of these two exemplar experiments was to explore some important issues such as the role of phonology on recognition by comparing the frequency effect between both the tasks. The results revealed different patterns of lexical access from those reported in the alphabetic systems. The results of Experiment 1 manifested a larger frequency effect in the naming task as compared to the LDT, when the stimuli were Chinese characters. And it is noteworthy that, in Experiment 1, when the stimuli were regular Chinese characters, the frequency effect observed in the naming task was roughly equivalent to that in the LDT. However, a smaller frequency effect was shown in the naming task as compared to the LDT, when the stimuli were switched to Chinese two-character words in Experiment 2. Taking advantage of the respective demands and characteristics in both tasks, researchers can obtain a more complete and precise picture of character/word recognition. PMID:27077703
Reading laterally: the cerebral hemispheric use of spatial frequencies in visual word recognition.
Tadros, Karine; Dupuis-Roy, Nicolas; Fiset, Daniel; Arguin, Martin; Gosselin, Frédéric
2013-01-04
It is generally accepted that the left hemisphere (LH) is more capable for reading than the right hemisphere (RH). Left hemifield presentations (initially processed by the RH) lead to a globally higher error rate, slower word identification, and a significantly stronger word length effect (i.e., slower reaction times for longer words). Because the visuo-perceptual mechanisms of the brain for word recognition are primarily localized in the LH (Cohen et al., 2003), it is possible that this part of the brain possesses better spatial frequency (SF) tuning for processing the visual properties of words than the RH. The main objective of this study is to determine the SF tuning functions of the LH and RH for word recognition. Each word image was randomly sampled in the SF domain using the SF bubbles method (Willenbockel et al., 2010) and was presented laterally to the left or right visual hemifield. As expected, the LH requires less visual information than the RH to reach the same level of performance, illustrating the well-known LH advantage for word recognition. Globally, the SF tuning of both hemispheres is similar. However, these seemingly identical tuning functions hide important differences. Most importantly, we argue that the RH requires higher SFs to identify longer words because of crowding.
Rank-frequency relation for Chinese characters
NASA Astrophysics Data System (ADS)
Deng, Weibing; Allahverdyan, Armen E.; Li, Bo; Wang, Qiuping A.
2014-02-01
We show that the Zipf's law for Chinese characters perfectly holds for sufficiently short texts (few thousand different characters). The scenario of its validity is similar to the Zipf's law for words in short English texts. For long Chinese texts (or for mixtures of short Chinese texts), rank-frequency relations for Chinese characters display a two-layer, hierarchic structure that combines a Zipfian power-law regime for frequent characters (first layer) with an exponential-like regime for less frequent characters (second layer). For these two layers we provide different (though related) theoretical descriptions that include the range of low-frequency characters (hapax legomena). We suggest that this hierarchic structure of the rank-frequency relation connects to semantic features of Chinese characters (number of different meanings and homographies). The comparative analysis of rank-frequency relations for Chinese characters versus English words illustrates the extent to which the characters play for Chinese writers the same role as the words for those writing within alphabetical systems.
Evidence for reading improvement following tDCS treatment in children and adolescents with Dyslexia.
Costanzo, Floriana; Varuzza, Cristiana; Rossi, Serena; Sdoia, Stefano; Varvara, Pamela; Oliveri, Massimiliano; Giacomo, Koch; Vicari, Stefano; Menghini, Deny
2016-01-01
There is evidence that non-invasive brain stimulation transitorily modulates reading by facilitating the neural pathways underactive in individuals with dyslexia. The study aimed at investigating whether multiple sessions of transcranial direct current stimulation (tDCS) would enhance reading abilities of children and adolescents with dyslexia and whether the effect is long-lasting. Eighteen children and adolescents with dyslexia received three 20-minute sessions a week for 6 weeks (18 sessions) of left anodal/right cathodal tDCS set at 1 mA over parieto-temporal regions combined with a cognitive training. The participants were randomly assigned to the active or the sham treatment; reading tasks (text, high and low frequency words, non-words) were used as outcome measures and collected before treatment, after treatment and one month after the end of treatment. The tolerability of tDCS was evaluated. The active group showed reduced low frequency word reading errors and non-word reading times. These positive effects were stable even one month after the end of treatment. None reported adverse effects. The study shows preliminary evidence of tDCS feasibility and efficacy in improving non-words and low frequency words reading of children and adolescents with dyslexia and it opens new rehabilitative perspectives for the remediation of dyslexia.
Zipf's Law and Avoidance of Excessive Synonymy
ERIC Educational Resources Information Center
Manin, Dmitrii Y.
2008-01-01
Zipf's law states that if words of language are ranked in the order of decreasing frequency in texts, the frequency of a word is inversely proportional to its rank. It is very reliably observed in the data, but to date it escaped satisfactory theoretical explanation. This article suggests that Zipf's law may result from a hierarchical organization…
The Frequency-Predictability Interaction in Reading: It Depends Where You're Coming from
ERIC Educational Resources Information Center
Hand, Christopher J.; Miellet, Sebastien; O'Donnell, Patrick J.; Sereno, Sara C.
2010-01-01
A word's frequency of occurrence and its predictability from a prior context are key factors determining how long the eyes remain on that word in normal reading. Past reaction-time and eye movement research can be distinguished by whether these variables, when combined, produce interactive or additive results, respectively. Our study addressed…
Practical Applications of Analyses and Descriptions of Texts.
ERIC Educational Resources Information Center
Pugh, A. K.
An examination of the literature supports the view that the implications of text studies have yet to have much impact on classrooms. For example, word frequency lists have been used widely in the preparation of reading materials. However, few books come with a list of the frequency of the words they contain. Thus, the main use of comparing texts…
Word Inventory and Frequency Analysis of French Conversations.
ERIC Educational Resources Information Center
Malecot, Andre
This word frequency list was extracted from a corpus of fifty half-hour conversations recorded in Paris during the academic year 1967-68. The speakers, who did not know that they were being recorded, were all well-educated professionals and all speakers of the most standard dialect of French. The list is made up of all phonetically discrete words…
Neglect Dyslexia: Frequency, Association with Other Hemispatial Neglects, and Lesion Localization
ERIC Educational Resources Information Center
Lee, Byung Hwa; Suh, Mee Kyung; Kim, Eun-Joo; Seo, Sang Won; Choi, Kyung Mook; Kim, Gyeong-Moon; Chung, Chin-Sang; Heilman, Kenneth M.; Na, Duk L.
2009-01-01
Patients with right hemisphere injury often omit or misread words on the left side of a page or the beginning letters of single words (neglect dyslexia). Our study involving a large sample of acute right hemisphere stroke investigated (1) the frequency of neglect dyslexia (ND), (2) the association between ND and other types of contralesional…
A Picture Database for Verbs and Nouns with Different Action Content in Turkish.
Bayram, Ece; Aydin, Özgür; Ergenc, Hacer Iclal; Akbostanci, Muhittin Cenk
2017-08-01
In this study we present a picture database of 160 nouns and 160 verbs. All verbs and nouns are divided into two groups as action and non-action words. Age of acquisition, familiarity, imageability, name agreement and complexity norms are reported alongside frequency, word length and morpheme count for each word. Data were collected from 600 native Turkish adults in total. The results show that although several measures have weak correlations with each other, only age of acquisition had moderate downhill relationships with familiarity and frequency with familiarity and frequency having a rather strong positive correlation with each other. The norms and the picture database are available as supplemental materials for use in psycholinguistic studies in Turkish.
Revisiting the role of recollection in item versus forced-choice recognition memory.
Cook, Gabriel I; Marsh, Richard L; Hicks, Jason L
2005-08-01
Many memory theorists have assumed that forced-choice recognition tests can rely more on familiarity, whereas item (yes-no) tests must rely more on recollection. In actuality, several studies have found no differences in the contributions of recollection and familiarity underlying the two different test formats. Using word frequency to manipulate stimulus characteristics, the present study demonstrated that the contributions of recollection to item versus forced-choice tests is variable. Low word frequency resulted in significantly more recollection in an item test than did a forced-choice procedure, but high word frequency produced the opposite result. These results clearly constrain any uniform claim about the degree to which recollection supports responding in item versus forced-choice tests.
Speaking-rate-induced variability in F2 trajectories.
Tjaden, K; Weismer, G
1998-10-01
This study examined speaking-rate-induced spectral and temporal variability of F2 formant trajectories for target words produced in a carrier phrase at speaking rates ranging from fast to slow. F2 onset frequency measured at the first glottal pulse following the stop consonant release in target words was used to quantify the extent to which adjacent consonantal and vocalic gestures overlapped; F2 target frequency was operationally defined as the first occurrence of a frequency minimum or maximum following F2 onset frequency. Regression analyses indicated 70% of functions relating F2 onset and vowel duration were statistically significant. The strength of the effect was variable, however, and the direction of significant functions often differed from that predicted by a simple model of overlapping, sliding gestures. Results of a partial correlation analysis examining interrelationships among F2 onset, F2 target frequency, and vowel duration across the speaking rate range indicated that covariation of F2 target with vowel duration may obscure the relationship between F2 onset and vowel duration across rate. The results further suggested that a sliding based model of acoustic variability associated with speaking rate change only partially accounts for the present data, and that such a view accounts for some speakers' data better than others.
Zipf's word frequency law in natural language: a critical review and future directions.
Piantadosi, Steven T
2014-10-01
The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data.
Different Loci of Semantic Interference in Picture Naming vs. Word-Picture Matching Tasks.
Harvey, Denise Y; Schnur, Tatiana T
2016-01-01
Naming pictures and matching words to pictures belonging to the same semantic category impairs performance relative to when stimuli come from different semantic categories (i.e., semantic interference). Despite similar semantic interference phenomena in both picture naming and word-picture matching tasks, the locus of interference has been attributed to different levels of the language system - lexical in naming and semantic in word-picture matching. Although both tasks involve access to shared semantic representations, the extent to which interference originates and/or has its locus at a shared level remains unclear, as these effects are often investigated in isolation. We manipulated semantic context in cyclical picture naming and word-picture matching tasks, and tested whether factors tapping semantic-level (generalization of interference to novel category items) and lexical-level processes (interactions with lexical frequency) affected the magnitude of interference, while also assessing whether interference occurs at a shared processing level(s) (transfer of interference across tasks). We found that semantic interference in naming was sensitive to both semantic- and lexical-level processes (i.e., larger interference for novel vs. old and low- vs. high-frequency stimuli), consistent with a semantically mediated lexical locus. Interference in word-picture matching exhibited stable interference for old and novel stimuli and did not interact with lexical frequency. Further, interference transferred from word-picture matching to naming. Together, these experiments provide evidence to suggest that semantic interference in both tasks originates at a shared processing stage (presumably at the semantic level), but that it exerts its effect at different loci when naming pictures vs. matching words to pictures.
Different Loci of Semantic Interference in Picture Naming vs. Word-Picture Matching Tasks
Harvey, Denise Y.; Schnur, Tatiana T.
2016-01-01
Naming pictures and matching words to pictures belonging to the same semantic category impairs performance relative to when stimuli come from different semantic categories (i.e., semantic interference). Despite similar semantic interference phenomena in both picture naming and word-picture matching tasks, the locus of interference has been attributed to different levels of the language system – lexical in naming and semantic in word-picture matching. Although both tasks involve access to shared semantic representations, the extent to which interference originates and/or has its locus at a shared level remains unclear, as these effects are often investigated in isolation. We manipulated semantic context in cyclical picture naming and word-picture matching tasks, and tested whether factors tapping semantic-level (generalization of interference to novel category items) and lexical-level processes (interactions with lexical frequency) affected the magnitude of interference, while also assessing whether interference occurs at a shared processing level(s) (transfer of interference across tasks). We found that semantic interference in naming was sensitive to both semantic- and lexical-level processes (i.e., larger interference for novel vs. old and low- vs. high-frequency stimuli), consistent with a semantically mediated lexical locus. Interference in word-picture matching exhibited stable interference for old and novel stimuli and did not interact with lexical frequency. Further, interference transferred from word-picture matching to naming. Together, these experiments provide evidence to suggest that semantic interference in both tasks originates at a shared processing stage (presumably at the semantic level), but that it exerts its effect at different loci when naming pictures vs. matching words to pictures. PMID:27242621
Mainz, Nina; Shao, Zeshu; Brysbaert, Marc; Meyer, Antje S.
2017-01-01
Vocabulary knowledge is central to a speaker's command of their language. In previous research, greater vocabulary knowledge has been associated with advantages in language processing. In this study, we examined the relationship between individual differences in vocabulary and language processing performance more closely by (i) using a battery of vocabulary tests instead of just one test, and (ii) testing not only university students (Experiment 1) but young adults from a broader range of educational backgrounds (Experiment 2). Five vocabulary tests were developed, including multiple-choice and open antonym and synonym tests and a definition test, and administered together with two established measures of vocabulary. Language processing performance was measured using a lexical decision task. In Experiment 1, vocabulary and word frequency were found to predict word recognition speed while we did not observe an interaction between the effects. In Experiment 2, word recognition performance was predicted by word frequency and the interaction between word frequency and vocabulary, with high-vocabulary individuals showing smaller frequency effects. While overall the individual vocabulary tests were correlated and showed similar relationships with language processing as compared to a composite measure of all tests, they appeared to share less variance in Experiment 2 than in Experiment 1. Implications of our findings concerning the assessment of vocabulary size in individual differences studies and the investigation of individuals from more varied backgrounds are discussed. PMID:28751871
Protopapas, Athanassios; Orfanidou, Eleni; Taylor, J S H; Karavasilis, Efstratios; Kapnoula, Efthymia C; Panagiotaropoulou, Georgia; Velonakis, Georgios; Poulou, Loukia S; Smyrnis, Nikolaos; Kelekis, Dimitrios
2016-03-01
In this study predictions of the dual-route cascaded (DRC) model of word reading were tested using fMRI. Specifically, patterns of co-localization were investigated: (a) between pseudoword length effects and a pseudowords vs. fixation contrast, to reveal the sublexical grapho-phonemic conversion (GPC) system; and (b) between word frequency effects and a words vs. pseudowords contrast, to reveal the orthographic and phonological lexicon. Forty four native speakers of Greek were scanned at 3T in an event-related lexical decision task with three event types: (a) 150 words in which frequency, length, bigram and syllable frequency, neighborhood, and orthographic consistency were decorrelated; (b) 150 matched pseudowords; and (c) fixation. Whole-brain analysis failed to reveal the predicted co-localizations. Further analysis with participant-specific regions of interest defined within masks from the group contrasts revealed length effects in left inferior parietal cortex and frequency effects in the left middle temporal gyrus. These findings could be interpreted as partially consistent with the existence of the GPC system and phonological lexicon of the model, respectively. However, there was no evidence in support of an orthographic lexicon, weakening overall support for the model. The results are discussed with respect to the prospect of using neuroimaging in cognitive model evaluation. Copyright © 2016 Elsevier Inc. All rights reserved.
Miller, Leonie M; Roodenrys, Steven
2012-11-01
The frequency effect in short-term serial recall is influenced by the composition of lists. In pure lists, a robust advantage in the recall of high-frequency (HF) words is observed, yet in alternating mixed lists, HF and low-frequency (LF) words are recalled equally well. It has been argued that the preexisting associations between all list items determine a single, global level of supportive activation that assists item recall. Preexisting associations between items are assumed to be a function of language co-occurrence; HF-HF associations are high, LF-LF associations are low, and mixed associations are intermediate in activation strength. This account, however, is based on results when alternating lists with equal numbers of HF and LF words were used. It is possible that directional association between adjacent list items is responsible for the recall patterns reported. In the present experiment, the recall of three forms of mixed lists-those with equal numbers of HF and LF items and pure lists-was examined to test the extent to which item-to-item associations are present in serial recall. Furthermore, conditional probabilities were used to examine more closely the evidence for a contribution, since correct-in-position scoring may mask recall that is dependent on the recall of prior items. The results suggest that an item-to-item effect is clearly present for early but not late list items, and they implicate an additional factor, perhaps the availability of resources at output, in the recall of late list items.
NASA Astrophysics Data System (ADS)
Zhang, Hui; Wang, Deqing; Wu, Wenjun; Hu, Hongping
2012-11-01
In today's business environment, enterprises are increasingly under pressure to process the vast amount of data produced everyday within enterprises. One method is to focus on the business intelligence (BI) applications and increasing the commercial added-value through such business analytics activities. Term weighting scheme, which has been used to convert the documents as vectors in the term space, is a vital task in enterprise Information Retrieval (IR), text categorisation, text analytics, etc. When determining term weight in a document, the traditional TF-IDF scheme sets weight value for the term considering only its occurrence frequency within the document and in the entire set of documents, which leads to some meaningful terms that cannot get the appropriate weight. In this article, we propose a new term weighting scheme called Term Frequency - Function of Document Frequency (TF-FDF) to address this issue. Instead of using monotonically decreasing function such as Inverse Document Frequency, FDF presents a convex function that dynamically adjusts weights according to the significance of the words in a document set. This function can be manually tuned based on the distribution of the most meaningful words which semantically represent the document set. Our experiments show that the TF-FDF can achieve higher value of Normalised Discounted Cumulative Gain in IR than that of TF-IDF and its variants, and improving the accuracy of relevance ranking of the IR results.
Zeng, Rong; Greenfield, Patricia M
2015-02-01
Chinese people have held collectivistic values such as obligation, giving to other people, obedience and sacrifice of personal interests for thousands of years. In recent decades, China has undergone rapid economic development and urbanisation. This study investigates changing cultural values in China from 1970 to 2008 and the relationship of changing values to ecological shifts. The conceptual framework for the study was Greenfield's (2009) theory of social change and human development. Changing frequencies of contrasting Chinese words indexing individualistic or collectivistic values show that values shift along with ecological changes (urbanisation, economic development and enrollment in higher education), thereby adapting to current sociodemographic contexts. Words indexing adaptive individualistic values increased in frequency between 1970 and 2008. In contrast, words indexing less adaptive collectivistic values either decreased in frequency in this same period of time or else rose more slowly than words indexing contrasting individualistic values. © 2015 International Union of Psychological Science.
Adults' Self-Directed Learning of an Artificial Lexicon: The Dynamics of Neighborhood Reorganization
ERIC Educational Resources Information Center
Bardhan, Neil Prodeep
2010-01-01
Artificial lexicons have previously been used to examine the time course of the learning and recognition of spoken words, the role of segment type in word learning, and the integration of context during spoken word recognition. However, in all of these studies the experimenter determined the frequency and order of the words to be learned. In three…
Bridging the Vocabulary Gap for EFL Medical Undergraduates: The Establishment of a Medical Word List
ERIC Educational Resources Information Center
Hsu, Wenhua
2013-01-01
This study created a medical word list (MWL) to bridge the gap between non-technical and technical vocabulary. The researcher compiled a corpus containing 155 textbooks across 31 medical subject areas from e-book databases (totaling 15 million running words) and examined the range and frequency of words outside the most frequent 3,000-word…
Terror in time: extending culturomics to address basic terror management mechanisms.
Dechesne, Mark; Bandt-Law, Bryn
2018-04-11
Building on Google's efforts to scan millions of books, this article introduces methodology using a database of annual word frequencies of the 40,000 most frequently occurring words in the American literature between 1800 and 2009. The current paper uses this methodology to replicate and identify terror management processes in historical context. Variation in frequencies of word usage of constructs relevant to terror management theory (e.g. death, worldview, self-esteem, relationships) are investigated over a time period of 209 years. Study 1 corroborated previous TMT findings and demonstrated that word use of constructs related to death and of constructs related to patriotism and romantic relationships significantly co-vary over time. Study 2 showed that the use of the word "death" most strongly co-varies over time with the use of medical constructs, but also co-varies with the use of constructs related to violence, relationships, religion, positive sentiment, and negative sentiment. Study 3 found that a change in the use of death related words is associated with an increase in the use of fear related words, but not in anxiety related words. Results indicate that the described methodology generates valuable insights regarding terror management theory and provide new perspectives for theoretical advances.
The tool for the automatic analysis of lexical sophistication (TAALES): version 2.0.
Kyle, Kristopher; Crossley, Scott; Berger, Cynthia
2017-07-11
This study introduces the second release of the Tool for the Automatic Analysis of Lexical Sophistication (TAALES 2.0), a freely available and easy-to-use text analysis tool. TAALES 2.0 is housed on a user's hard drive (allowing for secure data processing) and is available on most operating systems (Windows, Mac, and Linux). TAALES 2.0 adds 316 indices to the original tool. These indices are related to word frequency, word range, n-gram frequency, n-gram range, n-gram strength of association, contextual distinctiveness, word recognition norms, semantic network, and word neighbors. In this study, we validated TAALES 2.0 by investigating whether its indices could be used to model both holistic scores of lexical proficiency in free writes and word choice scores in narrative essays. The results indicated that the TAALES 2.0 indices could be used to explain 58% of the variance in lexical proficiency scores and 32% of the variance in word-choice scores. Newly added TAALES 2.0 indices, including those related to n-gram association strength, word neighborhood, and word recognition norms, featured heavily in these predictor models, suggesting that TAALES 2.0 represents a substantial upgrade.
Cascaded processing in written compound word production
Bertram, Raymond; Tønnessen, Finn Egil; Strömqvist, Sven; Hyönä, Jukka; Niemi, Pekka
2015-01-01
In this study we investigated the intricate interplay between central linguistic processing and peripheral motor processes during typewriting. Participants had to typewrite two-constituent (noun-noun) Finnish compounds in response to picture presentation while their typing behavior was registered. As dependent measures we used writing onset time to assess what processes were completed before writing and inter-key intervals to assess what processes were going on during writing. It was found that writing onset time was determined by whole word frequency rather than constituent frequencies, indicating that compound words are retrieved as whole orthographic units before writing is initiated. In addition, we found that the length of the first syllable also affects writing onset time, indicating that the first syllable is fully prepared before writing commences. The inter-key interval results showed that linguistic planning is not fully ready before writing, but cascades into the motor execution phase. More specifically, inter-key intervals were largest at syllable and morpheme boundaries, supporting the view that additional linguistic planning takes place at these boundaries. Bigram and trigram frequency also affected inter-key intervals with shorter intervals corresponding to higher frequencies. This can be explained by stronger memory traces for frequently co-occurring letter sequences in the motor memory for typewriting. These frequency effects were even larger in the second than in the first constituent, indicating that low-level motor memory starts to become more important during the course of writing compound words. We discuss our results in the light of current models of morphological processing and written word production. PMID:25954182
Cascaded processing in written compound word production.
Bertram, Raymond; Tønnessen, Finn Egil; Strömqvist, Sven; Hyönä, Jukka; Niemi, Pekka
2015-01-01
In this study we investigated the intricate interplay between central linguistic processing and peripheral motor processes during typewriting. Participants had to typewrite two-constituent (noun-noun) Finnish compounds in response to picture presentation while their typing behavior was registered. As dependent measures we used writing onset time to assess what processes were completed before writing and inter-key intervals to assess what processes were going on during writing. It was found that writing onset time was determined by whole word frequency rather than constituent frequencies, indicating that compound words are retrieved as whole orthographic units before writing is initiated. In addition, we found that the length of the first syllable also affects writing onset time, indicating that the first syllable is fully prepared before writing commences. The inter-key interval results showed that linguistic planning is not fully ready before writing, but cascades into the motor execution phase. More specifically, inter-key intervals were largest at syllable and morpheme boundaries, supporting the view that additional linguistic planning takes place at these boundaries. Bigram and trigram frequency also affected inter-key intervals with shorter intervals corresponding to higher frequencies. This can be explained by stronger memory traces for frequently co-occurring letter sequences in the motor memory for typewriting. These frequency effects were even larger in the second than in the first constituent, indicating that low-level motor memory starts to become more important during the course of writing compound words. We discuss our results in the light of current models of morphological processing and written word production.
Approaching the Linguistic Complexity
NASA Astrophysics Data System (ADS)
Drożdż, Stanisław; Kwapień, Jarosław; Orczyk, Adam
We analyze the rank-frequency distributions of words in selected English and Polish texts. We compare scaling properties of these distributions in both languages. We also study a few small corpora of Polish literary texts and find that for a corpus consisting of texts written by different authors the basic scaling regime is broken more strongly than in the case of comparable corpus consisting of texts written by the same author. Similarly, for a corpus consisting of texts translated into Polish from other languages the scaling regime is broken more strongly than for a comparable corpus of native Polish texts. Moreover, based on the British National Corpus, we consider the rank-frequency distributions of the grammatically basic forms of words (lemmas) tagged with their proper part of speech. We find that these distributions do not scale if each part of speech is analyzed separately. The only part of speech that independently develops a trace of scaling is verbs.
Frequency Analyses of Prephonological Spellings as Predictors of Success in Conventional Spelling
Kessler, Brett; Pollo, Tatiana Cury; Treiman, Rebecca; Cardoso-Martins, Cláudia
2014-01-01
The present study explored how children’s prephonological writing foretells differential learning outcomes in primary school. We asked Portuguese-speaking preschool children in Brazil (mean age 4 1/4 years) to spell 12 words. Monte Carlo tests were used to identify the 31 children whose writing was not based on spellings or sounds of the target words. 2 1/2 years later, the children took a standardized spelling test. The more closely the digram (2-letter sequence) frequencies in the preschool task correlated with those in children’s books, the better scores the children had in primary school; and the more preschoolers used letters from their own name, the lower their subsequent scores. Thus, preschoolers whose prephonological writing revealed attentiveness to the statistical properties of text subsequently performed better in conventional spelling. These analytic techniques may help in the early identification of children at risk for spelling difficulties. PMID:22798104
Duration of the speech disfluencies of beginning stutterers.
Zebrowski, P M
1991-06-01
This study compared the duration of within-word disfluencies and the number of repeated units per instance of sound/syllable and whole-word repetitions of beginning stutterers to those produced by age- and sex-matched nonstuttering children. Subjects were 10 stuttering children [9 males and 1 female; mean age 4:1 (years:months); age range 3:2-5:1), and 10 nonstuttering children (9 males and 1 female; mean age 4:0; age range: 2:10-5:1). Mothers of the stuttering children reported that their children had been stuttering for 1 year or less. One 300-word conversational speech sample from each of the stuttering and nonstuttering children was analyzed for (a) mean duration of sound/syllable repetition and sound prolongation, (b) mean number of repeated units per instance of sound/syllable and whole-word repetition, and (c) various related measures of the frequency of all between- and within-word speech disfluencies. There were no significant between-group differences for either the duration of acoustically measured sound/syllable repetitions and sound prolongations or the number of repeated units per instance of sound/syllable and whole-word repetition. Unlike frequency and type of speech disfluency produced, average duration of within-word disfluencies and number of repeated units per repetition do not differentiate the disfluent speech of beginning stutterers and their nonstuttering peers. Additional analyses support findings from previous perceptual work that type and frequency of speech disfluency, not duration, are the principal characteristics listeners use in distinguishing these two talker groups.
Letter Position Coding Across Modalities: The Case of Braille Readers
Perea, Manuel; García-Chamorro, Cristina; Martín-Suesta, Miguel; Gómez, Pablo
2012-01-01
Background The question of how the brain encodes letter position in written words has attracted increasing attention in recent years. A number of models have recently been proposed to accommodate the fact that transposed-letter stimuli like jugde or caniso are perceptually very close to their base words. Methodology Here we examined how letter position coding is attained in the tactile modality via Braille reading. The idea is that Braille word recognition may provide more serial processing than the visual modality, and this may produce differences in the input coding schemes employed to encode letters in written words. To that end, we conducted a lexical decision experiment with adult Braille readers in which the pseudowords were created by transposing/replacing two letters. Principal Findings We found a word-frequency effect for words. In addition, unlike parallel experiments in the visual modality, we failed to find any clear signs of transposed-letter confusability effects. This dissociation highlights the differences between modalities. Conclusions The present data argue against models of letter position coding that assume that transposed-letter effects (in the visual modality) occur at a relatively late, abstract locus. PMID:23071522
ERIC Educational Resources Information Center
Mermillod, Martial; Bonin, Patrick; Meot, Alain; Ferrand, Ludovic; Paindavoine, Michel
2012-01-01
According to the age-of-acquisition hypothesis, words acquired early in life are processed faster and more accurately than words acquired later. Connectionist models have begun to explore the influence of the age/order of acquisition of items (and also their frequency of encounter). This study attempts to reconcile two different methodological and…
ERIC Educational Resources Information Center
van Severen, Lieve; Gillis, Joris J. M.; Molemans, Inge; van den Berg, Renate; De Maeyer, Sven; Gillis, Steven
2013-01-01
The impact of input frequency (IF) and functional load (FL) of segments in the ambient language on the acquisition order of word-initial consonants is investigated. Several definitions of IF/FL are compared and implemented. The impact of IF/FL and their components are computed using a longitudinal corpus of interactions between thirty…
ERIC Educational Resources Information Center
Rispens, Judith; Baker, Anne; Duinmeijer, Iris
2015-01-01
Purpose: The effects of neighborhood density (ND) and lexical frequency on word recognition and the effects of phonotactic probability (PP) on nonword repetition (NWR) were examined to gain insight into processing at the lexical and sublexical levels in typically developing (TD) children and children with developmental language problems. Method:…
On the role of words in the network structure of texts: Application to authorship attribution
NASA Astrophysics Data System (ADS)
Akimushkin, Camilo; Amancio, Diego R.; Oliveira, Osvaldo N.
2018-04-01
Well-established automatic analyses of texts mainly consider frequencies of linguistic units, e.g. letters, words, and bigrams. In a recent, alternative approach, medium and large-scale text structures were used in opposition to the belief that text structure is dominated by the language features. In this paper, we introduce a generalized similarity measure to compare texts which accounts for both the network structure of texts and the role of individual words in the networks. The similarity measure is used for authorship attribution of three collections of books, each composed of 8 authors and 10 books per author. High accuracy rates were obtained with typical values between 90% and 98 . 75%, much higher than with the traditional term frequency-inverse document frequency (tf-idf) approach for the same collections. These accuracies are also higher than those obtained solely with the topology of networks. We conclude that the different properties of specific words on the macroscopic scale structure of a whole text are as relevant as their frequency of appearance; conversely, considering the identity of nodes brings further knowledge about a piece of text represented as a network.
Tuning time-frequency methods for the detection of metered HF speech
NASA Astrophysics Data System (ADS)
Nelson, Douglas J.; Smith, Lawrence H.
2002-12-01
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.
Taboo, emotionally valenced, and emotionally neutral word norms.
Janschewitz, Kristin
2008-11-01
Although taboo words are used to study emotional memory and attention, no easily accessible normative data are available that compare taboo, emotionally valenced, and emotionally neutral words on the same scales. Frequency, inappropriateness, valence, arousal, and imageability ratings for taboo, emotionally valenced, and emotionally neutral words were made by 78 native-English-speaking college students from a large metropolitan university. The valenced set comprised both positive and negative words, and the emotionally neutral set comprised category-related and category-unrelated words. To account for influences of demand characteristics and personality factors on the ratings, frequency and inappropriateness measures were decomposed into raters' personal reactions to the words versus raters' perceptions of societal reactions to the words (personal use vs. familiarity and offensiveness vs. tabooness, respectively). Although all word sets were rated higher in familiarity and tabooness than in personal use and offensiveness, these differences were most pronounced for the taboo set. In terms of valence, the taboo set was most similar to the negative set, although it yielded higher arousal ratings than did either valenced set. Imageability for the taboo set was comparable to that of both valenced sets. The ratings of each word are presented for all participants as well as for single-sex groups. The inadequacies of the application of normative data to research that uses emotional words and the conceptualization of taboo words as a coherent category are discussed. Materials associated with this article may be accessed at the Psychonomic Society's Archive of Norms, Stimuli, and Data, www.psychonomic.org/archive.
Emotion word processing: does mood make a difference?
Sereno, Sara C; Scott, Graham G; Yao, Bo; Thaden, Elske J; O'Donnell, Patrick J
2015-01-01
Visual emotion word processing has been in the focus of recent psycholinguistic research. In general, emotion words provoke differential responses in comparison to neutral words. However, words are typically processed within a context rather than in isolation. For instance, how does one's inner emotional state influence the comprehension of emotion words? To address this question, the current study examined lexical decision responses to emotionally positive, negative, and neutral words as a function of induced mood as well as their word frequency. Mood was manipulated by exposing participants to different types of music. Participants were randomly assigned to one of three conditions-no music, positive music, and negative music. Participants' moods were assessed during the experiment to confirm the mood induction manipulation. Reaction time results confirmed prior demonstrations of an interaction between a word's emotionality and its frequency. Results also showed a significant interaction between participant mood and word emotionality. However, the pattern of results was not consistent with mood-congruency effects. Although positive and negative mood facilitated responses overall in comparison to the control group, neither positive nor negative mood appeared to additionally facilitate responses to mood-congruent words. Instead, the pattern of findings seemed to be the consequence of attentional effects arising from induced mood. Positive mood broadens attention to a global level, eliminating the category distinction of positive-negative valence but leaving the high-low arousal dimension intact. In contrast, negative mood narrows attention to a local level, enhancing within-category distinctions, in particular, for negative words, resulting in less effective facilitation.
Kremin, Helgard; Akhutina, Tanya; Basso, Anna; Davidoff, Jules; De Wilde, Martine; Kitzing, Peter; Lorenz, Antje; Perrier, Danièle; van der Sandt-Koenderman, Mieke; Vendrell, Josep; Weniger, Dorothea; Apt, Pia; Arabia, Catherine; De Bleser, Ria; Cohen, Henri; Corbineau, Mathilde; Dolivet, Marie-Christine; Hirsh, Kathi; Lehoux, Emilie; Metz-Lutz, Mari Noëlle; Montañes, Patricia; Plagne, Stéphanie; Polonskaya, Natalya; Sirois, Mélanie; Stachowiak, Franz; Sweeney, Trione; Vish-Brink, Evy
2003-11-01
The well established effect of word frequency on adult's picture naming performance is now called into question. This is particularly true for variables which are correlated with frequency, as is the case of age of word acquisition. Since the work of [Carrol and White, 1973] there is growing agreement among researchers to confer an important role in lexical access to this variable. Indeed, it has been shown ( [Hodgson and Ellis, 1998]) that for normal English-speaking adults only the variables 'age-of-acquisition' and 'name agreement' are independent predictors of naming success among the various variables considered. However, when brain-damaged subjects with and without degenerative pathologies are studied, word frequency and word length as well as concept familiarity all give significant effects ( [Hirsh and Funnell, 1995]; [Lambon Ralph et al., 1998]; [Nickels and Howard, 1995]). Finally, it has been suggested that the production of specific error types may be related to such variables. According to [Nickels and Howard, 1994] the production of semantic errors is specifically affected by 'imageability' and in the recent study by [Kremin et al., 2001] 'age of acquisition' predicts (frank) word finding difficulties.
How Listening to Music Affects Reading: Evidence From Eye Tracking.
Zhang, Han; Miller, Kevin; Cleveland, Raymond; Cortina, Kai
2018-02-01
The current research looked at how listening to music affects eye movements when college students read natural passages for comprehension. Two studies found that effects of music depend on both frequency of the word and dynamics of the music. Study 1 showed that lexical and linguistic features of the text remained highly robust predictors of looking times, even in the music condition. However, under music exposure, (a) readers produced more rereading, and (b) gaze duration on words with very low frequency were less predicted by word length, suggesting disrupted sublexical processing. Study 2 showed that these effects were exacerbated for a short period as soon as a new song came into play. Our results suggested that word recognition generally stayed on track despite music exposure and that extensive rereading can, to some extent, compensate for disruption. However, an irrelevant auditory signal may impair sublexical processing of low-frequency words during first-pass reading, especially when the auditory signal changes dramatically. These eye movement patterns are different from those observed in some other scenarios in which reading comprehension is impaired, including mindless reading. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Ardoin, Scott P; Binder, Katherine S; Zawoyski, Andrea M; Foster, Tori E
2018-06-01
Repeated reading (RR) procedures are consistent with the procedures recommended by Haring and Eaton's (1978) Instructional Hierarchy (IH) for promoting students' fluent responding to newly learned stimuli. It is therefore not surprising that an extensive body of literature exists, which supports RR as an effective practice for promoting students' reading fluency of practiced passages. Less clear, however, is the extent to which RR helps students read the words practiced in an intervention passage when those same words are presented in a new passage. The current study employed randomized control design procedures to examine the maintenance and generalization effects of three interventions that were designed based upon Haring and Eaton's (1978) IH. Across four days, students either practiced reading (a) the same passage seven times (RR+RR), (b) one passage four times and three passages each once (RR+Guided Wide Reading [GWR]), or (c) seven passages each once (GWR+GWR). Students participated in the study across 2weeks, with intervention being provided on a different passage set each week. All passages practiced within a week, regardless of condition, contained four target low frequency and four high frequency words. Across the 130 students for whom data were analyzed, results indicated that increased opportunities to practice words led to greater maintenance effects when passages were read seven days later but revealed minimal differences across conditions in students' reading of target words presented within a generalization passage. Copyright © 2018 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.
Norris, Dennis
2006-04-01
This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers. ((c) 2006 APA, all rights reserved).
Benjafield, John G
2016-05-01
The digital humanities are being applied with increasing frequency to the analysis of historically important texts. In this study, the methods of G. K. Zipf are used to explore the digital history of the vocabulary of psychology. Zipf studied a great many phenomena, from word frequencies to city sizes, showing that they tend to have a characteristic distribution in which there are a few cases that occur very frequently and many more cases that occur very infrequently. We find that the number of new words and word senses that writers contribute to the vocabulary of psychology have such a Zipfian distribution. Moreover, those who make the most contributions, such as William James, tend also to invent new metaphorical senses of words rather than new words. By contrast, those who make the fewest contributions tend to invent entirely new words. The use of metaphor makes a text easier for a reader to understand. While the use of new words requires more effort on the part of the reader, it may lead to more precise understanding than does metaphor. On average, new words and word senses become a part of psychology's vocabulary in the time leading up to World War I, suggesting that psychology was "finding its language" (Danziger, 1997) during this period. (c) 2016 APA, all rights reserved).
Stochastic Model for the Vocabulary Growth in Natural Languages
NASA Astrophysics Data System (ADS)
Gerlach, Martin; Altmann, Eduardo G.
2013-04-01
We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.