statistical word segmentation: Topics by Science.gov

Sample records for statistical word segmentation

Familiar units prevail over statistical cues in word segmentation.

PubMed

Poulin-Charronnat, Bénédicte; Perruchet, Pierre; Tillmann, Barbara; Peereman, Ronald

2017-09-01

In language acquisition research, the prevailing position is that listeners exploit statistical cues, in particular transitional probabilities between syllables, to discover words of a language. However, other cues are also involved in word discovery. Assessing the weight learners give to these different cues leads to a better understanding of the processes underlying speech segmentation. The present study evaluated whether adult learners preferentially used known units or statistical cues for segmenting continuous speech. Before the exposure phase, participants were familiarized with part-words of a three-word artificial language. This design allowed the dissociation of the influence of statistical cues and familiar units, with statistical cues favoring word segmentation and familiar units favoring (nonoptimal) part-word segmentation. In Experiment 1, performance in a two-alternative forced choice (2AFC) task between words and part-words revealed part-word segmentation (even though part-words were less cohesive in terms of transitional probabilities and less frequent than words). By contrast, an unfamiliarized group exhibited word segmentation, as usually observed in standard conditions. Experiment 2 used a syllable-detection task to remove the likely contamination of performance by memory and strategy effects in the 2AFC task. Overall, the results suggest that familiar units overrode statistical cues, ultimately questioning the need for computation mechanisms of transitional probabilities (TPs) in natural language speech segmentation.
Flexibility in Statistical Word Segmentation: Finding Words in Foreign Speech

ERIC Educational Resources Information Center

Graf Estes, Katharine; Gluck, Stephanie Chen-Wu; Bastos, Carolina

2015-01-01

The present experiments investigated the flexibility of statistical word segmentation. There is ample evidence that infants can use statistical cues (e.g., syllable transitional probabilities) to segment fluent speech. However, it is unclear how effectively infants track these patterns in unfamiliar phonological systems. We examined whether…
The Surprising Power of Statistical Learning: When Fragment Knowledge Leads to False Memories of Unheard Words

ERIC Educational Resources Information Center

Endress, Ansgar D.; Mehler, Jacques

2009-01-01

Word-segmentation, that is, the extraction of words from fluent speech, is one of the first problems language learners have to master. It is generally believed that statistical processes, in particular those tracking "transitional probabilities" (TPs), are important to word-segmentation. However, there is evidence that word forms are stored in…
All words are not created equal: Expectations about word length guide infant statistical learning

PubMed Central

Lew-Williams, Casey; Saffran, Jenny R.

2011-01-01

Infants have been described as ‘statistical learners’ capable of extracting structure (such as words) from patterned input (such as language). Here, we investigated whether prior knowledge influences how infants track transitional probabilities in word segmentation tasks. Are infants biased by prior experience when engaging in sequential statistical learning? In a laboratory simulation of learning across time, we exposed 9- and 10-month-old infants to a list of either bisyllabic or trisyllabic nonsense words, followed by a pause-free speech stream composed of a different set of bisyllabic or trisyllabic nonsense words. Listening times revealed successful segmentation of words from fluent speech only when words were uniformly bisyllabic or trisyllabic throughout both phases of the experiment. Hearing trisyllabic words during the pre-exposure phase derailed infants’ abilities to segment speech into bisyllabic words, and vice versa. We conclude that prior knowledge about word length equips infants with perceptual expectations that facilitate efficient processing of subsequent language input. PMID:22088408
Why Segmentation Matters: Experience-Driven Segmentation Errors Impair "Morpheme" Learning

ERIC Educational Resources Information Center

Finn, Amy S.; Hudson Kam, Carla L.

2015-01-01

We ask whether an adult learner's knowledge of their native language impedes statistical learning in a new language beyond just word segmentation (as previously shown). In particular, we examine the impact of native-language word-form phonotactics on learners' ability to segment words into their component morphemes and learn phonologically…
Learning across Languages: Bilingual Experience Supports Dual Language Statistical Word Segmentation

ERIC Educational Resources Information Center

Antovich, Dylan M.; Graf Estes, Katharine

2018-01-01

Bilingual acquisition presents learning challenges beyond those found in monolingual environments, including the need to segment speech in two languages. Infants may use statistical cues, such as syllable-level transitional probabilities, to segment words from fluent speech. In the present study we assessed monolingual and bilingual 14-month-olds'…
GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain

NASA Astrophysics Data System (ADS)

Huang, Lan; Du, Youfu; Chen, Gongyang

2015-03-01

Unlike English, the Chinese language has no space between words. Segmenting texts into words, known as the Chinese word segmentation (CWS) problem, thus becomes a fundamental issue for processing Chinese documents and the first step in many text mining applications, including information retrieval, machine translation and knowledge acquisition. However, for the geoscience subject domain, the CWS problem remains unsolved. Although a generic segmenter can be applied to process geoscience documents, they lack the domain specific knowledge and consequently their segmentation accuracy drops dramatically. This motivated us to develop a segmenter specifically for the geoscience subject domain: the GeoSegmenter. We first proposed a generic two-step framework for domain specific CWS. Following this framework, we built GeoSegmenter using conditional random fields, a principled statistical framework for sequence learning. Specifically, GeoSegmenter first identifies general terms by using a generic baseline segmenter. Then it recognises geoscience terms by learning and applying a model that can transform the initial segmentation into the goal segmentation. Empirical experimental results on geoscience documents and benchmark datasets showed that GeoSegmenter could effectively recognise both geoscience terms and general terms.
Speech Segmentation by Statistical Learning Depends on Attention

ERIC Educational Resources Information Center

Toro, Juan M.; Sinnett, Scott; Soto-Faraco, Salvador

2005-01-01

We addressed the hypothesis that word segmentation based on statistical regularities occurs without the need of attention. Participants were presented with a stream of artificial speech in which the only cue to extract the words was the presence of statistical regularities between syllables. Half of the participants were asked to passively listen…
Listening through Voices: Infant Statistical Word Segmentation across Multiple Speakers

ERIC Educational Resources Information Center

Graf Estes, Katharine; Lew-Williams, Casey

2015-01-01

To learn from their environments, infants must detect structure behind pervasive variation. This presents substantial and largely untested learning challenges in early language acquisition. The current experiments address whether infants can use statistical learning mechanisms to segment words when the speech signal contains acoustic variation…
Cracking the Language Code: Neural Mechanisms Underlying Speech Parsing

PubMed Central

McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

2013-01-01

Word segmentation, detecting word boundaries in continuous speech, is a critical aspect of language learning. Previous research in infants and adults demonstrated that a stream of speech can be readily segmented based solely on the statistical and speech cues afforded by the input. Using functional magnetic resonance imaging (fMRI), the neural substrate of word segmentation was examined on-line as participants listened to three streams of concatenated syllables, containing either statistical regularities alone, statistical regularities and speech cues, or no cues. Despite the participants’ inability to explicitly detect differences between the speech streams, neural activity differed significantly across conditions, with left-lateralized signal increases in temporal cortices observed only when participants listened to streams containing statistical regularities, particularly the stream containing speech cues. In a second fMRI study, designed to verify that word segmentation had implicitly taken place, participants listened to trisyllabic combinations that occurred with different frequencies in the streams of speech they just heard (“words,” 45 times; “partwords,” 15 times; “nonwords,” once). Reliably greater activity in left inferior and middle frontal gyri was observed when comparing words with partwords and, to a lesser extent, when comparing partwords with nonwords. Activity in these regions, taken to index the implicit detection of word boundaries, was positively correlated with participants’ rapid auditory processing skills. These findings provide a neural signature of on-line word segmentation in the mature brain and an initial model with which to study developmental changes in the neural architecture involved in processing speech cues during language learning. PMID:16855090
Statistical word learning in children with autism spectrum disorder and specific language impairment.

PubMed

Haebig, Eileen; Saffran, Jenny R; Ellis Weismer, Susan

2017-11-01

Word learning is an important component of language development that influences child outcomes across multiple domains. Despite the importance of word knowledge, word-learning mechanisms are poorly understood in children with specific language impairment (SLI) and children with autism spectrum disorder (ASD). This study examined underlying mechanisms of word learning, specifically, statistical learning and fast-mapping, in school-aged children with typical and atypical development. Statistical learning was assessed through a word segmentation task and fast-mapping was examined in an object-label association task. We also examined children's ability to map meaning onto newly segmented words in a third task that combined exposure to an artificial language and a fast-mapping task. Children with SLI had poorer performance on the word segmentation and fast-mapping tasks relative to the typically developing and ASD groups, who did not differ from one another. However, when children with SLI were exposed to an artificial language with phonemes used in the subsequent fast-mapping task, they successfully learned more words than in the isolated fast-mapping task. There was some evidence that word segmentation abilities are associated with word learning in school-aged children with typical development and ASD, but not SLI. Follow-up analyses also examined performance in children with ASD who did and did not have a language impairment. Children with ASD with language impairment evidenced intact statistical learning abilities, but subtle weaknesses in fast-mapping abilities. As the Procedural Deficit Hypothesis (PDH) predicts, children with SLI have impairments in statistical learning. However, children with SLI also have impairments in fast-mapping. Nonetheless, they are able to take advantage of additional phonological exposure to boost subsequent word-learning performance. In contrast to the PDH, children with ASD appear to have intact statistical learning, regardless of language status; however, fast-mapping abilities differ according to broader language skills. © 2017 Association for Child and Adolescent Mental Health.
Why segmentation matters: experience-driven segmentation errors impair “morpheme” learning

PubMed Central

Finn, Amy S.; Hudson Kam, Carla L.

2015-01-01

We ask whether an adult learner’s knowledge of their native language impedes statistical learning in a new language beyond just word segmentation (as previously shown). In particular, we examine the impact of native-language word-form phonotactics on learners’ ability to segment words into their component morphemes and learn phonologically triggered variation of morphemes. We find that learning is impaired when words and component morphemes are structured to conflict with a learner’s native-language phonotactic system, but not when native-language phonotactics do not conflict with morpheme boundaries in the artificial language. A learner’s native-language knowledge can therefore have a cascading impact affecting word segmentation and the morphological variation that relies upon proper segmentation. These results show that getting word segmentation right early in learning is deeply important for learning other aspects of language, even those (morphology) that are known to pose a great difficulty for adult language learners. PMID:25730305
Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English

ERIC Educational Resources Information Center

Kastner, Itamar; Adriaans, Frans

2018-01-01

Statistical learning is often taken to lie at the heart of many cognitive tasks, including the acquisition of language. One particular task in which probabilistic models have achieved considerable success is the segmentation of speech into words. However, these models have mostly been tested against English data, and as a result little is known…
A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation.

PubMed

Tran, Phuoc; Dinh, Dien; Nguyen, Hien T

2016-01-01

Chinese and Vietnamese have the same isolated language; that is, the words are not delimited by spaces. In machine translation, word segmentation is often done first when translating from Chinese or Vietnamese into different languages (typically English) and vice versa. However, it is a matter for consideration that words may or may not be segmented when translating between two languages in which spaces are not used between words, such as Chinese and Vietnamese. Since Chinese-Vietnamese is a low-resource language pair, the sparse data problem is evident in the translation system of this language pair. Therefore, while translating, whether it should be segmented or not becomes more important. In this paper, we propose a new method for translating Chinese to Vietnamese based on a combination of the advantages of character level and word level translation. In addition, a hybrid approach that combines statistics and rules is used to translate on the word level. And at the character level, a statistical translation is used. The experimental results showed that our method improved the performance of machine translation over that of character or word level translation.
Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.

PubMed

Batterink, Laura J

2017-07-01

The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.
Assessing segmentation processes by click detection: online measure of statistical learning, or simple interference?

PubMed

Franco, Ana; Gaillard, Vinciane; Cleeremans, Axel; Destrebecqz, Arnaud

2015-12-01

Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212-223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.
A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

PubMed Central

2016-01-01

Chinese and Vietnamese have the same isolated language; that is, the words are not delimited by spaces. In machine translation, word segmentation is often done first when translating from Chinese or Vietnamese into different languages (typically English) and vice versa. However, it is a matter for consideration that words may or may not be segmented when translating between two languages in which spaces are not used between words, such as Chinese and Vietnamese. Since Chinese-Vietnamese is a low-resource language pair, the sparse data problem is evident in the translation system of this language pair. Therefore, while translating, whether it should be segmented or not becomes more important. In this paper, we propose a new method for translating Chinese to Vietnamese based on a combination of the advantages of character level and word level translation. In addition, a hybrid approach that combines statistics and rules is used to translate on the word level. And at the character level, a statistical translation is used. The experimental results showed that our method improved the performance of machine translation over that of character or word level translation. PMID:27446207
Words and possible words in early language acquisition.

PubMed

Marchetto, Erika; Bonatti, Luca L

2013-11-01

In order to acquire language, infants must extract its building blocks-words-and master the rules governing their legal combinations from speech. These two problems are not independent, however: words also have internal structure. Thus, infants must extract two kinds of information from the same speech input. They must find the actual words of their language. Furthermore, they must identify its possible words, that is, the sequences of sounds that, being morphologically well formed, could be words. Here, we show that infants' sensitivity to possible words appears to be more primitive and fundamental than their ability to find actual words. We expose 12- and 18-month-old infants to an artificial language containing a conflict between statistically coherent and structurally coherent items. We show that 18-month-olds can extract possible words when the familiarization stream contains marks of segmentation, but cannot do so when the stream is continuous. Yet, they can find actual words from a continuous stream by computing statistical relationships among syllables. By contrast, 12-month-olds can find possible words when familiarized with a segmented stream, but seem unable to extract statistically coherent items from a continuous stream that contains minimal conflicts between statistical and structural information. These results suggest that sensitivity to word structure is in place earlier than the ability to analyze distributional information. The ability to compute nontrivial statistical relationships becomes fully effective relatively late in development, when infants have already acquired a considerable amount of linguistic knowledge. Thus, mechanisms for structure extraction that do not rely on extensive sampling of the input are likely to have a much larger role in language acquisition than general-purpose statistical abilities. Copyright © 2013. Published by Elsevier Inc.
Implicit Language Learning: Adults' Ability to Segment Words in Norwegian

ERIC Educational Resources Information Center

Kittleson, Megan M.; Aguilar, Jessica M.; Tokerud, Gry Line; Plante, Elena; Asbjornsen, Arve E.

2010-01-01

Previous language learning research reveals that the statistical properties of the input offer sufficient information to allow listeners to segment words from fluent speech in an artificial language. The current pair of studies uses a natural language to test the ecological validity of these findings and to determine whether a listener's language…
Statistical Segmentation of Tone Sequences Activates the Left Inferior Frontal Cortex: A Near-Infrared Spectroscopy Study

ERIC Educational Resources Information Center

Abla, Dilshat; Okanoya, Kazuo

2008-01-01

Word segmentation, that is, discovering the boundaries between words that are embedded in a continuous speech stream, is an important faculty for language learners; humans solve this task partly by calculating transitional probabilities between sounds. Behavioral and ERP studies suggest that detection of sequential probabilities (statistical…

Speech segmentation in aphasia

PubMed Central

Peñaloza, Claudia; Benetello, Annalisa; Tuomiranta, Leena; Heikius, Ida-Maria; Järvinen, Sonja; Majos, Maria Carmen; Cardona, Pedro; Juncadella, Montserrat; Laine, Matti; Martin, Nadine; Rodríguez-Fornells, Antoni

2017-01-01

Background Speech segmentation is one of the initial and mandatory phases of language learning. Although some people with aphasia have shown a preserved ability to learn novel words, their speech segmentation abilities have not been explored. Aims We examined the ability of individuals with chronic aphasia to segment words from running speech via statistical learning. We also explored the relationships between speech segmentation and aphasia severity, and short-term memory capacity. We further examined the role of lesion location in speech segmentation and short-term memory performance. Methods & Procedures The experimental task was first validated with a group of young adults (n = 120). Participants with chronic aphasia (n = 14) were exposed to an artificial language and were evaluated in their ability to segment words using a speech segmentation test. Their performance was contrasted against chance level and compared to that of a group of elderly matched controls (n = 14) using group and case-by-case analyses. Outcomes & Results As a group, participants with aphasia were significantly above chance level in their ability to segment words from the novel language and did not significantly differ from the group of elderly controls. Speech segmentation ability in the aphasic participants was not associated with aphasia severity although it significantly correlated with word pointing span, a measure of verbal short-term memory. Case-by-case analyses identified four individuals with aphasia who performed above chance level on the speech segmentation task, all with predominantly posterior lesions and mild fluent aphasia. Their short-term memory capacity was also better preserved than in the rest of the group. Conclusions Our findings indicate that speech segmentation via statistical learning can remain functional in people with chronic aphasia and suggest that this initial language learning mechanism is associated with the functionality of the verbal short-term memory system and the integrity of the left inferior frontal region. PMID:28824218
Implicit Segmentation of a Stream of Syllables Based on Transitional Probabilities: An MEG Study

ERIC Educational Resources Information Center

Teinonen, Tuomas; Huotilainen, Minna

2012-01-01

Statistical segmentation of continuous speech, i.e., the ability to utilise transitional probabilities between syllables in order to detect word boundaries, is reflected in the brain's auditory event-related potentials (ERPs). The N1 and N400 ERP components are typically enhanced for word onsets compared to random syllables during active…
Do statistical segmentation abilities predict lexical-phonological and lexical-semantic abilities in children with and without SLI?

PubMed Central

Mainela-Arnold, Elina; Evans, Julia L.

2014-01-01

This study tested the predictions of the procedural deficit hypothesis by investigating the relationship between sequential statistical learning and two aspects of lexical ability, lexical-phonological and lexical-semantic, in children with and without specific language impairment (SLI). Participants included 40 children (ages 8;5–12;3), 20 children with SLI and 20 with typical development. Children completed Saffran’s statistical word segmentation task, a lexical-phonological access task (gating task), and a word definition task. Poor statistical learners were also poor at managing lexical-phonological competition during the gating task. However, statistical learning was not a significant predictor of semantic richness in word definitions. The ability to track statistical sequential regularities may be important for learning the inherently sequential structure of lexical-phonology, but not as important for learning lexical-semantic knowledge. Consistent with the procedural/declarative memory distinction, the brain networks associated with the two types of lexical learning are likely to have different learning properties. PMID:23425593
Linking sounds to meanings: infant statistical learning in a natural language.

PubMed

Hay, Jessica F; Pelucchi, Bruna; Graf Estes, Katharine; Saffran, Jenny R

2011-09-01

The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants' subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants' prior experience with the distribution of sounds that make up words in natural languages. Copyright © 2011 Elsevier Inc. All rights reserved.
Neurophysiological evidence for the interplay of speech segmentation and word-referent mapping during novel word learning.

PubMed

François, Clément; Cunillera, Toni; Garcia, Enara; Laine, Matti; Rodriguez-Fornells, Antoni

2017-04-01

Learning a new language requires the identification of word units from continuous speech (the speech segmentation problem) and mapping them onto conceptual representation (the word to world mapping problem). Recent behavioral studies have revealed that the statistical properties found within and across modalities can serve as cues for both processes. However, segmentation and mapping have been largely studied separately, and thus it remains unclear whether both processes can be accomplished at the same time and if they share common neurophysiological features. To address this question, we recorded EEG of 20 adult participants during both an audio alone speech segmentation task and an audiovisual word-to-picture association task. The participants were tested for both the implicit detection of online mismatches (structural auditory and visual semantic violations) as well as for the explicit recognition of words and word-to-picture associations. The ERP results from the learning phase revealed a delayed learning-related fronto-central negativity (FN400) in the audiovisual condition compared to the audio alone condition. Interestingly, while online structural auditory violations elicited clear MMN/N200 components in the audio alone condition, visual-semantic violations induced meaning-related N400 modulations in the audiovisual condition. The present results support the idea that speech segmentation and meaning mapping can take place in parallel and act in synergy to enhance novel word learning. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Bayesian Framework for Word Segmentation: Exploring the Effects of Context

ERIC Educational Resources Information Center

Goldwater, Sharon; Griffiths, Thomas L.; Johnson, Mark

2009-01-01

Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. "Science," 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we…
When Mommy Comes to the Rescue of Statistics: Infants Combine Top-Down and Bottom-Up Cues to Segment Speech

ERIC Educational Resources Information Center

Mersad, Karima; Nazzi, Thierry

2012-01-01

Transitional Probability (TP) computations are regarded as a powerful learning mechanism that is functional early in development and has been proposed as an initial bootstrapping device for speech segmentation. However, a recent study casts doubt on the robustness of early statistical word-learning. Johnson and Tyler (2010) showed that when…
The Longevity of Statistical Learning: When Infant Memory Decays, Isolated Words Come to the Rescue

ERIC Educational Resources Information Center

Karaman, Ferhat; Hay, Jessica F.

2018-01-01

Research over the past 2 decades has demonstrated that infants are equipped with remarkable computational abilities that allow them to find words in continuous speech. Infants can encode information about the transitional probability (TP) between syllables to segment words from artificial and natural languages. As previous research has tested…
The neural correlates of statistical learning in a word segmentation task: An fMRI study

PubMed Central

Karuza, Elisabeth A.; Newport, Elissa L.; Aslin, Richard N.; Starling, Sarah J.; Tivarus, Madalina E.; Bavelier, Daphne

2013-01-01

Functional magnetic resonance imaging (fMRI) was used to assess neural activation as participants learned to segment continuous streams of speech containing syllable sequences varying in their transitional probabilities. Speech streams were presented in four runs, each followed by a behavioral test to measure the extent of learning over time. Behavioral performance indicated that participants could discriminate statistically coherent sequences (words) from less coherent sequences (partwords). Individual rates of learning, defined as the difference in ratings for words and partwords, were used as predictors of neural activation to ask which brain areas showed activity associated with these measures. Results showed significant activity in the pars opercularis and pars triangularis regions of the left inferior frontal gyrus (LIFG). The relationship between these findings and prior work on the neural basis of statistical learning is discussed, and parallels to the frontal/subcortical network involved in other forms of implicit sequence learning are considered. PMID:23312790
Statistical Learning Is Related to Early Literacy-Related Skills

ERIC Educational Resources Information Center

Spencer, Mercedes; Kaschak, Michael P.; Jones, John L.; Lonigan, Christopher J.

2015-01-01

It has been demonstrated that statistical learning, or the ability to use statistical information to learn the structure of one's environment, plays a role in young children's acquisition of linguistic knowledge. Although most research on statistical learning has focused on language acquisition processes, such as the segmentation of words from…
Information extraction and knowledge graph construction from geoscience literature

NASA Astrophysics Data System (ADS)

Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

2018-03-01

Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.
The Neural Basis of Speech Parsing in Children and Adults

ERIC Educational Resources Information Center

McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

2010-01-01

Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…
The longevity of statistical learning: When infant memory decays, isolated words come to the rescue.

PubMed

Karaman, Ferhat; Hay, Jessica F

2018-02-01

Research over the past 2 decades has demonstrated that infants are equipped with remarkable computational abilities that allow them to find words in continuous speech. Infants can encode information about the transitional probability (TP) between syllables to segment words from artificial and natural languages. As previous research has tested infants immediately after familiarization, infants' ability to retain sequential statistics beyond the immediate familiarization context remains unknown. Here, we examine infants' memory for statistically defined words 10 min after familiarization with an Italian corpus. Eight-month-old English-learning infants were familiarized with Italian sentences that contained 4 embedded target words-2 words had high internal TP (HTP, TP = 1.0) and 2 had low TP (LTP, TP = .33)-and were tested on their ability to discriminate HTP from LTP words using the Headturn Preference Procedure. When tested after a 10-min delay, infants failed to discriminate HTP from LTP words, suggesting that memory for statistical information likely decays over even short delays (Experiment 1). Experiments 2-4 were designed to test whether experience with isolated words selectively reinforces memory for statistically defined (i.e., HTP) words. When 8-month-olds were given additional experience with isolated tokens of both HTP and LTP words immediately after familiarization, they looked significantly longer on HTP than LTP test trials 10 min later. Although initial representations of statistically defined words may be fragile, our results suggest that experience with isolated words may reinforce the output of statistical learning by helping infants create more robust memories for words with strong versus weak co-occurrence statistics. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Co-occurrence statistics as a language-dependent cue for speech segmentation.

PubMed

Saksida, Amanda; Langus, Alan; Nespor, Marina

2017-05-01

To what extent can language acquisition be explained in terms of different associative learning mechanisms? It has been hypothesized that distributional regularities in spoken languages are strong enough to elicit statistical learning about dependencies among speech units. Distributional regularities could be a useful cue for word learning even without rich language-specific knowledge. However, it is not clear how strong and reliable the distributional cues are that humans might use to segment speech. We investigate cross-linguistic viability of different statistical learning strategies by analyzing child-directed speech corpora from nine languages and by modeling possible statistics-based speech segmentations. We show that languages vary as to which statistical segmentation strategies are most successful. The variability of the results can be partially explained by systematic differences between languages, such as rhythmical differences. The results confirm previous findings that different statistical learning strategies are successful in different languages and suggest that infants may have to primarily rely on non-statistical cues when they begin their process of speech segmentation. © 2016 John Wiley & Sons Ltd.
Words analysis of online Chinese news headlines about trending events: a complex network perspective.

PubMed

Li, Huajiao; Fang, Wei; An, Haizhong; Huang, Xuan

2015-01-01

Because the volume of information available online is growing at breakneck speed, keeping up with meaning and information communicated by the media and netizens is a new challenge both for scholars and for companies who must address public relations crises. Most current theories and tools are directed at identifying one website or one piece of online news and do not attempt to develop a rapid understanding of all websites and all news covering one topic. This paper represents an effort to integrate statistics, word segmentation, complex networks and visualization to analyze headlines' keywords and words relationships in online Chinese news using two samples: the 2011 Bohai Bay oil spill and the 2010 Gulf of Mexico oil spill. We gathered all the news headlines concerning the two trending events in the search results from Baidu, the most popular Chinese search engine. We used Simple Chinese Word Segmentation to segment all the headlines into words and then took words as nodes and considered adjacent relations as edges to construct word networks both using the whole sample and at the monthly level. Finally, we develop an integrated mechanism to analyze the features of words' networks based on news headlines that can account for all the keywords in the news about a particular event and therefore track the evolution of news deeply and rapidly.
Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading?

PubMed Central

Liu, Ping-Ping; Li, Wei-Jun; Lin, Nan; Li, Xing-Shan

2013-01-01

We conducted a preliminary study to examine whether Chinese readers’ spontaneous word segmentation processing is consistent with the national standard rules of word segmentation based on the Contemporary Chinese language word segmentation specification for information processing (CCLWSSIP). Participants were asked to segment Chinese sentences into individual words according to their prior knowledge of words. The results showed that Chinese readers did not follow the segmentation rules of the CCLWSSIP, and their word segmentation processing was influenced by the syntactic categories of consecutive words. In many cases, the participants did not consider the auxiliary words, adverbs, adjectives, nouns, verbs, numerals and quantifiers as single word units. Generally, Chinese readers tended to combine function words with content words to form single word units, indicating they were inclined to chunk single words into large information units during word segmentation. Additionally, the “overextension of monosyllable words” hypothesis was tested and it might need to be corrected to some degree, implying that word length have an implicit influence on Chinese readers’ segmentation processing. Implications of these results for models of word recognition and eye movement control are discussed. PMID:23408981
Native Language Influence in the Segmentation of a Novel Language

ERIC Educational Resources Information Center

Ordin, Mikhail; Nespor, Marina

2016-01-01

A major problem in second language acquisition (SLA) is the segmentation of fluent speech in the target language, i.e., detecting the boundaries of phonological constituents like words and phrases in the speech stream. To this end, among a variety of cues, people extensively use prosody and statistical regularities. We examined the role of pitch,…
Statistical learning of an auditory sequence and reorganization of acquired knowledge: A time course of word segmentation and ordering.

PubMed

Daikoku, Tatsuya; Yatomi, Yutaka; Yumoto, Masato

2017-01-27

Previous neural studies have supported the hypothesis that statistical learning mechanisms are used broadly across different domains such as language and music. However, these studies have only investigated a single aspect of statistical learning at a time, such as recognizing word boundaries or learning word order patterns. In this study, we neutrally investigated how the two levels of statistical learning for recognizing word boundaries and word ordering could be reflected in neuromagnetic responses and how acquired statistical knowledge is reorganised when the syntactic rules are revised. Neuromagnetic responses to the Japanese-vowel sequence (a, e, i, o, and u), presented every .45s, were recorded from 14 right-handed Japanese participants. The vowel order was constrained by a Markov stochastic model such that five nonsense words (aue, eao, iea, oiu, and uoi) were chained with an either-or rule: the probability of the forthcoming word was statistically defined (80% for one word; 20% for the other word) by the most recent two words. All of the word transition probabilities (80% and 20%) were switched in the middle of the sequence. In the first and second quarters of the sequence, the neuromagnetic responses to the words that appeared with higher transitional probability were significantly reduced compared with those that appeared with a lower transitional probability. After switching the word transition probabilities, the response reduction was replicated in the last quarter of the sequence. The responses to the final vowels in the words were significantly reduced compared with those to the initial vowels in the last quarter of the sequence. The results suggest that both within-word and between-word statistical learning are reflected in neural responses. The present study supports the hypothesis that listeners learn larger structures such as phrases first, and they subsequently extract smaller structures, such as words, from the learned phrases. The present study provides the first neurophysiological evidence that the correction of statistical knowledge requires more time than the acquisition of new statistical knowledge. Copyright © 2016 Elsevier Ltd. All rights reserved.
Songs as an aid for language acquisition.

PubMed

Schön, Daniele; Boyer, Maud; Moreno, Sylvain; Besson, Mireille; Peretz, Isabelle; Kolinsky, Régine

2008-02-01

In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.] have shown that adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. They also showed that a similar learning mechanism operates with musical stimuli [Saffran, J. R., Johnson, R. E. K., Aslin, N., & Newport, E. L. (1999). Abstract Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.]. In this work we combined linguistic and musical information and we compared language learning based on speech sequences to language learning based on sung sequences. We hypothesized that, compared to speech sequences, a consistent mapping of linguistic and musical information would enhance learning. Results confirmed the hypothesis showing a strong learning facilitation of song compared to speech. Most importantly, the present results show that learning a new language, especially in the first learning phase wherein one needs to segment new words, may largely benefit of the motivational and structuring properties of music in song.
Words Analysis of Online Chinese News Headlines about Trending Events: A Complex Network Perspective

PubMed Central

Li, Huajiao; Fang, Wei; An, Haizhong; Huang, Xuan

2015-01-01

Because the volume of information available online is growing at breakneck speed, keeping up with meaning and information communicated by the media and netizens is a new challenge both for scholars and for companies who must address public relations crises. Most current theories and tools are directed at identifying one website or one piece of online news and do not attempt to develop a rapid understanding of all websites and all news covering one topic. This paper represents an effort to integrate statistics, word segmentation, complex networks and visualization to analyze headlines’ keywords and words relationships in online Chinese news using two samples: the 2011 Bohai Bay oil spill and the 2010 Gulf of Mexico oil spill. We gathered all the news headlines concerning the two trending events in the search results from Baidu, the most popular Chinese search engine. We used Simple Chinese Word Segmentation to segment all the headlines into words and then took words as nodes and considered adjacent relations as edges to construct word networks both using the whole sample and at the monthly level. Finally, we develop an integrated mechanism to analyze the features of words’ networks based on news headlines that can account for all the keywords in the news about a particular event and therefore track the evolution of news deeply and rapidly. PMID:25807376

Cloze, Discourse, and Approximations to English.

ERIC Educational Resources Information Center

Oller, John W., Jr.

Five orders of approximation to normal English prose were constructed; 5th, 10th, 25th, 50th, and 100th plus. Five cloze tests were then constructed by inserting blanks for deleted words in 5 word segments (5th order), 10 word segments (10th), 25 word segments (25th), 50 word segments (50th), and 100 word segments of five different passages of…
Lexical and sublexical units in speech perception.

PubMed

Giroux, Ibrahima; Rey, Arnaud

2009-03-01

Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vinter, 1998 in an experiment where we compare the lexical and sublexical recognition performance of adults after hearing 2 or 10 min of an artificial spoken language. The results are consistent with Parser's predictions and the clustering approach, showing that performance on words is better than performance on part-words only after 10 min. This result suggests that word segmentation abilities are not merely due to stronger associations between sublexical units but to the emergence of stronger lexical representations during the development of speech perception processes. Copyright © 2009, Cognitive Science Society, Inc.
The Secret Is in the Sound

PubMed Central

Christiansen, Morten H.; Onnis, Luca; Hockema, Stephen A.

2009-01-01

When learning language young children are faced with many seemingly formidable challenges, including discovering words embedded in a continuous stream of sounds and determining what role these words play in syntactic constructions. We suggest that knowledge of phoneme distributions may play a crucial part in helping children segment words and determine their lexical category, and propose an integrated model of how children might go from unsegmented speech to lexical categories. We corroborated this theoretical model using a two-stage computational analysis of a large corpus of English child-directed speech. First, we used transition probabilities between phonemes to find words in unsegmented speech. Second, we used distributional information about word edges—the beginning and ending phonemes of words—to predict whether the segmented words from the first stage were nouns, verbs, or something else. The results indicate that discovering lexical units and their associated syntactic category in child-directed speech is possible by attending to the statistics of single phoneme transitions and word-initial and final phonemes. Thus, we suggest that a core computational principle in language acquisition is that the same source of information is used to learn about different aspects of linguistic structure. PMID:19371361
Exploiting multiple sources of information in learning an artificial language: human data and modeling.

PubMed

Perruchet, Pierre; Tillmann, Barbara

2010-03-01

This study investigates the joint influences of three factors on the discovery of new word-like units in a continuous artificial speech stream: the statistical structure of the ongoing input, the initial word-likeness of parts of the speech flow, and the contextual information provided by the earlier emergence of other word-like units. Results of an experiment conducted with adult participants show that these sources of information have strong and interactive influences on word discovery. The authors then examine the ability of different models of word segmentation to account for these results. PARSER (Perruchet & Vinter, 1998) is compared to the view that word segmentation relies on the exploitation of transitional probabilities between successive syllables, and with the models based on the Minimum Description Length principle, such as INCDROP. The authors submit arguments suggesting that PARSER has the advantage of accounting for the whole pattern of data without ad-hoc modifications, while relying exclusively on general-purpose learning principles. This study strengthens the growing notion that nonspecific cognitive processes, mainly based on associative learning and memory principles, are able to account for a larger part of early language acquisition than previously assumed. Copyright © 2009 Cognitive Science Society, Inc.
Reduplication Facilitates Early Word Segmentation

ERIC Educational Resources Information Center

Ota, Mitsuhiko; Skarabela, Barbora

2018-01-01

This study explores the possibility that early word segmentation is aided by infants' tendency to segment words with repeated syllables ("reduplication"). Twenty-four nine-month-olds were familiarized with passages containing one novel reduplicated word and one novel non-reduplicated word. Their central fixation times in response to…
Segmentation of Vowel-Initial Words Is Facilitated by Function Words

ERIC Educational Resources Information Center

Kim, Yun Jung; Sundara, Megha

2015-01-01

Within the first year of life, infants learn to segment words from fluent speech. Previous research has shown that infants at 0;7·5 can segment consonant-initial words, yet the ability to segment vowel-initial words does not emerge until the age of 1;1-1;4 (0;11 in some restricted cases). In five experiments, we show that infants aged 0;11 but not…
Segmenting Words from Fluent Speech during Infancy--Challenges and Opportunities in a Bilingual Context

ERIC Educational Resources Information Center

Polka, Linda; Orena, Adriel John; Sundara, Megha; Worrall, Jennifer

2017-01-01

Previous research shows that word segmentation is a language-specific skill. Here, we tested segmentation of bi-syllabic words in two languages (French; English) within the same infants in a single test session. In Experiment 1, monolingual 8-month-olds (French; English) segmented bi-syllabic words in their native language, but not in an…
Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models.

PubMed

Zhang, Shaodian; Kang, Tian; Zhang, Xingting; Wen, Dong; Elhadad, Noémie; Lei, Jianbo

2016-04-01

Speculations represent uncertainty toward certain facts. In clinical texts, identifying speculations is a critical step of natural language processing (NLP). While it is a nontrivial task in many languages, detecting speculations in Chinese clinical notes can be particularly challenging because word segmentation may be necessary as an upstream operation. The objective of this paper is to construct a state-of-the-art speculation detection system for Chinese clinical notes and to investigate whether embedding features and word segmentations are worth exploiting toward this overall task. We propose a sequence labeling based system for speculation detection, which relies on features from bag of characters, bag of words, character embedding, and word embedding. We experiment on a novel dataset of 36,828 clinical notes with 5103 gold-standard speculation annotations on 2000 notes, and compare the systems in which word embeddings are calculated based on word segmentations given by general and by domain specific segmenters respectively. Our systems are able to reach performance as high as 92.2% measured by F score. We demonstrate that word segmentation is critical to produce high quality word embedding to facilitate downstream information extraction applications, and suggest that a domain dependent word segmenter can be vital to such a clinical NLP task in Chinese language. Copyright © 2016 Elsevier Inc. All rights reserved.
Research and Implementation of Tibetan Word Segmentation Based on Syllable Methods

NASA Astrophysics Data System (ADS)

Jiang, Jing; Li, Yachao; Jiang, Tao; Yu, Hongzhi

2018-03-01

Tibetan word segmentation (TWS) is an important problem in Tibetan information processing, while abbreviated word recognition is one of the key and most difficult problems in TWS. Most of the existing methods of Tibetan abbreviated word recognition are rule-based approaches, which need vocabulary support. In this paper, we propose a method based on sequence tagging model for abbreviated word recognition, and then implement in TWS systems with sequence labeling models. The experimental results show that our abbreviated word recognition method is fast and effective and can be combined easily with the segmentation model. This significantly increases the effect of the Tibetan word segmentation.
ASM Based Synthesis of Handwritten Arabic Text Pages

PubMed Central

Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed

2015-01-01

Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.

PubMed

Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed

2015-01-01

Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
Development of Infants' Segmentation of Words from Native Speech: A Meta-Analytic Approach

ERIC Educational Resources Information Center

Bergmann, Christina; Cristia, Alejandrina

2016-01-01

Infants start learning words, the building blocks of language, at least by 6 months. To do so, they must be able to extract the phonological form of words from running speech. A rich literature has investigated this process, termed word segmentation. We addressed the fundamental question of how infants of different ages segment words from their…
Sixteen-Month-Old Infants' Segment Words from Infant- and Adult-Directed Speech

ERIC Educational Resources Information Center

Mani, Nivedita; Pätzold, Wiebke

2016-01-01

One of the first challenges facing the young language learner is the task of segmenting words from a natural language speech stream, without prior knowledge of how these words sound. Studies with younger children find that children find it easier to segment words from fluent speech when the words are presented in infant-directed speech, i.e., the…
Phonotactics, Neighborhood Activation, and Lexical Access for Spoken Words

PubMed Central

Vitevitch, Michael S.; Luce, Paul A.; Pisoni, David B.; Auer, Edward T.

2012-01-01

Probabilistic phonotactics refers to the relative frequencies of segments and sequences of segments in spoken words. Neighborhood density refers to the number of words that are phonologically similar to a given word. Despite a positive correlation between phonotactic probability and neighborhood density, nonsense words with high probability segments and sequences are responded to more quickly than nonsense words with low probability segments and sequences, whereas real words occurring in dense similarity neighborhoods are responded to more slowly than real words occurring in sparse similarity neighborhoods. This contradiction may be resolved by hypothesizing that effects of probabilistic phonotactics have a sublexical focus and that effects of similarity neighborhood density have a lexical focus. The implications of this hypothesis for models of spoken word recognition are discussed. PMID:10433774
Pitch enhancement facilitates word learning across visual contexts

PubMed Central

Filippi, Piera; Gingras, Bruno; Fitch, W. Tecumseh

2014-01-01

This study investigates word-learning using a new experimental paradigm that integrates three processes: (a) extracting a word out of a continuous sound sequence, (b) inferring its referential meanings in context, (c) mapping the segmented word onto its broader intended referent, such as other objects of the same semantic category, and to novel utterances. Previous work has examined the role of statistical learning and/or of prosody in each of these processes separately. Here, we combine these strands of investigation into a single experimental approach, in which participants viewed a photograph belonging to one of three semantic categories while hearing a complex, five-word utterance containing a target word. Six between-subjects conditions were tested with 20 adult participants each. In condition 1, the only cue to word-meaning mapping was the co-occurrence of word and referents. This statistical cue was present in all conditions. In condition 2, the target word was sounded at a higher pitch. In condition 3, random words were sounded at a higher pitch, creating an inconsistent cue. In condition 4, the duration of the target word was lengthened. In conditions 5 and 6, an extraneous acoustic cue and a visual cue were associated with the target word, respectively. Performance in this word-learning task was significantly higher than that observed with simple co-occurrence only when pitch prominence consistently marked the target word. We discuss implications for the pragmatic value of pitch marking as well as the relevance of our findings to language acquisition and language evolution. PMID:25566144
The Edge Factor in Early Word Segmentation: Utterance-Level Prosody Enables Word Form Extraction by 6-Month-Olds

PubMed Central

Johnson, Elizabeth K.; Seidl, Amanda; Tyler, Michael D.

2014-01-01

Past research has shown that English learners begin segmenting words from speech by 7.5 months of age. However, more recent research has begun to show that, in some situations, infants may exhibit rudimentary segmentation capabilities at an earlier age. Here, we report on four perceptual experiments and a corpus analysis further investigating the initial emergence of segmentation capabilities. In Experiments 1 and 2, 6-month-olds were familiarized with passages containing target words located either utterance medially or at utterance edges. Only those infants familiarized with passages containing target words aligned with utterance edges exhibited evidence of segmentation. In Experiments 3 and 4, 6-month-olds recognized familiarized words when they were presented in a new acoustically distinct voice (male rather than female), but not when they were presented in a phonologically altered manner (missing the initial segment). Finally, we report corpus analyses examining how often different word types occur at utterance boundaries in different registers. Our findings suggest that edge-aligned words likely play a key role in infants’ early segmentation attempts, and also converge with recent reports suggesting that 6-month-olds’ have already started building a rudimentary lexicon. PMID:24421892
How African American English-Speaking First Graders Segment and Rhyme Words and Nonwords With Final Consonant Clusters.

PubMed

Shollenbarger, Amy J; Robinson, Gregory C; Taran, Valentina; Choi, Seo-Eun

2017-10-05

This study explored how typically developing 1st grade African American English (AAE) speakers differ from mainstream American English (MAE) speakers in the completion of 2 common phonological awareness tasks (rhyming and phoneme segmentation) when the stimulus items were consonant-vowel-consonant-consonant (CVCC) words and nonwords. Forty-nine 1st graders met criteria for 2 dialect groups: AAE and MAE. Three conditions were tested in each rhyme and segmentation task: Real Words No Model, Real Words With a Model, and Nonwords With a Model. The AAE group had significantly more responses that rhymed CVCC words with consonant-vowel-consonant words and segmented CVCC words as consonant-vowel-consonant than the MAE group across all experimental conditions. In the rhyming task, the presence of a model in the real word condition elicited more reduced final cluster responses for both groups. In the segmentation task, the MAE group was at ceiling, so only the AAE group changed across the different stimulus presentations and reduced the final cluster less often when given a model. Rhyming and phoneme segmentation performance can be influenced by a child's dialect when CVCC words are used.
2.5-year-olds use cross-situational consistency to learn verbs under referential uncertainty.

PubMed

Scott, Rose M; Fisher, Cynthia

2012-02-01

Recent evidence shows that children can use cross-situational statistics to learn new object labels under referential ambiguity (e.g., Smith & Yu, 2008). Such evidence has been interpreted as support for proposals that statistical information about word-referent co-occurrence plays a powerful role in word learning. But object labels represent only a fraction of the vocabulary children acquire, and arguably represent the simplest case of word learning based on observations of world scenes. Here we extended the study of cross-situational word learning to a new segment of the vocabulary, action verbs, to permit a stronger test of the role of statistical information in word learning. In two experiments, on each trial 2.5-year-olds encountered two novel intransitive (e.g., "She's pimming!"; Experiment 1) or transitive verbs (e.g., "She's pimming her toy!"; Experiment 2) while viewing two action events. The consistency with which each verb accompanied each action provided the only source of information about the intended referent of each verb. The 2.5-year-olds used cross-situational consistency in verb learning, but also showed significant limits on their ability to do so as the sentences and scenes became slightly more complex. These findings help to define the role of cross-situational observation in word learning. Copyright © 2011 Elsevier B.V. All rights reserved.
2.5-year-olds use cross-situational consistency to learn verbs under referential uncertainty

PubMed Central

Scott, Rose M.; Fisher, Cynthia

2011-01-01

Recent evidence shows that children can use cross-situational statistics to learn new object labels under referential ambiguity (e.g., Smith & Yu, 2008). Such evidence has been interpreted as support for proposals that statistical information about word-referent co-occurrence plays a powerful role in word learning. But object labels represent only a fraction of the vocabulary children acquire, and arguably represent the simplest case of word learning based on observations of world scenes. Here we extended the study of cross-situational word learning to a new segment of the vocabulary, action verbs, to permit a stronger test of the role of statistical information in word learning. In two experiments, on each trial 2.5-year-olds encountered two novel intransitive (e.g., “She’s pimming!”; Experiment 1) or transitive verbs (e.g., “She’s pimming her toy!”; Experiment 2) while viewing two action events. The consistency with which each verb accompanied each action provided the only source of information about the intended referent of each verb. The 2.5-year-olds used cross-situational consistency in verb learning, but also showed significant limits on their ability to do so as the sentences and scenes became slightly more complex. These findings help to define the role of cross-situational observation in word learning. PMID:22104489
Songs as an Aid for Language Acquisition

ERIC Educational Resources Information Center

Schon, Daniele; Boyer, Maud; Moreno, Sylvain; Besson, Mireille; Peretz, Isabelle; Kolinsky, Regine

2008-01-01

In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. "Journal of Memory and Language," 35, 606-621.] have shown that adults…

Segmenting words from natural speech: subsegmental variation in segmental cues.

PubMed

Rytting, C Anton; Brew, Chris; Fosler-Lussier, Eric

2010-06-01

Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We use this new representation to re-evaluate a key computational model of word segmentation. One finding is that high levels of phonetic variability degrade the model's performance. While robustness to phonetic variability may be intrinsically valuable, this finding needs to be complemented by parallel studies of the actual abilities of children to segment phonetically variable speech.
The metamorphosis of the statistical segmentation output: lexicalization during artificial language learning.

PubMed

Fernandes, Tânia; Kolinsky, Régine; Ventura, Paulo

2009-09-01

This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to real words. Both immediately after familiarization and post-one week, ALL outputs were lexicalized only when the cues available during familiarization (transitional probabilities and wordlikeness) suggested the same parsing (Experiments 1 and 3). No lexicalization effect occurred with incongruent cues (Experiments 2 and 4). Yet, ALL differed from chance, suggesting a dissociation between item knowledge and lexicalization. Similarly contrasted results were found when frequency of occurrence of the stimuli was equated during familiarization (Experiments 3 and 4). Our findings thus indicate that ALL outputs may be lexicalized as far as the segmentation cues are congruent, and that this process cannot be accounted for by raw frequency.
Social interaction facilitates word learning in preverbal infants: Word-object mapping and word segmentation.

PubMed

Hakuno, Yoko; Omori, Takahide; Yamamoto, Jun-Ichi; Minagawa, Yasuyo

2017-08-01

In natural settings, infants learn spoken language with the aid of a caregiver who explicitly provides social signals. Although previous studies have demonstrated that young infants are sensitive to these signals that facilitate language development, the impact of real-life interactions on early word segmentation and word-object mapping remains elusive. We tested whether infants aged 5-6 months and 9-10 months could segment a word from continuous speech and acquire a word-object relation in an ecologically valid setting. In Experiment 1, infants were exposed to a live tutor, while in Experiment 2, another group of infants were exposed to a televised tutor. Results indicate that both younger and older infants were capable of segmenting a word and learning a word-object association only when the stimuli were derived from a live tutor in a natural manner, suggesting that real-life interaction enhances the learning of spoken words in preverbal infants. Copyright © 2017 Elsevier Inc. All rights reserved.
Orthographic Transparency Enhances Morphological Segmentation in Children Reading Hebrew Words.

PubMed

Haddad, Laurice; Weiss, Yael; Katzir, Tami; Bitan, Tali

2017-01-01

Morphological processing of derived words develops simultaneously with reading acquisition. However, the reader's engagement in morphological segmentation may depend on the language morphological richness and orthographic transparency, and the readers' reading skills. The current study tested the common idea that morphological segmentation is enhanced in non-transparent orthographies to compensate for the absence of phonological information. Hebrew's rich morphology and the dual version of the Hebrew script (with and without diacritic marks) provides an opportunity to study the interaction of orthographic transparency and morphological segmentation on the development of reading skills in a within-language design. Hebrew speaking 2nd ( N = 27) and 5th ( N = 29) grade children read aloud 96 noun words. Half of the words were simple mono-morphemic words and half were bi-morphemic derivations composed of a productive root and a morphemic pattern. In each list half of the words were presented in the transparent version of the script (with diacritic marks), and half in the non-transparent version (without diacritic marks). Our results show that in both groups, derived bi-morphemic words were identified more accurately than mono-morphemic words, but only for the transparent, pointed, script. For the un-pointed script the reverse was found, namely, that bi-morphemic words were read less accurately than mono-morphemic words, especially in second grade. Second grade children also read mono-morphemic words faster than bi-morphemic words. Finally, correlations with a standardized measure of morphological awareness were found only for second grade children, and only in bi-morphemic words. These results, showing greater morphological effects in second grade compared to fifth grade children suggest that for children raised in a language with a rich morphology, common and easily segmented morphemic units may be more beneficial for younger compared to older readers. Moreover, in contrast to the common hypothesis, our results show that morphemic segmentation does not compensate for the missing phonological information in a non-transparent orthography, but rather that morphological segmentation is most beneficial in the highly transparent script. These results are consistent with the idea that morphological and phonological segmentation processes occur simultaneously and do not constitute alternative pathways to visual word recognition.
Orthographic Transparency Enhances Morphological Segmentation in Children Reading Hebrew Words

PubMed Central

Haddad, Laurice; Weiss, Yael; Katzir, Tami; Bitan, Tali

2018-01-01

Morphological processing of derived words develops simultaneously with reading acquisition. However, the reader’s engagement in morphological segmentation may depend on the language morphological richness and orthographic transparency, and the readers’ reading skills. The current study tested the common idea that morphological segmentation is enhanced in non-transparent orthographies to compensate for the absence of phonological information. Hebrew’s rich morphology and the dual version of the Hebrew script (with and without diacritic marks) provides an opportunity to study the interaction of orthographic transparency and morphological segmentation on the development of reading skills in a within-language design. Hebrew speaking 2nd (N = 27) and 5th (N = 29) grade children read aloud 96 noun words. Half of the words were simple mono-morphemic words and half were bi-morphemic derivations composed of a productive root and a morphemic pattern. In each list half of the words were presented in the transparent version of the script (with diacritic marks), and half in the non-transparent version (without diacritic marks). Our results show that in both groups, derived bi-morphemic words were identified more accurately than mono-morphemic words, but only for the transparent, pointed, script. For the un-pointed script the reverse was found, namely, that bi-morphemic words were read less accurately than mono-morphemic words, especially in second grade. Second grade children also read mono-morphemic words faster than bi-morphemic words. Finally, correlations with a standardized measure of morphological awareness were found only for second grade children, and only in bi-morphemic words. These results, showing greater morphological effects in second grade compared to fifth grade children suggest that for children raised in a language with a rich morphology, common and easily segmented morphemic units may be more beneficial for younger compared to older readers. Moreover, in contrast to the common hypothesis, our results show that morphemic segmentation does not compensate for the missing phonological information in a non-transparent orthography, but rather that morphological segmentation is most beneficial in the highly transparent script. These results are consistent with the idea that morphological and phonological segmentation processes occur simultaneously and do not constitute alternative pathways to visual word recognition. PMID:29403413
Time course of syllabic and sub-syllabic processing in Mandarin word production: Evidence from the picture-word interference paradigm.

PubMed

Wang, Jie; Wong, Andus Wing-Kuen; Chen, Hsuan-Chih

2017-06-05

The time course of phonological encoding in Mandarin monosyllabic word production was investigated by using the picture-word interference paradigm. Participants were asked to name pictures in Mandarin while visual distractor words were presented before, at, or after picture onset (i.e., stimulus-onset asynchrony/SOA = -100, 0, or +100 ms, respectively). Compared with the unrelated control, the distractors sharing atonal syllables with the picture names significantly facilitated the naming responses at -100- and 0-ms SOAs. In addition, the facilitation effect of sharing word-initial segments only appeared at 0-ms SOA, and null effects were found for sharing word-final segments. These results indicate that both syllables and subsyllabic units play important roles in Mandarin spoken word production and more critically that syllabic processing precedes subsyllabic processing. The current results lend strong support to the proximate units principle (O'Seaghdha, Chen, & Chen, 2010), which holds that the phonological structure of spoken word production is language-specific and that atonal syllables are the proximate phonological units in Mandarin Chinese. On the other hand, the significance of word-initial segments over word-final segments suggests that serial processing of segmental information seems to be universal across Germanic languages and Chinese, which remains to be verified in future studies.
Unconventional Word Segmentation in Emerging Bilingual Students' Writing: A Longitudinal Analysis

ERIC Educational Resources Information Center

Sparrow, Wendy

2014-01-01

This study explores cross-language and longitudinal patterns in unconventional word segmentation in 25 emerging bilingual students' (Spanish/English) writing from first through third grade. Spanish and English writing samples were collected annually and analyzed for two basic types of unconventional word segmentation: hyposegmentation, in…
Word segmentation by alternating colors facilitates eye guidance in Chinese reading.

PubMed

Zhou, Wei; Wang, Aiping; Shu, Hua; Kliegl, Reinhold; Yan, Ming

2018-02-12

During sentence reading, low spatial frequency information afforded by spaces between words is the primary factor for eye guidance in spaced writing systems, whereas saccade generation for unspaced writing systems is less clear and under debate. In the present study, we investigated whether word-boundary information, provided by alternating colors (consistent or inconsistent with word-boundary information) influences saccade-target selection in Chinese. In Experiment 1, as compared to a baseline (i.e., uniform color) condition, word segmentation with alternating color shifted fixation location towards the center of words. In contrast, incorrect word segmentation shifted fixation location towards the beginning of words. In Experiment 2, we used a gaze-contingent paradigm to restrict the color manipulation only to the upcoming parafoveal words and replicated the results, including fixation location effects, as observed in Experiment 1. These results indicate that Chinese readers are capable of making use of parafoveal word-boundary knowledge for saccade generation, even if such information is unfamiliar to them. The present study provides novel support for the hypothesis that word segmentation is involved in the decision about where to fixate next during Chinese reading.
Arabic handwritten: pre-processing and segmentation

NASA Astrophysics Data System (ADS)

Maliki, Makki; Jassim, Sabah; Al-Jawad, Naseer; Sellahewa, Harin

2012-06-01

This paper is concerned with pre-processing and segmentation tasks that influence the performance of Optical Character Recognition (OCR) systems and handwritten/printed text recognition. In Arabic, these tasks are adversely effected by the fact that many words are made up of sub-words, with many sub-words there associated one or more diacritics that are not connected to the sub-word's body; there could be multiple instances of sub-words overlap. To overcome these problems we investigate and develop segmentation techniques that first segment a document into sub-words, link the diacritics with their sub-words, and removes possible overlapping between words and sub-words. We shall also investigate two approaches for pre-processing tasks to estimate sub-words baseline, and to determine parameters that yield appropriate slope correction, slant removal. We shall investigate the use of linear regression on sub-words pixels to determine their central x and y coordinates, as well as their high density part. We also develop a new incremental rotation procedure to be performed on sub-words that determines the best rotation angle needed to realign baselines. We shall demonstrate the benefits of these proposals by conducting extensive experiments on publicly available databases and in-house created databases. These algorithms help improve character segmentation accuracy by transforming handwritten Arabic text into a form that could benefit from analysis of printed text.
Using Linguistic Knowledge in Statistical Machine Translation

DTIC Science & Technology

2010-09-01

on newswire test data . . . . . . . . . . . . . . . . . . . . . 65 3.4 Arabic to English MT results for Arabic morphological segmentation, measured on...web test data. . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Recombination Results. Percentage of sentences with mis-combined words...scores for syntactic reordering of the Spoken Language Domain. 90 5.1 Normalized likelihood of the test set alignments without decision trees, and then
Rhythmic grouping biases constrain infant statistical learning

PubMed Central

Hay, Jessica F.; Saffran, Jenny R.

2012-01-01

Linguistic stress and sequential statistical cues to word boundaries interact during speech segmentation in infancy. However, little is known about how the different acoustic components of stress constrain statistical learning. The current studies were designed to investigate whether intensity and duration each function independently as cues to initial prominence (trochaic-based hypothesis) or whether, as predicted by the Iambic-Trochaic Law (ITL), intensity and duration have characteristic and separable effects on rhythmic grouping (ITL-based hypothesis) in a statistical learning task. Infants were familiarized with an artificial language (Experiments 1 & 3) or a tone stream (Experiment 2) in which there was an alternation in either intensity or duration. In addition to potential acoustic cues, the familiarization sequences also contained statistical cues to word boundaries. In speech (Experiment 1) and non-speech (Experiment 2) conditions, 9-month-old infants demonstrated discrimination patterns consistent with an ITL-based hypothesis: intensity signaled initial prominence and duration signaled final prominence. The results of Experiment 3, in which 6.5-month-old infants were familiarized with the speech streams from Experiment 1, suggest that there is a developmental change in infants’ willingness to treat increased duration as a cue to word offsets in fluent speech. Infants’ perceptual systems interact with linguistic experience to constrain how infants learn from their auditory environment. PMID:23730217
Recognition of Handwritten Arabic words using a neuro-fuzzy network

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boukharouba, Abdelhak; Bennia, Abdelhak

We present a new method for the recognition of handwritten Arabic words based on neuro-fuzzy hybrid network. As a first step, connected components (CCs) of black pixels are detected. Then the system determines which CCs are sub-words and which are stress marks. The stress marks are then isolated and identified separately and the sub-words are segmented into graphemes. Each grapheme is described by topological and statistical features. Fuzzy rules are extracted from training examples by a hybrid learning scheme comprised of two phases: rule generation phase from data using a fuzzy c-means, and rule parameter tuning phase using gradient descentmore » learning. After learning, the network encodes in its topology the essential design parameters of a fuzzy inference system.The contribution of this technique is shown through the significant tests performed on a handwritten Arabic words database.« less
Word-Form Familiarity Bootstraps Infant Speech Segmentation

ERIC Educational Resources Information Center

Altvater-Mackensen, Nicole; Mani, Nivedita

2013-01-01

At about 7 months of age, infants listen longer to sentences containing familiar words--but not deviant pronunciations of familiar words (Jusczyk & Aslin, 1995). This finding suggests that infants are able to segment familiar words from fluent speech and that they store words in sufficient phonological detail to recognize deviations from a…
Infant Word Segmentation Revisited: Edge Alignment Facilitates Target Extraction

ERIC Educational Resources Information Center

Seidl, Amanda; Johnson, Elizabeth K.

2006-01-01

In a landmark study, Jusczyk and Aslin (1995 ) demonstrated that English-learning infants are able to segment words from continuous speech at 7.5 months of age. In the current study, we explored the possibility that infants segment words from the edges of utterances more readily than the middle of utterances. The same procedure was used as in…
The Roles of Tonal and Segmental Information in Mandarin Spoken Word Recognition: An Eyetracking Study

ERIC Educational Resources Information Center

Malins, Jeffrey G.; Joanisse, Marc F.

2010-01-01

We used eyetracking to examine how tonal versus segmental information influence spoken word recognition in Mandarin Chinese. Participants heard an auditory word and were required to identify its corresponding picture from an array that included the target item ("chuang2" "bed"), a phonological competitor (segmental: chuang1 "window"; cohort:…
Modelling acquired dyslexia: a software tool for developing grapheme-phoneme correspondences.

PubMed Central

D'Autrechy, C. L.; Reggia, J. A.; Berndt, R. S.

1991-01-01

In extending a computer model of acquired dyslexia, it has become necessary to develop a way to group printed characters in a word so that the character groups essentially have a one-to-one correspondence with the word's phonemes (speech sounds). This requires deriving a set of correspondences (legal character groupings, legal associations of character groups with phonemes, etc.) that yield a single grouping or "segmentation" of characters when applied to any English word. To facilitate and partially automate this task, a segmentation program has been developed that uses an interchangeable set of correspondences. The program segments words according to these correspondences and tabulates their success over large sets of words. The program has been used successfully to segment a 20,000 word corpus, demonstrating that this approach can be used effectively and efficiently. PMID:1807611
Perceiving non-native speech: Word segmentation

NASA Astrophysics Data System (ADS)

Mondini, Michèle; Miller, Joanne L.

2004-05-01

One important source of information listeners use to segment speech into discrete words is allophonic variation at word junctures. Previous research has shown that non-native speakers impose their native-language phonetic norms on their second language; as a consequence, non-native speech may (in some cases) exhibit altered patterns of allophonic variation at word junctures. We investigated the perceptual consequences of this for word segmentation by presenting native-English listeners with English word pairs produced either by six native-English speakers or six highly fluent, native-French speakers of English. The target word pairs had contrastive word juncture involving voiceless stop consonants (e.g., why pink/wipe ink; gray ties/great eyes; we cash/weak ash). The task was to identify randomized instances of each individual target word pair (as well as control pairs) by selecting one of four possible choices (e.g., why pink, wipe ink, why ink, wipe pink). Overall, listeners were more accurate in identifying target word pairs produced by the native-English speakers than by the non-native English speakers. These findings suggest that one contribution to the processing cost associated with listening to non-native speech may be the presence of altered allophonic information important for word segmentation. [Work supported by NIH/NIDCD.
Does segmental overlap help or hurt? Evidence from blocked cyclic naming in spoken and written production.

PubMed

Breining, Bonnie; Nozari, Nazbanou; Rapp, Brenda

2016-04-01

Past research has demonstrated interference effects when words are named in the context of multiple items that share a meaning. This interference has been explained within various incremental learning accounts of word production, which propose that each attempt at mapping semantic features to lexical items induces slight but persistent changes that result in cumulative interference. We examined whether similar interference-generating mechanisms operate during the mapping of lexical items to segments by examining the production of words in the context of others that share segments. Previous research has shown that initial-segment overlap amongst a set of target words produces facilitation, not interference. However, this initial-segment facilitation is likely due to strategic preparation, an external factor that may mask underlying interference. In the present study, we applied a novel manipulation in which the segmental overlap across target items was distributed unpredictably across word positions, in order to reduce strategic response preparation. This manipulation led to interference in both spoken (Exp. 1) and written (Exp. 2) production. We suggest that these findings are consistent with a competitive learning mechanism that applies across stages and modalities of word production.
British English infants segment words only with exaggerated infant-directed speech stimuli.

PubMed

Floccia, Caroline; Keren-Portnoy, Tamar; DePaolis, Rory; Duffy, Hester; Delle Luche, Claire; Durrant, Samantha; White, Laurence; Goslin, Jeremy; Vihman, Marilyn

2016-03-01

The word segmentation paradigm originally designed by Jusczyk and Aslin (1995) has been widely used to examine how infants from the age of 7.5 months can extract novel words from continuous speech. Here we report a series of 13 studies conducted independently in two British laboratories, showing that British English-learning infants aged 8-10.5 months fail to show evidence of word segmentation when tested in this paradigm. In only one study did we find evidence of word segmentation at 10.5 months, when we used an exaggerated infant-directed speech style. We discuss the impact of variations in infant-directed style within and across languages in the course of language acquisition. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Experience with a second language affects the use of fundamental frequency in speech segmentation

PubMed Central

Broersma, Mirjam; Cho, Taehong; Kim, Sahyang; Martínez-García, Maria Teresa; Connell, Katrina

2017-01-01

This study investigates whether listeners’ experience with a second language learned later in life affects their use of fundamental frequency (F0) as a cue to word boundaries in the segmentation of an artificial language (AL), particularly when the cues to word boundaries conflict between the first language (L1) and second language (L2). F0 signals phrase-final (and thus word-final) boundaries in French but word-initial boundaries in English. Participants were functionally monolingual French listeners, functionally monolingual English listeners, bilingual L1-English L2-French listeners, and bilingual L1-French L2-English listeners. They completed the AL-segmentation task with F0 signaling word-final boundaries or without prosodic cues to word boundaries (monolingual groups only). After listening to the AL, participants completed a forced-choice word-identification task in which the foils were either non-words or part-words. The results show that the monolingual French listeners, but not the monolingual English listeners, performed better in the presence of F0 cues than in the absence of such cues. Moreover, bilingual status modulated listeners’ use of F0 cues to word-final boundaries, with bilingual French listeners performing less accurately than monolingual French listeners on both word types but with bilingual English listeners performing more accurately than monolingual English listeners on non-words. These findings not only confirm that speech segmentation is modulated by the L1, but also newly demonstrate that listeners’ experience with the L2 (French or English) affects their use of F0 cues in speech segmentation. This suggests that listeners’ use of prosodic cues to word boundaries is adaptive and non-selective, and can change as a function of language experience. PMID:28738093

Interactive language learning by robots: the transition from babbling to word forms.

PubMed

Lyon, Caroline; Nehaniv, Chrystopher L; Saunders, Joe

2012-01-01

The advent of humanoid robots has enabled a new approach to investigating the acquisition of language, and we report on the development of robots able to acquire rudimentary linguistic skills. Our work focuses on early stages analogous to some characteristics of a human child of about 6 to 14 months, the transition from babbling to first word forms. We investigate one mechanism among many that may contribute to this process, a key factor being the sensitivity of learners to the statistical distribution of linguistic elements. As well as being necessary for learning word meanings, the acquisition of anchor word forms facilitates the segmentation of an acoustic stream through other mechanisms. In our experiments some salient one-syllable word forms are learnt by a humanoid robot in real-time interactions with naive participants. Words emerge from random syllabic babble through a learning process based on a dialogue between the robot and the human participant, whose speech is perceived by the robot as a stream of phonemes. Numerous ways of representing the speech as syllabic segments are possible. Furthermore, the pronunciation of many words in spontaneous speech is variable. However, in line with research elsewhere, we observe that salient content words are more likely than function words to have consistent canonical representations; thus their relative frequency increases, as does their influence on the learner. Variable pronunciation may contribute to early word form acquisition. The importance of contingent interaction in real-time between teacher and learner is reflected by a reinforcement process, with variable success. The examination of individual cases may be more informative than group results. Nevertheless, word forms are usually produced by the robot after a few minutes of dialogue, employing a simple, real-time, frequency dependent mechanism. This work shows the potential of human-robot interaction systems in studies of the dynamics of early language acquisition.
Overcoming the Effects of Variation in Infant Speech Segmentation: Influences of Word Familiarity

PubMed Central

Singh, Leher; Nestor, Sarah S.; Bortfeld, Heather

2010-01-01

Previous studies have shown that 7.5-month-olds can track and encode words in fluent speech, but they fail to equate instances of a word that contrast in talker gender, vocal affect, and fundamental frequency. By 10.5 months, they succeed at generalizing across such variability, marking a clear transition period during which infants’ word recognition skills become qualitatively more mature. Here we explore the role of word familiarity in this critical transition and, in particular, whether words that occur frequently in a child’s listening environment (i.e., “Mommy” and “Daddy”) are more easily recognized when they differ in surface characteristics than those that infants have not previously encountered (termed nonwords). Results demonstrate that words are segmented from continuous speech in a more linguistically mature fashion than nonwords at 7.5 months, but at 10.5 months, both words and nonwords are segmented in a relatively mature fashion. These findings suggest that early word recognition is facilitated in cases where infants have had significant exposure to items, but at later stages, infants are able to segment items regardless of their presumed familiarity. PMID:21088702
What's statistical about learning? Insights from modelling statistical learning as a set of memory processes

PubMed Central

2017-01-01

Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274, 1926–1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105, 2745–2750; Thiessen & Yee 2010 Child Development 81, 1287–1303; Saffran 2002 Journal of Memory and Language 47, 172–196; Misyak & Christiansen 2012 Language Learning 62, 302–331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39, 246–263; Thiessen et al. 2013 Psychological Bulletin 139, 792–814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37, 310–343). This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences'. PMID:27872374
What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

PubMed

Thiessen, Erik D

2017-01-05

Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37: , 310-343).This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).
When It Hurts (and Helps) to Try: The Role of Effort in Language Learning

PubMed Central

Finn, Amy S.; Lee, Taraz; Kraus, Allison; Hudson Kam, Carla L.

2014-01-01

Compared to children, adults are bad at learning language. This is counterintuitive; adults outperform children on most measures of cognition, especially those that involve effort (which continue to mature into early adulthood). The present study asks whether these mature effortful abilities interfere with language learning in adults and further, whether interference occurs equally for aspects of language that adults are good (word-segmentation) versus bad (grammar) at learning. Learners were exposed to an artificial language comprised of statistically defined words that belong to phonologically defined categories (grammar). Exposure occurred under passive or effortful conditions. Passive learners were told to listen while effortful learners were instructed to try to 1) learn the words, 2) learn the categories, or 3) learn the category-order. Effortful learners showed an advantage for learning words while passive learners showed an advantage for learning the categories. Effort can therefore hurt the learning of categories. PMID:25047901
When it hurts (and helps) to try: the role of effort in language learning.

PubMed

Finn, Amy S; Lee, Taraz; Kraus, Allison; Hudson Kam, Carla L

2014-01-01

Compared to children, adults are bad at learning language. This is counterintuitive; adults outperform children on most measures of cognition, especially those that involve effort (which continue to mature into early adulthood). The present study asks whether these mature effortful abilities interfere with language learning in adults and further, whether interference occurs equally for aspects of language that adults are good (word-segmentation) versus bad (grammar) at learning. Learners were exposed to an artificial language comprised of statistically defined words that belong to phonologically defined categories (grammar). Exposure occurred under passive or effortful conditions. Passive learners were told to listen while effortful learners were instructed to try to 1) learn the words, 2) learn the categories, or 3) learn the category-order. Effortful learners showed an advantage for learning words while passive learners showed an advantage for learning the categories. Effort can therefore hurt the learning of categories.
Data from Russian Help to Determine in Which Languages the Possible Word Constraint Applies

ERIC Educational Resources Information Center

Alexeeva, Svetlana; Frolova, Anastasia; Slioussar, Natalia

2017-01-01

The Possible Word Constraint, or PWC, is a speech segmentation principle prohibiting to postulate word boundaries if a remaining segment contains only consonants. The PWC was initially formulated for English where all words contain a vowel and claimed to hold universally after being confirmed for various other languages. However, it is crucial to…
Can colours be used to segment words when reading?

PubMed

Perea, Manuel; Tejero, Pilar; Winskel, Heather

2015-07-01

Rayner, Fischer, and Pollatsek (1998, Vision Research) demonstrated that reading unspaced text in Indo-European languages produces a substantial reading cost in word identification (as deduced from an increased word-frequency effect on target words embedded in the unspaced vs. spaced sentences) and in eye movement guidance (as deduced from landing sites closer to the beginning of the words in unspaced sentences). However, the addition of spaces between words comes with a cost: nearby words may fall outside high-acuity central vision, thus reducing the potential benefits of parafoveal processing. In the present experiment, we introduced a salient visual cue intended to facilitate the process of word segmentation without compromising visual acuity: each alternating word was printed in a different colour (i.e., ). Results only revealed a small reading cost of unspaced alternating colour sentences relative to the spaced sentences. Thus, present data are a demonstration that colour can be useful to segment words for readers of spaced orthographies. Copyright © 2015 Elsevier B.V. All rights reserved.
Learning to spell in a language with transparent orthography: Distributional properties of orthography and whole-word lexical processing.

PubMed

Angelelli, Paola; Marinelli, Chiara Valeria; Putzolu, Anna; Notarnicola, Alessandra; Iaia, Marika; Burani, Cristina

2018-03-01

We examined how whole-word lexical information and knowledge of distributional properties of orthography interact in children's spelling. High- versus low-frequency words, which included inconsistently spelled segments occurring more or less frequently in the orthography, were used in two experiments: (a) word spelling; (b) lexical priming of pseudoword spelling. Participants were 1st-, 2nd-, and 4th-grade Italian children. Word spelling showed sensitivity to the distributional properties of orthography in all children: accuracy in spelling uncommon transcription segments emerged progressively as a function of word frequency and schooling. Lexical priming effects emerged as a function of age. When related primes contained an uncommon segment, 2nd- and 4th-graders preferred uncommon segments than common ones in spelling target pseudowords, thus inverting the response trend found in the control condition. A smaller but significant effect was present in 1st- graders, who, unlike 2nd- and 4th-graders, still preferred common segments, only slightly increasing the use of uncommon ones. A larger priming effect emerged for high-frequency primes than low-frequency ones. Results indicate that children learning to spell in a transparent orthography are sensitive to the distributional properties of the orthography. However, whole-word lexical representations are also used, with larger effects in more skilled pupils.
Realization of Chinese word segmentation based on deep learning method

NASA Astrophysics Data System (ADS)

Wang, Xuefei; Wang, Mingjiang; Zhang, Qiquan

2017-08-01

In recent years, with the rapid development of deep learning, it has been widely used in the field of natural language processing. In this paper, I use the method of deep learning to achieve Chinese word segmentation, with large-scale corpus, eliminating the need to construct additional manual characteristics. In the process of Chinese word segmentation, the first step is to deal with the corpus, use word2vec to get word embedding of the corpus, each character is 50. After the word is embedded, the word embedding feature is fed to the bidirectional LSTM, add a linear layer to the hidden layer of the output, and then add a CRF to get the model implemented in this paper. Experimental results show that the method used in the 2014 People's Daily corpus to achieve a satisfactory accuracy.
Text Detection and Translation from Natural Scenes

DTIC Science & Technology

2001-06-01

is no explicit tags around Chinese words. A module for Chinese word segmentation is included in the system. This segmentor uses a word- frequency ... list to make segmentation decisions. We tested the EBMT based method using randomly selected 50 signs from our database, assuming perfect sign
Age and experience shape developmental changes in the neural basis of language-related learning.

PubMed

McNealy, Kristin; Mazziotta, John C; Dapretto, Mirella

2011-11-01

Very little is known about the neural underpinnings of language learning across the lifespan and how these might be modified by maturational and experiential factors. Building on behavioral research highlighting the importance of early word segmentation (i.e. the detection of word boundaries in continuous speech) for subsequent language learning, here we characterize developmental changes in brain activity as this process occurs online, using data collected in a mixed cross-sectional and longitudinal design. One hundred and fifty-six participants, ranging from age 5 to adulthood, underwent functional magnetic resonance imaging (fMRI) while listening to three novel streams of continuous speech, which contained either strong statistical regularities, strong statistical regularities and speech cues, or weak statistical regularities providing minimal cues to word boundaries. All age groups displayed significant signal increases over time in temporal cortices for the streams with high statistical regularities; however, we observed a significant right-to-left shift in the laterality of these learning-related increases with age. Interestingly, only the 5- to 10-year-old children displayed significant signal increases for the stream with low statistical regularities, suggesting an age-related decrease in sensitivity to more subtle statistical cues. Further, in a sample of 78 10-year-olds, we examined the impact of proficiency in a second language and level of pubertal development on learning-related signal increases, showing that the brain regions involved in language learning are influenced by both experiential and maturational factors. 2011 Blackwell Publishing Ltd.
The segment as the minimal planning unit in speech production: evidence based on absolute response latencies.

PubMed

Kawamoto, Alan H; Liu, Qiang; Lee, Ria J; Grebe, Patricia R

2014-01-01

A minimal amount of information about a word must be phonologically and phonetically encoded before a person can begin to utter that word. Most researchers assume that the minimum is the complete word or possibly the initial syllable. However, there is some evidence that the initial segment is sufficient based on longer durations when the initial segment is primed. In two experiments in which the initial segment of a monosyllabic word is primed or not primed, we present additional evidence based on very short absolute response times determined on the basis of acoustic and articulatory onset relative to presentation of the complete target. We argue that the previous failures to find very short absolute response times when the initial segment is primed are due in part to the exclusive use of acoustic onset as a measure of response latency, the exclusion of responses with very short acoustic latencies, the manner of articulation of the initial segment (i.e., plosive vs. nonplosive), and individual differences. Theoretical implications of the segment as the minimal planning unit are considered.
Invented Spelling, Word Stress, and Syllable Awareness in Relation to Reading Difficulties in Children.

PubMed

Mehta, Sheena; Ding, Yi; Ness, Molly; Chen, Eric C

2018-06-01

The study assessed the clinical utility of an invented spelling tool and determined whether invented spelling with linguistic manipulation at segmental and supra-segmental levels can be used to better identify reading difficulties. We conducted linguistic manipulation by using real and nonreal words, incorporating word stress, alternating the order of consonants and vowels, and alternating the number of syllables. We recruited 60 third-grade students, of which half were typical readers and half were poor readers. The invented spelling task consistently differentiated those with reading difficulties from typical readers. It explained unique variance in conventional spelling, but not in word reading. Word stress explained unique variance in both word reading and conventional spelling, highlighting the importance of addressing phonological awareness at the supra-segmental level. Poor readers had poorer performance when spelling both real and nonreal words and demonstrated substantial difficulty in detecting word stress. Poor readers struggled with spelling words with double consonants at the beginning and ending of words, and performed worse on spelling two- and three-syllable words than typical readers. Practical implications for early identification and instruction are discussed.
Development of Morphophonemic Segments in Children's Mental Representations of Words.

ERIC Educational Resources Information Center

Jones, Noel K.

This study explores children's development of dual-level phonological processing posited by generative theory for adult language users. Evidence suggesting 6-year-olds' utilization of morphophonemic segments was obtained by asking children to imitate complex words, omit specified portions, and discuss the meaning of the resulting word-parts. The…
Processing Orthographic Structure: Associations between Print and Fingerspelling

ERIC Educational Resources Information Center

Emmorey, Karen; Petrich, Jennifer A. F.

2012-01-01

Two lexical decision experiments are reported that investigate whether the same segmentation strategies are used for reading printed English words and fingerspelled words (in American Sign Language). Experiment 1 revealed that both deaf and hearing readers performed better when written words were segmented with respect to an orthographically…
Lexical Characteristics of Spanish and English Words and the Development of Phonological Awareness Skills in Spanish-Speaking Language-Minority Children

ERIC Educational Resources Information Center

Goodrich, J. Marc; Lonigan, Christopher J.

2016-01-01

The lexical restructuring model (LRM) is a theory that attempts to explain the developmental origins of phonological awareness (PA). According to the LRM, various characteristics of words should be related to the extent to which words are segmentally represented in the lexicon. Segmental representations of words allow children to access the parts…
More Limitations to Monolingualism: Bilinguals Outperform Monolinguals in Implicit Word Learning.

PubMed

Escudero, Paola; Mulak, Karen E; Fu, Charlene S L; Singh, Leher

2016-01-01

To succeed at cross-situational word learning, learners must infer word-object mappings by attending to the statistical co-occurrences of novel objects and labels across multiple encounters. While past studies have investigated this as a learning mechanism for infants and monolingual adults, bilinguals' cross-situational word learning abilities have yet to be tested. Here, we compared monolinguals' and bilinguals' performance on a cross-situational word learning paradigm that featured phonologically distinct word pairs (e.g., BON-DEET) and phonologically similar word pairs that varied by a single consonant or vowel segment (e.g., BON-TON, DEET-DIT, respectively). Both groups learned the novel word-referent mappings, providing evidence that cross-situational word learning is a learning strategy also available to bilingual adults. Furthermore, bilinguals were overall more accurate than monolinguals. This supports that bilingualism fosters a wide range of cognitive advantages that may benefit implicit word learning. Additionally, response patterns to the different trial types revealed a relative difficulty for vowel minimal pairs than consonant minimal pairs, replicating the pattern found in monolinguals by Escudero et al. (2016) in a different English accent. Specifically, all participants failed to learn vowel contrasts differentiated by vowel height. We discuss evidence for this bilingual advantage as a language-specific or general advantage.
More Limitations to Monolingualism: Bilinguals Outperform Monolinguals in Implicit Word Learning

PubMed Central

Escudero, Paola; Mulak, Karen E.; Fu, Charlene S. L.; Singh, Leher

2016-01-01

To succeed at cross-situational word learning, learners must infer word-object mappings by attending to the statistical co-occurrences of novel objects and labels across multiple encounters. While past studies have investigated this as a learning mechanism for infants and monolingual adults, bilinguals’ cross-situational word learning abilities have yet to be tested. Here, we compared monolinguals’ and bilinguals’ performance on a cross-situational word learning paradigm that featured phonologically distinct word pairs (e.g., BON-DEET) and phonologically similar word pairs that varied by a single consonant or vowel segment (e.g., BON-TON, DEET-DIT, respectively). Both groups learned the novel word-referent mappings, providing evidence that cross-situational word learning is a learning strategy also available to bilingual adults. Furthermore, bilinguals were overall more accurate than monolinguals. This supports that bilingualism fosters a wide range of cognitive advantages that may benefit implicit word learning. Additionally, response patterns to the different trial types revealed a relative difficulty for vowel minimal pairs than consonant minimal pairs, replicating the pattern found in monolinguals by Escudero et al. (2016) in a different English accent. Specifically, all participants failed to learn vowel contrasts differentiated by vowel height. We discuss evidence for this bilingual advantage as a language-specific or general advantage. PMID:27574513
Consonant and Vowel Processing in Word Form Segmentation: An Infant ERP Study.

PubMed

Von Holzen, Katie; Nishibayashi, Leo-Lyuki; Nazzi, Thierry

2018-01-31

Segmentation skill and the preferential processing of consonants (C-bias) develop during the second half of the first year of life and it has been proposed that these facilitate language acquisition. We used Event-related brain potentials (ERPs) to investigate the neural bases of early word form segmentation, and of the early processing of onset consonants, medial vowels, and coda consonants, exploring how differences in these early skills might be related to later language outcomes. Our results with French-learning eight-month-old infants primarily support previous studies that found that the word familiarity effect in segmentation is developing from a positive to a negative polarity at this age. Although as a group infants exhibited an anterior-localized negative effect, inspection of individual results revealed that a majority of infants showed a negative-going response (Negative Responders), while a minority showed a positive-going response (Positive Responders). Furthermore, all infants demonstrated sensitivity to onset consonant mispronunciations, while Negative Responders demonstrated a lack of sensitivity to vowel mispronunciations, a developmental pattern similar to previous literature. Responses to coda consonant mispronunciations revealed neither sensitivity nor lack of sensitivity. We found that infants showing a more mature, negative response to newly segmented words compared to control words (evaluating segmentation skill) and mispronunciations (evaluating phonological processing) at test also had greater growth in word production over the second year of life than infants showing a more positive response. These results establish a relationship between early segmentation skills and phonological processing (not modulated by the type of mispronunciation) and later lexical skills.

Spotting words in handwritten Arabic documents

NASA Astrophysics Data System (ADS)

Srihari, Sargur; Srinivasan, Harish; Babu, Pavithra; Bhole, Chetan

2006-01-01

The design and performance of a system for spotting handwritten Arabic words in scanned document images is presented. Three main components of the system are a word segmenter, a shape based matcher for words and a search interface. The user types in a query in English within a search window, the system finds the equivalent Arabic word, e.g., by dictionary look-up, locates word images in an indexed (segmented) set of documents. A two-step approach is employed in performing the search: (1) prototype selection: the query is used to obtain a set of handwritten samples of that word from a known set of writers (these are the prototypes), and (2) word matching: the prototypes are used to spot each occurrence of those words in the indexed document database. A ranking is performed on the entire set of test word images-- where the ranking criterion is a similarity score between each prototype word and the candidate words based on global word shape features. A database of 20,000 word images contained in 100 scanned handwritten Arabic documents written by 10 different writers was used to study retrieval performance. Using five writers for providing prototypes and the other five for testing, using manually segmented documents, 55% precision is obtained at 50% recall. Performance increases as more writers are used for training.
Word segmentation in phonemically identical and prosodically different sequences using cochlear implants: A case study.

PubMed

Basirat, Anahita

2017-01-01

Cochlear implant (CI) users frequently achieve good speech understanding based on phoneme and word recognition. However, there is a significant variability between CI users in processing prosody. The aim of this study was to examine the abilities of an excellent CI user to segment continuous speech using intonational cues. A post-lingually deafened adult CI user and 22 normal hearing (NH) subjects segmented phonemically identical and prosodically different sequences in French such as 'l'affiche' (the poster) versus 'la fiche' (the sheet), both [lafiʃ]. All participants also completed a minimal pair discrimination task. Stimuli were presented in auditory-only and audiovisual presentation modalities. The performance of the CI user in the minimal pair discrimination task was 97% in the auditory-only and 100% in the audiovisual condition. In the segmentation task, contrary to the NH participants, the performance of the CI user did not differ from the chance level. Visual speech did not improve word segmentation. This result suggests that word segmentation based on intonational cues is challenging when using CIs even when phoneme/word recognition is very well rehabilitated. This finding points to the importance of the assessment of CI users' skills in prosody processing and the need for specific interventions focusing on this aspect of speech communication.
Statistical Learning in a Natural Language by 8-Month-Old Infants

PubMed Central

Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.

2013-01-01

Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896
Statistical learning in a natural language by 8-month-old infants.

PubMed

Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R

2009-01-01

Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition.
The Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal

ERIC Educational Resources Information Center

Ettlinger, Marc; Finn, Amy S.; Hudson Kam, Carla L.

2012-01-01

It has been well documented how language-specific cues may be used for word segmentation. Here, we investigate what role a language-independent phonological universal, the sonority sequencing principle (SSP), may also play. Participants were presented with an unsegmented speech stream with non-English word onsets that juxtaposed adherence to the…
Modeling the Contribution of Phonotactic Cues to the Problem of Word Segmentation

ERIC Educational Resources Information Center

Blanchard, Daniel; Heinz, Jeffrey; Golinkoff, Roberta

2010-01-01

How do infants find the words in the speech stream? Computational models help us understand this feat by revealing the advantages and disadvantages of different strategies that infants might use. Here, we outline a computational model of word segmentation that aims both to incorporate cues proposed by language acquisition researchers and to…
Phonotactic Probability of Brand Names: I'd buy that!

PubMed Central

Vitevitch, Michael S.; Donoso, Alexander J.

2011-01-01

Psycholinguistic research shows that word-characteristics influence the speed and accuracy of various language-related processes. Analogous characteristics of brand names influence the retrieval of product information and the perception of risks associated with that product. In the present experiment we examined how phonotactic probability—the frequency with which phonological segments and sequences of segments appear in a word—might influence consumer behavior. Participants rated brand names that varied in phonotactic probability on the likelihood that they would buy the product. Participants indicated that they were more likely to purchase a product if the brand name was comprised of common segments and sequences of segments rather than less common segments and sequences of segments. This result suggests that word-characteristics may influence higher-level cognitive processes, in addition to language-related processes. Furthermore, the benefits of using objective measures of word characteristics in the design of brand names are discussed. PMID:21870135
On the unsupervised analysis of domain-specific Chinese texts

PubMed Central

Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.

2016-01-01

With the growing availability of digitized text data both publicly and privately, there is a great need for effective computational tools to automatically extract information from texts. Because the Chinese language differs most significantly from alphabet-based languages in not specifying word boundaries, most existing Chinese text-mining methods require a prespecified vocabulary and/or a large relevant training corpus, which may not be available in some applications. We introduce an unsupervised method, top-down word discovery and segmentation (TopWORDS), for simultaneously discovering and segmenting words and phrases from large volumes of unstructured Chinese texts, and propose ways to order discovered words and conduct higher-level context analyses. TopWORDS is particularly useful for mining online and domain-specific texts where the underlying vocabulary is unknown or the texts of interest differ significantly from available training corpora. When outputs from TopWORDS are fed into context analysis tools such as topic modeling, word embedding, and association pattern finding, the results are as good as or better than that from using outputs of a supervised segmentation method. PMID:27185919
Lexical frequency and acoustic reduction in spoken Dutch

NASA Astrophysics Data System (ADS)

Pluymaekers, Mark; Ernestus, Mirjam; Baayen, R. Harald

2005-10-01

This study investigates the effects of lexical frequency on the durational reduction of morphologically complex words in spoken Dutch. The hypothesis that high-frequency words are more reduced than low-frequency words was tested by comparing the durations of affixes occurring in different carrier words. Four Dutch affixes were investigated, each occurring in a large number of words with different frequencies. The materials came from a large database of face-to-face conversations. For each word containing a target affix, one token was randomly selected for acoustic analysis. Measurements were made of the duration of the affix as a whole and the durations of the individual segments in the affix. For three of the four affixes, a higher frequency of the carrier word led to shorter realizations of the affix as a whole, individual segments in the affix, or both. Other relevant factors were the sex and age of the speaker, segmental context, and speech rate. To accommodate for these findings, models of speech production should allow word frequency to affect the acoustic realizations of lower-level units, such as individual speech sounds occurring in affixes.
Tone matters for Cantonese-English bilingual children's English word reading development: A unified model of phonological transfer.

PubMed

Tong, Xiuli; He, Xinjie; Deacon, S Hélène

2017-02-01

Languages differ considerably in how they use prosodic features, or variations in pitch, duration, and intensity, to distinguish one word from another. Prosodic features include lexical tone in Chinese and lexical stress in English. Recent cross-sectional studies show a surprising result that Mandarin Chinese tone sensitivity is related to Mandarin-English bilingual children's English word reading. This study explores the mechanism underlying this relation by testing two explanations of these effects: the prosodic hypothesis and segmental phonological awareness transfer. We administered multiple measures of Cantonese tone sensitivity, English stress sensitivity, segmental phonological awareness in Cantonese and English, nonverbal ability, and English word reading to 123 Cantonese-English bilingual children ages 7 and 8 years. Structural equation modeling revealed a longitudinal prediction of Cantonese tone sensitivity to English word reading between 8 and 9 years of age. This relation was realized through two parallel routes. In one, Cantonese tone sensitivity predicted English stress sensitivity, and English stress sensitivity, in turn, significantly predicted English word reading, as postulated by the prosodic hypothesis. In the second, Cantonese tone sensitivity predicted English word reading through the transfer of segmental phonological awareness between Cantonese and English, as predicted by segmental phonological transfer. These results support a unified model of phonological transfer, emphasizing the role of tone in English word reading for Cantonese-English bilingual children.
Speech Perception Engages a General Timer: Evidence from a Divided Attention Word Identification Task

ERIC Educational Resources Information Center

Casini, Laurence; Burle, Boris; Nguyen, Noel

2009-01-01

Time is essential to speech. The duration of speech segments plays a critical role in the perceptual identification of these segments, and therefore in that of spoken words. Here, using a French word identification task, we show that vowels are perceived as shorter when attention is divided between two tasks, as compared to a single task control…
Min-cut segmentation of cursive handwriting in tabular documents

NASA Astrophysics Data System (ADS)

Davis, Brian L.; Barrett, William A.; Swingle, Scott D.

2015-01-01

Handwritten tabular documents, such as census, birth, death and marriage records, contain a wealth of information vital to genealogical and related research. Much work has been done in segmenting freeform handwriting, however, segmentation of cursive handwriting in tabular documents is still an unsolved problem. Tabular documents present unique segmentation challenges caused by handwriting overlapping cell-boundaries and other words, both horizontally and vertically, as "ascenders" and "descenders" overlap into adjacent cells. This paper presents a method for segmenting handwriting in tabular documents using a min-cut/max-flow algorithm on a graph formed from a distance map and connected components of handwriting. Specifically, we focus on line, word and first letter segmentation. Additionally, we include the angles of strokes of the handwriting as a third dimension to our graph to enable the resulting segments to share pixels of overlapping letters. Word segmentation accuracy is 89.5% evaluating lines of the data set used in the ICDAR2013 Handwriting Segmentation Contest. Accuracy is 92.6% for a specific application of segmenting first and last names from noisy census records. Accuracy for segmenting lines of names from noisy census records is 80.7%. The 3D graph cutting shows promise in segmenting overlapping letters, although highly convoluted or overlapping handwriting remains an ongoing challenge.
Word-level recognition of multifont Arabic text using a feature vector matching approach

NASA Astrophysics Data System (ADS)

Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

1996-03-01

Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.
Diminutives facilitate word segmentation in natural speech: cross-linguistic evidence.

PubMed

Kempe, Vera; Brooks, Patricia J; Gillis, Steven; Samson, Graham

2007-06-01

Final-syllable invariance is characteristic of diminutives (e.g., doggie), which are a pervasive feature of the child-directed speech registers of many languages. Invariance in word endings has been shown to facilitate word segmentation (Kempe, Brooks, & Gillis, 2005) in an incidental-learning paradigm in which synthesized Dutch pseudonouns were used. To broaden the cross-linguistic evidence for this invariance effect and to increase its ecological validity, adult English speakers (n=276) were exposed to naturally spoken Dutch or Russian pseudonouns presented in sentence contexts. A forced choice test was given to assess target recognition, with foils comprising unfamiliar syllable combinations in Experiments 1 and 2 and syllable combinations straddling word boundaries in Experiment 3. A control group (n=210) received the recognition test with no prior exposure to targets. Recognition performance improved with increasing final-syllable rhyme invariance, with larger increases for the experimental group. This confirms that word ending invariance is a valid segmentation cue in artificial, as well as naturalistic, speech and that diminutives may aid segmentation in a number of languages.
Self-Paced Segmentation of Written Words on a Touchscreen Tablet Promotes the Oral Production of Nonverbal and Minimally Verbal Children with Autism

ERIC Educational Resources Information Center

Vernay, Frédérique; Kahina, Harma; Thierry, Marrone; Jean-Yves, Roussey

2017-01-01

We investigated in a pilot study the effects of various types of visual mediation (photos, written words and self-paced syllabic segmentation of written words displayed on a touchscreen tablet) that are thought to facilitate the oral production of nonverbal and minimally verbal children with autism, according to the participants' level of oral…
Proximate Units in Word Production: Phonological Encoding Begins with Syllables in Mandarin Chinese but with Segments in English

ERIC Educational Resources Information Center

O'Seaghdha, Padraig G.; Chen, Jenn-Yeu; Chen, Train-Min

2010-01-01

In Mandarin Chinese, speakers benefit from fore-knowledge of what the first syllable but not of what the first phonemic segment of a disyllabic word will be (Chen, Chen, & Dell, 2002), contrasting with findings in English, Dutch, and other Indo-European languages, and challenging the generality of current theories of word production. In this…
K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics.

PubMed

Holliday, Jeffrey J; Turnbull, Rory; Eychenne, Julien

2017-10-01

This article presents K-SPAN (Korean Surface Phonetics and Neighborhoods), a database of surface phonetic forms and several measures of phonological neighborhood density for 63,836 Korean words. Currently publicly available Korean corpora are limited by the fact that they only provide orthographic representations in Hangeul, which is problematic since phonetic forms in Korean cannot be reliably predicted from orthographic forms. We describe the method used to derive the surface phonetic forms from a publicly available orthographic corpus of Korean, and report on several statistics calculated using this database; namely, segment unigram frequencies, which are compared to previously reported results, along with segment-based and syllable-based neighborhood density statistics for three types of representation: an "orthographic" form, which is a quasi-phonological representation, a "conservative" form, which maintains all known contrasts, and a "modern" form, which represents the pronunciation of contemporary Seoul Korean. These representations are rendered in an ASCII-encoded scheme, which allows users to query the corpus without having to read Korean orthography, and permits the calculation of a wide range of phonological measures.
Dynamic Encoding of Speech Sequence Probability in Human Temporal Cortex

PubMed Central

Leonard, Matthew K.; Bouchard, Kristofer E.; Tang, Claire

2015-01-01

Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. PMID:25948269
CDP++.Italian: Modelling Sublexical and Supralexical Inconsistency in a Shallow Orthography

PubMed Central

Perry, Conrad; Ziegler, Johannes C.; Zorzi, Marco

2014-01-01

Most models of reading aloud have been constructed to explain data in relatively complex orthographies like English and French. Here, we created an Italian version of the Connectionist Dual Process Model of Reading Aloud (CDP++) to examine the extent to which the model could predict data in a language which has relatively simple orthography-phonology relationships but is relatively complex at a suprasegmental (word stress) level. We show that the model exhibits good quantitative performance and accounts for key phenomena observed in naming studies, including some apparently contradictory findings. These effects include stress regularity and stress consistency, both of which have been especially important in studies of word recognition and reading aloud in Italian. Overall, the results of the model compare favourably to an alternative connectionist model that can learn non-linear spelling-to-sound mappings. This suggests that CDP++ is currently the leading computational model of reading aloud in Italian, and that its simple linear learning mechanism adequately captures the statistical regularities of the spelling-to-sound mapping both at the segmental and supra-segmental levels. PMID:24740261
Primary phonological planning units in spoken word production are language-specific: Evidence from an ERP study.

PubMed

Wang, Jie; Wong, Andus Wing-Kuen; Wang, Suiping; Chen, Hsuan-Chih

2017-07-19

It is widely acknowledged in Germanic languages that segments are the primary planning units at the phonological encoding stage of spoken word production. Mixed results, however, have been found in Chinese, and it is still unclear what roles syllables and segments play in planning Chinese spoken word production. In the current study, participants were asked to first prepare and later produce disyllabic Mandarin words upon picture prompts and a response cue while electroencephalogram (EEG) signals were recorded. Each two consecutive pictures implicitly formed a pair of prime and target, whose names shared the same word-initial atonal syllable or the same word-initial segments, or were unrelated in the control conditions. Only syllable repetition induced significant effects on event-related brain potentials (ERPs) after target onset: a widely distributed positivity in the 200- to 400-ms interval and an anterior positivity in the 400- to 600-ms interval. We interpret these to reflect syllable-size representations at the phonological encoding and phonetic encoding stages. Our results provide the first electrophysiological evidence for the distinct role of syllables in producing Mandarin spoken words, supporting a language specificity hypothesis about the primary phonological units in spoken word production.

Cascaded Segmentation-Detection Networks for Word-Level Text Spotting.

PubMed

Qin, Siyang; Manduchi, Roberto

2017-11-01

We introduce an algorithm for word-level text spotting that is able to accurately and reliably determine the bounding regions of individual words of text "in the wild". Our system is formed by the cascade of two convolutional neural networks. The first network is fully convolutional and is in charge of detecting areas containing text. This results in a very reliable but possibly inaccurate segmentation of the input image. The second network (inspired by the popular YOLO architecture) analyzes each segment produced in the first stage, and predicts oriented rectangular regions containing individual words. No post-processing (e.g. text line grouping) is necessary. With execution time of 450 ms for a 1000 × 560 image on a Titan X GPU, our system achieves good performance on the ICDAR 2013, 2015 benchmarks [2], [1].
Recognition of speaker-dependent continuous speech with KEAL

NASA Astrophysics Data System (ADS)

Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.

1989-04-01

A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.
The interaction between specificity and linguistic contrast

NASA Astrophysics Data System (ADS)

Nielsen, Kuniko

2005-09-01

Previous studies have shown listeners' ability to remember fine phonetic details [e.g., Mullennix et al., 1989], providing support for the episodic view of speech perception. The imitation paradigm [Goldinger, 1998, Shockley et al., 2004], in which subjects' speech is compared before and after they are exposed to target speech (= study phase) has shown that subjects shift their production in the direction of the target. Our earlier results [Nielsen, 2005] showed that the imitation effect for extended VOT was generalized to new stimuli as well as to a new segment, suggesting that the locus of the imitation effect can be smaller than individual words or segments. The current study aims to further investigate how experienced speech input interacts with linguistic representations, by testing whether the imitation effect is observed when the modeled stimuli have reduced VOT (which could introduce linguistic ambiguity). In other words, do speakers imitate and generalize shorter VOT even if the change might impair linguistic contrasts? To address this question, the study phase includes words with initial /p/ with reduced VOT, while the pre- and post-study production list includes (1) the modeled words, (2) the modeled segments /p/ in new words, and (3) the new segment /k/.
Word Family Size and French-Speaking Children's Segmentation of Existing Compounds

ERIC Educational Resources Information Center

Nicoladis, Elena; Krott, Andrea

2007-01-01

The family size of the constituents of compound words, or the number of compounds sharing the constituents, affects English-speaking children's compound segmentation. This finding is consistent with a usage-based theory of language acquisition, whereby children learn abstract underlying linguistic structure through their experience with particular…
Multidimensional Assessment of Phonological Similarity within and between Children

ERIC Educational Resources Information Center

Ingram, David; Dubasik, Virginia L.

2011-01-01

Multidimensional analysis involves moving away from one-dimensional analyses such as most articulation tests to comprehensive analyses involving levels of phonological information from the word level down to segments. This article outlines one such approach that looks at four levels from words to segments, using nine phonological measures. It also…
Possible-Word Constraints in Cantonese Speech Segmentation

ERIC Educational Resources Information Center

Yip, Michael C. W.

2004-01-01

A Cantonese syllable-spotting experiment was conducted to examine whether the Possible-Word Constraint (PWC), proposed by Norris, McQueen, Cutler, and Butterfield (1997), can apply in Cantonese speech segmentation. In the experiment, listeners were asked to spot out the target Cantonese syllable from a series of nonsense sound strings. Results…
Visual speech segmentation: using facial cues to locate word boundaries in continuous speech

PubMed Central

Mitchel, Aaron D.; Weiss, Daniel J.

2014-01-01

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition. PMID:25018577
Preparing novice teachers to develop basic reading and spelling skills in children.

PubMed

Spear-Swerling, Louise; Brucker, Pamela Owen

2004-12-01

This study examined the word-structure knowledge of novice teachers and the progress of children tutored by a subgroup of the teachers. Teachers' word-structure knowledge was assessed using three tasks: graphophonemic segmentation, classification of pseudowords by syllable type, and classification of real words as phonetically regular or irregular. Tutored children were assessed on several measures of basic reading and spelling skills. Novice teachers who received word-structure instruction outperformed a comparison group of teachers in word-structure knowledge at post-test. Tutored children improved significantly from pre-test to post-test on all assessments. Teachers' post-test knowledge on the graphophonemic segmentation and irregular words tasks correlated significantly with tutored children's progress in decoding phonetically regular words; error analyses indicated links between teachers' patterns of word-structure knowledge and children's patterns of decoding progress. The study suggests that word-structure knowledge is important to effective teaching of word decoding and underscores the need to include this information in teacher preparation.
The role of character positional frequency on Chinese word learning during natural reading.

PubMed

Liang, Feifei; Blythe, Hazel I; Bai, Xuejun; Yan, Guoli; Li, Xin; Zang, Chuanli; Liversedge, Simon P

2017-01-01

Readers' eye movements were recorded to examine the role of character positional frequency on Chinese lexical acquisition during reading and its possible modulation by word spacing. In Experiment 1, three types of pseudowords were constructed based on each character's positional frequency, providing congruent, incongruent, and no positional word segmentation information. Each pseudoword was embedded into two sets of sentences, for the learning and the test phases. In the learning phase, half the participants read sentences in word-spaced format, and half in unspaced format. In the test phase, all participants read sentences in unspaced format. The results showed an inhibitory effect of character positional frequency upon the efficiency of word learning when processing incongruent pseudowords both in the learning and test phase, and also showed facilitatory effect of word spacing in the learning phase, but not at test. Most importantly, these two characteristics exerted independent influences on word segmentation. In Experiment 2, three analogous types of pseudowords were created whilst controlling for orthographic neighborhood size. The results of the two experiments were consistent, except that the effect of character positional frequency was absent in the test phase in Experiment 2. We argue that the positional frequency of a word's constituent characters may influence the character-to-word assignment in a process that likely incorporates both lexical segmentation and identification.
The use of phrase-level prosodic information in lexical segmentation: evidence from word-spotting experiments in Korean.

PubMed

Kim, Sahyang; Cho, Taehong

2009-05-01

This study investigated the role of phrase-level prosodic boundary information in word segmentation in Korean with two word-spotting experiments. In experiment 1, it was found that intonational cues alone helped listeners with lexical segmentation. Listeners paid more attention to local intonational cues (...H#L...) across the prosodic boundary than the intonational information within a prosodic phrase. The results imply that intonation patterns with high frequency are used, though not exclusively, in lexical segmentation. In experiment 2, final lengthening was added to see how multiple prosodic cues influence lexical segmentation. The results showed that listeners did not necessarily benefit from the presence of both intonational and final lengthening cues: Their performance was improved only when intonational information contained infrequent tonal patterns for boundary marking, showing only partially cumulative effects of prosodic cues. When the intonational information was optimal (frequent) for boundary marking, however, poorer performance was observed with final lengthening. This is arguably because the phrase-initial segmental allophonic cues for the accentual phrase were not matched with the prosodic cues for the intonational phrase. It is proposed that the asymmetrical use of multiple cues was due to interaction between prosodic and segmental information that are computed in parallel in lexical segmentation.
Statistical Learning is Related to Early Literacy-Related Skills

PubMed Central

Spencer, Mercedes; Kaschak, Michael P.; Jones, John L.; Lonigan, Christopher J.

2015-01-01

It has been demonstrated that statistical learning, or the ability to use statistical information to learn the structure of one’s environment, plays a role in young children’s acquisition of linguistic knowledge. Although most research on statistical learning has focused on language acquisition processes, such as the segmentation of words from fluent speech and the learning of syntactic structure, some recent studies have explored the extent to which individual differences in statistical learning are related to literacy-relevant knowledge and skills. The present study extends on this literature by investigating the relations between two measures of statistical learning and multiple measures of skills that are critical to the development of literacy—oral language, vocabulary knowledge, and phonological processing—within a single model. Our sample included a total of 553 typically developing children from prekindergarten through second grade. Structural equation modeling revealed that statistical learning accounted for a unique portion of the variance in these literacy-related skills. Practical implications for instruction and assessment are discussed. PMID:26478658
Word Spotting.

ERIC Educational Resources Information Center

McQueen, James

1996-01-01

Summarizes the use of word-spotting in psycholinguistic research. Notes that listeners hear a list of nonsense words, some of which contain embedded real words, and they detect those embedded words, a task designed to study the segmentation of continuous speech. Describes the task and summarizes its advantages and disadvantages. (12 references)…
Metallic Material Image Segmentation by using 3D Grain Structure Consistency and Intra/Inter-Grain Model Information

DTIC Science & Technology

2015-01-05

Wang. KinWrite: Handwriting -Based Authentication Using Kinect, Annual Network & Distributed System Security Symposium (NDSS), San Diego, CA, 2013 21...the large varia- tion of different handwriting styles, neighboring characters within a word are usually connected, and we may need to segment a word
How Transitional Probabilities and the Edge Effect Contribute to Listeners' Phonological Bootstrapping Success

ERIC Educational Resources Information Center

Sohail, Juwairia; Johnson, Elizabeth K.

2016-01-01

Much of what we know about the development of listeners' word segmentation strategies originates from the artificial language-learning literature. However, many artificial speech streams designed to study word segmentation lack a salient cue found in all natural languages: utterance boundaries. In this study, participants listened to a…
Modeling the Control of Phonological Encoding in Bilingual Speakers

ERIC Educational Resources Information Center

Roelofs, Ardi; Verhoef, Kim

2006-01-01

Phonological encoding is the process by which speakers retrieve phonemic segments for morphemes from memory and use the segments to assemble phonological representations of words to be spoken. When conversing in one language, bilingual speakers have to resist the temptation of encoding word forms using the phonological rules and representations of…
Sleep-Driven Computations in Speech Processing

PubMed Central

Frost, Rebecca L. A.; Monaghan, Padraic

2017-01-01

Acquiring language requires segmenting speech into individual words, and abstracting over those words to discover grammatical structure. However, these tasks can be conflicting—on the one hand requiring memorisation of precise sequences that occur in speech, and on the other requiring a flexible reconstruction of these sequences to determine the grammar. Here, we examine whether speech segmentation and generalisation of grammar can occur simultaneously—with the conflicting requirements for these tasks being over-come by sleep-related consolidation. After exposure to an artificial language comprising words containing non-adjacent dependencies, participants underwent periods of consolidation involving either sleep or wake. Participants who slept before testing demonstrated a sustained boost to word learning and a short-term improvement to grammatical generalisation of the non-adjacencies, with improvements after sleep outweighing gains seen after an equal period of wake. Thus, we propose that sleep may facilitate processing for these conflicting tasks in language acquisition, but with enhanced benefits for speech segmentation. PMID:28056104
Sleep-Driven Computations in Speech Processing.

PubMed

Frost, Rebecca L A; Monaghan, Padraic

2017-01-01

Acquiring language requires segmenting speech into individual words, and abstracting over those words to discover grammatical structure. However, these tasks can be conflicting-on the one hand requiring memorisation of precise sequences that occur in speech, and on the other requiring a flexible reconstruction of these sequences to determine the grammar. Here, we examine whether speech segmentation and generalisation of grammar can occur simultaneously-with the conflicting requirements for these tasks being over-come by sleep-related consolidation. After exposure to an artificial language comprising words containing non-adjacent dependencies, participants underwent periods of consolidation involving either sleep or wake. Participants who slept before testing demonstrated a sustained boost to word learning and a short-term improvement to grammatical generalisation of the non-adjacencies, with improvements after sleep outweighing gains seen after an equal period of wake. Thus, we propose that sleep may facilitate processing for these conflicting tasks in language acquisition, but with enhanced benefits for speech segmentation.
The functional unit of Japanese word naming: evidence from masked priming.

PubMed

Verdonschot, Rinus G; Kiyama, Sachiko; Tamaoka, Katsuo; Kinoshita, Sachiko; Heij, Wido La; Schiller, Niels O

2011-11-01

Theories of language production generally describe the segment as the basic unit in phonological encoding (e.g., Dell, 1988; Levelt, Roelofs, & Meyer, 1999). However, there is also evidence that such a unit might be language specific. Chen, Chen, and Dell (2002), for instance, found no effect of single segments when using a preparation paradigm. To shed more light on the functional unit of phonological encoding in Japanese, a language often described as being mora based, we report the results of 4 experiments using word reading tasks and masked priming. Experiment 1 demonstrated using Japanese kana script that primes, which overlapped in the whole mora with target words, sped up word reading latencies but not when just the onset overlapped. Experiments 2 and 3 investigated a possible role of script by using combinations of romaji (Romanized Japanese) and hiragana; again, facilitation effects were found only when the whole mora and not the onset segment overlapped. Experiment 4 distinguished mora priming from syllable priming and revealed that the mora priming effects obtained in the first 3 experiments are also obtained when a mora is part of a syllable. Again, no priming effect was found for single segments. Our findings suggest that the mora and not the segment (phoneme) is the basic functional phonological unit in Japanese language production planning.
Overnight lexical consolidation revealed by speech segmentation.

PubMed

Dumay, Nicolas; Gareth Gaskell, M

2012-04-01

Two experiments explored the consolidation of spoken words, and assessed whether post-sleep novel competitor effects truly reflect engagement of these novel words in competition for lexical segmentation. Two types of competitor relationships were contrasted: the onset-aligned case (such as "frenzylk"), where the novel word is a close variant of the existing word: they start at the same time point and overlap on most of their segments; and the fully embedding case (such as "lirmucktoze"), where the existing word corresponds to a smaller embedded portion of its novel competitor and is thus less noticeable. Experiment 1 (pause detection) revealed a similar performance for both cases, with no competitor effect immediately after exposure, but significant inhibition after 24 h and seven days. Experiment 2 (word spotting) produced exactly the same pattern; however, as is the case with existing word carriers (cf. McQueen, Norris, & Cutler, 1994), the inhibition was much stronger for fully embedded than for onset-aligned targets (e.g., "lirmuckt" vs. "frenzyl"). Meanwhile, explicit measures of learning, i.e., free recall and recognition, improved over time. These results cannot be explained by either consolidation of episodic traces or acquisition of new phonological/dialectal variants. We argue instead that they reflect a general trait of vocabulary learning and consolidation. Copyright © 2011 Elsevier B.V. All rights reserved.
A role for the developing lexicon in phonetic category acquisition

PubMed Central

Feldman, Naomi H.; Griffiths, Thomas L.; Goldwater, Sharon; Morgan, James L.

2013-01-01

Infants segment words from fluent speech during the same period when they are learning phonetic categories, yet accounts of phonetic category acquisition typically ignore information about the words in which sounds appear. We use a Bayesian model to illustrate how feedback from segmented words might constrain phonetic category learning by providing information about which sounds occur together in words. Simulations demonstrate that word-level information can successfully disambiguate overlapping English vowel categories. Learning patterns in the model are shown to parallel human behavior from artificial language learning tasks. These findings point to a central role for the developing lexicon in phonetic category acquisition and provide a framework for incorporating top-down constraints into models of category learning. PMID:24219848

Measuring Severity of Involvement in Speech Delay: Segmental and Whole-Word Measures

ERIC Educational Resources Information Center

Flipsen, Peter, Jr.; Hammer, Jill B.; Yost, Kathryn M.

2005-01-01

Purpose: This study examined whether any of a series of segmental and whole-word measures of articulatory competence captured more of the variance in impressionistic ratings of severity of involvement in speech delay. It also examined whether knowing the age of the child affected severity ratings. Method: Ten very experienced speech-language…
A Computational Model of Word Segmentation from Continuous Speech Using Transitional Probabilities of Atomic Acoustic Events

ERIC Educational Resources Information Center

Rasanen, Okko

2011-01-01

Word segmentation from continuous speech is a difficult task that is faced by human infants when they start to learn their native language. Several studies indicate that infants might use several different cues to solve this problem, including intonation, linguistic stress, and transitional probabilities between subsequent speech sounds. In this…
How African American English-Speaking First Graders Segment and Rhyme Words and Nonwords with Final Consonant Clusters

ERIC Educational Resources Information Center

Shollenbarger, Amy J.; Robinson, Gregory C.; Taran, Valentina; Choi, Seo-eun

2017-01-01

Purpose: This study explored how typically developing 1st grade African American English (AAE) speakers differ from mainstream American English (MAE) speakers in the completion of 2 common phonological awareness tasks (rhyming and phoneme segmentation) when the stimulus items were consonant-vowel-consonant-consonant (CVCC) words and nonwords.…
The Relation between Order of Acquisition, Segmental Frequency and Function: The Case of Word-Initial Consonants in Dutch

ERIC Educational Resources Information Center

van Severen, Lieve; Gillis, Joris J. M.; Molemans, Inge; van den Berg, Renate; De Maeyer, Sven; Gillis, Steven

2013-01-01

The impact of input frequency (IF) and functional load (FL) of segments in the ambient language on the acquisition order of word-initial consonants is investigated. Several definitions of IF/FL are compared and implemented. The impact of IF/FL and their components are computed using a longitudinal corpus of interactions between thirty…
Lexical Stress and Phonetic Processing in Word Learning in 20- to 24-Month-Old English-Learning Children

ERIC Educational Resources Information Center

Floccia, Caroline; Nazzi, Thierry; Austin, Keith; Arreckx, Frederique; Goslin, Jeremy

2011-01-01

To investigate the interaction between segmental and supra-segmental stress-related information in early word learning, two experiments were conducted with 20- to 24-month-old English-learning children. In an adaptation of the object categorization study designed by Nazzi and Gopnik (2001), children were presented with pairs of novel objects whose…
Phonotactic Probability Effects in Children Who Stutter

PubMed Central

Anderson, Julie D.; Byrd, Courtney T.

2008-01-01

Purpose The purpose of this study was to examine the influence of phonotactic probability, the frequency of different sound segments and segment sequences, on the overall fluency with which words are produced by preschool children who stutter (CWS), as well as to determine whether it has an effect on the type of stuttered disfluency produced. Method A 500+ word language sample was obtained from 19 CWS. Each stuttered word was randomly paired with a fluently produced word that closely matched it in grammatical class, word length, familiarity, word and neighborhood frequency, and neighborhood density. Phonotactic probability values were obtained for the stuttered and fluent words from an online database. Results Phonotactic probability did not have a significant influence on the overall susceptibility of words to stuttering, but it did impact the type of stuttered disfluency produced. In specific, single-syllable word repetitions were significantly lower in phonotactic probability than fluently produced words, as well as part-word repetitions and sound prolongations. Conclusions In general, the differential impact of phonotactic probability on the type of stuttering-like disfluency produced by young CWS provides some support for the notion that different disfluency types may originate in the disruption of different levels of processing. PMID:18658056
Segmentation of Written Words in French

ERIC Educational Resources Information Center

Chetail, Fabienne; Content, Alain

2013-01-01

Syllabification of spoken words has been largely used to define syllabic properties of written words, such as the number of syllables or syllabic boundaries. By contrast, some authors proposed that the functional structure of written words stems from visuo-orthographic features rather than from the transposition of phonological structure into the…
Partial Membership Latent Dirichlet Allocation for Soft Image Segmentation.

PubMed

Chen, Chao; Zare, Alina; Trinh, Huy N; Omotara, Gbenga O; Cobb, James Tory; Lagaunne, Timotius A

2017-12-01

Topic models [e.g., probabilistic latent semantic analysis, latent Dirichlet allocation (LDA), and supervised LDA] have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership LDA (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery, where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research.

PubMed

Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif

2016-03-11

Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers-that we proposed earlier-improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research

PubMed Central

Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif

2016-01-01

Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. PMID:26978368
Segmental transition of the first syllables of words in Japanese children who stutter: Comparison between word and sentence production.

PubMed

Matsumoto, Sachiyo; Ito, Tomohiko

2016-01-01

Matsumoto-Shimamori, Ito, Fukuda, & Fukuda (2011) proposed the hypothesis that in Japanese, the transition from the core vowels (i.e. syllable nucleus) of the first syllables of words to the following segments affected the occurrence of stuttering. Moreover, in this transition position, an inter-syllabic transition precipitated more stuttering than an intra-syllabic one (Shimamori & Ito, 2007, 2008). However, these studies have only used word production tasks. The purpose of this study was to investigate whether the same results could be obtained in sentence production tasks. Participants were 28 Japanese school-age children who stutter, ranging in age from 7;3 to 12;7. The frequency of stuttering on words with an inter-syllabic transition was significantly higher than on those having an intra-syllabic transition, not only in isolated words but in the first words of sentences. These results suggested that Matsumoto et al.'s hypothesis could be applicable to the results of sentence production tasks.
Word-level information influences phonetic learning in adults and infants

PubMed Central

Feldman, Naomi H.; Myers, Emily B.; White, Katherine S.; Griffiths, Thomas L.; Morgan, James L.

2013-01-01

Infants begin to segment words from fluent speech during the same time period that they learn phonetic categories. Segmented words can provide a potentially useful cue for phonetic learning, yet accounts of phonetic category acquisition typically ignore the contexts in which sounds appear. We present two experiments to show that, contrary to the assumption that phonetic learning occurs in isolation, learners are sensitive to the words in which sounds appear and can use this information to constrain their interpretation of phonetic variability. Experiment 1 shows that adults use word-level information in a phonetic category learning task, assigning acoustically similar vowels to different categories more often when those sounds consistently appear in different words. Experiment 2 demonstrates that eight-month-old infants similarly pay attention to word-level information and that this information affects how they treat phonetic contrasts. These findings suggest that phonetic category learning is a rich, interactive process that takes advantage of many different types of cues that are present in the input. PMID:23562941
Contribution of Phonemic Segmentation Instruction with Letters and Articulation Pictures to Word Reading and Spelling in Beginners

ERIC Educational Resources Information Center

Boyer, Nancy; Ehri, Linnea C.

2011-01-01

English-speaking preschoolers who knew letters but were nonreaders (M = 4 years 9 months; n = 60) were taught to segment consonant-vowel (CV), VC, and CVC words into phonemes either with letters and pictures of articulatory gestures (the LPA condition) or with letters only (the LO condition). A control group received no treatment. Both trained…
Automatic measurement and representation of prosodic features

NASA Astrophysics Data System (ADS)

Ying, Goangshiuan Shawn

Effective measurement and representation of prosodic features of the acoustic signal for use in automatic speech recognition and understanding systems is the goal of this work. Prosodic features-stress, duration, and intonation-are variations of the acoustic signal whose domains are beyond the boundaries of each individual phonetic segment. Listeners perceive prosodic features through a complex combination of acoustic correlates such as intensity, duration, and fundamental frequency (F0). We have developed new tools to measure F0 and intensity features. We apply a probabilistic global error correction routine to an Average Magnitude Difference Function (AMDF) pitch detector. A new short-term frequency-domain Teager energy algorithm is used to measure the energy of a speech signal. We have conducted a series of experiments performing lexical stress detection on words in continuous English speech from two speech corpora. We have experimented with two different approaches, a segment-based approach and a rhythm unit-based approach, in lexical stress detection. The first approach uses pattern recognition with energy- and duration-based measurements as features to build Bayesian classifiers to detect the stress level of a vowel segment. In the second approach we define rhythm unit and use only the F0-based measurement and a scoring system to determine the stressed segment in the rhythm unit. A duration-based segmentation routine was developed to break polysyllabic words into rhythm units. The long-term goal of this work is to develop a system that can effectively detect the stress pattern for each word in continuous speech utterances. Stress information will be integrated as a constraint for pruning the word hypotheses in a word recognition system based on hidden Markov models.
English Listeners Use Suprasegmental Cues to Lexical Stress Early During Spoken-Word Recognition

PubMed Central

Poellmann, Katja; Kong, Ying-Yee

2017-01-01

Purpose We used an eye-tracking technique to investigate whether English listeners use suprasegmental information about lexical stress to speed up the recognition of spoken words in English. Method In a visual world paradigm, 24 young English listeners followed spoken instructions to choose 1 of 4 printed referents on a computer screen (e.g., “Click on the word admiral”). Displays contained a critical pair of words (e.g., ˈadmiral–ˌadmiˈration) that were segmentally identical for their first 2 syllables but differed suprasegmentally in their 1st syllable: One word began with primary lexical stress, and the other began with secondary lexical stress. All words had phrase-level prominence. Listeners' relative proportion of eye fixations on these words indicated their ability to differentiate them over time. Results Before critical word pairs became segmentally distinguishable in their 3rd syllables, participants fixated target words more than their stress competitors, but only if targets had initial primary lexical stress. The degree to which stress competitors were fixated was independent of their stress pattern. Conclusions Suprasegmental information about lexical stress modulates the time course of spoken-word recognition. Specifically, suprasegmental information on the primary-stressed syllable of words with phrase-level prominence helps in distinguishing the word from phonological competitors with secondary lexical stress. PMID:28056135
Whole Word Measures in Bilingual Children with Speech Sound Disorders

ERIC Educational Resources Information Center

Burrows, Lauren; Goldstein, Brian A.

2010-01-01

Phonological acquisition traditionally has been measured using constructs that focus on segments rather than the whole words. Findings from recent research have suggested whole-word productions be evaluated using measures such as phonological mean length of utterance (pMLU) and the proportion of whole-word proximity (PWP). These measures have been…
An Efficient Pipeline for Abdomen Segmentation in CT Images.

PubMed

Koyuncu, Hasan; Ceylan, Rahime; Sivri, Mesut; Erdogan, Hasan

2018-04-01

Computed tomography (CT) scans usually include some disadvantages due to the nature of the imaging procedure, and these handicaps prevent accurate abdomen segmentation. Discontinuous abdomen edges, bed section of CT, patient information, closeness between the edges of the abdomen and CT, poor contrast, and a narrow histogram can be regarded as the most important handicaps that occur in abdominal CT scans. Currently, one or more handicaps can arise and prevent technicians obtaining abdomen images through simple segmentation techniques. In other words, CT scans can include the bed section of CT, a patient's diagnostic information, low-quality abdomen edges, low-level contrast, and narrow histogram, all in one scan. These phenomena constitute a challenge, and an efficient pipeline that is unaffected by handicaps is required. In addition, analysis such as segmentation, feature selection, and classification has meaning for a real-time diagnosis system in cases where the abdomen section is directly used with a specific size. A statistical pipeline is designed in this study that is unaffected by the handicaps mentioned above. Intensity-based approaches, morphological processes, and histogram-based procedures are utilized to design an efficient structure. Performance evaluation is realized in experiments on 58 CT images (16 training, 16 test, and 26 validation) that include the abdomen and one or more disadvantage(s). The first part of the data (16 training images) is used to detect the pipeline's optimum parameters, while the second and third parts are utilized to evaluate and to confirm the segmentation performance. The segmentation results are presented as the means of six performance metrics. Thus, the proposed method achieves remarkable average rates for training/test/validation of 98.95/99.36/99.57% (jaccard), 99.47/99.67/99.79% (dice), 100/99.91/99.91% (sensitivity), 98.47/99.23/99.85% (specificity), 99.38/99.63/99.87% (classification accuracy), and 98.98/99.45/99.66% (precision). In summary, a statistical pipeline performing the task of abdomen segmentation is achieved that is not affected by the disadvantages, and the most detailed abdomen segmentation study is performed for the use before organ and tumor segmentation, feature extraction, and classification.
Evidence for Separate Tonal and Segmental Tiers in the Lexical Specification of Words: A Case Study of a Brain-Damaged Chinese Speaker

ERIC Educational Resources Information Center

Liang, Jie; van Heuven, Vincent J.

2004-01-01

We present an acoustic study of segmental and prosodic properties of words produced by a female speaker of Chinese with left-hemisphere brain damage. We measured the location of the point vowels /a, e, @?, i, y, o, u/ and determined their separation in the vowel plane, and their perceptual distinctivity. Similarly, the acoustic properties of the…
An eye movement based reading intervention in lexical and segmental readers with acquired dyslexia.

PubMed

Ablinger, Irene; von Heyden, Kerstin; Vorstius, Christian; Halm, Katja; Huber, Walter; Radach, Ralph

2014-01-01

Due to their brain damage, aphasic patients with acquired dyslexia often rely to a greater extent on lexical or segmental reading procedures. Thus, therapy intervention is mostly targeted on the more impaired reading strategy. In the present work we introduce a novel therapy approach based on real-time measurement of patients' eye movements as they attempt to read words. More specifically, an eye movement contingent technique of stepwise letter de-masking was used to support sequential reading, whereas fixation-dependent initial masking of non-central letters stimulated a lexical (parallel) reading strategy. Four lexical and four segmental readers with acquired central dyslexia received our intensive reading intervention. All participants showed remarkable improvements as evident in reduced total reading time, a reduced number of fixations per word and improved reading accuracy. Both types of intervention led to item-specific training effects in all subjects. A generalisation to untrained items was only found in segmental readers after the lexical training. Eye movement analyses were also used to compare word processing before and after therapy, indicating that all patients, with one exclusion, maintained their preferred reading strategy. However, in several cases the balance between sequential and lexical processing became less extreme, indicating a more effective individual interplay of both word processing routes.
Words in Puddles of Sound: Modelling Psycholinguistic Effects in Speech Segmentation

ERIC Educational Resources Information Center

Monaghan, Padraic; Christiansen, Morten H.

2010-01-01

There are numerous models of how speech segmentation may proceed in infants acquiring their first language. We present a framework for considering the relative merits and limitations of these various approaches. We then present a model of speech segmentation that aims to reveal important sources of information for speech segmentation, and to…

An Analysis of the Most Frequently Occurring Words in Spoken American English.

ERIC Educational Resources Information Center

Plant, Geoff

1999-01-01

A study analyzed frequency of occurrence of consonants, vowels, and diphthongs, syllabic structure of the words, and segmental structure of the 311 monosyllabic words of 500 words that occur most frequently in English. Three mannerisms of articulation accounted for nearly 75 percent of all consonant occurrences: stops, semi-vowels, and nasals.…
Acquisition of Malay word recognition skills: lessons from low-progress early readers.

PubMed

Lee, Lay Wah; Wheldall, Kevin

2011-02-01

Malay is a consistent alphabetic orthography with complex syllable structures. The focus of this research was to investigate word recognition performance in order to inform reading interventions for low-progress early readers. Forty-six Grade 1 students were sampled and 11 were identified as low-progress readers. The results indicated that both syllable awareness and phoneme blending were significant predictors of word recognition, suggesting that both syllable and phonemic grain-sizes are important in Malay word recognition. Item analysis revealed a hierarchical pattern of difficulty based on the syllable and the phonic structure of the words. Error analysis identified the sources of errors to be errors due to inefficient syllable segmentation, oversimplification of syllables, insufficient grapheme-phoneme knowledge and inefficient phonemic code assembly. Evidence also suggests that direct instruction in syllable segmentation, phonemic awareness and grapheme-phoneme correspondence is necessary for low-progress readers to acquire word recognition skills. Finally, a logical sequence to teach grapheme-phoneme decoding in Malay is suggested. Copyright © 2010 John Wiley & Sons, Ltd.
Word-initial rhotic clusters in typically developing children: European Portuguese.

PubMed

Ramalho, Ana Margarida; Freitas, M João

2018-01-01

Rhotic clusters are complex structures segmentally and prosodically and are frequently one of the last structures acquired by Portuguese-speaking children. This paper describes cross-sectional data for word-initial (WI) rhotic tap clusters in typically developing 3-4- and 5-year-olds in Portugal. Additional information is provided on WI /l/ as a singleton and in clusters. A native speaker audio-recorded and transcribed single words in a story-telling task. Results for WI rhotic clusters show an age effect consistent with previous research on European Portuguese. Singleton /l/ was in advance of /l/-clusters as expected, but the tap clusters were in advance of the /l/-clusters, possibly reflecting the velarized characteristics of the lateral. The prosodic variables word stress and word length were relevant for the WI rhotic clusters: shorter words and stressed syllables showed higher accuracy. Finally, mismatches ('errors') mainly reflected negative structural constraints (deletion of C2 and epenthesis) rather than segmental constraints (substitutions).
Support for context effects on segmentation and segments depends on the context.

PubMed

Heffner, Christopher C; Newman, Rochelle S; Idsardi, William J

2017-04-01

Listeners must adapt to differences in speech rate across talkers and situations. Speech rate adaptation effects are strong for adjacent syllables (i.e., proximal syllables). For studies that have assessed adaptation effects on speech rate information more than one syllable removed from a point of ambiguity in speech (i.e., distal syllables), the difference in strength between different types of ambiguity is stark. Studies of word segmentation have shown large shifts in perception as a result of distal rate manipulations, while studies of segmental perception have shown only weak, or even nonexistent, effects. However, no study has standardized methods and materials to study context effects for both types of ambiguity simultaneously. Here, a set of sentences was created that differed as minimally as possible except for whether the sentences were ambiguous to the voicing of a consonant or ambiguous to the location of a word boundary. The sentences were then rate-modified to slow down the distal context speech rate to various extents, dependent on three different definitions of distal context that were adapted from previous experiments, along with a manipulation of proximal context to assess whether proximal effects were comparable across ambiguity types. The results indicate that the definition of distal influenced the extent of distal rate effects strongly for both segments and segmentation. They also establish the presence of distal rate effects on word-final segments for the first time. These results were replicated, with some caveats regarding the perception of individual segments, in an Internet-based sample recruited from Mechanical Turk.
Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis

PubMed Central

Bussemaker, Harmen J.; Li, Hao; Siggia, Eric D.

2000-01-01

The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners. PMID:10944202
Development of A Two-Stage Procedure for the Automatic Recognition of Dysfluencies in the Speech of Children Who Stutter: I. Psychometric Procedures Appropriate for Selection of Training Material for Lexical Dysfluency Classifiers

PubMed Central

Howell, Peter; Sackin, Stevie; Glenn, Kazan

2007-01-01

This program of work is intended to develop automatic recognition procedures to locate and assess stuttered dysfluencies. This and the following article together, develop and test recognizers for repetitions and prolongations. The automatic recognizers classify the speech in two stages: In the first, the speech is segmented and in the second the segments are categorized. The units that are segmented are words. Here assessments by human judges on the speech of 12 children who stutter are described using a corresponding procedure. The accuracy of word boundary placement across judges, categorization of the words as fluent, repetition or prolongation, and duration of the different fluency categories are reported. These measures allow reliable instances of repetitions and prolongations to be selected for training and assessing the recognizers in the subsequent paper. PMID:9328878
Unsupervised Word Spotting in Historical Handwritten Document Images using Document-oriented Local Features.

PubMed

Zagoris, Konstantinos; Pratikakis, Ioannis; Gatos, Basilis

2017-05-03

Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that it relies upon document-oriented local features which take into account information around representative keypoints as well a matching process that incorporates spatial context in a local proximity search without using any training data. Experimental results on four historical handwritten datasets for two different scenarios (segmentation-based and segmentation-free) using standard evaluation measures show the improved performance achieved by the proposed methodology.
A Web-based interface to calculate phonotactic probability for words and nonwords in English

PubMed Central

VITEVITCH, MICHAEL S.; LUCE, PAUL A.

2008-01-01

Phonotactic probability refers to the frequency with which phonological segments and sequences of phonological segments occur in words in a given language. We describe one method of estimating phonotactic probabilities based on words in American English. These estimates of phonotactic probability have been used in a number of previous studies and are now being made available to other researchers via a Web-based interface. Instructions for using the interface, as well as details regarding how the measures were derived, are provided in the present article. The Phonotactic Probability Calculator can be accessed at http://www.people.ku.edu/~mvitevit/PhonoProbHome.html. PMID:15641436
Adults' Self-Directed Learning of an Artificial Lexicon: The Dynamics of Neighborhood Reorganization

ERIC Educational Resources Information Center

Bardhan, Neil Prodeep

2010-01-01

Artificial lexicons have previously been used to examine the time course of the learning and recognition of spoken words, the role of segment type in word learning, and the integration of context during spoken word recognition. However, in all of these studies the experimenter determined the frequency and order of the words to be learned. In three…
Multiple functional units in the preattentive segmentation of speech in Japanese: evidence from word illusions.

PubMed

Nakamura, Miyoko; Kolinsky, Régine

2014-12-01

We explored the functional units of speech segmentation in Japanese using dichotic presentation and a detection task requiring no intentional sublexical analysis. Indeed, illusory perception of a target word might result from preattentive migration of phonemes, morae, or syllables from one ear to the other. In Experiment I, Japanese listeners detected targets presented in hiragana and/or kanji. Phoneme migrations did occur, suggesting that orthography-independent sublexical constituents play some role in segmentation. However, syllable and especially mora migrations were more numerous. This pattern of results was not observed in French speakers (Experiment 2), suggesting that it reflects native segmentation in Japanese. To control for the intervention of kanji representations (many words are written in kanji, and one kanji often corresponds to one syllable), in Experiment 3, Japanese listeners were presented with target loanwords that can be written only in katakana. Again, phoneme migrations occurred, while the first mora and syllable led to similar rates of illusory percepts. No migration occurred for the second, "special" mora (/J/ or/N/), probably because this constitutes the latter part of a heavy syllable. Overall, these findings suggest that multiple units, such as morae, syllables, and even phonemes, function independently of orthographic knowledge in Japanese preattentive speech segmentation.
Statistical learning of movement.

PubMed

Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M

2016-12-01

The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.
Neural Correlates of Morphology Acquisition through a Statistical Learning Paradigm.

PubMed

Sandoval, Michelle; Patterson, Dianne; Dai, Huanping; Vance, Christopher J; Plante, Elena

2017-01-01

The neural basis of statistical learning as it occurs over time was explored with stimuli drawn from a natural language (Russian nouns). The input reflected the "rules" for marking categories of gendered nouns, without making participants explicitly aware of the nature of what they were to learn. Participants were scanned while listening to a series of gender-marked nouns during four sequential scans, and were tested for their learning immediately after each scan. Although participants were not told the nature of the learning task, they exhibited learning after their initial exposure to the stimuli. Independent component analysis of the brain data revealed five task-related sub-networks. Unlike prior statistical learning studies of word segmentation, this morphological learning task robustly activated the inferior frontal gyrus during the learning period. This region was represented in multiple independent components, suggesting it functions as a network hub for this type of learning. Moreover, the results suggest that subnetworks activated by statistical learning are driven by the nature of the input, rather than reflecting a general statistical learning system.
Neural Correlates of Morphology Acquisition through a Statistical Learning Paradigm

PubMed Central

Sandoval, Michelle; Patterson, Dianne; Dai, Huanping; Vance, Christopher J.; Plante, Elena

2017-01-01

The neural basis of statistical learning as it occurs over time was explored with stimuli drawn from a natural language (Russian nouns). The input reflected the “rules” for marking categories of gendered nouns, without making participants explicitly aware of the nature of what they were to learn. Participants were scanned while listening to a series of gender-marked nouns during four sequential scans, and were tested for their learning immediately after each scan. Although participants were not told the nature of the learning task, they exhibited learning after their initial exposure to the stimuli. Independent component analysis of the brain data revealed five task-related sub-networks. Unlike prior statistical learning studies of word segmentation, this morphological learning task robustly activated the inferior frontal gyrus during the learning period. This region was represented in multiple independent components, suggesting it functions as a network hub for this type of learning. Moreover, the results suggest that subnetworks activated by statistical learning are driven by the nature of the input, rather than reflecting a general statistical learning system. PMID:28798703
Thai Automatic Speech Recognition

DTIC Science & Technology

2005-01-01

used in an external DARPA evaluation involving medical scenarios between an American Doctor and a naïve monolingual Thai patient. 2. Thai Language... dictionary generation more challenging, and (3) the lack of word segmentation, which calls for automatic segmentation approaches to make n-gram language...requires a dictionary and provides various segmentation algorithms to automatically select suitable segmentations. Here we used a maximal matching
Developmental Differences in the Effects of Phonological, Lexical and Semantic Variables on Word Learning by Infants

ERIC Educational Resources Information Center

Storkel, Holly L.

2009-01-01

The influence of phonological (i.e. individual sounds), lexical (i.e. whole-word forms) and semantic (i.e. meaning) characteristics on the words known by infants age 1;4 to 2;6 was examined, using an existing database (Dale & Fenson, 1996). For each noun, word frequency, two phonological (i.e. positional segment average, biphone average), two…
Event-Related Potentials Reflecting the Processing of Phonological Constraint Violations

ERIC Educational Resources Information Center

Domahs, Ulrike; Kehrein, Wolfgang; Knaus, Johannes; Wiese, Richard; Schlesewsky, Matthias

2009-01-01

How are violations of phonological constraints processed in word comprehension? The present article reports the results of an event-related potentials (ERP) study on a phonological constraint of German that disallows identical segments within a syllable or word (CC[subscript i]VC[subscript i]). We examined three types of monosyllabic CCVC words:…
He Said, She Said: Effects of Bilingualism on Cross-Talker Word Recognition in Infancy

ERIC Educational Resources Information Center

Singh, Leher

2018-01-01

The purpose of the current study was to examine effects of bilingual language input on infant word segmentation and on talker generalization. In the present study, monolingually and bilingually exposed infants were compared on their abilities to recognize familiarized words in speech and to maintain generalizable representations of familiarized…
Probabilistic Phonotactics as a Cue for Recognizing Spoken Cantonese Words in Speech

ERIC Educational Resources Information Center

Yip, Michael C. W.

2017-01-01

Previous experimental psycholinguistic studies suggested that the probabilistic phonotactics information might likely to hint the locations of word boundaries in continuous speech and hence posed an interesting solution to the empirical question on how we recognize/segment individual spoken word in speech. We investigated this issue by using…
Got Rhythm...For Better and for Worse. Cross-Modal Effects of Auditory Rhythm on Visual Word Recognition

ERIC Educational Resources Information Center

Brochard, Renaud; Tassin, Maxime; Zagar, Daniel

2013-01-01

The present research aimed to investigate whether, as previously observed with pictures, background auditory rhythm would also influence visual word recognition. In a lexical decision task, participants were presented with bisyllabic visual words, segmented into two successive groups of letters, while an irrelevant strongly metric auditory…
Orthographic vs. Phonologic Syllables in Handwriting Production

ERIC Educational Resources Information Center

Kandel, Sonia; Herault, Lucie; Grosjacques, Geraldine; Lambert, Eric; Fayol, Michel

2009-01-01

French children program the words they write syllable by syllable. We examined whether the syllable the children use to segment words is determined phonologically (i.e., is derived from speech production processes) or orthographically. Third, 4th and 5th graders wrote on a digitiser words that were mono-syllables phonologically (e.g.…

The activation of segmental and tonal information in visual word recognition.

PubMed

Li, Chuchu; Lin, Candise Y; Wang, Min; Jiang, Nan

2013-08-01

Mandarin Chinese has a logographic script in which graphemes map onto syllables and morphemes. It is not clear whether Chinese readers activate phonological information during lexical access, although phonological information is not explicitly represented in Chinese orthography. In the present study, we examined the activation of phonological information, including segmental and tonal information in Chinese visual word recognition, using the Stroop paradigm. Native Mandarin speakers named the presentation color of Chinese characters in Mandarin. The visual stimuli were divided into five types: color characters (e.g., , hong2, "red"), homophones of the color characters (S+T+; e.g., , hong2, "flood"), different-tone homophones (S+T-; e.g., , hong1, "boom"), characters that shared the same tone but differed in segments with the color characters (S-T+; e.g., , ping2, "bottle"), and neutral characters (S-T-; e.g., , qian1, "leading through"). Classic Stroop facilitation was shown in all color-congruent trials, and interference was shown in the incongruent trials. Furthermore, the Stroop effect was stronger for S+T- than for S-T+ trials, and was similar between S+T+ and S+T- trials. These findings suggested that both tonal and segmental forms of information play roles in lexical constraints; however, segmental information has more weight than tonal information. We proposed a revised visual word recognition model in which the functions of both segmental and suprasegmental types of information and their relative weights are taken into account.
The role of reference in cross-situational word learning.

PubMed

Wang, Felix Hao; Mintz, Toben H

2018-01-01

Word learning involves massive ambiguity, since in a particular encounter with a novel word, there are an unlimited number of potential referents. One proposal for how learners surmount the problem of ambiguity is that learners use cross-situational statistics to constrain the ambiguity: When a word and its referent co-occur across multiple situations, learners will associate the word with the correct referent. Yu and Smith (2007) propose that these co-occurrence statistics are sufficient for word-to-referent mapping. Alternative accounts hold that co-occurrence statistics alone are insufficient to support learning, and that learners are further guided by knowledge that words are referential (e.g., Waxman & Gelman, 2009). However, no behavioral word learning studies we are aware of explicitly manipulate subjects' prior assumptions about the role of the words in the experiments in order to test the influence of these assumptions. In this study, we directly test whether, when faced with referential ambiguity, co-occurrence statistics are sufficient for word-to-referent mappings in adult word-learners. Across a series of cross-situational learning experiments, we varied the degree to which there was support for the notion that the words were referential. At the same time, the statistical information about the words' meanings was held constant. When we overrode support for the notion that words were referential, subjects failed to learn the word-to-referent mappings, but otherwise they succeeded. Thus, cross-situational statistics were useful only when learners had the goal of discovering mappings between words and referents. We discuss the implications of these results for theories of word learning in children's language acquisition. Copyright © 2017 Elsevier B.V. All rights reserved.
The Utility of Cognitive Plausibility in Language Acquisition Modeling: Evidence From Word Segmentation.

PubMed

Phillips, Lawrence; Pearl, Lisa

2015-11-01

The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's cognitive plausibility. We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition model can aim to be cognitively plausible in multiple ways. We discuss these cognitive plausibility checkpoints generally and then apply them to a case study in word segmentation, investigating a promising Bayesian segmentation strategy. We incorporate cognitive plausibility by using an age-appropriate unit of perceptual representation, evaluating the model output in terms of its utility, and incorporating cognitive constraints into the inference process. Our more cognitively plausible model shows a beneficial effect of cognitive constraints on segmentation performance. One interpretation of this effect is as a synergy between the naive theories of language structure that infants may have and the cognitive constraints that limit the fidelity of their inference processes, where less accurate inference approximations are better when the underlying assumptions about how words are generated are less accurate. More generally, these results highlight the utility of incorporating cognitive plausibility more fully into computational models of language acquisition. Copyright © 2015 Cognitive Science Society, Inc.
Keywords image retrieval in historical handwritten Arabic documents

NASA Astrophysics Data System (ADS)

Saabni, Raid; El-Sana, Jihad

2013-01-01

A system is presented for spotting and searching keywords in handwritten Arabic documents. A slightly modified dynamic time warping algorithm is used to measure similarities between words. Two sets of features are generated from the outer contour of the words/word-parts. The first set is based on the angles between nodes on the contour and the second set is based on the shape context features taken from the outer contour. To recognize a given word, the segmentation-free approach is partially adopted, i.e., continuous word parts are used as the basic alphabet, instead of individual characters or complete words. Additional strokes, such as dots and detached short segments, are classified and used in a postprocessing step to determine the final comparison decision. The search for a keyword is performed by the search for its word parts given in the correct order. The performance of the presented system was very encouraging in terms of efficiency and match rates. To evaluate the presented system its performance is compared to three different systems. Unfortunately, there are no publicly available standard datasets with ground truth for testing Arabic key word searching systems. Therefore, a private set of images partially taken from Juma'a Al-Majid Center in Dubai for evaluation is used, while using a slightly modified version of the IFN/ENIT database for training.
Machine-printed Arabic OCR

NASA Astrophysics Data System (ADS)

Hassibi, Khosrow M.

1994-02-01

This paper presents a brief overview of our research in the development of an OCR system for recognition of machine-printed texts in languages that use the Arabic alphabet. The cursive nature of machine-printed Arabic makes the segmentation of words into letters a challenging problem. In our approach, through a novel preliminary segmentation technique, a word is broken into pieces where each piece may not represent a valid letter in general. Neural networks trained on a training sample set of about 500 Arabic text images are used for recognition of these pieces. The rules governing the alphabet and character-level contextual information are used for recombining these pieces into valid letters. Higher-level contextual analysis schemes including the use of an Arabic lexicon and n-grams is also under development and are expected to improve the word recognition accuracy. The segmentation, recognition, and contextual analysis processes are closely integrated using a feedback scheme. The details of preparation of the training set and some recent results on training of the networks will be presented.
Lexical restructuring in the absence of literacy.

PubMed

Venturaa, Paulo; Kolinsky, Régine; Fernandesa, Sandra; Queridoa, Luís; Morais, José

2007-11-01

Vocabulary growth was suggested to prompt the implementation of increasingly finer-grained lexical representations of spoken words in children (e.g., [Metsala, J. L., & Walley, A. C. (1998). Spoken vocabulary growth and the segmental restructuring of lexical representations: precursors to phonemic awareness and early reading ability. In J. L. Metsala & L. C. Ehri (Eds.), Word recognition in beginning literacy (pp. 89-120). Hillsdale, NJ: Erlbaum.]). Although literacy was not explicitly mentioned in this lexical restructuring hypothesis, the process of learning to read and spell might also have a significant impact on the specification of lexical representations (e.g., [Carroll, J. M., & Snowling, M. J. (2001). The effects of global similarity between stimuli on children's judgments of rime and alliteration. Applied Psycholinguistics, 22, 327-342.]; [Goswami, U. (2000). Phonological representations, reading development and dyslexia: Towards a cross-linguistic theoretical framework. Dyslexia, 6, 133-151.]). This is what we checked in the present study. We manipulated word frequency and neighborhood density in a gating task (Experiment 1) and a word-identification-in-noise task (Experiment 2) presented to Portuguese literate and illiterate adults. Ex-illiterates were also tested in Experiment 2 in order to disentangle the effects of vocabulary size and literacy. There was an interaction between word frequency and neighborhood density, which was similar in the three groups. These did not differ even for the words that are supposed to undergo lexical restructuring the latest (low frequency words from sparse neighborhoods). Thus, segmental lexical representations seem to develop independently of literacy. While segmental restructuring is not affected by literacy, it constrains the development of phoneme awareness as shown by the fact that, in Experiment 3, neighborhood density modulated the phoneme deletion performance of both illiterates and ex-illiterates.
Effects of First and Second Language on Segmentation of Non-Native Speech

ERIC Educational Resources Information Center

Hanulikova, Adriana; Mitterer, Holger; McQueen, James M.

2011-01-01

Do Slovak-German bilinguals apply native Slovak phonological and lexical knowledge when segmenting German speech? When Slovaks listen to their native language, segmentation is impaired when fixed-stress cues are absent (Hanulikova, McQueen & Mitterer, 2010), and, following the Possible-Word Constraint (PWC; Norris, McQueen, Cutler & Butterfield,…
Segmentation and Representation of Consonant Blends in Kindergarten Children's Spellings

ERIC Educational Resources Information Center

Werfel, Krystal L.; Schuele, C. Melanie

2012-01-01

Purpose: The purpose of this study was to describe the growth of children's segmentation and representation of consonant blends in the kindergarten year and to evaluate the extent to which linguistic features influence segmentation and representation of consonant blends. Specifically, the roles of word position (initial blends, final blends),…
How Are Pronunciation Variants of Spoken Words Recognized? A Test of Generalization to Newly Learned Words

ERIC Educational Resources Information Center

Pitt, Mark A.

2009-01-01

One account of how pronunciation variants of spoken words (center-> "senner" or "sennah") are recognized is that sublexical processes use information about variation in the same phonological environments to recover the intended segments [Gaskell, G., & Marslen-Wilson, W. D. (1998). Mechanisms of phonological inference in speech perception.…
The Storage and Processing of Morphologically Complex Words in L2 Spanish

ERIC Educational Resources Information Center

Foote, Rebecca

2017-01-01

Research with native speakers indicates that, during word recognition, regularly inflected words undergo parsing that segments them into stems and affixes. In contrast, studies with learners suggest that this parsing may not take place in L2. This study's research questions are: Do L2 Spanish learners store and process regularly inflected,…
Stress Regularity or Consistency? Reading Aloud Italian Polysyllables with Different Stress Patterns

ERIC Educational Resources Information Center

Burani, Cristina; Arduino, Lisa S.

2004-01-01

Stress assignment to three- and four-syllable Italian words is not predictable by rule, but needs lexical look-up. The present study investigated whether stress assignment to low-frequency Italian words is determined by stress regularity, or by the number of words sharing the final phonological segment and the stress pattern (stress neighborhood…
Phonetic Aspects of Children's Elicited Word Revisions.

ERIC Educational Resources Information Center

Paul-Brown, Diane; Yeni-Komshian, Grace H.

A study of the phonetic changes occurring when a speaker attempts to revise an unclear word for a listener focuses on changes made in the sound segment duration to maximize differences between phonemes. In the study, five-year-olds were asked by adults to revise words differing in voicing of initial and final stop consonants; a control group of…
The Effects of Video Self-Modeling on the Decoding Skills of Children at Risk for Reading Disabilities

ERIC Educational Resources Information Center

Ayala, Sandra M.

2010-01-01

Ten first grade students, participating in a Tier II response to intervention (RTI) reading program received an intervention of video self modeling to improve decoding skills and sight word recognition. The students were video recorded blending and segmenting decodable words, and reading sight words taken directly from their curriculum…
The unrealized promise of infant statistical word-referent learning

PubMed Central

Smith, Linda B.; Suanda, Sumarga H.; Yu, Chen

2014-01-01

Recent theory and experiments offer a new solution as to how infant learners may break into word learning, by using cross-situational statistics to find the underlying word-referent mappings. Computational models demonstrate the in-principle plausibility of this statistical learning solution and experimental evidence shows that infants can aggregate and make statistically appropriate decisions from word-referent co-occurrence data. We review these contributions and then identify the gaps in current knowledge that prevent a confident conclusion about whether cross-situational learning is the mechanism through which infants break into word learning. We propose an agenda to address that gap that focuses on detailing the statistics in the learning environment and the cognitive processes that make use of those statistics. PMID:24637154
Possible-word constraints in Cantonese speech segmentation.

PubMed

Yip, Michael C

2004-03-01

A Cantonese syllable-spotting experiment was conducted to examine whether the Possible-Word Constraint (PWC), proposed by Norris, McQueen, Cutler, and Butterfield (1997), can apply in Cantonese speech segmentation. In the experiment, listeners were asked to spot out the target Cantonese syllable from a series of nonsense sound strings. Results suggested that listeners found it more difficult to spot out the target syllable [kDm1] in the nonsense sound strings that attached with a single consonant [tkDm1] than in the nonsense sound strings that attached either with a vowel [a:kDm1] or a pseudo-syllable [khow1kDm1]. Finally, the current set of results further supported that the PWC appears to be a language-universal mechanism in segmenting continuous speech.
The Use of Segmentation Cues in Second Language Learners of English

ERIC Educational Resources Information Center

Lin, Candise Yue

2013-01-01

This dissertation project examined the influence of language typology on the use of segmentation cues by second language (L2) learners of English. Previous research has shown that native English speakers rely more on sentence context and lexical knowledge than segmental (i.e. phonotactics or acoustic-phonetics) or prosodic cues (e.g., word stress)…
How Does Context Play "a Part" in Splitting Words "Apart"? Production and Perception of Word Boundaries in Casual Speech

ERIC Educational Resources Information Center

Kim, Dahee; Stephens, Joseph D. W.; Pitt, Mark A.

2012-01-01

Four experiments examined listeners' segmentation of ambiguous schwa-initial sequences (e.g., "a long" vs. "along") in casual speech, where acoustic cues can be unclear, possibly increasing reliance on contextual information to resolve the ambiguity. In Experiment 1, acoustic analyses of talkers' productions showed that the one-word and two-word…
English Pronunciation: A Systematic Approach to Word-Stress and Vowel-Sounds.

ERIC Educational Resources Information Center

Carmona, Francisco

A handbook on English word stress and stressed-vowel sounds is based on the idea that these segments are, in most cases, controlled by phonological context and their pronunciation can be understood through a system of rules. It serves as a reference for teachers and as a text for students. Chapters address these topics: word stress and active and…
Catch Up® Literacy: Evaluation Report and Executive Summary

ERIC Educational Resources Information Center

Rutt, Simon; Kettlewell, Kelly; Bernardinelli, Daniele

2015-01-01

Catch Up® Literacy is a structured one-to-one literacy intervention for pupils between the ages of 6 and 14 who are struggling to learn to read. It teaches pupils to blend phonemes (combine letter sounds into words), segment phonemes (separate words into letter sounds), and memorise particular words so they can be understood without needing to use…
Overcoming the Effects of Variation in Infant Speech Segmentation: Influences of Word Familiarity

ERIC Educational Resources Information Center

Singh, Leher; Nestor, Sarah S.; Bortfeld, Heather

2008-01-01

Previous studies have shown that 7.5-month-olds can track and encode words in fluent speech, but they fail to equate instances of a word that contrast in talker gender, vocal affect, and fundamental frequency. By 10.5 months, they succeed at generalizing across such variability, marking a clear transition period during which infants' word…

The extraction and integration framework: a two-process account of statistical learning.

PubMed

Thiessen, Erik D; Kronstein, Alexandra T; Hufnagle, Daniel G

2013-07-01

The term statistical learning in infancy research originally referred to sensitivity to transitional probabilities. Subsequent research has demonstrated that statistical learning contributes to infant development in a wide array of domains. The range of statistical learning phenomena necessitates a broader view of the processes underlying statistical learning. Learners are sensitive to a much wider range of statistical information than the conditional relations indexed by transitional probabilities, including distributional and cue-based statistics. We propose a novel framework that unifies learning about all of these kinds of statistical structure. From our perspective, learning about conditional relations outputs discrete representations (such as words). Integration across these discrete representations yields sensitivity to cues and distributional information. To achieve sensitivity to all of these kinds of statistical structure, our framework combines processes that extract segments of the input with processes that compare across these extracted items. In this framework, the items extracted from the input serve as exemplars in long-term memory. The similarity structure of those exemplars in long-term memory leads to the discovery of cues and categorical structure, which guides subsequent extraction. The extraction and integration framework provides a way to explain sensitivity to both conditional statistical structure (such as transitional probabilities) and distributional statistical structure (such as item frequency and variability), and also a framework for thinking about how these different aspects of statistical learning influence each other. 2013 APA, all rights reserved
The role of tone and segmental information in visual-word recognition in Thai.

PubMed

Winskel, Heather; Ratitamkul, Theeraporn; Charoensit, Akira

2017-07-01

Tone languages represent a large proportion of the spoken languages of the world and yet lexical tone is understudied. Thai offers a unique opportunity to investigate the role of lexical tone processing during visual-word recognition, as tone is explicitly expressed in its script. We used colour words and their orthographic neighbours as stimuli to investigate facilitation (Experiment 1) and interference (Experiment 2) Stroop effects. Five experimental conditions were created: (a) the colour word (e.g., ขาว /k h ã:w/ [white]), (b) tone different word (e.g., ข่าว /k h à:w/[news]), (c) initial consonant phonologically same word (e.g., คาว /k h a:w/ [fishy]), where the initial consonant of the word was phonologically the same but orthographically different, (d) initial consonant different, tone same word (e.g., หาว /hã:w/ yawn), where the initial consonant was orthographically different but the tone of the word was the same, and (e) initial consonant different, tone different word (e.g., กาว /ka:w/ glue), where the initial consonant was orthographically different, and the tone was different. In order to examine whether tone information per se had a facilitative effect, we also included a colour congruent word condition where the segmental (S) information was different but the tone (T) matched the colour word (S-T+) in Experiment 2. Facilitation/interference effects were found for all five conditions when compared with a neutral control word. Results of the critical comparisons revealed that tone information comes into play at a later stage in lexical processing, and orthographic information contributes more than phonological information.
Activity of left inferior frontal gyrus related to word repetition effects: LORETA imaging with 128-channel EEG and individual MRI.

PubMed

Kim, Young Youn; Lee, Boreom; Shin, Yong Wook; Kwon, Jun Soo; Kim, Myung-Sun

2006-02-01

We investigated the brain substrate of word repetition effects on the implicit memory task using low-resolution electromagnetic tomography (LORETA) with high-density 128-channel EEG and individual MRI as a realistic head model. Thirteen right-handed, healthy subjects performed a word/non-word discrimination task, in which the words and non-words were presented visually, and some of the words appeared twice with a lag of one or five items. All of the subjects exhibited word repetition effects with respect to the behavioral data, in which a faster reaction time was observed to the repeated word (old word) than to the first presentation of the word (new word). The old words elicited more positive-going potentials than the new words, beginning at 200 ms and lasting until 500 ms post-stimulus. We conducted source reconstruction using LORETA at a latency of 400 ms with the peak mean global field potentials and used statistical parametric mapping for the statistical analysis. We found that the source elicited by the old words exhibited a statistically significant current density reduction in the left inferior frontal gyrus. This is the first study to investigate the generators of word repetition effects using voxel-by-voxel statistical mapping of the current density with individual MRI and high-density EEG.
Emergence of good conduct, scaling and zipf laws in human behavioral sequences in an online world.

PubMed

Thurner, Stefan; Szell, Michael; Sinatra, Roberta

2012-01-01

We study behavioral action sequences of players in a massive multiplayer online game. In their virtual life players use eight basic actions which allow them to interact with each other. These actions are communication, trade, establishing or breaking friendships and enmities, attack, and punishment. We measure the probabilities for these actions conditional on previous taken and received actions and find a dramatic increase of negative behavior immediately after receiving negative actions. Similarly, positive behavior is intensified by receiving positive actions. We observe a tendency towards antipersistence in communication sequences. Classifying actions as positive (good) and negative (bad) allows us to define binary 'world lines' of lives of individuals. Positive and negative actions are persistent and occur in clusters, indicated by large scaling exponents α ~ 0.87 of the mean square displacement of the world lines. For all eight action types we find strong signs for high levels of repetitiveness, especially for negative actions. We partition behavioral sequences into segments of length n (behavioral 'words' and 'motifs') and study their statistical properties. We find two approximate power laws in the word ranking distribution, one with an exponent of κ ~ -1 for the ranks up to 100, and another with a lower exponent for higher ranks. The Shannon n-tuple redundancy yields large values and increases in terms of word length, further underscoring the non-trivial statistical properties of behavioral sequences. On the collective, societal level the timeseries of particular actions per day can be understood by a simple mean-reverting log-normal model.
"Fragment errors" in deep dysgraphia: further support for a lexical hypothesis.

PubMed

Bormann, Tobias; Wallesch, Claus-W; Blanken, Gerhard

2008-07-01

In addition to various lexical errors, the writing of patients with deep dysgraphia may include a large number of segmental spelling errors, which increase towards the end of the word. Frequently, these errors involve deletion of two or more letters resulting in so-called "fragment errors". Different positions have been brought forward regarding their origin, including rapid decay of activation in the graphemic buffer and an impairment of more central (i.e., lexical or semantic) processing. We present data from a patient (M.D.) with deep dysgraphia who showed an increase of segmental spelling errors towards the end of the word. Several tasks were carried out to explore M.D.'s underlying functional impairment. Errors affected word-final positions in tasks like backward spelling and fragment completion. In a delayed copying task, length of the delay had no influence. In addition, when asked to recall three serially presented letters, a task which had not been carried out before, M.D. exhibited a preference for the first and the third letter and poor performance for the second letter. M.D.'s performance on these tasks contradicts the rapid decay account and instead supports a lexical-semantic account of segmental errors in deep dysgraphia. In addition, the results fit well with an implemented computational model of deep dysgraphia and segmental spelling errors.
EHME: a new word database for research in Basque language.

PubMed

Acha, Joana; Laka, Itziar; Landa, Josu; Salaburu, Pello

2014-11-14

This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria 1 , as well as to obtain statistical characteristics from a list of words
Event Recognition Based on Deep Learning in Chinese Texts

PubMed Central

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.

PubMed

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
A model to identify high crash road segments with the dynamic segmentation method.

PubMed

Boroujerdian, Amin Mirza; Saffarzadeh, Mahmoud; Yousefi, Hassan; Ghassemian, Hassan

2014-12-01

Currently, high social and economic costs in addition to physical and mental consequences put road safety among most important issues. This paper aims at presenting a novel approach, capable of identifying the location as well as the length of high crash road segments. It focuses on the location of accidents occurred along the road and their effective regions. In other words, due to applicability and budget limitations in improving safety of road segments, it is not possible to recognize all high crash road segments. Therefore, it is of utmost importance to identify high crash road segments and their real length to be able to prioritize the safety improvement in roads. In this paper, after evaluating deficiencies of the current road segmentation models, different kinds of errors caused by these methods are addressed. One of the main deficiencies of these models is that they can not identify the length of high crash road segments. In this paper, identifying the length of high crash road segments (corresponding to the arrangement of accidents along the road) is achieved by converting accident data to the road response signal of through traffic with a dynamic model based on the wavelet theory. The significant advantage of the presented method is multi-scale segmentation. In other words, this model identifies high crash road segments with different lengths and also it can recognize small segments within long segments. Applying the presented model into a real case for identifying 10-20 percent of high crash road segment showed an improvement of 25-38 percent in relative to the existing methods. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pinyin and English Invented Spelling in Chinese-Speaking Students Who Speak English as a Second Language.

PubMed

Ding, Yi; Liu, Ru-De; McBride, Catherine A; Fan, Chung-Hau; Xu, Le; Wang, Jia

2018-05-07

This study examined pinyin (the official phonetic system that transcribes the lexical tones and pronunciation of Chinese characters) invented spelling and English invented spelling in 72 Mandarin-speaking 6th graders who learned English as their second language. The pinyin invented spelling task measured segmental-level awareness including syllable and phoneme awareness, and suprasegmental-level awareness including lexical tones and tone sandhi in Chinese Mandarin. The English invented spelling task manipulated segmental-level awareness including syllable awareness and phoneme awareness, and suprasegmental-level awareness including word stress. This pinyin task outperformed a traditional phonological awareness task that only measured segmental-level awareness and may have optimal utility to measure unique phonological and linguistic features in Chinese reading. The pinyin invented spelling uniquely explained variance in Chinese conventional spelling and word reading in both languages. The English invented spelling uniquely explained variance in conventional spelling and word reading in both languages. Our findings appear to support the role of phonological activation in Chinese reading. Our experimental linguistic manipulations altered the phonological awareness item difficulties.
Cross-situational word learning in aphasia.

PubMed

Peñaloza, Claudia; Mirman, Daniel; Cardona, Pedro; Juncadella, Montserrat; Martin, Nadine; Laine, Matti; Rodríguez-Fornells, Antoni

2017-08-01

Human learners can resolve referential ambiguity and discover the relationships between words and meanings through a cross-situational learning (CSL) strategy. Some people with aphasia (PWA) can learn word-referent pairings under referential uncertainty supported by online feedback. However, it remains unknown whether PWA can learn new words cross-situationally and if such learning ability is supported by statistical learning (SL) mechanisms. The present study examined whether PWA can learn novel word-referent mappings in a CSL task without feedback. We also studied whether CSL is related to SL in PWA and neurologically healthy individuals. We further examined whether aphasia severity, phonological processing and verbal short-term memory (STM) predict CSL in aphasia, and also whether individual differences in verbal STM modulate CSL in healthy older adults. Sixteen people with chronic aphasia underwent a CSL task that involved exposure to a series of individually ambiguous learning trials and a SL task that taps speech segmentation. Their learning ability was compared to 18 older controls and 39 young adults recruited for task validation. CSL in the aphasia group was below the older controls and young adults and took place at a slower rate. Importantly, we found a strong association between SL and CSL performance in all three groups. CSL was modulated by aphasia severity in the aphasia group, and by verbal STM capacity in the older controls. Our findings indicate that some PWA can preserve the ability to learn new word-referent associations cross-situationally. We suggest that both PWA and neurologically intact individuals may rely on SL mechanisms to achieve CSL and that verbal STM also influences CSL. These findings contribute to the ongoing debate on the cognitive mechanisms underlying this learning ability. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Computer Analysis Study of the Word Style in Love-songs of Tshang yang Gya tsho

NASA Astrophysics Data System (ADS)

Yonghong, Li; SunTing; Lei, Guo; Hongzhi, Yu

Based on the statistical methods of corpus and the 124 love-songs of Tshang yang Gya tsho as the studying object, this paper have set up the principles of vocabulary segmentation and built the love-songs corpus of Tibetan and Tibetan-Chinese grammar separation lexicon corpus. Then it did quantitative research on the achievement of "love-songs" in the language arts from three aspects: the length of the vocabularie's, the frequency rate of the vocabularies, and the distribution of the term's number in the verses and the songs. In addition it also introduced a new kind of researching idea and method for the study of Tibetan literature.
Lasting Effects on Literacy Skills with a Computer-Assisted Learning Using Syllabic Units in Low-Progress Readers

ERIC Educational Resources Information Center

Ecalle, Jean; Magnan, Annie; Calmus, Caroline

2009-01-01

This study examines the effects of a computer-assisted learning (CAL) program in which syllabic units were highlighted inside words in comparison with a CAL program in which the words were not segmented, i.e. one requiring whole word recognition. In a randomised control trial design, two separate groups of French speaking poor readers (2 * 14) in…
Syllabic Strategy as Opposed to Coda Optimization in the Segmentation of Spanish Letter-Strings Using Word Spotting

ERIC Educational Resources Information Center

Álvarez, Carlos J.; Taft, Marcus; Hernández-Cabrera, Juan A.

2017-01-01

A word-spotting task is used in Spanish to test the way in which polysyllabic letter-strings are parsed in this language. Monosyllabic words (e.g., "bar") embedded at the beginning of a pseudoword were immediately followed by either a coda-forming consonant (e.g., "barto") or a vowel (e.g., "baros"). In the former…
An English-French-German-Spanish Word Frequency Dictionary: A Correlation of the First Six Thousand Words in Four Single-Language Frequency Lists.

ERIC Educational Resources Information Center

Eaton, Helen S., Comp.

This semantic frequency list for English, French, German, and Spanish correlates 6,474 concepts represented by individual words in an order of diminishing occurrence. Designed as a research tool, the work is segmented into seven comparative "Thousand Concepts" lists with 115 sectional subdivisions, each of which begins with the key English word…
Infants' statistical learning: 2- and 5-month-olds' segmentation of continuous visual sequences.

PubMed

Slone, Lauren Krogh; Johnson, Scott P

2015-05-01

Past research suggests that infants have powerful statistical learning abilities; however, studies of infants' visual statistical learning offer differing accounts of the developmental trajectory of and constraints on this learning. To elucidate this issue, the current study tested the hypothesis that young infants' segmentation of visual sequences depends on redundant statistical cues to segmentation. A sample of 20 2-month-olds and 20 5-month-olds observed a continuous sequence of looming shapes in which unit boundaries were defined by both transitional probability and co-occurrence frequency. Following habituation, only 5-month-olds showed evidence of statistically segmenting the sequence, looking longer to a statistically improbable shape pair than to a probable pair. These results reaffirm the power of statistical learning in infants as young as 5 months but also suggest considerable development of statistical segmentation ability between 2 and 5 months of age. Moreover, the results do not support the idea that infants' ability to segment visual sequences based on transitional probabilities and/or co-occurrence frequencies is functional at the onset of visual experience, as has been suggested previously. Rather, this type of statistical segmentation appears to be constrained by the developmental state of the learner. Factors contributing to the development of statistical segmentation ability during early infancy, including memory and attention, are discussed. Copyright © 2015 Elsevier Inc. All rights reserved.
Learning the Language of Statistics: Challenges and Teaching Approaches

ERIC Educational Resources Information Center

Dunn, Peter K.; Carey, Michael D.; Richardson, Alice M.; McDonald, Christine

2016-01-01

Learning statistics requires learning the language of statistics. Statistics draws upon words from general English, mathematical English, discipline-specific English and words used primarily in statistics. This leads to many linguistic challenges in teaching statistics and the way in which the language is used in statistics creates an extra layer…
Listeners' processing of a given reduced word pronunciation variant directly reflects their exposure to this variant: Evidence from native listeners and learners of French.

PubMed

Brand, Sophie; Ernestus, Mirjam

2018-05-01

In casual conversations, words often lack segments. This study investigates whether listeners rely on their experience with reduced word pronunciation variants during the processing of single segment reduction. We tested three groups of listeners in a lexical decision experiment with French words produced either with or without word-medial schwa (e.g., /ʀvy/ and /ʀvy/ for revue). Participants also rated the relative frequencies of the two pronunciation variants of the words. If the recognition accuracy and reaction times (RTs) for a given listener group correlate best with the frequencies of occurrence holding for that given listener group, recognition is influenced by listeners' exposure to these variants. Native listeners' relative frequency ratings correlated well with their accuracy scores and RTs. Dutch advanced learners' accuracy scores and RTs were best predicted by their own ratings. In contrast, the accuracy and RTs from Dutch beginner learners of French could not be predicted by any relative frequency rating; the rating task was probably too difficult for them. The participant groups showed behaviour reflecting their difference in experience with the pronunciation variants. Our results strongly suggest that listeners store the frequencies of occurrence of pronunciation variants, and consequently the variants themselves.
The limits of metrical segmentation: intonation modulates infants' extraction of embedded trochees.

PubMed

Zahner, Katharina; Schönhuber, Muna; Braun, Bettina

2016-11-01

We tested German nine-month-olds' reliance on pitch and metrical stress for segmentation. In a headturn-preference paradigm, infants were familiarized with trisyllabic words (weak-strong-weak (WSW) stress pattern) in sentence-contexts. The words were presented in one of three naturally occurring intonation conditions: one in which high pitch was aligned with the stressed syllable and two misalignment conditions (with high pitch preceding vs. following the stressed syllable). Infants were tested on the SW unit of the WSW carriers. Experiment 1 showed recognition only when the stressed syllable was high-pitched. Intonation of test items (similar vs. dissimilar to familiarization) had no influence (Experiment 2). Thus, German nine-month-olds perceive stressed syllables as word onsets only when high-pitched, although they already generalize over different pitch contours. Different mechanisms underlying this pattern of results are discussed.
Infants with Williams syndrome detect statistical regularities in continuous speech.

PubMed

Cashon, Cara H; Ha, Oh-Ryeong; Graf Estes, Katharine; Saffran, Jenny R; Mervis, Carolyn B

2016-09-01

Williams syndrome (WS) is a rare genetic disorder associated with delays in language and cognitive development. The reasons for the language delay are unknown. Statistical learning is a domain-general mechanism recruited for early language acquisition. In the present study, we investigated whether infants with WS were able to detect the statistical structure in continuous speech. Eighteen 8- to 20-month-olds with WS were familiarized with 2min of a continuous stream of synthesized nonsense words; the statistical structure of the speech was the only cue to word boundaries. They were tested on their ability to discriminate statistically-defined "words" and "part-words" (which crossed word boundaries) in the artificial language. Despite significant cognitive and language delays, infants with WS were able to detect the statistical regularities in the speech stream. These findings suggest that an inability to track the statistical properties of speech is unlikely to be the primary basis for the delays in the onset of language observed in infants with WS. These results provide the first evidence of statistical learning by infants with developmental delays. Copyright © 2016 Elsevier B.V. All rights reserved.

Observational Word Learning: Beyond Propose-But-Verify and Associative Bean Counting.

PubMed

Roembke, Tanja; McMurray, Bob

2016-04-01

Learning new words is difficult. In any naming situation, there are multiple possible interpretations of a novel word. Recent approaches suggest that learners may solve this problem by tracking co-occurrence statistics between words and referents across multiple naming situations (e.g. Yu & Smith, 2007), overcoming the ambiguity in any one situation. Yet, there remains debate around the underlying mechanisms. We conducted two experiments in which learners acquired eight word-object mappings using cross-situational statistics while eye-movements were tracked. These addressed four unresolved questions regarding the learning mechanism. First, eye-movements during learning showed evidence that listeners maintain multiple hypotheses for a given word and bring them all to bear in the moment of naming. Second, trial-by-trial analyses of accuracy suggested that listeners accumulate continuous statistics about word/object mappings, over and above prior hypotheses they have about a word. Third, consistent, probabilistic context can impede learning, as false associations between words and highly co-occurring referents are formed. Finally, a number of factors not previously considered in prior analysis impact observational word learning: knowledge of the foils, spatial consistency of the target object, and the number of trials between presentations of the same word. This evidence suggests that observational word learning may derive from a combination of gradual statistical or associative learning mechanisms and more rapid real-time processes such as competition, mutual exclusivity and even inference or hypothesis testing.
Tutorial: Assessment and Analysis of Polysyllables in Young Children

ERIC Educational Resources Information Center

Masso, Sarah; McLeod, Sharynne; Baker, Elise

2018-01-01

Purpose: Polysyllables, words of 3 or more syllables, represent almost 30% of words used in American English. The purpose of this tutorial is to support speech-language pathologists' (SLPs') assessment and analysis of polysyllables, extending the focus of published assessment tools that focus on sampling and analyzing children's segmental accuracy…
Lexical Access for Phonetic Ambiguities.

ERIC Educational Resources Information Center

Spencer, N. J.; Wollman, Neil

1980-01-01

Reports on research that (1) suggests that phonetically ambiguous pairs (ice cream/I scream) have been used inaccurately to illustrate contextual effects in word segmentation, (2) supports unitary rather than exhaustive processing, and (3) supports the use of the concepts of word frequency and listener expectations instead of top-down, multiple…
From Statistics to Meaning: Infants’ Acquisition of Lexical Categories

PubMed Central

Lany, Jill; Saffran, Jenny R.

2013-01-01

Infants are highly sensitive to statistical patterns in their auditory language input that mark word categories (e.g., noun and verb). However, it is unknown whether experience with these cues facilitates the acquisition of semantic properties of word categories. In a study testing this hypothesis, infants first listened to an artificial language in which word categories were reliably distinguished by statistical cues (experimental group) or in which these properties did not cue category membership (control group). Both groups were then trained on identical pairings between the words and pictures from two categories (animals and vehicles). Only infants in the experimental group learned the trained associations between specific words and pictures. Moreover, these infants generalized the pattern to include novel pairings. These results suggest that experience with statistical cues marking lexical categories sets the stage for learning the meanings of individual words and for generalizing meanings to new category members. PMID:20424058
Connected word recognition using a cascaded neuro-computational model

NASA Astrophysics Data System (ADS)

Hoya, Tetsuya; van Leeuwen, Cees

2016-10-01

We propose a novel framework for processing a continuous speech stream that contains a varying number of words, as well as non-speech periods. Speech samples are segmented into word-tokens and non-speech periods. An augmented version of an earlier-proposed, cascaded neuro-computational model is used for recognising individual words within the stream. Simulation studies using both a multi-speaker-dependent and speaker-independent digit string database show that the proposed method yields a recognition performance comparable to that obtained by a benchmark approach using hidden Markov models with embedded training.
Learning builds on learning: Infants' use of native language sound patterns to learn words

PubMed Central

Graf Estes, Katharine

2014-01-01

The present research investigated how infants apply prior knowledge of environmental regularities to support new learning. The experiments tested whether infants could exploit experience with native language (English) phonotactic patterns to facilitate associating sounds with meanings during word learning. Fourteen-month-olds heard fluent speech that contained cues for detecting target words; they were embedded in sequences that occur across word boundaries. A separate group heard the target words embedded without word boundary cues. Infants then participated in an object label-learning task. With the opportunity to use native language patterns to segment the target words, infants subsequently learned the labels. Without this experience, infants failed. Novice word learners can take advantage of early learning about sounds scaffold lexical development. PMID:24980741
Phonemic Awareness and Beginning Reading and Writing.

ERIC Educational Resources Information Center

Kamii, Constance; Manning, Maryann

2002-01-01

Examined English-speaking preschoolers' level of writing and their performance on oral-segmentation tasks. Found a close relationship between children's levels of writing and their levels of oral segmentation on a writing task in which they were asked to write four pairs of words, for example, "ham" and "hamster." Concluded…
Infant Directed Speech Enhances Statistical Learning in Newborn Infants: An ERP Study

PubMed Central

Teinonen, Tuomas; Tervaniemi, Mari; Huotilainen, Minna

2016-01-01

Statistical learning and the social contexts of language addressed to infants are hypothesized to play important roles in early language development. Previous behavioral work has found that the exaggerated prosodic contours of infant-directed speech (IDS) facilitate statistical learning in 8-month-old infants. Here we examined the neural processes involved in on-line statistical learning and investigated whether the use of IDS facilitates statistical learning in sleeping newborns. Event-related potentials (ERPs) were recorded while newborns were exposed to12 pseudo-words, six spoken with exaggerated pitch contours of IDS and six spoken without exaggerated pitch contours (ADS) in ten alternating blocks. We examined whether ERP amplitudes for syllable position within a pseudo-word (word-initial vs. word-medial vs. word-final, indicating statistical word learning) and speech register (ADS vs. IDS) would interact. The ADS and IDS registers elicited similar ERP patterns for syllable position in an early 0–100 ms component but elicited different ERP effects in both the polarity and topographical distribution at 200–400 ms and 450–650 ms. These results provide the first evidence that the exaggerated pitch contours of IDS result in differences in brain activity linked to on-line statistical learning in sleeping newborns. PMID:27617967
Phonological Differentiation before Age Two in a Tagalog-Spanish-English Trilingual Child

ERIC Educational Resources Information Center

Montanari, Simona

2011-01-01

This study focuses on a trilingual toddler's ability to differentiate her Tagalog, Spanish and English productions on phonological/phonetic grounds. Working within the articulatory phonology framework, the word-initial segments produced by the child in Tagalog, Spanish and English words at age 1;10 were narrowly transcribed by two researchers and…
Speaking Rate Affects the Perception of Duration as a Suprasegmental Lexical-Stress Cue

ERIC Educational Resources Information Center

Reinisch, Eva; Jesse, Alexandra; McQueen, James M.

2011-01-01

Three categorization experiments investigated whether the speaking rate of a preceding sentence influences durational cues to the perception of suprasegmental lexical-stress patterns. Dutch two-syllable word fragments had to be judged as coming from one of two longer words that matched the fragment segmentally but differed in lexical stress…
Large Constituent Families Help Children Parse Compounds

ERIC Educational Resources Information Center

Krott, Andrea; Nicoladis, Elena

2005-01-01

The family size of the constituents of compound words, or the number of compounds sharing the constituents, has been shown to affect adults' access to compound words in the mental lexicon. The present study was designed to see if family size would affect children's segmentation of compounds. Twenty-five English-speaking children between 3;7 and…
Overcoming Barriers to Using Precision Teaching with a Web-Based Programme

ERIC Educational Resources Information Center

Hayes, Ben; Heather, Andrew; Jones, Daniel; Clarke, Christopher

2018-01-01

Precision Teaching (PT) is an evidence-based intervention, which research indicates is frequently not implemented following training, with few teachers using it in schools after training events. The web-based programme in this research focuses on word-level reading skills and targets blending and segmenting skills rather than whole word reading.…
Implicit Processing of Phonotactic Cues: Evidence from Electrophysiological and Vascular Responses

ERIC Educational Resources Information Center

Rossi, Sonja; Jurgenson, Ina B.; Hanulikova, Adriana; Telkemeyer, Silke; Wartenburger, Isabell; Obrig, Hellmuth

2011-01-01

Spoken word recognition is achieved via competition between activated lexical candidates that match the incoming speech input. The competition is modulated by prelexical cues that are important for segmenting the auditory speech stream into linguistic units. One such prelexical cue that listeners rely on in spoken word recognition is phonotactics.…
Performance impact of stop lists and morphological decomposition on word-word corpus-based semantic space models.

PubMed

Keith, Jeff; Westbury, Chris; Goldman, James

2015-09-01

Corpus-based semantic space models, which primarily rely on lexical co-occurrence statistics, have proven effective in modeling and predicting human behavior in a number of experimental paradigms that explore semantic memory representation. The most widely studied extant models, however, are strongly influenced by orthographic word frequency (e.g., Shaoul & Westbury, Behavior Research Methods, 38, 190-195, 2006). This has the implication that high-frequency closed-class words can potentially bias co-occurrence statistics. Because these closed-class words are purported to carry primarily syntactic, rather than semantic, information, the performance of corpus-based semantic space models may be improved by excluding closed-class words (using stop lists) from co-occurrence statistics, while retaining their syntactic information through other means (e.g., part-of-speech tagging and/or affixes from inflected word forms). Additionally, very little work has been done to explore the effect of employing morphological decomposition on the inflected forms of words in corpora prior to compiling co-occurrence statistics, despite (controversial) evidence that humans perform early morphological decomposition in semantic processing. In this study, we explored the impact of these factors on corpus-based semantic space models. From this study, morphological decomposition appears to significantly improve performance in word-word co-occurrence semantic space models, providing some support for the claim that sublexical information-specifically, word morphology-plays a role in lexical semantic processing. An overall decrease in performance was observed in models employing stop lists (e.g., excluding closed-class words). Furthermore, we found some evidence that weakens the claim that closed-class words supply primarily syntactic information in word-word co-occurrence semantic space models.
Beyond Transitional Probability Computations: Extracting Word-Like Units when Only Statistical Information Is Available

ERIC Educational Resources Information Center

Perruchet, Pierre; Poulin-Charronnat, Benedicte

2012-01-01

Endress and Mehler (2009) reported that when adult subjects are exposed to an unsegmented artificial language composed from trisyllabic words such as ABX, YBC, and AZC, they are unable to distinguish between these words and what they coined as the "phantom-word" ABC in a subsequent test. This suggests that statistical learning generates knowledge…
Language as Labor: Semantic Activities as the Basis for Language Development.

ERIC Educational Resources Information Center

Riegel, Klaus F.

The processes by which the young child recognizes and regenerates some invariant and organizational properties of language are discussed. In these processes the child conjoins and contrasts recurrent segments--perhaps a recurrent word--of the messages presented to him. After repeated exposure to messages containing a common segment, the child…
Influence of Musical Expertise on Segmental and Tonal Processing in Mandarin Chinese

ERIC Educational Resources Information Center

Marie, Celine; Delogu, Franco; Lampis, Giulia; Belardinelli, Marta Olivetti; Besson, Mireille

2011-01-01

A same-different task was used to test the hypothesis that musical expertise improves the discrimination of tonal and segmental (consonant, vowel) variations in a tone language, Mandarin Chinese. Two four-word sequences (prime and target) were presented to French musicians and nonmusicians unfamiliar with Mandarin, and event-related brain…
The Exploitation of Subphonemic Acoustic Detail in L2 Speech Segmentation

ERIC Educational Resources Information Center

Shoemaker, Ellenor

2014-01-01

The current study addresses an aspect of second language (L2) phonological acquisition that has received little attention to date--namely, the acquisition of allophonic variation as a word boundary cue. The role of subphonemic variation in the segmentation of speech by native speakers has been indisputably demonstrated; however, the acquisition of…
Prediction in the service of comprehension: modulated early brain responses to omitted speech segments.

PubMed

Bendixen, Alexandra; Scharinger, Mathias; Strauß, Antje; Obleser, Jonas

2014-04-01

Speech signals are often compromised by disruptions originating from external (e.g., masking noise) or internal (e.g., inaccurate articulation) sources. Speech comprehension thus entails detecting and replacing missing information based on predictive and restorative neural mechanisms. The present study targets predictive mechanisms by investigating the influence of a speech segment's predictability on early, modality-specific electrophysiological responses to this segment's omission. Predictability was manipulated in simple physical terms in a single-word framework (Experiment 1) or in more complex semantic terms in a sentence framework (Experiment 2). In both experiments, final consonants of the German words Lachs ([laks], salmon) or Latz ([lats], bib) were occasionally omitted, resulting in the syllable La ([la], no semantic meaning), while brain responses were measured with multi-channel electroencephalography (EEG). In both experiments, the occasional presentation of the fragment La elicited a larger omission response when the final speech segment had been predictable. The omission response occurred ∼125-165 msec after the expected onset of the final segment and showed characteristics of the omission mismatch negativity (MMN), with generators in auditory cortical areas. Suggestive of a general auditory predictive mechanism at work, this main observation was robust against varying source of predictive information or attentional allocation, differing between the two experiments. Source localization further suggested the omission response enhancement by predictability to emerge from left superior temporal gyrus and left angular gyrus in both experiments, with additional experiment-specific contributions. These results are consistent with the existence of predictive coding mechanisms in the central auditory system, and suggestive of the general predictive properties of the auditory system to support spoken word recognition. Copyright © 2014 Elsevier Ltd. All rights reserved.
Interactions between statistical and semantic information in infant language development

PubMed Central

Lany, Jill; Saffran, Jenny R.

2013-01-01

Infants can use statistical regularities to form rudimentary word categories (e.g. noun, verb), and to learn the meanings common to words from those categories. Using an artificial language methodology, we probed the mechanisms by which two types of statistical cues (distributional and phonological regularities) affect word learning. Because linking distributional cues vs. phonological information to semantics make different computational demands on learners, we also tested whether their use is related to language proficiency. We found that 22-month-old infants with smaller vocabularies generalized using phonological cues; however, infants with larger vocabularies showed the opposite pattern of results, generalizing based on distributional cues. These findings suggest that both phonological and distributional cues marking word categories promote early word learning. Moreover, while correlations between these cues are important to forming word categories, we found infants’ weighting of these cues in subsequent word-learning tasks changes over the course of early language development. PMID:21884336

Segmenting lung fields in serial chest radiographs using both population-based and patient-specific shape statistics.

PubMed

Shi, Y; Qi, F; Xue, Z; Chen, L; Ito, K; Matsuo, H; Shen, D

2008-04-01

This paper presents a new deformable model using both population-based and patient-specific shape statistics to segment lung fields from serial chest radiographs. There are two novelties in the proposed deformable model. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than the general intensity and gradient features, is used to characterize the image features in the vicinity of each pixel. Second, the deformable contour is constrained by both population-based and patient-specific shape statistics, and it yields more robust and accurate segmentation of lung fields for serial chest radiographs. In particular, for segmenting the initial time-point images, the population-based shape statistics is used to constrain the deformable contour; as more subsequent images of the same patient are acquired, the patient-specific shape statistics online collected from the previous segmentation results gradually takes more roles. Thus, this patient-specific shape statistics is updated each time when a new segmentation result is obtained, and it is further used to refine the segmentation results of all the available time-point images. Experimental results show that the proposed method is more robust and accurate than other active shape models in segmenting the lung fields from serial chest radiographs.
Infants Segment Continuous Events Using Transitional Probabilities

ERIC Educational Resources Information Center

Stahl, Aimee E.; Romberg, Alexa R.; Roseberry, Sarah; Golinkoff, Roberta Michnick; Hirsh-Pasek, Kathryn

2014-01-01

Throughout their 1st year, infants adeptly detect statistical structure in their environment. However, little is known about whether statistical learning is a primary mechanism for event segmentation. This study directly tests whether statistical learning alone is sufficient to segment continuous events. Twenty-eight 7- to 9-month-old infants…
Distributional structure in language: Contributions to noun–verb difficulty differences in infant word recognition

PubMed Central

Willits, Jon A.; Seidenberg, Mark S.; Saffran, Jenny R.

2014-01-01

What makes some words easy for infants to recognize, and other words difficult? We addressed this issue in the context of prior results suggesting that infants have difficulty recognizing verbs relative to nouns. In this work, we highlight the role played by the distributional contexts in which nouns and verbs occur. Distributional statistics predict that English nouns should generally be easier to recognize than verbs in fluent speech. However, there are situations in which distributional statistics provide similar support for verbs. The statistics for verbs that occur with the English morpheme –ing, for example, should facilitate verb recognition. In two experiments with 7.5- and 9.5-month-old infants, we tested the importance of distributional statistics for word recognition by varying the frequency of the contextual frames in which verbs occur. The results support the conclusion that distributional statistics are utilized by infant language learners and contribute to noun–verb differences in word recognition. PMID:24908342
Hidden word learning capacity through orthography in aphasia.

PubMed

Tuomiranta, Leena M; Càmara, Estela; Froudist Walsh, Seán; Ripollés, Pablo; Saunavaara, Jani P; Parkkola, Riitta; Martin, Nadine; Rodríguez-Fornells, Antoni; Laine, Matti

2014-01-01

The ability to learn to use new words is thought to depend on the integrity of the left dorsal temporo-frontal speech processing pathway. We tested this assumption in a chronic aphasic individual (AA) with an extensive left temporal lesion using a new-word learning paradigm. She exhibited severe phonological problems and Magnetic Resonance Imaging (MRI) suggested a complete disconnection of this left-sided white-matter pathway comprising the arcuate fasciculus (AF). Diffusion imaging tractography confirmed the disconnection of the direct segment and the posterior indirect segment of her left AF, essential components of the left dorsal speech processing pathway. Despite her left-hemispheric damage and moderate aphasia, AA learned to name and maintain the novel words in her active vocabulary on par with healthy controls up to 6 months after learning. This exceeds previous demonstrations of word learning ability in aphasia. Interestingly, AA's preserved word learning ability was modality-specific as it was observed exclusively for written words. Functional magnetic resonance imaging (fMRI) revealed that in contrast to normals, AA showed a significantly right-lateralized activation pattern in the temporal and parietal regions when engaged in reading. Moreover, learning of visually presented novel word-picture pairs also activated the right temporal lobe in AA. Both AA and the controls showed increased activation during learning of novel versus familiar word-picture pairs in the hippocampus, an area critical for associative learning. AA's structural and functional imaging results suggest that in a literate person, a right-hemispheric network can provide an effective alternative route for learning of novel active vocabulary. Importantly, AA's previously undetected word learning ability translated directly into therapy, as she could use written input also to successfully re-learn and maintain familiar words that she had lost due to her left hemisphere lesion. Copyright © 2013 Elsevier Ltd. All rights reserved.
The effect of problem structure on problem-solving: an fMRI study of word versus number problems.

PubMed

Newman, Sharlene D; Willoughby, Gregory; Pruce, Benjamin

2011-09-02

It has long been thought that word problems are more difficult to solve than number/equation problems. However, recent findings have begun to bring this broadly believed idea into question. The current study examined the processing differences between these two types of problems. The behavioral results presented here failed to show an overwhelming advantage for number problems. In fact, there were more errors for the number problems than the word problems. The neuroimaging results reported demonstrate that there is significant overlap in the processing of what, on the surface, appears to be completely different problems that elicit different problem-solving strategies. Word and number problems rely on a general network responsible for problem-solving that includes the superior posterior parietal cortex, the horizontal segment of the intraparietal sulcus which is hypothesized to be involved in problem representation and calculation as well as the regions that have been linked to executive aspects of working memory such as the pre-SMA and basal ganglia. While overlap was observed, significant differences were also found primarily in language processing regions such as Broca's and Wernicke's areas for the word problems and the horizontal segment of the intraparietal sulcus for the number problems. Copyright © 2011 Elsevier B.V. All rights reserved.
Children show right-lateralized effects of spoken word-form learning

PubMed Central

Nora, Anni; Karvonen, Leena; Renvall, Hanna; Parviainen, Tiina; Kim, Jeong-Young; Service, Elisabet; Salmelin, Riitta

2017-01-01

It is commonly thought that phonological learning is different in young children compared to adults, possibly due to the speech processing system not yet having reached full native-language specialization. However, the neurocognitive mechanisms of phonological learning in children are poorly understood. We employed magnetoencephalography (MEG) to track cortical correlates of incidental learning of meaningless word forms over two days as 6–8-year-olds overtly repeated them. Native (Finnish) pseudowords were compared with words of foreign sound structure (Korean) to investigate whether the cortical learning effects would be more dependent on previous proficiency in the language rather than maturational factors. Half of the items were encountered four times on the first day and once more on the following day. Incidental learning of these recurring word forms manifested as improved repetition accuracy and a correlated reduction of activation in the right superior temporal cortex, similarly for both languages and on both experimental days, and in contrast to a salient left-hemisphere emphasis previously reported in adults. We propose that children, when learning new word forms in either native or foreign language, are not yet constrained by left-hemispheric segmental processing and established sublexical native-language representations. Instead, they may rely more on supra-segmental contours and prosody. PMID:28158201
Children show right-lateralized effects of spoken word-form learning.

PubMed

Nora, Anni; Karvonen, Leena; Renvall, Hanna; Parviainen, Tiina; Kim, Jeong-Young; Service, Elisabet; Salmelin, Riitta

2017-01-01

It is commonly thought that phonological learning is different in young children compared to adults, possibly due to the speech processing system not yet having reached full native-language specialization. However, the neurocognitive mechanisms of phonological learning in children are poorly understood. We employed magnetoencephalography (MEG) to track cortical correlates of incidental learning of meaningless word forms over two days as 6-8-year-olds overtly repeated them. Native (Finnish) pseudowords were compared with words of foreign sound structure (Korean) to investigate whether the cortical learning effects would be more dependent on previous proficiency in the language rather than maturational factors. Half of the items were encountered four times on the first day and once more on the following day. Incidental learning of these recurring word forms manifested as improved repetition accuracy and a correlated reduction of activation in the right superior temporal cortex, similarly for both languages and on both experimental days, and in contrast to a salient left-hemisphere emphasis previously reported in adults. We propose that children, when learning new word forms in either native or foreign language, are not yet constrained by left-hemispheric segmental processing and established sublexical native-language representations. Instead, they may rely more on supra-segmental contours and prosody.
Difficulties of Drivers With Dyslexia When Reading Traffic Signs: Analysis of Reading, Eye Gazes, and Driving Performance.

PubMed

Tejero, Pilar; Insa, Beatriz; Roca, Javier

2018-03-01

A group of adult individuals with dyslexia and a matched group of normally reading individuals participated in a driving simulation experiment. Participants were asked to read the word presented on every direction traffic sign encountered along a route, as far as possible from the sign, maintaining driving performance. Word frequency and word length were manipulated as within-subject factors. We analyzed (a) reading accuracy, (b) how far the sign was when the participant started to give the response, (c) where the participant looked during the time leading up to the response, and (d) the variability of the vehicle's speed during that time and during driving on similar segments of the route that did not present the traffic signs. Individuals with dyslexia showed lower levels of performance in the reading task, the roles of word frequency and word length were more influential for them, and there was larger variability of the vehicle's speed during the time they were attempting to read the traffic sign, which did not occur during their driving on similar segments that did not present the targeted traffic signs. Therefore, the specific needs of individuals with dyslexia on the road should be considered in plans aimed at increasing traffic safety and fluidity.
Cross-linguistic differences in the use of durational cues for the segmentation of a novel language.

PubMed

Ordin, Mikhail; Polyanskaya, Leona; Laka, Itziar; Nespor, Marina

2017-07-01

It is widely accepted that duration can be exploited as phonological phrase final lengthening in the segmentation of a novel language, i.e., in extracting discrete constituents from continuous speech. The use of final lengthening for segmentation and its facilitatory effect has been claimed to be universal. However, lengthening in the world languages can also mark lexically stressed syllables. Stress-induced lengthening can potentially be in conflict with right edge phonological phrase boundary lengthening. Thus the processing of durational cues in segmentation can be dependent on the listener's linguistic background, e.g., on the specific correlates and unmarked location of lexical stress in the native language of the listener. We tested this prediction and found that segmentation by both German and Basque speakers is facilitated when lengthening is aligned with the word final syllable and is not affected by lengthening on either the penultimate or the antepenultimate syllables. Lengthening of the word final syllable, however, does not help Italian and Spanish speakers to segment continuous speech, and lengthening of the antepenultimate syllable impedes their performance. We have also found a facilitatory effect of penultimate lengthening on segmentation by Italians. These results confirm our hypothesis that processing of lengthening cues is not universal, and interpretation of lengthening as a phonological phrase final boundary marker in a novel language of exposure can be overridden by the phonology of lexical stress in the native language of the listener.
Computer-Mediated Assessment of Intelligibility in Aphasia and Apraxia of Speech

PubMed Central

Haley, Katarina L.; Roth, Heidi; Grindstaff, Enetta; Jacks, Adam

2011-01-01

Background Previous work indicates that single word intelligibility tests developed for dysarthria are sensitive to segmental production errors in aphasic individuals with and without apraxia of speech. However, potential listener learning effects and difficulties adapting elicitation procedures to coexisting language impairments limit their applicability to left hemisphere stroke survivors. Aims The main purpose of this study was to examine basic psychometric properties for a new monosyllabic intelligibility test developed for individuals with aphasia and/or AOS. A related purpose was to examine clinical feasibility and potential to standardize a computer-mediated administration approach. Methods & Procedures A 600-item monosyllabic single word intelligibility test was constructed by assembling sets of phonetically similar words. Custom software was used to select 50 target words from this test in a pseudo-random fashion and to elicit and record production of these words by 23 speakers with aphasia and 20 neurologically healthy participants. To evaluate test-retest reliability, two identical sets of 50-word lists were elicited by requesting repetition after a live speaker model. To examine the effect of a different word set and auditory model, an additional set of 50 different words was elicited with a pre-recorded model. The recorded words were presented to normal-hearing listeners for identification via orthographic and multiple-choice response formats. To examine construct validity, production accuracy for each speaker was estimated via phonetic transcription and rating of overall articulation. Outcomes & Results Recording and listening tasks were completed in less than six minutes for all speakers and listeners. Aphasic speakers were significantly less intelligible than neurologically healthy speakers and displayed a wide range of intelligibility scores. Test-retest and inter-listener reliability estimates were strong. No significant difference was found in scores based on recordings from a live model versus a pre-recorded model, but some individual speakers favored the live model. Intelligibility test scores correlated highly with segmental accuracy derived from broad phonetic transcription of the same speech sample and a motor speech evaluation. Scores correlated moderately with rated articulation difficulty. Conclusions We describe a computerized, single-word intelligibility test that yields clinically feasible, reliable, and valid measures of segmental speech production in adults with aphasia. This tool can be used in clinical research to facilitate appropriate participant selection and to establish matching across comparison groups. For a majority of speakers, elicitation procedures can be standardized by using a pre-recorded auditory model for repetition. This assessment tool has potential utility for both clinical assessment and outcomes research. PMID:22215933
From Acoustic Segmentation to Language Processing: Evidence from Optical Imaging

PubMed Central

Obrig, Hellmuth; Rossi, Sonja; Telkemeyer, Silke; Wartenburger, Isabell

2010-01-01

During language acquisition in infancy and when learning a foreign language, the segmentation of the auditory stream into words and phrases is a complex process. Intuitively, learners use “anchors” to segment the acoustic speech stream into meaningful units like words and phrases. Regularities on a segmental (e.g., phonological) or suprasegmental (e.g., prosodic) level can provide such anchors. Regarding the neuronal processing of these two kinds of linguistic cues a left-hemispheric dominance for segmental and a right-hemispheric bias for suprasegmental information has been reported in adults. Though lateralization is common in a number of higher cognitive functions, its prominence in language may also be a key to understanding the rapid emergence of the language network in infants and the ease at which we master our language in adulthood. One question here is whether the hemispheric lateralization is driven by linguistic input per se or whether non-linguistic, especially acoustic factors, “guide” the lateralization process. Methodologically, functional magnetic resonance imaging provides unsurpassed anatomical detail for such an enquiry. However, instrumental noise, experimental constraints and interference with EEG assessment limit its applicability, pointedly in infants and also when investigating the link between auditory and linguistic processing. Optical methods have the potential to fill this gap. Here we review a number of recent studies using optical imaging to investigate hemispheric differences during segmentation and basic auditory feature analysis in language development. PMID:20725516
Three DIBELS Tasks vs. Three Informal Reading/Spelling Tasks: A Comparison of Predictive Validity

ERIC Educational Resources Information Center

Morris, Darrell; Trathen, Woodrow; Perney, Jan; Gill, Tom; Schlagal, Robert; Ward, Devery; Frye, Elizabeth M.

2017-01-01

Within a developmental framework, this study compared the predictive validity of three DIBELS tasks (phoneme segmentation fluency [PSF], nonsense word fluency [NWF], and oral reading fluency [ORF]) with that of three alternative tasks drawn from the field of reading (phonemic spelling [phSPEL], word recognition-timed [WR-t], and graded passage…
CarPrice versus CarpRice: Word Boundary Ambiguity Influences Saccade Target Selection during the Reading of Chinese Sentences

ERIC Educational Resources Information Center

Yan, Ming; Kliegl, Reinhold

2016-01-01

As a contribution to a theoretical debate about the degree of high-level influences on saccade targeting during sentence reading, we investigated eye movements during the reading of structurally ambiguous Chinese character strings and examined whether parafoveal word segmentation could influence saccade-target selection. As expected, ambiguous…
Age and Experience Shape Developmental Changes in the Neural Basis of Language-Related Learning

ERIC Educational Resources Information Center

McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

2011-01-01

Very little is known about the neural underpinnings of language learning across the lifespan and how these might be modified by maturational and experiential factors. Building on behavioral research highlighting the importance of early word segmentation (i.e. the detection of word boundaries in continuous speech) for subsequent language learning,…
Promoting Early Literacy via Practicing Invented Spelling: A Comparison of Different Mediation Routines

ERIC Educational Resources Information Center

Levin, Iris; Aram, Dorit

2013-01-01

The present study compared the effects of different mediation routines provided to kindergartners from families of low socioeconomic status on the students' invented spelling attempts and on their gains obtained on spelling and other early literacy skills (letter naming, sounds of letters, word segmentation, and word decoding). The effects of the…
Do Newly Formed Word Representations Encode Non-Criterial Information?

ERIC Educational Resources Information Center

Curtin, Suzanne

2011-01-01

Lexical stress is useful for a number of language learning tasks. In particular, it helps infants segment the speech stream and identify phonetic contrasts. Recent work has demonstrated that infants aged 1 ; 0 can learn two novel words differing only in their stress pattern. In the current study, we ask whether infants aged 1 ; 0 store stress…
Invented Spelling, Word Stress, and Phonological Awareness in Relation to Reading Difficulties in Children

ERIC Educational Resources Information Center

Mehta, Sheena

2016-01-01

The purpose of the current research is to assess the clinical utility of an invented spelling tool and determine whether invented spelling and word stress (supra-segmental level measures) can also be used to better identify reading difficulties. The proposed invented spelling tool incorporated linguistic manipulations to alter the difficulty…
Prosodic and Phonemic Awareness in Children's Reading of Long and Short Words

ERIC Educational Resources Information Center

Wade-Woolley, Lesly

2016-01-01

Phonemic and prosodic awareness are both phonological processes that operate at different levels: the former at the level of the individual sound segment and the latter at the suprasegmental level across syllables. Both have been shown to be related to word reading in young readers. In this study we examine how these processes are differentially…
Auditory word identification in dyslexic and normally achieving readers.

PubMed

Bruno, Jennifer L; Manis, Franklin R; Keating, Patricia; Sperling, Anne J; Nakamoto, Jonathan; Seidenberg, Mark S

2007-07-01

The integrity of phonological representation/processing in dyslexic children was explored with a gating task in which children listened to successively longer segments (gates) of a word. At each gate, the task was to decide what the entire word was. Responses were scored for overall accuracy as well as the children's sensitivity to coarticulation from the final consonant. As a group, dyslexic children were less able than normally achieving readers to detect coarticulation present in the vowel portion of the word, particularly on the most difficult items, namely those ending in a nasal sound. Hierarchical regression and path analyses indicated that phonological awareness mediated the relation of gating and general language ability to word and pseudoword reading ability.
The A2iA French handwriting recognition system at the Rimes-ICDAR2011 competition

NASA Astrophysics Data System (ADS)

Menasri, Farès; Louradour, Jérôme; Bianne-Bernard, Anne-Laure; Kermorvant, Christopher

2012-01-01

This paper describes the system for the recognition of French handwriting submitted by A2iA to the competition organized at ICDAR2011 using the Rimes database. This system is composed of several recognizers based on three different recognition technologies, combined using a novel combination method. A framework multi-word recognition based on weighted finite state transducers is presented, using an explicit word segmentation, a combination of isolated word recognizers and a language model. The system was tested both for isolated word recognition and for multi-word line recognition and submitted to the RIMES-ICDAR2011 competition. This system outperformed all previously proposed systems on these tasks.

Grounding statistical learning in context: The effects of learning and retrieval contexts on cross-situational word learning.

PubMed

Chen, Chi-Hsin; Yu, Chen

2017-06-01

Natural language environments usually provide structured contexts for learning. This study examined the effects of semantically themed contexts-in both learning and retrieval phases-on statistical word learning. Results from 2 experiments consistently showed that participants had higher performance in semantically themed learning contexts. In contrast, themed retrieval contexts did not affect performance. Our work suggests that word learners are sensitive to statistical regularities not just at the level of individual word-object co-occurrences but also at another level containing a whole network of associations among objects and their properties.
Modeling Cross-Situational Word-Referent Learning: Prior Questions

ERIC Educational Resources Information Center

Yu, Chen; Smith, Linda B.

2012-01-01

Both adults and young children possess powerful statistical computation capabilities--they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of…
Lexical Ambiguities in the Vocabulary of Statistics

ERIC Educational Resources Information Center

Whitaker, Douglas

2016-01-01

Lexical ambiguities exist when two different meanings are ascribed to the same word. Such lexical ambiguities can be particularly problematic for learning material with technical words that have everyday meanings that are not the same as the technical meaning. This study reports on lexical ambiguities in six statistical words germane to statistics…
Dental measurements and Bolton index reliability and accuracy obtained from 2D digital, 3D segmented CBCT, and 3d intraoral laser scanner

PubMed Central

San José, Verónica; Bellot-Arcís, Carlos; Tarazona, Beatriz; Zamora, Natalia; O Lagravère, Manuel

2017-01-01

Background To compare the reliability and accuracy of direct and indirect dental measurements derived from two types of 3D virtual models: generated by intraoral laser scanning (ILS) and segmented cone beam computed tomography (CBCT), comparing these with a 2D digital model. Material and Methods One hundred patients were selected. All patients’ records included initial plaster models, an intraoral scan and a CBCT. Patients´ dental arches were scanned with the iTero® intraoral scanner while the CBCTs were segmented to create three-dimensional models. To obtain 2D digital models, plaster models were scanned using a conventional 2D scanner. When digital models had been obtained using these three methods, direct dental measurements were measured and indirect measurements were calculated. Differences between methods were assessed by means of paired t-tests and regression models. Intra and inter-observer error were analyzed using Dahlberg´s d and coefficients of variation. Results Intraobserver and interobserver error for the ILS model was less than 0.44 mm while for segmented CBCT models, the error was less than 0.97 mm. ILS models provided statistically and clinically acceptable accuracy for all dental measurements, while CBCT models showed a tendency to underestimate measurements in the lower arch, although within the limits of clinical acceptability. Conclusions ILS and CBCT segmented models are both reliable and accurate for dental measurements. Integration of ILS with CBCT scans would get dental and skeletal information altogether. Key words:CBCT, intraoral laser scanner, 2D digital models, 3D models, dental measurements, reliability. PMID:29410764
A perceptive method for handwritten text segmentation

NASA Astrophysics Data System (ADS)

Lemaitre, Aurélie; Camillerapp, Jean; Coüasnon, Bertrand

2011-01-01

This paper presents a new method to address the problem of handwritten text segmentation into text lines and words. Thus, we propose a method based on the cooperation among points of view that enables the localization of the text lines in a low resolution image, and then to associate the pixels at a higher level of resolution. Thanks to the combination of levels of vision, we can detect overlapping characters and re-segment the connected components during the analysis. Then, we propose a segmentation of lines into words based on the cooperation among digital data and symbolic knowledge. The digital data are obtained from distances inside a Delaunay graph, which gives a precise distance between connected components, at the pixel level. We introduce structural rules in order to take into account some generic knowledge about the organization of a text page. This cooperation among information gives a bigger power of expression and ensures the global coherence of the recognition. We validate this work using the metrics and the database proposed for the segmentation contest of ICDAR 2009. Thus, we show that our method obtains very interesting results, compared to the other methods of the literature. More precisely, we are able to deal with slope and curvature, overlapping text lines and varied kinds of writings, which are the main difficulties met by the other methods.
Phoneme Segmenting Alignment with the Common Core Foundational Skills Standard Two: Grades K-1. Technical Report #1227

ERIC Educational Resources Information Center

Sáez, Leilani; Irvin, P. Shawn; Alonzo, Julie; Tindal, Gerald

2012-01-01

In 2006, the easyCBM reading assessment system was developed to support the progress monitoring of phoneme segmenting, letter names and sounds recognition, word reading, passage reading fluency, and comprehension skill development in elementary schools. More recently, the Common Core Standards in English Language Arts have been introduced as a…
The Functional Unit of Japanese Word Naming: Evidence from Masked Priming

ERIC Educational Resources Information Center

Verdonschot, Rinus G.; Kiyama, Sachiko; Tamaoka, Katsuo; Kinoshita, Sachiko; La Heij, Wido; Schiller, Niels O.

2011-01-01

Theories of language production generally describe the segment as the basic unit in phonological encoding (e.g., Dell, 1988; Levelt, Roelofs, & Meyer, 1999). However, there is also evidence that such a unit might be language specific. Chen, Chen, and Dell (2002), for instance, found no effect of single segments when using a preparation…
A neural network model of semantic memory linking feature-based object representation and words.

PubMed

Cuppini, C; Magosso, E; Ursino, M

2009-06-01

Recent theories in cognitive neuroscience suggest that semantic memory is a distributed process, which involves many cortical areas and is based on a multimodal representation of objects. The aim of this work is to extend a previous model of object representation to realize a semantic memory, in which sensory-motor representations of objects are linked with words. The model assumes that each object is described as a collection of features, coded in different cortical areas via a topological organization. Features in different objects are segmented via gamma-band synchronization of neural oscillators. The feature areas are further connected with a lexical area, devoted to the representation of words. Synapses among the feature areas, and among the lexical area and the feature areas are trained via a time-dependent Hebbian rule, during a period in which individual objects are presented together with the corresponding words. Simulation results demonstrate that, during the retrieval phase, the network can deal with the simultaneous presence of objects (from sensory-motor inputs) and words (from acoustic inputs), can correctly associate objects with words and segment objects even in the presence of incomplete information. Moreover, the network can realize some semantic links among words representing objects with shared features. These results support the idea that semantic memory can be described as an integrated process, whose content is retrieved by the co-activation of different multimodal regions. In perspective, extended versions of this model may be used to test conceptual theories, and to provide a quantitative assessment of existing data (for instance concerning patients with neural deficits).
Accommodation of end-state comfort reveals subphonemic planning in speech

PubMed Central

Gick, Bryan

2015-01-01

Applying Rosenbaum’s “end-state comfort” hypothesis (Rosenbaum et al., 1992, 1996) to tongue motion provides evidence of long-distance subphonemic planning in speech. Speakers’ tongue postures may anticipate upcoming speech up to three segments, two syllables, and a morpheme or word boundary later. We used m-mode ultrasound imaging to measure the direction of tongue tip/blade movements for known variants of flap/tap allophones of North American English /t/ and /d/. Results show that speakers produce different flap variants early in words or word sequences so as to facilitate the kinematic needs of flap/tap or other /r/ variants that appear later in the word or word sequence. Similar results were also observed across word boundaries, indicating that this is not a lexical effect. PMID:25790787
Interrupted Monosyllabic Words: The Effects of Ten Interruption Locations on Recognition Performance by Older Listeners with Sensorineural Hearing Loss.

PubMed

Wilson, Richard H; Sharrett, Kadie C

2017-01-01

Two previous experiments from our laboratory with 70 interrupted monosyllabic words demonstrated that recognition performance was influenced by the temporal location of the interruption pattern. The interruption pattern (10 interruptions/sec, 50% duty cycle) was always the same and referenced word onset; the only difference between the patterns was the temporal location of the on- and off-segments of the interruption cycle. In the first study, both young and older listeners obtained better recognition performances when the initial on-segment coincided with word onset than when the initial on-segment was delayed by 50 msec. The second experiment with 24 young listeners detailed recognition performance as the interruption pattern was incremented in 10-msec steps through the 0- to 90-msec onset range. Across the onset conditions, 95% of the functions were either flat or U-shaped. To define the effects that interruption pattern locations had on word recognition by older listeners with sensorineural hearing loss as the interruption pattern incremented, re: word onset, from 0 to 90 msec in 10-msec steps. A repeated-measures design with ten interruption patterns (onset conditions) and one uninterruption condition. Twenty-four older males (mean = 69.6 yr) with sensorineural hearing loss participated in two 1-hour sessions. The three-frequency pure-tone average was 24.0 dB HL and word recognition was ≥80% correct. Seventy consonant-vowel nucleus-consonant words formed the corpus of materials with 25 additional words used for practice. For each participant, the 700 interrupted stimuli (70 words by 10 onset conditions), the 70 words uninterrupted, and two practice lists each were randomized and recorded on compact disc in 33 tracks of 25 words each. The data were analyzed at the participant and word levels and compared to the results obtained earlier on 24 young listeners with normal hearing. The mean recognition performance on the 70 words uninterrupted was 91.0% with an overall mean performance on the ten interruption conditions of 63.2% (range: 57.9-69.3%), compared to 80.4% (range: 73.0-87.7%) obtained earlier on the young adults. The best performances were at the extremes of the onset conditions. Standard deviations ranged from 22.1% to 28.1% (24 participants) and from 9.2% to 12.8% (70 words). An arithmetic algorithm categorized the shapes of the psychometric functions across the ten onset conditions. With the older participants in the current study, 40% of the functions were flat, 41.4% were U-shaped, and 18.6% were inverted U-shaped, which compared favorably to the function shapes by the young listeners in the earlier study of 50.0%, 41.4%, and 8.6%, respectively. There were two words on which the older listeners had 40% better performances. Collectively, the data are orderly, but at the individual word or participant level, the data are somewhat volatile, which may reflect auditory processing differences between the participant groups. The diversity of recognition performances by the older listeners on the ten interruption conditions with each of the 70 words supports the notion that the term hearing loss is inclusive of processes well beyond the filtering produced by end-organ sensitivity deficits. American Academy of Audiology
Validation tools for image segmentation

NASA Astrophysics Data System (ADS)

Padfield, Dirk; Ross, James

2009-02-01

A large variety of image analysis tasks require the segmentation of various regions in an image. For example, segmentation is required to generate accurate models of brain pathology that are important components of modern diagnosis and therapy. While the manual delineation of such structures gives accurate information, the automatic segmentation of regions such as the brain and tumors from such images greatly enhances the speed and repeatability of quantifying such structures. The ubiquitous need for such algorithms has lead to a wide range of image segmentation algorithms with various assumptions, parameters, and robustness. The evaluation of such algorithms is an important step in determining their effectiveness. Therefore, rather than developing new segmentation algorithms, we here describe validation methods for segmentation algorithms. Using similarity metrics comparing the automatic to manual segmentations, we demonstrate methods for optimizing the parameter settings for individual cases and across a collection of datasets using the Design of Experiment framework. We then employ statistical analysis methods to compare the effectiveness of various algorithms. We investigate several region-growing algorithms from the Insight Toolkit and compare their accuracy to that of a separate statistical segmentation algorithm. The segmentation algorithms are used with their optimized parameters to automatically segment the brain and tumor regions in MRI images of 10 patients. The validation tools indicate that none of the ITK algorithms studied are able to outperform with statistical significance the statistical segmentation algorithm although they perform reasonably well considering their simplicity.
Fine-Grained Sensitivity to Statistical Information in Adult Word Learning

ERIC Educational Resources Information Center

Vouloumanos, Athena

2008-01-01

A language learner trying to acquire a new word must often sift through many potential relations between particular words and their possible meanings. In principle, statistical information about the distribution of those mappings could serve as one important source of data, but little is known about whether learners can in fact track multiple…
Multivariate statistical model for 3D image segmentation with application to medical images.

PubMed

John, Nigel M; Kabuka, Mansur R; Ibrahim, Mohamed O

2003-12-01

In this article we describe a statistical model that was developed to segment brain magnetic resonance images. The statistical segmentation algorithm was applied after a pre-processing stage involving the use of a 3D anisotropic filter along with histogram equalization techniques. The segmentation algorithm makes use of prior knowledge and a probability-based multivariate model designed to semi-automate the process of segmentation. The algorithm was applied to images obtained from the Center for Morphometric Analysis at Massachusetts General Hospital as part of the Internet Brain Segmentation Repository (IBSR). The developed algorithm showed improved accuracy over the k-means, adaptive Maximum Apriori Probability (MAP), biased MAP, and other algorithms. Experimental results showing the segmentation and the results of comparisons with other algorithms are provided. Results are based on an overlap criterion against expertly segmented images from the IBSR. The algorithm produced average results of approximately 80% overlap with the expertly segmented images (compared with 85% for manual segmentation and 55% for other algorithms).
Effect of Name Change of Schizophrenia on Mass Media Between 1985 and 2013 in Japan: A Text Data Mining Analysis

PubMed Central

Koike, Shinsuke; Yamaguchi, Sosei; Ojio, Yasutaka; Ohta, Kazusa; Ando, Shuntaro

2016-01-01

Background: Mass media such as newspapers and TV news affect mental health-related stigma. In Japan, the name of schizophrenia was changed in 2002 for the purposes of stigma reduction; however, little has been known about the effect of name change of schizophrenia on mass media. Method: Articles including old and new names of schizophrenia, depressive disorder, and diabetes mellitus (DM) in headlines and/or text were extracted from 23169092 articles in 4 major Japanese newspapers and 1 TV news program (1985–2013). The trajectory of the number of articles including each term was determined across years. Then, all text in news headlines was segmented as per part-of-speech level using text data mining. Segmented words were classified into 6 categories and in each category of extracted words by target term and period were also tested. Results: Total 51789 and 1106 articles including target terms in newspaper articles and TV news segments were obtained, respectively. The number of articles including the target terms increased across years. Relative increase was observed in the articles published on schizophrenia since 2003 compared with those on DM and between 2000 and 2005 compared with those on depressive disorder. Word tendency used in headlines was equivalent before and after 2002 for the articles including each target term. Articles for schizophrenia contained more negative words than depressive disorder and DM (31.5%, 16.0%, and 8.2%, respectively). Conclusions: Name change of schizophrenia had a limited effect on the articles published and little effect on its contents. PMID:26614786
The Processing of the Right-Sided Accent Mark in Left Neglect Dyslexia

ERIC Educational Resources Information Center

Cubelli, Roberto; Beschin, Nicoletta

2005-01-01

Italian polysyllabic words with stress falling on the last syllable are written with a diacritic sign on the last vowel. It allows discrimination between two words with the same orthographic segments (e.g., papa [pope], papa [dad]). The effect of the accent mark in left neglect dyslexia has never been investigated. In the current study, six…
Hemispheric Asymmetry for Linguistic Prosody: A Study of Stress Perception in Croatian

ERIC Educational Resources Information Center

Mildner, Vesna

2004-01-01

The aim of the study was to test for possible functional cerebral asymmetry in processing one segment of linguistic prosody, namely word stress, in Croatian. The test material consisted of eight tokens of the word "pas" under a falling accent, varying only in vowel duration between 119 and 185ms, attached to the end of a frame sentence. The…
Perception and production of rise-fall intonation in American English.

PubMed

Steppling, Mary L; Montgomery, Allen A

2002-04-01

At the segmental level, the rate of speaking affects the degree of physical undershoot of articulatory targets and the resulting perception. Little is known regarding evidence of these effects at the suprasegmental level, particularly in intonation. In this study, the effect of rate of speaking on fundamental frequency and on perceptual judgments of peak pitch in a rise-fall intonation pattern was investigated. First, speakers produced rise-fall intonations in sentence contexts at slow, normal, and fast speaking rates. Peak fundamental frequencies (F0) of the slow productions were significantly lower than those of the normal or fast productions. The mean normal rate production of the word Miami was used as a model for the target word in a series of subsequent perceptual experiments. Altering the duration of the target word to represent slow, normal, and fast rates of speaking did not affect listener judgment of peak pitch. Finally, the pitch of the target word was measured in a sentence context. No differences between peak pitch in isolation or in sentence context were found. It was concluded that the production and perception of this form of intonation was not subject to the effects of rate that are seen at the segmental level.
Topic segmentation via community detection in complex networks

NASA Astrophysics Data System (ADS)

de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.

2016-06-01

Many real systems have been modeled in terms of network concepts, and written texts are a particular example of information networks. In recent years, the use of network methods to analyze language has allowed the discovery of several interesting effects, including the proposition of novel models to explain the emergence of fundamental universal patterns. While syntactical networks, one of the most prevalent networked models of written texts, display both scale-free and small-world properties, such a representation fails in capturing other textual features, such as the organization in topics or subjects. We propose a novel network representation whose main purpose is to capture the semantical relationships of words in a simple way. To do so, we link all words co-occurring in the same semantic context, which is defined in a threefold way. We show that the proposed representations favor the emergence of communities of semantically related words, and this feature may be used to identify relevant topics. The proposed methodology to detect topics was applied to segment selected Wikipedia articles. We found that, in general, our methods outperform traditional bag-of-words representations, which suggests that a high-level textual representation may be useful to study the semantical features of texts.
Topic segmentation via community detection in complex networks.

PubMed

de Arruda, Henrique F; Costa, Luciano da F; Amancio, Diego R

2016-06-01

Many real systems have been modeled in terms of network concepts, and written texts are a particular example of information networks. In recent years, the use of network methods to analyze language has allowed the discovery of several interesting effects, including the proposition of novel models to explain the emergence of fundamental universal patterns. While syntactical networks, one of the most prevalent networked models of written texts, display both scale-free and small-world properties, such a representation fails in capturing other textual features, such as the organization in topics or subjects. We propose a novel network representation whose main purpose is to capture the semantical relationships of words in a simple way. To do so, we link all words co-occurring in the same semantic context, which is defined in a threefold way. We show that the proposed representations favor the emergence of communities of semantically related words, and this feature may be used to identify relevant topics. The proposed methodology to detect topics was applied to segment selected Wikipedia articles. We found that, in general, our methods outperform traditional bag-of-words representations, which suggests that a high-level textual representation may be useful to study the semantical features of texts.
Identification and Definition of Lexically Ambiguous Words in Statistics by Tutors and Students

ERIC Educational Resources Information Center

Richardson, Alice M.; Dunn, Peter K.; Hutchins, Rene

2013-01-01

Lexical ambiguity arises when a word from everyday English is used differently in a particular discipline, such as statistics. This paper reports on a project that begins by identifying tutors' perceptions of words that are potentially lexically ambiguous to students, in two different ways. Students' definitions of nine lexically ambiguous words…

Survey statistics of automated segmentations applied to optical imaging of mammalian cells.

PubMed

Bajcsy, Peter; Cardone, Antonio; Chalfoun, Joe; Halter, Michael; Juba, Derek; Kociolek, Marcin; Majurski, Michael; Peskin, Adele; Simon, Carl; Simon, Mylene; Vandecreme, Antoine; Brady, Mary

2015-10-15

The goal of this survey paper is to overview cellular measurements using optical microscopy imaging followed by automated image segmentation. The cellular measurements of primary interest are taken from mammalian cells and their components. They are denoted as two- or three-dimensional (2D or 3D) image objects of biological interest. In our applications, such cellular measurements are important for understanding cell phenomena, such as cell counts, cell-scaffold interactions, cell colony growth rates, or cell pluripotency stability, as well as for establishing quality metrics for stem cell therapies. In this context, this survey paper is focused on automated segmentation as a software-based measurement leading to quantitative cellular measurements. We define the scope of this survey and a classification schema first. Next, all found and manually filteredpublications are classified according to the main categories: (1) objects of interests (or objects to be segmented), (2) imaging modalities, (3) digital data axes, (4) segmentation algorithms, (5) segmentation evaluations, (6) computational hardware platforms used for segmentation acceleration, and (7) object (cellular) measurements. Finally, all classified papers are converted programmatically into a set of hyperlinked web pages with occurrence and co-occurrence statistics of assigned categories. The survey paper presents to a reader: (a) the state-of-the-art overview of published papers about automated segmentation applied to optical microscopy imaging of mammalian cells, (b) a classification of segmentation aspects in the context of cell optical imaging, (c) histogram and co-occurrence summary statistics about cellular measurements, segmentations, segmented objects, segmentation evaluations, and the use of computational platforms for accelerating segmentation execution, and (d) open research problems to pursue. The novel contributions of this survey paper are: (1) a new type of classification of cellular measurements and automated segmentation, (2) statistics about the published literature, and (3) a web hyperlinked interface to classification statistics of the surveyed papers at https://isg.nist.gov/deepzoomweb/resources/survey/index.html.
The company objects keep: Linking referents together during cross-situational word learning.

PubMed

Zettersten, Martin; Wojcik, Erica; Benitez, Viridiana L; Saffran, Jenny

2018-04-01

Learning the meanings of words involves not only linking individual words to referents but also building a network of connections among entities in the world, concepts, and words. Previous studies reveal that infants and adults track the statistical co-occurrence of labels and objects across multiple ambiguous training instances to learn words. However, it is less clear whether, given distributional or attentional cues, learners also encode associations amongst the novel objects. We investigated the consequences of two types of cues that highlighted object-object links in a cross-situational word learning task: distributional structure - how frequently the referents of novel words occurred together - and visual context - whether the referents were seen on matching backgrounds. Across three experiments, we found that in addition to learning novel words, adults formed connections between frequently co-occurring objects. These findings indicate that learners exploit statistical regularities to form multiple types of associations during word learning.
Age-Related Differences in Speech Rate Perception Do Not Necessarily Entail Age-Related Differences in Speech Rate Use

ERIC Educational Resources Information Center

Heffner, Christopher C.; Newman, Rochelle S.; Dilley, Laura C.; Idsardi, William J.

2015-01-01

Purpose: A new literature has suggested that speech rate can influence the parsing of words quite strongly in speech. The purpose of this study was to investigate differences between younger adults and older adults in the use of context speech rate in word segmentation, given that older adults perceive timing information differently from younger…
How Many Is Enough?—Statistical Principles for Lexicostatistics

PubMed Central

Zhang, Menghan; Gong, Tao

2016-01-01

Lexicostatistics has been applied in linguistics to inform phylogenetic relations among languages. There are two important yet not well-studied parameters in this approach: the conventional size of vocabulary list to collect potentially true cognates and the minimum matching instances required to confirm a recurrent sound correspondence. Here, we derive two statistical principles from stochastic theorems to quantify these parameters. These principles validate the practice of using the Swadesh 100- and 200-word lists to indicate degree of relatedness between languages, and enable a frequency-based, dynamic threshold to detect recurrent sound correspondences. Using statistical tests, we further evaluate the generality of the Swadesh 100-word list compared to the Swadesh 200-word list and other 100-word lists sampled randomly from the Swadesh 200-word list. All these provide mathematical support for applying lexicostatistics in historical and comparative linguistics. PMID:28018261
Sizing up the competition: quantifying the influence of the mental lexicon on auditory and visual spoken word recognition.

PubMed

Strand, Julia F; Sommers, Mitchell S

2011-09-01

Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al., 2002) that perceptual and cognitive properties underlying spoken word recognition are not specific to the auditory domain. In addition, the results support the use of the phi-square statistic as a better predictor of lexical competition than metrics currently used in models of spoken word recognition. © 2011 Acoustical Society of America
The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth

ERIC Educational Resources Information Center

Steyvers, Mark; Tenenbaum, Joshua B.

2005-01-01

We present statistical analyses of the large-scale structure of 3 types of semantic networks: word associations, WordNet, and Roget's Thesaurus. We show that they have a small-world structure, characterized by sparse connectivity, short average path lengths between words, and strong local clustering. In addition, the distributions of the number of…
Word recognition using a lexicon constrained by first/last character decisions

NASA Astrophysics Data System (ADS)

Zhao, Sheila X.; Srihari, Sargur N.

1995-03-01

In lexicon based recognition of machine-printed word images, the size of the lexicon can be quite extensive. The recognition performance is closely related to the size of the lexicon. Recognition performance drops quickly when lexicon size increases. Here, we present an algorithm to improve the word recognition performance by reducing the size of the given lexicon. The algorithm utilizes the information provided by the first and last characters of a word to reduce the size of the given lexicon. Given a word image and a lexicon that contains the word in the image, the first and last characters are segmented and then recognized by a character classifier. The possible candidates based on the results given by the classifier are selected, which give us the sub-lexicon. Then a word shape analysis algorithm is applied to produce the final ranking of the given lexicon. The algorithm was tested on a set of machine- printed gray-scale word images which includes a wide range of print types and qualities.
Word learning mechanisms.

PubMed

He, Angela Xiaoxue; Arunachalam, Sudha

2017-07-01

How do children acquire the meanings of words? Many word learning mechanisms have been proposed to guide learners through this challenging task. Despite the availability of rich information in the learner's linguistic and extralinguistic input, the word-learning task is insurmountable without such mechanisms for filtering through and utilizing that information. Different kinds of words, such as nouns denoting object concepts and verbs denoting event concepts, require to some extent different kinds of information and, therefore, access to different kinds of mechanisms. We review some of these mechanisms to examine the relationship between the input that is available to learners and learners' intake of that input-that is, the organized, interpreted, and stored representations they form. We discuss how learners segment individual words from the speech stream and identify their grammatical categories, how they identify the concepts denoted by these words, and how they refine their initial representations of word meanings. WIREs Cogn Sci 2017, 8:e1435. doi: 10.1002/wcs.1435 This article is categorized under: Linguistics > Language Acquisition Psychology > Language. © 2017 Wiley Periodicals, Inc.
Characterizing the D2 statistic: word matches in biological sequences.

PubMed

Forêt, Sylvain; Wilson, Susan R; Burden, Conrad J

2009-01-01

Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.
Robust tissue-air volume segmentation of MR images based on the statistics of phase and magnitude: Its applications in the display of susceptibility-weighted imaging of the brain.

PubMed

Du, Yiping P; Jin, Zhaoyang

2009-10-01

To develop a robust algorithm for tissue-air segmentation in magnetic resonance imaging (MRI) using the statistics of phase and magnitude of the images. A multivariate measure based on the statistics of phase and magnitude was constructed for tissue-air volume segmentation. The standard deviation of first-order phase difference and the standard deviation of magnitude were calculated in a 3 x 3 x 3 kernel in the image domain. To improve differentiation accuracy, the uniformity of phase distribution in the kernel was also calculated and linear background phase introduced by field inhomogeneity was corrected. The effectiveness of the proposed volume segmentation technique was compared to a conventional approach that uses the magnitude data alone. The proposed algorithm was shown to be more effective and robust in volume segmentation in both synthetic phantom and susceptibility-weighted images of human brain. Using our proposed volume segmentation method, veins in the peripheral regions of the brain were well depicted in the minimum-intensity projection of the susceptibility-weighted images. Using the additional statistics of phase, tissue-air volume segmentation can be substantially improved compared to that using the statistics of magnitude data alone. (c) 2009 Wiley-Liss, Inc.
Effect of Name Change of Schizophrenia on Mass Media Between 1985 and 2013 in Japan: A Text Data Mining Analysis.

PubMed

Koike, Shinsuke; Yamaguchi, Sosei; Ojio, Yasutaka; Ohta, Kazusa; Ando, Shuntaro

2016-05-01

Mass media such as newspapers and TV news affect mental health-related stigma. In Japan, the name of schizophrenia was changed in 2002 for the purposes of stigma reduction; however, little has been known about the effect of name change of schizophrenia on mass media. Articles including old and new names of schizophrenia, depressive disorder, and diabetes mellitus (DM) in headlines and/or text were extracted from 23169092 articles in 4 major Japanese newspapers and 1 TV news program (1985-2013). The trajectory of the number of articles including each term was determined across years. Then, all text in news headlines was segmented as per part-of-speech level using text data mining. Segmented words were classified into 6 categories and in each category of extracted words by target term and period were also tested. Total 51789 and 1106 articles including target terms in newspaper articles and TV news segments were obtained, respectively. The number of articles including the target terms increased across years. Relative increase was observed in the articles published on schizophrenia since 2003 compared with those on DM and between 2000 and 2005 compared with those on depressive disorder. Word tendency used in headlines was equivalent before and after 2002 for the articles including each target term. Articles for schizophrenia contained more negative words than depressive disorder and DM (31.5%, 16.0%, and 8.2%, respectively). Name change of schizophrenia had a limited effect on the articles published and little effect on its contents. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Social discourses of healthy eating. A market segmentation approach.

PubMed

Chrysochou, Polymeros; Askegaard, Søren; Grunert, Klaus G; Kristensen, Dorthe Brogård

2010-10-01

This paper proposes a framework of discourses regarding consumers' healthy eating as a useful conceptual scheme for market segmentation purposes. The objectives are: (a) to identify the appropriate number of health-related segments based on the underlying discursive subject positions of the framework, (b) to validate and further describe the segments based on their socio-demographic characteristics and attitudes towards healthy eating, and (c) to explore differences across segments in types of associations with food and health, as well as perceptions of food healthfulness.316 Danish consumers participated in a survey that included measures of the underlying subject positions of the proposed framework, followed by a word association task that aimed to explore types of associations with food and health, and perceptions of food healthfulness. A latent class clustering approach revealed three consumer segments: the Common, the Idealists and the Pragmatists. Based on the addressed objectives, differences across the segments are described and implications of findings are discussed.
Semantic Coherence Facilitates Distributional Learning.

PubMed

Ouyang, Long; Boroditsky, Lera; Frank, Michael C

2017-04-01

Computational models have shown that purely statistical knowledge about words' linguistic contexts is sufficient to learn many properties of words, including syntactic and semantic category. For example, models can infer that "postman" and "mailman" are semantically similar because they have quantitatively similar patterns of association with other words (e.g., they both tend to occur with words like "deliver," "truck," "package"). In contrast to these computational results, artificial language learning experiments suggest that distributional statistics alone do not facilitate learning of linguistic categories. However, experiments in this paradigm expose participants to entirely novel words, whereas real language learners encounter input that contains some known words that are semantically organized. In three experiments, we show that (a) the presence of familiar semantic reference points facilitates distributional learning and (b) this effect crucially depends both on the presence of known words and the adherence of these known words to some semantic organization. Copyright © 2016 Cognitive Science Society, Inc.
Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture

NASA Astrophysics Data System (ADS)

Bideault, Gautier; Mioulet, Luc; Chatelain, Clément; Paquet, Thierry

2015-01-01

In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.
Effects of hand gestures on auditory learning of second-language vowel length contrasts.

PubMed

Hirata, Yukari; Kelly, Spencer D; Huang, Jessica; Manansala, Michael

2014-12-01

Research has shown that hand gestures affect comprehension and production of speech at semantic, syntactic, and pragmatic levels for both native language and second language (L2). This study investigated a relatively less explored question: Do hand gestures influence auditory learning of an L2 at the segmental phonology level? To examine auditory learning of phonemic vowel length contrasts in Japanese, 88 native English-speaking participants took an auditory test before and after one of the following 4 types of training in which they (a) observed an instructor in a video speaking Japanese words while she made syllabic-rhythm hand gesture, (b) produced this gesture with the instructor, (c) observed the instructor speaking those words and her moraic-rhythm hand gesture, or (d) produced the moraic-rhythm gesture with the instructor. All of the training types yielded similar auditory improvement in identifying vowel length contrast. However, observing the syllabic-rhythm hand gesture yielded the most balanced improvement between word-initial and word-final vowels and between slow and fast speaking rates. The overall effect of hand gesture on learning of segmental phonology is limited. Implications for theories of hand gesture are discussed in terms of the role it plays at different linguistic levels.
A segmentation editing framework based on shape change statistics

NASA Astrophysics Data System (ADS)

Mostapha, Mahmoud; Vicory, Jared; Styner, Martin; Pizer, Stephen

2017-02-01

Segmentation is a key task in medical image analysis because its accuracy significantly affects successive steps. Automatic segmentation methods often produce inadequate segmentations, which require the user to manually edit the produced segmentation slice by slice. Because editing is time-consuming, an editing tool that enables the user to produce accurate segmentations by only drawing a sparse set of contours would be needed. This paper describes such a framework as applied to a single object. Constrained by the additional information enabled by the manually segmented contours, the proposed framework utilizes object shape statistics to transform the failed automatic segmentation to a more accurate version. Instead of modeling the object shape, the proposed framework utilizes shape change statistics that were generated to capture the object deformation from the failed automatic segmentation to its corresponding correct segmentation. An optimization procedure was used to minimize an energy function that consists of two terms, an external contour match term and an internal shape change regularity term. The high accuracy of the proposed segmentation editing approach was confirmed by testing it on a simulated data set based on 10 in-vivo infant magnetic resonance brain data sets using four similarity metrics. Segmentation results indicated that our method can provide efficient and adequately accurate segmentations (Dice segmentation accuracy increase of 10%), with very sparse contours (only 10%), which is promising in greatly decreasing the work expected from the user.
Handwritten Word Recognition Using Multi-view Analysis

NASA Astrophysics Data System (ADS)

de Oliveira, J. J.; de A. Freitas, C. O.; de Carvalho, J. M.; Sabourin, R.

This paper brings a contribution to the problem of efficiently recognizing handwritten words from a limited size lexicon. For that, a multiple classifier system has been developed that analyzes the words from three different approximation levels, in order to get a computational approach inspired on the human reading process. For each approximation level a three-module architecture composed of a zoning mechanism (pseudo-segmenter), a feature extractor and a classifier is defined. The proposed application is the recognition of the Portuguese handwritten names of the months, for which a best recognition rate of 97.7% was obtained, using classifier combination.
Lexical Ambiguity: Making a Case against Spread

ERIC Educational Resources Information Center

Kaplan, Jennifer J.; Rogness, Neal T.; Fisher, Diane G.

2012-01-01

We argue for decreasing the use of the word "spread" when describing the statistical idea of dispersion or variability in introductory statistics courses. In addition, we argue for increasing the use of the word "variability" as a replacement for "spread."
Cross-Situational Learning of Minimal Word Pairs

ERIC Educational Resources Information Center

Escudero, Paola; Mulak, Karen E.; Vlach, Haley A.

2016-01-01

"Cross-situational statistical learning" of words involves tracking co-occurrences of auditory words and objects across time to infer word-referent mappings. Previous research has demonstrated that learners can infer referents across sets of very phonologically distinct words (e.g., WUG, DAX), but it remains unknown whether learners can…
Selective activation around the left occipito-temporal sulcus for words relative to pictures: individual variability or false positives?

PubMed

Wright, Nicholas D; Mechelli, Andrea; Noppeney, Uta; Veltman, Dick J; Rombouts, Serge A R B; Glensman, Janice; Haynes, John-Dylan; Price, Cathy J

2008-08-01

We used high-resolution fMRI to investigate claims that learning to read results in greater left occipito-temporal (OT) activation for written words relative to pictures of objects. In the first experiment, 9/16 subjects performing a one-back task showed activation in > or =1 left OT voxel for words relative to pictures (P < 0.05 uncorrected). In a second experiment, another 9/15 subjects performing a semantic decision task activated > or =1 left OT voxel for words relative to pictures. However, at this low statistical threshold false positives need to be excluded. The semantic decision paradigm was therefore repeated, within subject, in two different scanners (1.5 and 3 T). Both scanners consistently localised left OT activation for words relative to fixation and pictures relative to words, but there were no consistent effects for words relative to pictures. Finally, in a third experiment, we minimised the voxel size (1.5 x 1.5 x 1.5 mm(3)) and demonstrated a striking concordance between the voxels activated for words and pictures, irrespective of task (naming vs. one-back) or script (English vs. Hebrew). In summary, although we detected differential activation for words relative to pictures, these effects: (i) do not withstand statistical rigour; (ii) do not replicate within or between subjects; and (iii) are observed in voxels that also respond to pictures of objects. Our findings have implications for the role of left OT activation during reading. More generally, they show that studies using low statistical thresholds in single subject analyses should correct the statistical threshold for the number of comparisons made or replicate effects within subject. (c) 2007 Wiley-Liss, Inc.

Selective Activation Around the Left Occipito-Temporal Sulcus for Words Relative to Pictures: Individual Variability or False Positives?

PubMed Central

Wright, Nicholas D; Mechelli, Andrea; Noppeney, Uta; Veltman, Dick J; Rombouts, Serge ARB; Glensman, Janice; Haynes, John-Dylan; Price, Cathy J

2008-01-01

We used high-resolution fMRI to investigate claims that learning to read results in greater left occipito-temporal (OT) activation for written words relative to pictures of objects. In the first experiment, 9/16 subjects performing a one-back task showed activation in ≥1 left OT voxel for words relative to pictures (P < 0.05 uncorrected). In a second experiment, another 9/15 subjects performing a semantic decision task activated ≥1 left OT voxel for words relative to pictures. However, at this low statistical threshold false positives need to be excluded. The semantic decision paradigm was therefore repeated, within subject, in two different scanners (1.5 and 3 T). Both scanners consistently localised left OT activation for words relative to fixation and pictures relative to words, but there were no consistent effects for words relative to pictures. Finally, in a third experiment, we minimised the voxel size (1.5 × 1.5 × 1.5 mm3) and demonstrated a striking concordance between the voxels activated for words and pictures, irrespective of task (naming vs. one-back) or script (English vs. Hebrew). In summary, although we detected differential activation for words relative to pictures, these effects: (i) do not withstand statistical rigour; (ii) do not replicate within or between subjects; and (iii) are observed in voxels that also respond to pictures of objects. Our findings have implications for the role of left OT activation during reading. More generally, they show that studies using low statistical thresholds in single subject analyses should correct the statistical threshold for the number of comparisons made or replicate effects within subject. Hum Brain Mapp 2008. © 2007 Wiley-Liss, Inc. PMID:17712786
The statistical trade-off between word order and word structure – Large-scale evidence for the principle of least effort

PubMed Central

Koplenig, Alexander; Meyer, Peter; Wolfer, Sascha; Müller-Spitzer, Carolin

2017-01-01

Languages employ different strategies to transmit structural and grammatical information. While, for example, grammatical dependency relationships in sentences are mainly conveyed by the ordering of the words for languages like Mandarin Chinese, or Vietnamese, the word ordering is much less restricted for languages such as Inupiatun or Quechua, as these languages (also) use the internal structure of words (e.g. inflectional morphology) to mark grammatical relationships in a sentence. Based on a quantitative analysis of more than 1,500 unique translations of different books of the Bible in almost 1,200 different languages that are spoken as a native language by approximately 6 billion people (more than 80% of the world population), we present large-scale evidence for a statistical trade-off between the amount of information conveyed by the ordering of words and the amount of information conveyed by internal word structure: languages that rely more strongly on word order information tend to rely less on word structure information and vice versa. Or put differently, if less information is carried within the word, more information has to be spread among words in order to communicate successfully. In addition, we find that–despite differences in the way information is expressed–there is also evidence for a trade-off between different books of the biblical canon that recurs with little variation across languages: the more informative the word order of the book, the less informative its word structure and vice versa. We argue that this might suggest that, on the one hand, languages encode information in very different (but efficient) ways. On the other hand, content-related and stylistic features are statistically encoded in very similar ways. PMID:28282435
The ICSI+ Multilingual Sentence Segmentation System

DTIC Science & Technology

2006-01-01

these steps the ASR output needs to be enriched with information additional to words, such as speaker diarization , sentence segmentation, or story...and the out- of a speaker diarization is considered as well. We first detail extraction of the prosodic features, and then describe the clas- ation...also takes into account the speaker turns that estimated by the diarization system. In addition to the Max- 1) model speaker turn unigrams, trigram
Semi-automatic medical image segmentation with adaptive local statistics in Conditional Random Fields framework.

PubMed

Hu, Yu-Chi J; Grossberg, Michael D; Mageras, Gikas S

2008-01-01

Planning radiotherapy and surgical procedures usually require onerous manual segmentation of anatomical structures from medical images. In this paper we present a semi-automatic and accurate segmentation method to dramatically reduce the time and effort required of expert users. This is accomplished by giving a user an intuitive graphical interface to indicate samples of target and non-target tissue by loosely drawing a few brush strokes on the image. We use these brush strokes to provide the statistical input for a Conditional Random Field (CRF) based segmentation. Since we extract purely statistical information from the user input, we eliminate the need of assumptions on boundary contrast previously used by many other methods, A new feature of our method is that the statistics on one image can be reused on related images without registration. To demonstrate this, we show that boundary statistics provided on a few 2D slices of volumetric medical data, can be propagated through the entire 3D stack of images without using the geometric correspondence between images. In addition, the image segmentation from the CRF can be formulated as a minimum s-t graph cut problem which has a solution that is both globally optimal and fast. The combination of a fast segmentation and minimal user input that is reusable, make this a powerful technique for the segmentation of medical images.
Quantitative learning strategies based on word networks

NASA Astrophysics Data System (ADS)

Zhao, Yue-Tian-Yi; Jia, Zi-Yang; Tang, Yong; Xiong, Jason Jie; Zhang, Yi-Cheng

2018-02-01

Learning English requires a considerable effort, but the way that vocabulary is introduced in textbooks is not optimized for learning efficiency. With the increasing population of English learners, learning process optimization will have significant impact and improvement towards English learning and teaching. The recent developments of big data analysis and complex network science provide additional opportunities to design and further investigate the strategies in English learning. In this paper, quantitative English learning strategies based on word network and word usage information are proposed. The strategies integrate the words frequency with topological structural information. By analyzing the influence of connected learned words, the learning weights for the unlearned words and dynamically updating of the network are studied and analyzed. The results suggest that quantitative strategies significantly improve learning efficiency while maintaining effectiveness. Especially, the optimized-weight-first strategy and segmented strategies outperform other strategies. The results provide opportunities for researchers and practitioners to reconsider the way of English teaching and designing vocabularies quantitatively by balancing the efficiency and learning costs based on the word network.
Differential Gaze Patterns on Eyes and Mouth During Audiovisual Speech Segmentation

PubMed Central

Lusk, Laina G.; Mitchel, Aaron D.

2016-01-01

Speech is inextricably multisensory: both auditory and visual components provide critical information for all aspects of speech processing, including speech segmentation, the visual components of which have been the target of a growing number of studies. In particular, a recent study (Mitchel and Weiss, 2014) established that adults can utilize facial cues (i.e., visual prosody) to identify word boundaries in fluent speech. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2014). Subjects spent the most time watching the eyes and mouth. A significant trend in gaze durations was found with the longest gaze duration on the mouth, followed by the eyes and then the nose. In addition, eye-gaze patterns changed across familiarization as subjects learned the word boundaries, showing decreased attention to the mouth in later blocks while attention on other facial features remained consistent. These findings highlight the importance of the visual component of speech processing and suggest that the mouth may play a critical role in visual speech segmentation. PMID:26869959
A new universality class in corpus of texts; A statistical physics study

NASA Astrophysics Data System (ADS)

Najafi, Elham; Darooneh, Amir H.

2018-05-01

Text can be regarded as a complex system. There are some methods in statistical physics which can be used to study this system. In this work, by means of statistical physics methods, we reveal new universal behaviors of texts associating with the fractality values of words in a text. The fractality measure indicates the importance of words in a text by considering distribution pattern of words throughout the text. We observed a power law relation between fractality of text and vocabulary size for texts and corpora. We also observed this behavior in studying biological data.
Phonemic carryover perseveration: word blends.

PubMed

Buckingham, Hugh W; Christman, Sarah S

2004-11-01

This article will outline and describe the aphasic disorder of recurrent perseveration and will demonstrate how it interacts with the retrieval and production of spoken words in the language of fluent aphasic patients who have sustained damage to the left (dominant) posterior temporoparietal lobe. We will concentrate on the various kinds of sublexical segmental perseverations (the so-called phonemic carryovers of Santo Pietro and Rigrodsky) that most often play a role in the generation of word blendings. We will show how perseverative blends allow the clinician to better understand the dynamics of word and syllable production in fluent aphasia by scrutinizing the "onset/rime" and "onset/superrime" constituents of monosyllabic and polysyllabic words, respectively. We will demonstrate to the speech language pathologist the importance of the trochee stress pattern and the possibility that its metrical template may constitute a structural unit that can be perseverated.
E-Hitz: a word frequency list and a program for deriving psycholinguistic statistics in an agglutinative language (Basque).

PubMed

Perea, Manuel; Urkia, Miriam; Davis, Colin J; Agirre, Ainhoa; Laseka, Edurne; Carreiras, Manuel

2006-11-01

We describe a Windows program that enables users to obtain a broad range of statistics concerning the properties of word and nonword stimuli in an agglutinative language (Basque), including measures of word frequency (at the whole-word and lemma levels), bigram and biphone frequency, orthographic similarity, orthographic and phonological structure, and syllable-based measures. It is designed for use by researchers in psycholinguistics, particularly those concerned with recognition of isolated words and morphology. In addition to providing standard orthographic and phonological neighborhood measures, the program can be used to obtain information about other forms of orthographic similarity, such as transposed-letter similarity and embedded-word similarity. It is available free of charge from www .uv.es/mperea/E-Hitz.zip.
Phonological Words and Stuttering on Function Words

PubMed Central

Au-Yeung, James; Howell, Peter; Pilgrim, Lesley

2007-01-01

Stuttering on function words was examined in 51 people who stutter. The people who stutter were subdivided into young (2 to 6 years), middle (6 to 9 years), and older (9 to 12 years) child groups; teenagers (13 to 18 years); and adults (20 to 40 years). As reported by previous researchers, children up to about age 9 stuttered more on function words (pronouns, articles, prepositions, conjunctions, auxiliary verbs), whereas older people tended to stutter more on content words (nouns, main verbs, adverbs, adjectives). Function words in early positions in utterances, again as reported elsewhere, were more likely to be stuttered than function words at later positions in an utterance. This was most apparent for the younger groups of speakers. For the remaining analyses, utterances were segmented into phonological words on the basis of Selkirk’s work (1984). Stuttering rate was higher when function words occurred in early phonological word positions than other phonological word positions whether the phonological word appeared in initial position in an utterance or not. Stuttering rate was highly dependent on whether the function word occurred before or after the single content word allowed in Selkirk’s (1984) phonological words. This applied, once again, whether the phonological word was utterance-initial or not. It is argued that stuttering of function words before their content word in phonological words in young speakers is used as a delaying tactic when the forthcoming content word is not prepared for articulation. PMID:9771625
A System for Mailpiece ZIP Code Assignment through Contextual Analysis. Phase 2

DTIC Science & Technology

1991-03-01

Segmentation Address Block Interpretation Automatic Feature Generation Word Recognition Feature Detection Word Verification Optical Character Recognition Directory...in the Phase III effort. 1.1 Motivation The United States Postal Service (USPS) deploys large numbers of optical character recognition (OCR) machines...4):208-218, November 1986. [2] Gronmeyer, L. K., Ruffin, B. W., Lybanon, M. A., Neely, P. L., and Pierce, S. E. An Overview of Optical Character Recognition (OCR
Monitoring the capacity of working memory: Executive control and effects of listening effort

PubMed Central

Amichetti, Nicole M.; Stanley, Raymond S.; White, Alison G.

2013-01-01

In two experiments, we used an interruption-and-recall (IAR) task to explore listeners’ ability to monitor the capacity of working memory as new information arrived in real time. In this task, listeners heard recorded word lists with instructions to interrupt the input at the maximum point that would still allow for perfect recall. Experiment 1 demonstrated that the most commonly selected segment size closely matched participants’ memory span, as measured in a baseline span test. Experiment 2 showed that reducing the sound level of presented word lists to a suprathreshold but effortful listening level disrupted the accuracy of matching selected segment sizes with participants’ memory spans. The results are discussed in terms of whether online capacity monitoring may be subsumed under other, already enumerated working memory executive functions (inhibition, set shifting, and memory updating). PMID:23400826
ERP signatures of conscious and unconscious word and letter perception in an inattentional blindness paradigm.

PubMed

Schelonka, Kathryn; Graulty, Christian; Canseco-Gonzalez, Enriqueta; Pitts, Michael A

2017-09-01

A three-phase inattentional blindness paradigm was combined with ERPs. While participants performed a distracter task, line segments in the background formed words or consonant-strings. Nearly half of the participants failed to notice these word-forms and were deemed inattentionally blind. All participants noticed the word-forms in phase 2 of the experiment while they performed the same distracter task. In the final phase, participants performed a task on the word-forms. In all phases, including during inattentional blindness, word-forms elicited distinct ERPs during early latencies (∼200-280ms) suggesting unconscious orthographic processing. A subsequent ERP (∼320-380ms) similar to the visual awareness negativity appeared only when subjects were aware of the word-forms, regardless of the task. Finally, word-forms elicited a P3b (∼400-550ms) only when these stimuli were task-relevant. These results are consistent with previous inattentional blindness studies and help distinguish brain activity associated with pre- and post-perceptual processing from correlates of conscious perception. Copyright © 2017 Elsevier Inc. All rights reserved.
Feature Statistics Modulate the Activation of Meaning during Spoken Word Processing

ERIC Educational Resources Information Center

Devereux, Barry J.; Taylor, Kirsten I.; Randall, Billi; Geertzen, Jeroen; Tyler, Lorraine K.

2016-01-01

Understanding spoken words involves a rapid mapping from speech to conceptual representations. One distributed feature-based conceptual account assumes that the statistical characteristics of concepts' features--the number of concepts they occur in ("distinctiveness/sharedness") and likelihood of co-occurrence ("correlational…
Statistical learning in reading: variability in irrelevant letters helps children learn phonics skills.

PubMed

Apfelbaum, Keith S; Hazeltine, Eliot; McMurray, Bob

2013-07-01

Early reading abilities are widely considered to derive in part from statistical learning of regularities between letters and sounds. Although there is substantial evidence from laboratory work to support this, how it occurs in the classroom setting has not been extensively explored; there are few investigations of how statistics among letters and sounds influence how children actually learn to read or what principles of statistical learning may improve learning. We examined 2 conflicting principles that may apply to learning grapheme-phoneme-correspondence (GPC) regularities for vowels: (a) variability in irrelevant units may help children derive invariant relationships and (b) similarity between words may force children to use a deeper analysis of lexical structure. We trained 224 first-grade students on a small set of GPC regularities for vowels, embedded in words with either high or low consonant similarity, and tested their generalization to novel tasks and words. Variability offered a consistent benefit over similarity for trained and new words in both trained and new tasks.
What's in a Face? Visual Contributions to Speech Segmentation

ERIC Educational Resources Information Center

Mitchel, Aaron D.; Weiss, Daniel J.

2010-01-01

Recent research has demonstrated that adults successfully segment two interleaved artificial speech streams with incongruent statistics (i.e., streams whose combined statistics are noisier than the encapsulated statistics) only when provided with an indexical cue of speaker voice. In a series of five experiments, our study explores whether…
Lexical statistics of competition in L2 versus L1 listening

NASA Astrophysics Data System (ADS)

Cutler, Anne

2005-09-01

Spoken-word recognition involves multiple activation of alternative word candidates and competition between these alternatives. Phonemic confusions in L2 listening increase the number of potentially active words, thus slowing word recognition by adding competitors. This study used a 70,000-word English lexicon backed by frequency statistics from a 17,900,000-word corpus to assess the competition increase resulting from two representative phonemic confusions, one vocalic (ae/E) and one consonantal (r/l), in L2 versus L1 listening. The first analysis involved word embedding. Embedded words (cat in cattle, rib in ribbon) cause competition, which phonemic confusion can increase (cat in kettle, rib in liberty). The average increase in number of embedded words was 59.6 and 48.3 temporary ambiguity. Even when no embeddings are present, multiple alternatives are possible: para- can become parrot, paradise, etc., but also pallet, palace given /r/-/l/ confusion. Phoneme confusions (vowel or consonant) in first or second position in the word approximately doubled the number of activated candidates; confusions later in the word increased activation by on average 53 third, 42 confusions significantly increase competition for L2 compared with L1 listeners.
Cross-situational statistical word learning in young children.

PubMed

Suanda, Sumarga H; Mugwanya, Nassali; Namy, Laura L

2014-10-01

Recent empirical work has highlighted the potential role of cross-situational statistical word learning in children's early vocabulary development. In the current study, we tested 5- to 7-year-old children's cross-situational learning by presenting children with a series of ambiguous naming events containing multiple words and multiple referents. Children rapidly learned word-to-object mappings by attending to the co-occurrence regularities across these ambiguous naming events. The current study begins to address the mechanisms underlying children's learning by demonstrating that the diversity of learning contexts affects performance. The implications of the current findings for the role of cross-situational word learning at different points in development are discussed along with the methodological implications of employing school-aged children to test hypotheses regarding the mechanisms supporting early word learning. Copyright © 2014 Elsevier Inc. All rights reserved.
WordCluster: detecting clusters of DNA words and genomic elements

PubMed Central

2011-01-01

Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981
The Activation of Embedded Words in Spoken Word Recognition

PubMed Central

Zhang, Xujin; Samuel, Arthur G.

2015-01-01

The current study investigated how listeners understand English words that have shorter words embedded in them. A series of auditory-auditory priming experiments assessed the activation of six types of embedded words (2 embedded positions × 3 embedded proportions) under different listening conditions. Facilitation of lexical decision responses to targets (e.g., pig) associated with words embedded in primes (e.g., hamster) indexed activation of the embedded words (e.g., ham). When the listening conditions were optimal, isolated embedded words (e.g., ham) primed their targets in all six conditions (Experiment 1a). Within carrier words (e.g., hamster), the same set of embedded words produced priming only when they were at the beginning or comprised a large proportion of the carrier word (Experiment 1b). When the listening conditions were made suboptimal by expanding or compressing the primes, significant priming was found for isolated embedded words (Experiment 2a), but no priming was produced when the carrier words were compressed/expanded (Experiment 2b). Similarly, priming was eliminated when the carrier words were presented with one segment replaced by noise (Experiment 3). When cognitive load was imposed, priming for embedded words was again found when they were presented in isolation (Experiment 4a), but not when they were embedded in the carrier words (Experiment 4b). The results suggest that both embedded position and proportion play important roles in the activation of embedded words, but that such activation only occurs under unusually good listening conditions. PMID:25593407

The Activation of Embedded Words in Spoken Word Recognition.

PubMed

Zhang, Xujin; Samuel, Arthur G

2015-01-01

The current study investigated how listeners understand English words that have shorter words embedded in them. A series of auditory-auditory priming experiments assessed the activation of six types of embedded words (2 embedded positions × 3 embedded proportions) under different listening conditions. Facilitation of lexical decision responses to targets (e.g., pig) associated with words embedded in primes (e.g., hamster ) indexed activation of the embedded words (e.g., ham ). When the listening conditions were optimal, isolated embedded words (e.g., ham ) primed their targets in all six conditions (Experiment 1a). Within carrier words (e.g., hamster ), the same set of embedded words produced priming only when they were at the beginning or comprised a large proportion of the carrier word (Experiment 1b). When the listening conditions were made suboptimal by expanding or compressing the primes, significant priming was found for isolated embedded words (Experiment 2a), but no priming was produced when the carrier words were compressed/expanded (Experiment 2b). Similarly, priming was eliminated when the carrier words were presented with one segment replaced by noise (Experiment 3). When cognitive load was imposed, priming for embedded words was again found when they were presented in isolation (Experiment 4a), but not when they were embedded in the carrier words (Experiment 4b). The results suggest that both embedded position and proportion play important roles in the activation of embedded words, but that such activation only occurs under unusually good listening conditions.
Statistical label fusion with hierarchical performance models

PubMed Central

Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.

2014-01-01

Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally – fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy. PMID:24817809
The contrast between alveolar and velar stops with typical speech data: acoustic and articulatory analyses.

PubMed

Melo, Roberta Michelon; Mota, Helena Bolli; Berti, Larissa Cristina

2017-06-08

This study used acoustic and articulatory analyses to characterize the contrast between alveolar and velar stops with typical speech data, comparing the parameters (acoustic and articulatory) of adults and children with typical speech development. The sample consisted of 20 adults and 15 children with typical speech development. The analyzed corpus was organized through five repetitions of each target-word (/'kap ə/, /'tapə/, /'galo/ e /'daɾə/). These words were inserted into a carrier phrase and the participant was asked to name them spontaneously. Simultaneous audio and video data were recorded (tongue ultrasound images). The data was submitted to acoustic analyses (voice onset time; spectral peak and burst spectral moments; vowel/consonant transition and relative duration measures) and articulatory analyses (proportion of significant axes of the anterior and posterior tongue regions and description of tongue curves). Acoustic and articulatory parameters were effective to indicate the contrast between alveolar and velar stops, mainly in the adult group. Both speech analyses showed statistically significant differences between the two groups. The acoustic and articulatory parameters provided signals to characterize the phonic contrast of speech. One of the main findings in the comparison between adult and child speech was evidence of articulatory refinement/maturation even after the period of segment acquisition.
ERP correlates of unexpected word forms in a picture–word study of infants and adults

PubMed Central

Duta, M.D.; Styles, S.J.; Plunkett, K.

2012-01-01

We tested 14-month-olds and adults in an event-related potentials (ERPs) study in which pictures of familiar objects generated expectations about upcoming word forms. Expected word forms labelled the picture (word condition), while unexpected word forms mismatched by either a small deviation in word medial vowel height (mispronunciation condition) or a large deviation from the onset of the first speech segment (pseudoword condition). Both infants and adults showed sensitivity to both types of unexpected word form. Adults showed a chain of discrete effects: positivity over the N1 wave, negativity over the P2 wave (PMN effect) and negativity over the N2 wave (N400 effect). Infants showed a similar pattern, including a robust effect similar to the adult P2 effect. These observations were underpinned by a novel visualisation method which shows the dynamics of the ERP within bands of the scalp over time. The results demonstrate shared processing mechanisms across development, as even subtle deviations from expected word forms were indexed in both age groups by a reduction in the amplitude of characteristic waves in the early auditory evoked potential. PMID:22483072
Statistical Word Learning in Children with Autism Spectrum Disorder and Specific Language Impairment

ERIC Educational Resources Information Center

Haebig, Eileen; Saffran, Jenny R.; Ellis Weismer, Susan

2017-01-01

Background: Word learning is an important component of language development that influences child outcomes across multiple domains. Despite the importance of word knowledge, word-learning mechanisms are poorly understood in children with specific language impairment (SLI) and children with autism spectrum disorder (ASD). This study examined…
[Analyses of segment motor function in patients with degenerative lumbar disease on the treatment of WavefleX dynamic stabilization system].

PubMed

Wu, Junsong; Du, Junhua; Jiang, Xiangyun; Wang, Quan; Li, Xigong; Du, Jingyu; Lin, Xiangjin

2014-06-17

To explore the changes of range-of-motion (ROM) in patients with degenerative lumbar disease on the treatment of WavefleX dynamic stabilization system and examine the postoperative lumbar regularity and tendency of ROM. Nine patients with degenerative lumbar disease on the treatment of WavefleX dynamic stabilization system were followed up with respect to ROMs at 5 timepoints within 12 months. Records of ROM were made for instrumented segments, adjacent segments and total lumbar. Compared with preoperation, ROMs in non-fusional segments with WavefleX dynamic stabilization system decreased statistical significantly (P < 0.05 or P < 0.01) at different timepoints; ROMs in adjacent segments increased at some levels without wide statistical significance. The exception was single L3/4 at Month 12 (P < 0.05) versus control group simultaneously at the levels of L3/4, L4/5 and L5/S1, ROMs decreased at Months 6 and 12 with wide statistical significance (P < 0.05 or P < 0.01). ROMs in total lumbar had statistical significant decrease (P < 0.01) in both group of non-fusional segments and hybrid group of non-fusion and fusion. The trends of continuous augments were observed during follow-ups. Statistically significant augments were also acquired at 4 timepoints as compared to control group (P < 0.01). The treatment of degenerative lumbar diseases with WavefleX dynamic stabilization system may limit excessive extension/inflexion and preserve some motor functions. Moreover, it can sustain physiological lordosis, decrease and transfer disc load in adjacent segments to prevent early degeneration of adjacent segment. Trends of motor function augment in total lumbar need to be confirmed during future long-term follow-ups.
The change of adjacent segment after cervical disc arthroplasty compared with anterior cervical discectomy and fusion: a meta-analysis of randomized controlled trials.

PubMed

Dong, Liang; Xu, Zhengwei; Chen, Xiujin; Wang, Dongqi; Li, Dichen; Liu, Tuanjing; Hao, Dingjun

2017-10-01

Many meta-analyses have been performed to study the efficacy of cervical disc arthroplasty (CDA) compared with anterior cervical discectomy and fusion (ACDF); however, there are few data referring to adjacent segment within these meta-analyses, or investigators are unable to arrive at the same conclusion in the few meta-analyses about adjacent segment. With the increased concerns surrounding adjacent segment degeneration (ASDeg) and adjacent segment disease (ASDis) after anterior cervical surgery, it is necessary to perform a comprehensive meta-analysis to analyze adjacent segment parameters. To perform a comprehensive meta-analysis to elaborate adjacent segment motion, degeneration, disease, and reoperation of CDA compared with ACDF. Meta-analysis of randomized controlled trials (RCTs). PubMed, Embase, and Cochrane Library were searched for RCTs comparing CDA and ACDF before May 2016. The analysis parameters included follow-up time, operative segments, adjacent segment motion, ASDeg, ASDis, and adjacent segment reoperation. The risk of bias scale was used to assess the papers. Subgroup analysis and sensitivity analysis were used to analyze the reason for high heterogeneity. Twenty-nine RCTs fulfilled the inclusion criteria. Compared with ACDF, the rate of adjacent segment reoperation in the CDA group was significantly lower (p<.01), and the advantage of that group in reducing adjacent segment reoperation increases with increasing follow-up time by subgroup analysis. There was no statistically significant difference in ASDeg between CDA and ACDF within the 24-month follow-up period; however, the rate of ASDeg in CDA was significantly lower than that of ACDF with the increase in follow-up time (p<.01). There was no statistically significant difference in ASDis between CDA and ACDF (p>.05). Cervical disc arthroplasty provided a lower adjacent segment range of motion (ROM) than did ACDF, but the difference was not statistically significant. Compared with ACDF, the advantages of CDA were lower ASDeg and adjacent segment reoperation. However, there was no statistically significant difference in ASDis and adjacent segment ROM. Copyright © 2017 Elsevier Inc. All rights reserved.
The Missing Link: The Use of Link Words and Phrases as a Link to Manuscript Quality

ERIC Educational Resources Information Center

Onwuegbuzie, Anthony J.

2016-01-01

In this article, I provide a typology of transition words/phrases. This typology comprises 12 dimensions of link words/phrases that capture 277 link words/phrases. Using QDA Miner, WordStat, and SPSS--a computer-assisted mixed methods data analysis software, content analysis software, and statistical software, respectively--I analyzed 74…
On the Utility of Content Analysis in Author Attribution: "The Federalist."

ERIC Educational Resources Information Center

Martindale, Colin; McKenzie, Dean

1995-01-01

Compares the success of lexical statistics, content analysis, and function words in determining the true author of "The Federalist." The function word approach proved most successful in attributing the papers to James Madison. Lexical statistics contributed nothing, while content analytic measures resulted in some success. (MJP)
Scaling laws and fluctuations in the statistics of word frequencies

NASA Astrophysics Data System (ADS)

Gerlach, Martin; Altmann, Eduardo G.

2014-11-01

In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths.
Entropy Based Classifier Combination for Sentence Segmentation

DTIC Science & Technology

2007-01-01

speaker diarization system to divide the audio data into hypothetical speakers [17...the prosodic feature also includes turn-based features which describe the position of a word in relation to diarization seg- mentation. The speaker ...ro- bust speaker segmentation: the ICSI-SRI fall 2004 diarization system,” in Proc. RT-04F Workshop, 2004. [18] “The rich transcription fall 2003,” http://nist.gov/speech/tests/rt/rt2003/fall/docs/rt03-fall-eval- plan-v9.pdf.
Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

PubMed

Dai, Qi; Yang, Yanchun; Wang, Tianming

2008-10-15

Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Reading handprinted addresses on IRS tax forms

NASA Astrophysics Data System (ADS)

Ramanaprasad, Vemulapati; Shin, Yong-Chul; Srihari, Sargur N.

1996-03-01

The hand-printed address recognition system described in this paper is a part of the Name and Address Block Reader (NABR) system developed by the Center of Excellence for Document Analysis and Recognition (CEDAR). NABR is currently being used by the IRS to read address blocks (hand-print as well as machine-print) on fifteen different tax forms. Although machine- print address reading was relatively straightforward, hand-print address recognition has posed some special challenges due to demands on processing speed (with an expected throughput of 8450 forms/hour) and recognition accuracy. We discuss various subsystems involved in hand- printed address recognition, including word segmentation, word recognition, digit segmentation, and digit recognition. We also describe control strategies used to make effective use of these subsystems to maximize recognition accuracy. We present system performance on 931 address blocks in recognizing various fields, such as city, state, ZIP Code, street number and name, and personal names.
Statistical mechanics of letters in words

PubMed Central

Stephens, Greg J.; Bialek, William

2013-01-01

We consider words as a network of interacting letters, and approximate the probability distribution of states taken on by this network. Despite the intuition that the rules of English spelling are highly combinatorial and arbitrary, we find that maximum entropy models consistent with pairwise correlations among letters provide a surprisingly good approximation to the full statistics of words, capturing ~92% of the multi-information in four-letter words and even “discovering” words that were not represented in the data. These maximum entropy models incorporate letter interactions through a set of pairwise potentials and thus define an energy landscape on the space of possible words. Guided by the large letter redundancy we seek a lower-dimensional encoding of the letter distribution and show that distinctions between local minima in the landscape account for ~68% of the four-letter entropy. We suggest that these states provide an effective vocabulary which is matched to the frequency of word use and much smaller than the full lexicon. PMID:20866490
Competitive Processes in Cross-Situational Word Learning

PubMed Central

Yurovsky, Daniel; Yu, Chen; Smith, Linda B.

2013-01-01

Cross-situational word learning, like any statistical learning problem, involves tracking the regularities in the environment. But the information that learners pick up from these regularities is dependent on their learning mechanism. This paper investigates the role of one type of mechanism in statistical word learning: competition. Competitive mechanisms would allow learners to find the signal in noisy input, and would help to explain the speed with which learners succeed in statistical learning tasks. Because cross-situational word learning provides information at multiple scales – both within and across trials/situations –learners could implement competition at either or both of these scales. A series of four experiments demonstrate that cross-situational learning involves competition at both levels of scale, and that these mechanisms interact to support rapid learning. The impact of both of these mechanisms is then considered from the perspective of a process-level understanding of cross-situational learning. PMID:23607610
Competitive processes in cross-situational word learning.

PubMed

Yurovsky, Daniel; Yu, Chen; Smith, Linda B

2013-07-01

Cross-situational word learning, like any statistical learning problem, involves tracking the regularities in the environment. However, the information that learners pick up from these regularities is dependent on their learning mechanism. This article investigates the role of one type of mechanism in statistical word learning: competition. Competitive mechanisms would allow learners to find the signal in noisy input and would help to explain the speed with which learners succeed in statistical learning tasks. Because cross-situational word learning provides information at multiple scales-both within and across trials/situations-learners could implement competition at either or both of these scales. A series of four experiments demonstrate that cross-situational learning involves competition at both levels of scale, and that these mechanisms interact to support rapid learning. The impact of both of these mechanisms is considered from the perspective of a process-level understanding of cross-situational learning. Copyright © 2013 Cognitive Science Society, Inc.
Primal/dual linear programming and statistical atlases for cartilage segmentation.

PubMed

Glocker, Ben; Komodakis, Nikos; Paragios, Nikos; Glaser, Christian; Tziritas, Georgios; Navab, Nassir

2007-01-01

In this paper we propose a novel approach for automatic segmentation of cartilage using a statistical atlas and efficient primal/dual linear programming. To this end, a novel statistical atlas construction is considered from registered training examples. Segmentation is then solved through registration which aims at deforming the atlas such that the conditional posterior of the learned (atlas) density is maximized with respect to the image. Such a task is reformulated using a discrete set of deformations and segmentation becomes equivalent to finding the set of local deformations which optimally match the model to the image. We evaluate our method on 56 MRI data sets (28 used for the model and 28 used for evaluation) and obtain a fully automatic segmentation of patella cartilage volume with an overlap ratio of 0.84 with a sensitivity and specificity of 94.06% and 99.92%, respectively.
Ventral and dorsal streams for choosing word order during sentence production

PubMed Central

Thothathiri, Malathi; Rattinger, Michelle

2015-01-01

Proficient language use requires speakers to vary word order and choose between different ways of expressing the same meaning. Prior statistical associations between individual verbs and different word orders are known to influence speakers’ choices, but the underlying neural mechanisms are unknown. Here we show that distinct neural pathways are used for verbs with different statistical associations. We manipulated statistical experience by training participants in a language containing novel verbs and two alternative word orders (agent-before-patient, AP; patient-before-agent, PA). Some verbs appeared exclusively in AP, others exclusively in PA, and yet others in both orders. Subsequently, we used sparse sampling neuroimaging to examine the neural substrates as participants generated new sentences in the scanner. Behaviorally, participants showed an overall preference for AP order, but also increased PA order for verbs experienced in that order, reflecting statistical learning. Functional activation and connectivity analyses revealed distinct networks underlying the increased PA production. Verbs experienced in both orders during training preferentially recruited a ventral stream, indicating the use of conceptual processing for mapping meaning to word order. In contrast, verbs experienced solely in PA order recruited dorsal pathways, indicating the use of selective attention and sensorimotor integration for choosing words in the right order. These results show that the brain tracks the structural associations of individual verbs and that the same structural output may be achieved via ventral or dorsal streams, depending on the type of regularities in the input. PMID:26621706
Exploration in free word association networks: models and experiment.

PubMed

Ludueña, Guillermo A; Behzad, Mehran Djalali; Gros, Claudius

2014-05-01

Free association is a task that requires a subject to express the first word to come to their mind when presented with a certain cue. It is a task which can be used to expose the basic mechanisms by which humans connect memories. In this work, we have made use of a publicly available database of free associations to model the exploration of the averaged network of associations using a statistical and the adaptive control of thought-rational (ACT-R) model. We performed, in addition, an online experiment asking participants to navigate the averaged network using their individual preferences for word associations. We have investigated the statistics of word repetitions in this guided association task. We find that the considered models mimic some of the statistical properties, viz the probability of word repetitions, the distance between repetitions and the distribution of association chain lengths, of the experiment, with the ACT-R model showing a particularly good fit to the experimental data for the more intricate properties as, for instance, the ratio of repetitions per length of association chains.
Russian Character Recognition using Self-Organizing Map

NASA Astrophysics Data System (ADS)

Gunawan, D.; Arisandi, D.; Ginting, F. M.; Rahmat, R. F.; Amalia, A.

2017-01-01

The World Tourism Organization (UNWTO) in 2014 released that there are 28 million visitors who visit Russia. Most of the visitors might have problem in typing Russian word when using digital dictionary. This is caused by the letters, called Cyrillic that used by the Russian and the countries around it, have different shape than Latin letters. The visitors might not familiar with Cyrillic. This research proposes an alternative way to input the Cyrillic words. Instead of typing the Cyrillic words directly, camera can be used to capture image of the words as input. The captured image is cropped, then several pre-processing steps are applied such as noise filtering, binary image processing, segmentation and thinning. Next, the feature extraction process is applied to the image. Cyrillic letters recognition in the image is done by utilizing Self-Organizing Map (SOM) algorithm. SOM successfully recognizes 89.09% Cyrillic letters from the computer-generated images. On the other hand, SOM successfully recognizes 88.89% Cyrillic letters from the image captured by the smartphone’s camera. For the word recognition, SOM successfully recognized 292 words and partially recognized 58 words from the image captured by the smartphone’s camera. Therefore, the accuracy of the word recognition using SOM is 83.42%

Text-image alignment for historical handwritten documents

NASA Astrophysics Data System (ADS)

Zinger, S.; Nerbonne, J.; Schomaker, L.

2009-01-01

We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten lines are automatically segmented from the scanned pages of historical documents and then manually transcribed. To train automatic routines to detect words in an image of handwritten text, we need a training set - images of words with their transcriptions. We present our results on aligning words from the images of handwritten lines and their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve the alignment results considerably. To take into account the relative word length, we define the expressions for the cost function that has to be minimized for aligning text words with their images. We apply right to left alignment as well as alignment based on exhaustive search. The quality assessment of these alignments shows correct results for 69% of words from 100 lines, or 90% of partially correct and correct alignments combined.
Influence of musical expertise on segmental and tonal processing in Mandarin Chinese.

PubMed

Marie, Céline; Delogu, Franco; Lampis, Giulia; Belardinelli, Marta Olivetti; Besson, Mireille

2011-10-01

A same-different task was used to test the hypothesis that musical expertise improves the discrimination of tonal and segmental (consonant, vowel) variations in a tone language, Mandarin Chinese. Two four-word sequences (prime and target) were presented to French musicians and nonmusicians unfamiliar with Mandarin, and event-related brain potentials were recorded. Musicians detected both tonal and segmental variations more accurately than nonmusicians. Moreover, tonal variations were associated with higher error rate than segmental variations and elicited an increased N2/N3 component that developed 100 msec earlier in musicians than in nonmusicians. Finally, musicians also showed enhanced P3b components to both tonal and segmental variations. These results clearly show that musical expertise influenced the perceptual processing as well as the categorization of linguistic contrasts in a foreign language. They show positive music-to-language transfer effects and open new perspectives for the learning of tone languages.
War and peace: morphemes and full forms in a noninteractive activation parallel dual-route model.

PubMed

Baayen, H; Schreuder, R

This article introduces a computational tool for modeling the process of morphological segmentation in visual and auditory word recognition in the framework of a parallel dual-route model. Copyright 1999 Academic Press.
Speech-Enabled Interfaces for Travel Information Systems with Large Grammars

NASA Astrophysics Data System (ADS)

Zhao, Baoli; Allen, Tony; Bargiela, Andrzej

This paper introduces three grammar-segmentation methods capable of handling the large grammar issues associated with producing a real-time speech-enabled VXML bus travel application for London. Large grammars tend to produce relatively slow recognition interfaces and this work shows how this limitation can be successfully addressed. Comparative experimental results show that the novel last-word recognition based grammar segmentation method described here achieves an optimal balance between recognition rate, speed of processing and naturalness of interaction.
A multi-object statistical atlas adaptive for deformable registration errors in anomalous medical image segmentation

NASA Astrophysics Data System (ADS)

Botter Martins, Samuel; Vallin Spina, Thiago; Yasuda, Clarissa; Falcão, Alexandre X.

2017-02-01

Statistical Atlases have played an important role towards automated medical image segmentation. However, a challenge has been to make the atlas more adaptable to possible errors in deformable registration of anomalous images, given that the body structures of interest for segmentation might present significant differences in shape and texture. Recently, deformable registration errors have been accounted by a method that locally translates the statistical atlas over the test image, after registration, and evaluates candidate objects from a delineation algorithm in order to choose the best one as final segmentation. In this paper, we improve its delineation algorithm and extend the model to be a multi-object statistical atlas, built from control images and adaptable to anomalous images, by incorporating a texture classifier. In order to provide a first proof of concept, we instantiate the new method for segmenting, object-by-object and all objects simultaneously, the left and right brain hemispheres, and the cerebellum, without the brainstem, and evaluate it on MRT1-images of epilepsy patients before and after brain surgery, which removed portions of the temporal lobe. The results show efficiency gain with statistically significant higher accuracy, using the mean Average Symmetric Surface Distance, with respect to the original approach.
Statistical image segmentation for the detection of skin lesion borders in UV fluorescence excitation

NASA Astrophysics Data System (ADS)

Ortega-Martinez, Antonio; Padilla-Martinez, Juan Pablo; Franco, Walfre

2016-04-01

The skin contains several fluorescent molecules or fluorophores that serve as markers of structure, function and composition. UV fluorescence excitation photography is a simple and effective way to image specific intrinsic fluorophores, such as the one ascribed to tryptophan which emits at a wavelength of 345 nm upon excitation at 295 nm, and is a marker of cellular proliferation. Earlier, we built a clinical UV photography system to image cellular proliferation. In some samples, the naturally low intensity of the fluorescence can make it difficult to separate the fluorescence of cells in higher proliferation states from background fluorescence and other imaging artifacts -- like electronic noise. In this work, we describe a statistical image segmentation method to separate the fluorescence of interest. Statistical image segmentation is based on image averaging, background subtraction and pixel statistics. This method allows to better quantify the intensity and surface distributions of fluorescence, which in turn simplify the detection of borders. Using this method we delineated the borders of highly-proliferative skin conditions and diseases, in particular, allergic contact dermatitis, psoriatic lesions and basal cell carcinoma. Segmented images clearly define lesion borders. UV fluorescence excitation photography along with statistical image segmentation may serve as a quick and simple diagnostic tool for clinicians.
Proximate factors associated with speech intelligibility in children with cochlear implants: A preliminary study.

PubMed

Chin, Steven B; Kuhns, Matthew J

2014-01-01

The purpose of this descriptive pilot study was to examine possible relationships among speech intelligibility and structural characteristics of speech in children who use cochlear implants. The Beginners Intelligibility Test (BIT) was administered to 10 children with cochlear implants, and the intelligibility of the words in the sentences was judged by panels of naïve adult listeners. Additionally, several qualitative and quantitative measures of word omission, segment correctness, duration, and intonation variability were applied to the sentences used to assess intelligibility. Correlational analyses were conducted to determine if BIT scores and the other speech parameters were related. There was a significant correlation between BIT score and percent words omitted, but no other variables correlated significantly with BIT score. The correlation between intelligibility and word omission may be task-specific as well as reflective of memory limitations.
Grammar and Frequency Effects in the Acquisition of Prosodic Words in European Portuguese

ERIC Educational Resources Information Center

Vigario, Marina; Freitas, Maria Joao; Frota, Sonia

2006-01-01

This paper investigates the acquisition of prosodic words in European Portuguese (EP) through analysis of grammatical and statistical properties of the target language and child speech. The analysis of grammatical properties shows that there are solid cues to the prosodic word (PW) in EP, and the presence of early word-based phonology in child…
Word lengths are optimized for efficient communication.

PubMed

Piantadosi, Steven T; Tily, Harry; Gibson, Edward

2011-03-01

We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.
Whole vertebral bone segmentation method with a statistical intensity-shape model based approach

NASA Astrophysics Data System (ADS)

Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer

2011-03-01

An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.
Music reading expertise modulates hemispheric lateralization in English word processing but not in Chinese character processing.

PubMed

Li, Sara Tze Kwan; Hsiao, Janet Hui-Wen

2018-07-01

Music notation and English word reading both involve mapping horizontally arranged visual components to components in sound, in contrast to reading in logographic languages such as Chinese. Accordingly, music-reading expertise may influence English word processing more than Chinese character processing. Here we showed that musicians named English words significantly faster than non-musicians when words were presented in the left visual field/right hemisphere (RH) or the center position, suggesting an advantage of RH processing due to music reading experience. This effect was not observed in Chinese character naming. A follow-up ERP study showed that in a sequential matching task, musicians had reduced RH N170 responses to English non-words under the processing of musical segments as compared with non-musicians, suggesting a shared visual processing mechanism in the RH between music notation and English non-word reading. This shared mechanism may be related to the letter-by-letter, serial visual processing that characterizes RH English word recognition (e.g., Lavidor & Ellis, 2001), which may consequently facilitate English word processing in the RH in musicians. Thus, music reading experience may have differential influences on the processing of different languages, depending on their similarities in the cognitive processes involved. Copyright © 2018 Elsevier B.V. All rights reserved.
Evaluation of a segment-based LANDSAT full-frame approach to corp area estimation

NASA Technical Reports Server (NTRS)

Bauer, M. E. (Principal Investigator); Hixson, M. M.; Davis, S. M.

1981-01-01

As the registration of LANDSAT full frames enters the realm of current technology, sampling methods should be examined which utilize other than the segment data used for LACIE. The effect of separating the functions of sampling for training and sampling for area estimation. The frame selected for analysis was acquired over north central Iowa on August 9, 1978. A stratification of he full-frame was defined. Training data came from segments within the frame. Two classification and estimation procedures were compared: statistics developed on one segment were used to classify that segment, and pooled statistics from the segments were used to classify a systematic sample of pixels. Comparisons to USDA/ESCS estimates illustrate that the full-frame sampling approach can provide accurate and precise area estimates.
2.5-Year-Olds Use Cross-Situational Consistency to Learn Verbs under Referential Uncertainty

ERIC Educational Resources Information Center

Scott, Rose M.; Fisher, Cynthia

2012-01-01

Recent evidence shows that children can use cross-situational statistics to learn new object labels under referential ambiguity (e.g., Smith & Yu, 2008). Such evidence has been interpreted as support for proposals that statistical information about word-referent co-occurrence plays a powerful role in word learning. But object labels represent only…
Are Young Children with Cochlear Implants Sensitive to the Statistics of Words in the Ambient Spoken Language?

ERIC Educational Resources Information Center

Guo, Ling-Yu; McGregor, Karla K.; Spencer, Linda J.

2015-01-01

Purpose: The purpose of this study was to determine whether children with cochlear implants (CIs) are sensitive to statistical characteristics of words in the ambient spoken language, whether that sensitivity changes in expected ways as their spoken lexicon grows, and whether that sensitivity varies with unilateral or bilateral implantation.…
Statistical Clustering and the Contents of the Infant Vocabulary

ERIC Educational Resources Information Center

Swingley, Daniel

2005-01-01

Infants parse speech into word-sized units according to biases that develop in the first year. One bias, present before the age of 7 months, is to cluster syllables that tend to co-occur. The present computational research demonstrates that this statistical clustering bias could lead to the extraction of speech sequences that are actual words,…
Real-world visual statistics and infants' first-learned object names

PubMed Central

Clerkin, Elizabeth M.; Hart, Elizabeth; Rehg, James M.; Yu, Chen

2017-01-01

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered with many different objects in view. However, the frequency distribution of object categories was extremely right skewed such that a very small set of objects was pervasively present—a fact that may substantially reduce the problem of referential ambiguity. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’. PMID:27872373
Multi-scales region segmentation for ROI separation in digital mammograms

NASA Astrophysics Data System (ADS)

Zhang, Dapeng; Zhang, Di; Li, Yue; Wang, Wei

2017-02-01

Mammography is currently the most effective imaging modality used by radiologists for the screening of breast cancer. Segmentation is one of the key steps in the process of developing anatomical models for calculation of safe medical dose of radiation. This paper explores the potential of the statistical region merging segmentation technique for Breast segmentation in digital mammograms. First, the mammograms are pre-processing for regions enhancement, then the enhanced images are segmented using SRM with multi scales, finally these segmentations are combined for region of interest (ROI) separation and edge detection. The proposed algorithm uses multi-scales region segmentation in order to: separate breast region from background region, region edge detection and ROIs separation. The experiments are performed using a data set of mammograms from different patients, demonstrating the validity of the proposed criterion. Results show that, the statistical region merging segmentation algorithm actually can work on the segmentation of medical image and more accurate than another methods. And the outcome shows that the technique has a great potential to become a method of choice for segmentation of mammograms.
Statistical shape (ASM) and appearance (AAM) models for the segmentation of the cerebellum in fetal ultrasound

NASA Astrophysics Data System (ADS)

Reyes López, Misael; Arámbula Cosío, Fernando

2017-11-01

The cerebellum is an important structure to determine the gestational age of the fetus, moreover most of the abnormalities it presents are related to growth disorders. In this work, we present the results of the segmentation of the fetal cerebellum applying statistical shape and appearance models. Both models were tested on ultrasound images of the fetal brain taken from 23 pregnant women, between 18 and 24 gestational weeks. The accuracy results obtained on 11 ultrasound images show a mean Hausdorff distance of 6.08 mm between the manual segmentation and the segmentation using active shape model, and a mean Hausdorff distance of 7.54 mm between the manual segmentation and the segmentation using active appearance model. The reported results demonstrate that the active shape model is more robust in the segmentation of the fetal cerebellum in ultrasound images.
Off-lexicon online Arabic handwriting recognition using neural network

NASA Astrophysics Data System (ADS)

Yahia, Hamdi; Chaabouni, Aymen; Boubaker, Houcine; Alimi, Adel M.

2017-03-01

This paper highlights a new method for online Arabic handwriting recognition based on graphemes segmentation. The main contribution of our work is to explore the utility of Beta-elliptic model in segmentation and features extraction for online handwriting recognition. Indeed, our method consists in decomposing the input signal into continuous part called graphemes based on Beta-Elliptical model, and classify them according to their position in the pseudo-word. The segmented graphemes are then described by the combination of geometric features and trajectory shape modeling. The efficiency of the considered features has been evaluated using feed forward neural network classifier. Experimental results using the benchmarking ADAB Database show the performance of the proposed method.
Effects of word frequency and modality on sentence comprehension impairments in people with aphasia.

PubMed

DeDe, Gayle

2012-05-01

It is well known that people with aphasia have sentence comprehension impairments. The present study investigated whether lexical factors contribute to sentence comprehension impairments in both the auditory and written modalities using online measures of sentence processing. People with aphasia and non brain-damaged controls participated in the experiment (n = 8 per group). Twenty-one sentence pairs containing high- and low-frequency words were presented in self-paced listening and reading tasks. The sentences were syntactically simple and differed only in the critical words. The dependent variables were response times for critical segments of the sentence and accuracy on the comprehension questions. The results showed that word frequency influences performance on measures of sentence comprehension in people with aphasia. The accuracy data on the comprehension questions suggested that people with aphasia have more difficulty understanding sentences containing low-frequency words in the written compared to auditory modality. Both group and single-case analyses of the response time data also indicated that people with aphasia experience more difficulty with reading than listening. Sentence comprehension in people with aphasia is influenced by word frequency and presentation modality.

Syllables and bigrams: orthographic redundancy and syllabic units affect visual word recognition at different processing levels.

PubMed

Conrad, Markus; Carreiras, Manuel; Tamm, Sascha; Jacobs, Arthur M

2009-04-01

Over the last decade, there has been increasing evidence for syllabic processing during visual word recognition. If syllabic effects prove to be independent from orthographic redundancy, this would seriously challenge the ability of current computational models to account for the processing of polysyllabic words. Three experiments are presented to disentangle effects of the frequency of syllabic units and orthographic segments in lexical decision. In Experiment 1 the authors obtained an inhibitory syllable frequency effect that was unaffected by the presence or absence of a bigram trough at the syllable boundary. In Experiments 2 and 3 an inhibitory effect of initial syllable frequency but a facilitative effect of initial bigram frequency emerged when manipulating 1 of the 2 measures and controlling for the other in Spanish words starting with consonant-vowel syllables. The authors conclude that effects of syllable frequency and letter-cluster frequency are independent and arise at different processing levels of visual word recognition. Results are discussed within the framework of an interactive activation model of visual word recognition. (c) 2009 APA, all rights reserved.
Dissociation of tone and vowel processing in Mandarin idioms.

PubMed

Hu, Jiehui; Gao, Shan; Ma, Weiyi; Yao, Dezhong

2012-09-01

Using event-related potentials, this study measured the access of suprasegmental (tone) and segmental (vowel) information in spoken word recognition with Mandarin idioms. Participants performed a delayed-response acceptability task, in which they judged the correctness of the last word of each idiom, which might deviate from the correct word in either tone or vowel. Results showed that, compared with the correct idioms, a larger early negativity appeared only for vowel violation. Additionally, a larger N400 effect was observed for vowel mismatch than tone mismatch. A control experiment revealed that these differences were not due to low-level physical differences across conditions; instead, they represented the greater constraining power of vowels than tones in the lexical selection and semantic integration of the spoken words. Furthermore, tone violation elicited a more robust late positive component than vowel violation, suggesting different reanalyses of the two types of information. In summary, the current results support a functional dissociation of tone and vowel processing in spoken word recognition. Copyright © 2012 Society for Psychophysiological Research.
Electrophysiological evidence of statistical learning of long-distance dependencies in 8-month-old preterm and full-term infants.

PubMed

Kabdebon, C; Pena, M; Buiatti, M; Dehaene-Lambertz, G

2015-09-01

Using electroencephalography, we examined 8-month-old infants' ability to discover a systematic dependency between the first and third syllables of successive words, concatenated into a monotonous speech stream, and to subsequently generalize this regularity to new items presented in isolation. Full-term and preterm infants, while exposed to the stream, displayed a significant entrainment (phase-locking) to the syllabic and word frequencies, demonstrating that they were sensitive to the word unit. The acquisition of the systematic dependency defining words was confirmed by the significantly different neural responses to rule-words and part-words subsequently presented during the test phase. Finally, we observed a correlation between syllabic entrainment during learning and the difference in phase coherence between the test conditions (rule-words vs part-words) suggesting that temporal processing of the syllable unit might be crucial in linguistic learning. No group difference was observed suggesting that non-adjacent statistical computations are already robust at 8 months, even in preterm infants, and thus develop during the first year of life, earlier than expected from behavioral studies. Copyright © 2015 Elsevier Inc. All rights reserved.
The Emotions of Abstract Words: A Distributional Semantic Analysis.

PubMed

Lenci, Alessandro; Lebani, Gianluca E; Passaro, Lucia C

2018-04-06

Recent psycholinguistic and neuroscientific research has emphasized the crucial role of emotions for abstract words, which would be grounded by affective experience, instead of a sensorimotor one. The hypothesis of affective embodiment has been proposed as an alternative to the idea that abstract words are linguistically coded and that linguistic processing plays a key role in their acquisition and processing. In this paper, we use distributional semantic models to explore the complex interplay between linguistic and affective information in the representation of abstract words. Distributional analyses on Italian norming data show that abstract words have more affective content and tend to co-occur with contexts with higher emotive values, according to affective statistical indices estimated in terms of distributional similarity with a restricted number of seed words strongly associated with a set of basic emotions. Therefore, the strong affective content of abstract words might just be an indirect byproduct of co-occurrence statistics. This is consistent with a version of representational pluralism in which concepts that are fully embodied either at the sensorimotor or at the affective level live side-by-side with concepts only indirectly embodied via their linguistic associations with other embodied words. Copyright © 2018 Cognitive Science Society, Inc.
Effect of word familiarity on visually evoked magnetic fields.

PubMed

Harada, N; Iwaki, S; Nakagawa, S; Yamaguchi, M; Tonoike, M

2004-11-30

This study investigated the effect of word familiarity of visual stimuli on the word recognizing function of the human brain. Word familiarity is an index of the relative ease of word perception, and is characterized by facilitation and accuracy on word recognition. We studied the effect of word familiarity, using "Hiragana" (phonetic characters in Japanese orthography) characters as visual stimuli, on the elicitation of visually evoked magnetic fields with a word-naming task. The words were selected from a database of lexical properties of Japanese. The four "Hiragana" characters used were grouped and presented in 4 classes of degree of familiarity. The three components were observed in averaged waveforms of the root mean square (RMS) value on latencies at about 100 ms, 150 ms and 220 ms. The RMS value of the 220 ms component showed a significant positive correlation (F=(3/36); 5.501; p=0.035) with the value of familiarity. ECDs of the 220 ms component were observed in the intraparietal sulcus (IPS). Increments in the RMS value of the 220 ms component, which might reflect ideographical word recognition, retrieving "as a whole" were enhanced with increments of the value of familiarity. The interaction of characters, which increased with the value of familiarity, might function "as a large symbol"; and enhance a "pop-out" function with an escaping character inhibiting other characters and enhancing the segmentation of the character (as a figure) from the ground.
Level statistics of words: Finding keywords in literary texts and symbolic sequences

NASA Astrophysics Data System (ADS)

Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.

2009-03-01

Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.
Creating a medical dictionary using word alignment: the influence of sources and resources.

PubMed

Nyström, Mikael; Merkel, Magnus; Petersson, Håkan; Ahlfeldt, Hans

2007-11-23

Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms. More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10.
Creating a medical dictionary using word alignment: The influence of sources and resources

PubMed Central

Nyström, Mikael; Merkel, Magnus; Petersson, Håkan; Åhlfeldt, Hans

2007-01-01

Background Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. Methods We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. Results The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms. Conclusion More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10. PMID:18036221
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY.

PubMed

Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang

2009-08-07

This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application.
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY

PubMed Central

Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang

2010-01-01

This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application. PMID:21197416
Rank Dynamics of Word Usage at Multiple Scales

NASA Astrophysics Data System (ADS)

Morales, José A.; Colman, Ewan; Sánchez, Sergio; Sánchez-Puig, Fernanda; Pineda, Carlos; Iñiguez, Gerardo; Cocho, Germinal; Flores, Jorge; Gershenson, Carlos

2018-05-01

The recent dramatic increase in online data availability has allowed researchers to explore human culture with unprecedented detail, such as the growth and diversification of language. In particular, it provides statistical tools to explore whether word use is similar across languages, and if so, whether these generic features appear at different scales of language structure. Here we use the Google Books N-grams dataset to analyze the temporal evolution of word usage in several languages. We apply measures proposed recently to study rank dynamics, such as the diversity of N-grams in a given rank, the probability that an N-gram changes rank between successive time intervals, the rank entropy, and the rank complexity. Using different methods, results show that there are generic properties for different languages at different scales, such as a core of words necessary to minimally understand a language. We also propose a null model to explore the relevance of linguistic structure across multiple scales, concluding that N-gram statistics cannot be reduced to word statistics. We expect our results to be useful in improving text prediction algorithms, as well as in shedding light on the large-scale features of language use, beyond linguistic and cultural differences across human populations.
The Spiral Arm Segments of the Galaxy within 3 kpc from the Sun: A Statistical Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Griv, Evgeny; Jiang, Ing-Guey; Hou, Li-Gang, E-mail: griv@bgu.ac.il

As can be reasonably expected, upcoming large-scale APOGEE, GAIA, GALAH, LAMOST, and WEAVE stellar spectroscopic surveys will yield rather noisy Galactic distributions of stars. In view of the possibility of employing these surveys, our aim is to present a statistical method to extract information about the spiral structure of the Galaxy from currently available data, and to demonstrate the effectiveness of this method. The model differs from previous works studying how objects are distributed in space in its calculation of the statistical significance of the hypothesis that some of the objects are actually concentrated in a spiral. A statistical analysismore » of the distribution of cold dust clumps within molecular clouds, H ii regions, Cepheid stars, and open clusters in the nearby Galactic disk within 3 kpc from the Sun is carried out. As an application of the method, we obtain distances between the Sun and the centers of the neighboring Sagittarius arm segment, the Orion arm segment in which the Sun is located, and the Perseus arm segment. Pitch angles of the logarithmic spiral segments and their widths are also estimated. The hypothesis that the collected objects accidentally form spirals is refuted with almost 100% statistical confidence. We show that these four independent distributions of young objects lead to essentially the same results. We also demonstrate that our newly deduced values of the mean distances and pitch angles for the segments are not too far from those found recently by Reid et al. using VLBI-based trigonometric parallaxes of massive star-forming regions.« less
The Contribution of Language-Specific Knowledge in the Selection of Statistically-Coherent Word Candidates

ERIC Educational Resources Information Center

Toro, Juan M.; Pons, Ferran; Bion, Ricardo A. H.; Sebastian-Galles, Nuria

2011-01-01

Much research has explored the extent to which statistical computations account for the extraction of linguistic information. However, it remains to be studied how language-specific constraints are imposed over these computations. In the present study we investigated if the violation of a word-forming rule in Catalan (the presence of more than one…
The discrimination of sea ice types using SAR backscatter statistics

NASA Technical Reports Server (NTRS)

Shuchman, Robert A.; Wackerman, Christopher C.; Maffett, Andrew L.; Onstott, Robert G.; Sutherland, Laura L.

1989-01-01

X-band (HH) synthetic aperture radar (SAR) data of sea ice collected during the Marginal Ice Zone Experiment in March and April of 1987 was statistically analyzed with respect to discriminating open water, first-year ice, multiyear ice, and Odden. Odden are large expanses of nilas ice that rapidly form in the Greenland Sea and transform into pancake ice. A first-order statistical analysis indicated that mean versus variance can segment out open water and first-year ice, and skewness versus modified skewness can segment the Odden and multilayer categories. In additions to first-order statistics, a model has been generated for the distribution function of the SAR ice data. Segmentation of ice types was also attempted using textural measurements. In this case, the general co-occurency matrix was evaluated. The textural method did not generate better results than the first-order statistical approach.
Antimnemonic effects of schemas in young and older adults

PubMed Central

Badham, Stephen P.; Maylor, Elizabeth A.

2016-01-01

Schema-consistent material that is aligned with an individual’s knowledge and experience is typically more memorable than abstract material. This effect is often more extreme in older adults and schema use can alleviate age deficits in memory. In three experiments, young and older adults completed memory tasks where the availability of schematic information was manipulated. Specifying nonobvious relations between to-be-remembered word pairs paradoxically hindered memory (Experiment 1). Highlighting relations within mixed lists of related and unrelated word pairs had no effect on memory for those pairs (Experiment 2). This occurred even though related word pairs were recalled better than unrelated word pairs, particularly for older adults. Revealing a schematic context in a memory task with abstract image segments also hindered memory performance, particularly for older adults (Experiment 3). The data show that processing schematic information can come with costs that offset mnemonic benefits associated with schema-consistent stimuli. PMID:25980799
Low-frequency periodicity in the coordination of progressive handwriting.

PubMed

Thomassen, A J; Meulenbroek, R G

1998-11-01

The paper addresses the question how the effector segments are coordinated during handwriting, in particular as a function of the left-to-right progression within words. It studies the phase relations between wrist and finger-joint rotations during a repetitive graphic task (long words consisting of letters 'e'), and it subjects the resulting continuous phase-relation plots to autocorrelation analysis. A novel phenomenon, viz. that of low-frequency (1-Hz) periodicity, is observed which presumably reflects adjustments of the coordination pattern about once per second, i.e., after every three or four letters 'e'. Moreover, word length and word position are found to affect this periodicity in a predictable manner. These results are related to those of an earlier study which used an ad-hoc method of analysing wrist-finger coordination adjustments. The paper underlines the value of phase-relation analysis for certain graphic tasks, but it also points out its limitations for this purpose.
Ideophones in Japanese modulate the P2 and late positive complex responses

PubMed Central

Lockwood, Gwilym; Tuomainen, Jyrki

2015-01-01

Sound-symbolism, or the direct link between sound and meaning, is typologically and behaviorally attested across languages. However, neuroimaging research has mostly focused on artificial non-words or individual segments, which do not represent sound-symbolism in natural language. We used EEG to compare Japanese ideophones, which are phonologically distinctive sound-symbolic lexical words, and arbitrary adverbs during a sentence reading task. Ideophones elicit a larger visual P2 response than arbitrary adverbs, as well as a sustained late positive complex. Our results and previous literature suggest that the larger P2 may indicate the integration of sound and sensory information by association in response to the distinctive phonology of ideophones. The late positive complex may reflect the facilitated lexical retrieval of arbitrary words in comparison to ideophones. This account provides new evidence that ideophones exhibit similar cross-modal correspondences to those which have been proposed for non-words and individual sounds. PMID:26191031
A statistical pixel intensity model for segmentation of confocal laser scanning microscopy images.

PubMed

Calapez, Alexandre; Rosa, Agostinho

2010-09-01

Confocal laser scanning microscopy (CLSM) has been widely used in the life sciences for the characterization of cell processes because it allows the recording of the distribution of fluorescence-tagged macromolecules on a section of the living cell. It is in fact the cornerstone of many molecular transport and interaction quantification techniques where the identification of regions of interest through image segmentation is usually a required step. In many situations, because of the complexity of the recorded cellular structures or because of the amounts of data involved, image segmentation either is too difficult or inefficient to be done by hand and automated segmentation procedures have to be considered. Given the nature of CLSM images, statistical segmentation methodologies appear as natural candidates. In this work we propose a model to be used for statistical unsupervised CLSM image segmentation. The model is derived from the CLSM image formation mechanics and its performance is compared to the existing alternatives. Results show that it provides a much better description of the data on classes characterized by their mean intensity, making it suitable not only for segmentation methodologies with known number of classes but also for use with schemes aiming at the estimation of the number of classes through the application of cluster selection criteria.
Temporally selective attention supports speech processing in 3- to 5-year-old children.

PubMed

Astheimer, Lori B; Sanders, Lisa D

2012-01-01

Recent event-related potential (ERP) evidence demonstrates that adults employ temporally selective attention to preferentially process the initial portions of words in continuous speech. Doing so is an effective listening strategy since word-initial segments are highly informative. Although the development of this process remains unexplored, directing attention to word onsets may be important for speech processing in young children who would otherwise be overwhelmed by the rapidly changing acoustic signals that constitute speech. We examined the use of temporally selective attention in 3- to 5-year-old children listening to stories by comparing ERPs elicited by attention probes presented at four acoustically matched times relative to word onsets: concurrently with a word onset, 100 ms before, 100 ms after, and at random control times. By 80 ms, probes presented at and after word onsets elicited a larger negativity than probes presented before word onsets or at control times. The latency and distribution of this effect is similar to temporally and spatially selective attention effects measured in adults and, despite differences in polarity, spatially selective attention effects measured in children. These results indicate that, like adults, preschool aged children modulate temporally selective attention to preferentially process the initial portions of words in continuous speech. Copyright © 2011 Elsevier Ltd. All rights reserved.
Effects of metric hierarchy and rhyme predictability on word duration in The Cat in the Hat.

PubMed

Breen, Mara

2018-05-01

Word durations convey many types of linguistic information, including intrinsic lexical features like length and frequency and contextual features like syntactic and semantic structure. The current study was designed to investigate whether hierarchical metric structure and rhyme predictability account for durational variation over and above other features in productions of a rhyming, metrically-regular children's book: The Cat in the Hat (Dr. Seuss, 1957). One-syllable word durations and inter-onset intervals were modeled as functions of segment number, lexical frequency, word class, syntactic structure, repetition, and font emphasis. Consistent with prior work, factors predicting longer word durations and inter-onset intervals included more phonemes, lower frequency, first mention, alignment with a syntactic boundary, and capitalization. A model parameter corresponding to metric grid height improved model fit of word durations and inter-onset intervals. Specifically, speakers realized five levels of metric hierarchy with inter-onset intervals such that interval duration increased linearly with increased height in the metric hierarchy. Conversely, speakers realized only three levels of metric hierarchy with word duration, demonstrating that they shortened the highly predictable rhyme resolutions. These results further understanding of the factors that affect spoken word duration, and demonstrate the myriad cues that children receive about linguistic structure from nursery rhymes. Copyright © 2018 Elsevier B.V. All rights reserved.

Automated Identification and Characterization of Secondary & Tertiary gamma’ Precipitates in Nickel-Based Superalloys (PREPRINT)

DTIC Science & Technology

2010-01-01

and intensity information from the EFTEM images. The microstructural statistics obtained from the segmented γ’ precipitates agreed with those of the...is its ability to automate segmentation of precipitates in a reproducible manner for acquiring microstructural statistics that relate to both...were identified using a combination of visual inspection and intensity information from the EFTEM images. The microstructural statistics obtained
The effects of processing and sequence organization on the timing of turn taking: a corpus study

PubMed Central

Roberts, Seán G.; Torreira, Francisco; Levinson, Stephen C.

2015-01-01

The timing of turn taking in conversation is extremely rapid given the cognitive demands on speakers to comprehend, plan and execute turns in real time. Findings from psycholinguistics predict that the timing of turn taking is influenced by demands on processing, such as word frequency or syntactic complexity. An alternative view comes from the field of conversation analysis, which predicts that the rules of turn-taking and sequence organization may dictate the variation in gap durations (e.g., the functional role of each turn in communication). In this paper, we estimate the role of these two different kinds of factors in determining the speed of turn-taking in conversation. We use the Switchboard corpus of English telephone conversation, already richly annotated for syntactic structure speech act sequences, and segmental alignment. To this we add further information including Floor Transfer Offset (the amount of time between the end of one turn and the beginning of the next), word frequency, concreteness, and surprisal values. We then apply a novel statistical framework (“random forests”) to show that these two dimensions are interwoven together with indexical properties of the speakers as explanatory factors determining the speed of response. We conclude that an explanation of the of the timing of turn taking will require insights from both processing and sequence organization. PMID:26029125
RankExplorer: Visualization of Ranking Changes in Large Time Series Data.

PubMed

Shi, Conglei; Cui, Weiwei; Liu, Shixia; Xu, Panpan; Chen, Wei; Qu, Huamin

2012-12-01

For many applications involving time series data, people are often interested in the changes of item values over time as well as their ranking changes. For example, people search many words via search engines like Google and Bing every day. Analysts are interested in both the absolute searching number for each word as well as their relative rankings. Both sets of statistics may change over time. For very large time series data with thousands of items, how to visually present ranking changes is an interesting challenge. In this paper, we propose RankExplorer, a novel visualization method based on ThemeRiver to reveal the ranking changes. Our method consists of four major components: 1) a segmentation method which partitions a large set of time series curves into a manageable number of ranking categories; 2) an extended ThemeRiver view with embedded color bars and changing glyphs to show the evolution of aggregation values related to each ranking category over time as well as the content changes in each ranking category; 3) a trend curve to show the degree of ranking changes over time; 4) rich user interactions to support interactive exploration of ranking changes. We have applied our method to some real time series data and the case studies demonstrate that our method can reveal the underlying patterns related to ranking changes which might otherwise be obscured in traditional visualizations.
The Metamorphosis of the Statistical Segmentation Output: Lexicalization during Artificial Language Learning

ERIC Educational Resources Information Center

Fernandes, Tania; Kolinsky, Regine; Ventura, Paulo

2009-01-01

This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to…
Direction of Wording Effects in Balanced Scales.

ERIC Educational Resources Information Center

Miller, Timothy R.; Cleary, T. Anne

1993-01-01

The degree to which statistical item selection reduces direction-of-wording effects in balanced affective measures developed from relatively small item pools was investigated with 171 male and 228 female undergraduate and graduate students at 2 U.S. universities. Clearest direction-of-wording effects result from selection of items with high…
Automated skin lesion segmentation with kernel density estimation

NASA Astrophysics Data System (ADS)

Pardo, A.; Real, E.; Fernandez-Barreras, G.; Madruga, F. J.; López-Higuera, J. M.; Conde, O. M.

2017-07-01

Skin lesion segmentation is a complex step for dermoscopy pathological diagnosis. Kernel density estimation is proposed as a segmentation technique based on the statistic distribution of color intensities in the lesion and non-lesion regions.
The Sound Pattern of Japanese Surnames

ERIC Educational Resources Information Center

Tanaka, Yu

2017-01-01

Compound surnames in Japanese show complex phonological patterns, which pose challenges to current theories of phonology. This dissertation proposes an account of the segmental and prosodic issues in Japanese surnames and discusses their theoretical implications. Like regular compound words, compound surnames may undergo a sound alternation known…
Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.

PubMed

Borrie, Stephanie A

2015-03-01

This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.
Using visible speech to train perception and production of speech for individuals with hearing loss.

PubMed

Massaro, Dominic W; Light, Joanna

2004-04-01

The main goal of this study was to implement a computer-animated talking head, Baldi, as a language tutor for speech perception and production for individuals with hearing loss. Baldi can speak slowly; illustrate articulation by making the skin transparent to reveal the tongue, teeth, and palate; and show supplementary articulatory features, such as vibration of the neck to show voicing and turbulent airflow to show frication. Seven students with hearing loss between the ages of 8 and 13 were trained for 6 hours across 21 weeks on 8 categories of segments (4 voiced vs. voiceless distinctions, 3 consonant cluster distinctions, and 1 fricative vs. affricate distinction). Training included practice at the segment and the word level. Perception and production improved for each of the 7 children. Speech production also generalized to new words not included in the training lessons. Finally, speech production deteriorated somewhat after 6 weeks without training, indicating that the training method rather than some other experience was responsible for the improvement that was found.
Computer Supported Indexing: A History and Evaluation of NASA's MAI System. Supplement 24

NASA Technical Reports Server (NTRS)

Silvester, June P.

1997-01-01

Computer supported indexing systems may be categorized in several ways. One classification scheme refers to them as statistical, syntactic, semantic or knowledge-based. While a system may emphasize one of these aspects, most systems actually combine two or more of these mechanisms to maximize system efficiency. Statistical systems can be based on counts of words or word stems, statistical association, and correlation techniques that assign weights to word locations or provide lexical disambiguation, calculations regarding the likelihood of word co-occurrences, clustering of word stems and transformations, or any other computational method used to identify pertinent terms. If words are counted, the ones of median frequency become candidate index terms. Syntactical systems stress grammar and identify parts of speech. Concepts found in designated grammatical combinations, such as noun phrases, generate the suggested terms. Semantic systems are concerned with the context sensitivity of words in text. The primary goal of this type of indexing is to identify without regard to syntax the subject matter and the context-bearing words in the text being indexed. Knowledge-based systems provide a conceptual network that goes past thesaurus or equivalent relationships to knowing (e.g., in the National Library of Medicine (NLM) system) that because the tibia is part of the leg, a document relating to injuries to the tibia should he indexed to LEG INJURIES, not the broader MeSH term INJURIES, or knowing that the term FEMALE should automatically be added when the term PREGNANCY is assigned, and also that the indexer should be prompted to add either HUMAN or ANIMAL. Another way of categorizing indexing systems is to identify them as producing either assigned- or derived-term indexes.
Arabic word recognizer for mobile applications

NASA Astrophysics Data System (ADS)

Khanna, Nitin; Abdollahian, Golnaz; Brame, Ben; Boutin, Mireille; Delp, Edward J.

2011-03-01

When traveling in a region where the local language is not written using a "Roman alphabet," translating written text (e.g., documents, road signs, or placards) is a particularly difficult problem since the text cannot be easily entered into a translation device or searched using a dictionary. To address this problem, we are developing the "Rosetta Phone," a handheld device (e.g., PDA or mobile telephone) capable of acquiring an image of the text, locating the region (word) of interest within the image, and producing both an audio and a visual English interpretation of the text. This paper presents a system targeted for interpreting words written in Arabic script. The goal of this work is to develop an autonomous, segmentation-free Arabic phrase recognizer, with computational complexity low enough to deploy on a mobile device. A prototype of the proposed system has been deployed on an iPhone with a suitable user interface. The system was tested on a number of noisy images, in addition to the images acquired from the iPhone's camera. It identifies Arabic words or phrases by extracting appropriate features and assigning "codewords" to each word or phrase. On a dictionary of 5,000 words, the system uniquely mapped (word-image to codeword) 99.9% of the words. The system has a 82% recognition accuracy on images of words captured using the iPhone's built-in camera.
Universal partitioning of the hierarchical fold network of 50-residue segments in proteins

PubMed Central

Ito, Jun-ichi; Sonobe, Yuki; Ikeda, Kazuyoshi; Tomii, Kentaro; Higo, Junichi

2009-01-01

Background Several studies have demonstrated that protein fold space is structured hierarchically and that power-law statistics are satisfied in relation between the numbers of protein families and protein folds (or superfamilies). We examined the internal structure and statistics in the fold space of 50 amino-acid residue segments taken from various protein folds. We used inter-residue contact patterns to measure the tertiary structural similarity among segments. Using this similarity measure, the segments were classified into a number (Kc) of clusters. We examined various Kc values for the clustering. The special resolution to differentiate the segment tertiary structures increases with increasing Kc. Furthermore, we constructed networks by linking structurally similar clusters. Results The network was partitioned persistently into four regions for Kc ≥ 1000. This main partitioning is consistent with results of earlier studies, where similar partitioning was reported in classifying protein domain structures. Furthermore, the network was partitioned naturally into several dozens of sub-networks (i.e., communities). Therefore, intra-sub-network clusters were mutually connected with numerous links, although inter-sub-network ones were rarely done with few links. For Kc ≥ 1000, the major sub-networks were about 40; the contents of the major sub-networks were conserved. This sub-partitioning is a novel finding, suggesting that the network is structured hierarchically: Segments construct a cluster, clusters form a sub-network, and sub-networks constitute a region. Additionally, the network was characterized by non-power-law statistics, which is also a novel finding. Conclusion Main findings are: (1) The universe of 50 residue segments found here was characterized by non-power-law statistics. Therefore, the universe differs from those ever reported for the protein domains. (2) The 50-residue segments were partitioned persistently and universally into some dozens (ca. 40) of major sub-networks, irrespective of the number of clusters. (3) These major sub-networks encompassed 90% of all segments. Consequently, the protein tertiary structure is constructed using the dozens of elements (sub-networks). PMID:19454039
Feasibility Study for Design of a Biocybernetic Communication System

DTIC Science & Technology

1975-08-01

electrode for the Within Words variance and Between Words variance for each of the 255 data samples in the 6-sec epoch. If a given sample point was not...contributing to the computer classification of the word, the ratio of the two variances (i.e., the F-statistic) should be small. On the other hand...if the Between Word variance was signifi- cantly higher than the Within Word variance for a given sample point, we can assume with some confidence
Statistical evaluation of manual segmentation of a diffuse low-grade glioma MRI dataset.

PubMed

Ben Abdallah, Meriem; Blonski, Marie; Wantz-Mezieres, Sophie; Gaudeau, Yann; Taillandier, Luc; Moureaux, Jean-Marie

2016-08-01

Software-based manual segmentation is critical to the supervision of diffuse low-grade glioma patients and to the optimal treatment's choice. However, manual segmentation being time-consuming, it is difficult to include it in the clinical routine. An alternative to circumvent the time cost of manual segmentation could be to share the task among different practitioners, providing it can be reproduced. The goal of our work is to assess diffuse low-grade gliomas' manual segmentation's reproducibility on MRI scans, with regard to practitioners, their experience and field of expertise. A panel of 13 experts manually segmented 12 diffuse low-grade glioma clinical MRI datasets using the OSIRIX software. A statistical analysis gave promising results, as the practitioner factor, the medical specialty and the years of experience seem to have no significant impact on the average values of the tumor volume variable.
Real-world visual statistics and infants' first-learned object names.

PubMed

Clerkin, Elizabeth M; Hart, Elizabeth; Rehg, James M; Yu, Chen; Smith, Linda B

2017-01-05

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered with many different objects in view. However, the frequency distribution of object categories was extremely right skewed such that a very small set of objects was pervasively present-a fact that may substantially reduce the problem of referential ambiguity. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).
Multiresolution multiscale active mask segmentation of fluorescence microscope images

NASA Astrophysics Data System (ADS)

Srinivasa, Gowri; Fickus, Matthew; Kovačević, Jelena

2009-08-01

We propose an active mask segmentation framework that combines the advantages of statistical modeling, smoothing, speed and flexibility offered by the traditional methods of region-growing, multiscale, multiresolution and active contours respectively. At the crux of this framework is a paradigm shift from evolving contours in the continuous domain to evolving multiple masks in the discrete domain. Thus, the active mask framework is particularly suited to segment digital images. We demonstrate the use of the framework in practice through the segmentation of punctate patterns in fluorescence microscope images. Experiments reveal that statistical modeling helps the multiple masks converge from a random initial configuration to a meaningful one. This obviates the need for an involved initialization procedure germane to most of the traditional methods used to segment fluorescence microscope images. While we provide the mathematical details of the functions used to segment fluorescence microscope images, this is only an instantiation of the active mask framework. We suggest some other instantiations of the framework to segment different types of images.
Quantitative analysis of the text and graphic content in ophthalmic slide presentations.

PubMed

Ing, Edsel; Celo, Erdit; Ing, Royce; Weisbrod, Lawrence; Ing, Mercedes

2017-04-01

To determine the characteristics of ophthalmic digital slide presentations. Retrospective quantitative analysis. Slide presentations from a 2015 Canadian primary eye care conference were analyzed for their duration, character and word count, font size, words per minute (wpm), lines per slide, words per slide, slides per minute (spm), text density product (wpm × spm), proportion of graphic content, and Flesch Reading Ease (FRE) score using Microsoft PowerPoint and Word. The median audience evaluation score for the lectures was used to dichotomize the higher scoring lectures (HSL) from the lower scoring lectures (LSL). A priori we hypothesized that there would be a difference in the wpm, spm, text density product, and FRE score between HSL and LSL. Wilcoxon rank-sum tests with Bonferroni correction were utilized. The 17 lectures had medians of 2.5 spm, 20.3 words per slide, 5.0 lines per slide, 28-point sans serif font, 36% graphic content, and text density product of 136.4 words × slides/minute 2 . Although not statistically significant, the HSL had more wpm, fewer words per slide, more graphics per slide, greater text density, and higher FRE score than LSL. There was a statistically significant difference in the spm of the HSL (3.1 ± 1.0) versus the LSL (2.2 ± 1.0) at p = 0.0124. All presenters showed more than 1 slide per minute. The HSL showed more spm than the LSL. The descriptive statistics from this study may aid in the preparation of slides used for teaching and conferences. Copyright © 2017 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
A metric to search for relevant words

NASA Astrophysics Data System (ADS)

Zhou, Hongding; Slater, Gary W.

2003-11-01

We propose a new metric to evaluate and rank the relevance of words in a text. The method uses the density fluctuations of a word to compute an index that measures its degree of clustering. Highly significant words tend to form clusters, while common words are essentially uniformly spread in a text. If a word is not rare, the metric is stable when we move any individual occurrence of this word in the text. Furthermore, we prove that the metric always increases when words are moved to form larger clusters, or when several independent documents are merged. Using the Holy Bible as an example, we show that our approach reduces the significance of common words when compared to a recently proposed statistical metric.
Statistical Learning in Emerging Lexicons: The Case of Danish

ERIC Educational Resources Information Center

Stokes, Stephanie F.; Bleses, Dorthe; Basboll, Hans; Lambertsen, Claus

2012-01-01

Purpose: This research explored the impact of neighborhood density (ND), word frequency (WF), and word length (WL) on the vocabulary size of Danish-speaking children. Given the particular phonological properties of Danish, the impact was expected to differ from that reported in studies on English and French. Method: The monosyllabic words in the…
Judging Words by Their Covers and the Company They Keep: Probabilistic Cues Support Word Learning

ERIC Educational Resources Information Center

Lany, Jill

2014-01-01

Statistical learning may be central to lexical and grammatical development. The phonological and distributional properties of words provide probabilistic cues to their grammatical and semantic properties. Infants can capitalize on such probabilistic cues to learn grammatical patterns in listening tasks. However, infants often struggle to learn…

Jointly learning word embeddings using a corpus and a knowledge base

PubMed Central

Bollegala, Danushka; Maehara, Takanori; Kawarabayashi, Ken-ichi

2018-01-01

Methods for representing the meaning of words in vector spaces purely using the information distributed in text corpora have proved to be very valuable in various text mining and natural language processing (NLP) tasks. However, these methods still disregard the valuable semantic relational structure between words in co-occurring contexts. These beneficial semantic relational structures are contained in manually-created knowledge bases (KBs) such as ontologies and semantic lexicons, where the meanings of words are represented by defining the various relationships that exist among those words. We combine the knowledge in both a corpus and a KB to learn better word embeddings. Specifically, we propose a joint word representation learning method that uses the knowledge in the KBs, and simultaneously predicts the co-occurrences of two words in a corpus context. In particular, we use the corpus to define our objective function subject to the relational constrains derived from the KB. We further utilise the corpus co-occurrence statistics to propose two novel approaches, Nearest Neighbour Expansion (NNE) and Hedged Nearest Neighbour Expansion (HNE), that dynamically expand the KB and therefore derive more constraints that guide the optimisation process. Our experimental results over a wide-range of benchmark tasks demonstrate that the proposed method statistically significantly improves the accuracy of the word embeddings learnt. It outperforms a corpus-only baseline and reports an improvement of a number of previously proposed methods that incorporate corpora and KBs in both semantic similarity prediction and word analogy detection tasks. PMID:29529052
E-WOM Review Adoption: Consumers’ Demographic Profile Influence on Green Purchase Intention

NASA Astrophysics Data System (ADS)

Rahim, Roslin Abdul; Sulaiman, Zuraidah; Chin, Thoo Ai; Arif, Mohd Shoki Mohd; Hamid, Mohd Hakim Abdul

2017-06-01

Nowadays, green products are getting popular in their acceptance by the Malaysian consumers. Due to the advancement of the Internet technologies and the wide spread of electronic word of mouth (E-WOM), consumers seem to be more influenced in purchasing the green products. In this study, consumers’ demographic profiles, such as age, gender, income, education background, and occupation are being explored to investigate their influences on consumers’ green product purchase intention. The purpose of this paper is to showcase the results of the differences between several demographic profile groups on green product purchase intention using descriptive analysis, ANOVA and independent sample T-Test. T-test results showed that there is a statistically significant difference between gender on consumers’ green product purchase intention. Meanwhile, the results generated by ANOVA indicated that there are no significant differences between age, income, education background and occupation on consumers’ green product purchase intention. These results shed light on the potential market segment that should be targeted by marketers and producers of green products in Malaysia.
The effects of energetic and informational masking on The Words-in-Noise Test (WIN).

PubMed

Wilson, Richard H; Trivette, Cristine P; Williams, Daniel A; Watts, Kelly L

2012-01-01

In certain masking paradigms, the masker can have two components, energetic and informational. Energetic masking is the traditional peripheral masking, whereas informational masking involves confusions (uncertainty) between the signal and masker that originate more centrally in the auditory system. Sperry et al (1997) used Northwestern University Auditory Test No. 6 (NU-6) words in multitalker babble to study the differential effects of energetic and informational masking using babble played temporally forward (FB) and backward (BB). The FB and BB are the same except BB is void of the contextual and semantic content cues that are available in FB. It is these informational cues that are thought to fuel informational masking. Sperry et al found 15% better recognition performance (∼3 dB) on BB than on FB, which can be interpreted as the presence of informational masking in the FB condition and not in the BB condition (Dirks and Bower, 1969). The Words-in-Noise Test (WIN) (Wilson, 2003; Wilson and McArdle, 2007) uses NU-6 words as the signal and multitalker babble as the masker, which is a combination of stimuli that potentially could produce informational masking. The WIN presents 5 or 10 words at each of seven signal-to-noise ratios (S/N, SNR) from 24 to 0 dB in 4 dB decrements with the 50% correct point being the metric of interest. The same recordings of the NU-6 words and multitalker babble used by Sperry et al are used in the WIN. To determine whether informational masking was involved with the WIN. Descriptive, quasi-experimental designs were conducted in three experiments using FB and BB in various paradigms in which FB and BB varied from 4.3 sec concatenated segments to essentially continuous. Eighty young adults with normal hearing and 64 older adults with sensorineural hearing losses participated in a series of three experiments. Experiment 1 compared performance on the normal WIN (FB) with performance on the WIN in which the babble segment with each word was reversed temporally (BB). Experiment 2 examined the effects of continuous FB and BB segments on WIN performance. Experiment 3 replicated the Sperry et al (1997) experiment at 4 and 0 dB S/N using NU-6 words in the FB and BB conditions. Experiment 1-with the WIN paradigm, recognition performances on FB and BB were the same for listeners with normal hearing and listeners with hearing loss, except at the 0 dB S/N with the listeners with normal hearing at which performance was significantly better on BB than FB. Experiment 2-recognition performances on FB and BB were the same at all SNRs for listeners with normal hearing using a slightly modified WIN paradigm. Experiment 3-there was no difference in performances on the FB and BB conditions with either of the two SNRs. Informational masking was not involved in the WIN paradigm. The Sperry et al results were not replicated, which is thought to be related to the way in which the Sperry et al BB condition was produced. American Academy of Audiology.
Linguistica matematica, statistica linguistica e linguistica applicata. Una nota storica sui lessici di frequenza e l'educazione linguistica (Mathematical Linguistics, Linguistic Statistics, and Applied Linguistics. An Historical Note on Word Frequencies and Linguistic Education)

ERIC Educational Resources Information Center

Elia, Annibale

1977-01-01

This article traces the history of several themes in applied linguistics and to show the relationships between linguistic theory and the sciences concerned with the learning and teaching of languages. Interest in word frequency statistics is discussed in particular. (Text is in Italian.) (CFM)
Localized Statistics for DW-MRI Fiber Bundle Segmentation

PubMed Central

Lankton, Shawn; Melonakos, John; Malcolm, James; Dambreville, Samuel; Tannenbaum, Allen

2013-01-01

We describe a method for segmenting neural fiber bundles in diffusion-weighted magnetic resonance images (DWMRI). As these bundles traverse the brain to connect regions, their local orientation of diffusion changes drastically, hence a constant global model is inaccurate. We propose a method to compute localized statistics on orientation information and use it to drive a variational active contour segmentation that accurately models the non-homogeneous orientation information present along the bundle. Initialized from a single fiber path, the proposed method proceeds to capture the entire bundle. We demonstrate results using the technique to segment the cingulum bundle and describe several extensions making the technique applicable to a wide range of tissues. PMID:23652079
Sparse intervertebral fence composition for 3D cervical vertebra segmentation

NASA Astrophysics Data System (ADS)

Liu, Xinxin; Yang, Jian; Song, Shuang; Cong, Weijian; Jiao, Peifeng; Song, Hong; Ai, Danni; Jiang, Yurong; Wang, Yongtian

2018-06-01

Statistical shape models are capable of extracting shape prior information, and are usually utilized to assist the task of segmentation of medical images. However, such models require large training datasets in the case of multi-object structures, and it also is difficult to achieve satisfactory results for complex shapes. This study proposed a novel statistical model for cervical vertebra segmentation, called sparse intervertebral fence composition (SiFC), which can reconstruct the boundary between adjacent vertebrae by modeling intervertebral fences. The complex shape of the cervical spine is replaced by a simple intervertebral fence, which considerably reduces the difficulty of cervical segmentation. The final segmentation results are obtained by using a 3D active contour deformation model without shape constraint, which substantially enhances the recognition capability of the proposed method for objects with complex shapes. The proposed segmentation framework is tested on a dataset with CT images from 20 patients. A quantitative comparison against corresponding reference vertebral segmentation yields an overall mean absolute surface distance of 0.70 mm and a dice similarity index of 95.47% for cervical vertebral segmentation. The experimental results show that the SiFC method achieves competitive cervical vertebral segmentation performances, and completely eliminates inter-process overlap.
Radiographic Response to Yttrium-90 Radioembolization in Anterior Versus Posterior Liver Segments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ibrahim, Saad M.; Lewandowski, Robert J.; Ryu, Robert K.

2008-11-15

The purpose of our study was to determine if preferential radiographic tumor response occurs in tumors located in posterior versus anterior liver segments following radioembolization with yttrium-90 glass microspheres. One hundred thirty-seven patients with chemorefractory liver metastases of various primaries were treated with yttrium-90 glass microspheres. Of these, a subset analysis was performed on 89 patients who underwent 101 whole-right-lobe infusions to liver segments V, VI, VII, and VIII. Pre- and posttreatment imaging included either triphasic contrast material-enhanced CT or gadolinium-enhanced MRI. Responses to treatment were compared in anterior versus posterior right lobe lesions using both RECIST and WHO criteria.more » Statistical comparative studies were conducted in 42 patients with both anterior and posterior segment lesions using the paired-sample t-test. Pearson correlation was used to determine the relationship between pretreatment tumor size and posttreatment tumor response. Median administered activity, delivered radiation dose, and treatment volume were 2.3 GBq, 118.2 Gy, and 1,072 cm{sup 3}, respectively. Differences between the pretreatment tumor size of anterior and posterior liver segments were not statistically significant (p = 0.7981). Differences in tumor response between anterior and posterior liver segments were not statistically significant using WHO criteria (p = 0.8557). A statistically significant correlation did not exist between pretreatment tumor size and posttreatment tumor response (r = 0.0554, p = 0.4434). On imaging follow-up using WHO criteria, for anterior and posterior regions of the liver, (1) response rates were 50% (PR = 50%) and 45% (CR = 9%, PR = 36%), and (2) mean changes in tumor size were -41% and -40%. In conclusion, this study did not find evidence of preferential radiographic tumor response in posterior versus anterior liver segments treated with yttrium-90 glass microspheres.« less
Radiographic response to yttrium-90 radioembolization in anterior versus posterior liver segments.

PubMed

Ibrahim, Saad M; Lewandowski, Robert J; Ryu, Robert K; Sato, Kent T; Gates, Vanessa L; Mulcahy, Mary F; Kulik, Laura; Larson, Andrew C; Omary, Reed A; Salem, Riad

2008-01-01

The purpose of our study was to determine if preferential radiographic tumor response occurs in tumors located in posterior versus anterior liver segments following radioembolization with yttrium-90 glass microspheres. One hundred thirty-seven patients with chemorefractory liver metastases of various primaries were treated with yttrium-90 glass microspheres. Of these, a subset analysis was performed on 89 patients who underwent 101 whole-right-lobe infusions to liver segments V, VI, VII, and VIII. Pre- and posttreatment imaging included either triphasic contrast material-enhanced CT or gadolinium-enhanced MRI. Responses to treatment were compared in anterior versus posterior right lobe lesions using both RECIST and WHO criteria. Statistical comparative studies were conducted in 42 patients with both anterior and posterior segment lesions using the paired-sample t-test. Pearson correlation was used to determine the relationship between pretreatment tumor size and posttreatment tumor response. Median administered activity, delivered radiation dose, and treatment volume were 2.3 GBq, 118.2 Gy, and 1,072 cm(3), respectively. Differences between the pretreatment tumor size of anterior and posterior liver segments were not statistically significant (p = 0.7981). Differences in tumor response between anterior and posterior liver segments were not statistically significant using WHO criteria (p = 0.8557). A statistically significant correlation did not exist between pretreatment tumor size and posttreatment tumor response (r = 0.0554, p = 0.4434). On imaging follow-up using WHO criteria, for anterior and posterior regions of the liver, (1) response rates were 50% (PR = 50%) and 45% (CR = 9%, PR = 36%), and (2) mean changes in tumor size were -41% and -40%. In conclusion, this study did not find evidence of preferential radiographic tumor response in posterior versus anterior liver segments treated with yttrium-90 glass microspheres.
Effects of Word Frequency and Modality on Sentence Comprehension Impairments in People with Aphasia

PubMed Central

DeDe, Gayle

2014-01-01

Purpose It is well known that people with aphasia have sentence comprehension impairments. The present study investigated whether lexical factors contribute to sentence comprehension impairments in both the auditory and written modalities using on-line measures of sentence processing. Methods People with aphasia and non-brain-damaged controls participated in the experiment (n=8 per group). Twenty-one sentence pairs containing high and low frequency words were presented in self-paced listening and reading tasks. The sentences were syntactically simple and differed only in the critical words. The dependent variables were response times for critical segments of the sentence and accuracy on the comprehension questions. Results The results showed that word frequency influences performance on measures of sentence comprehension in people with aphasia. The accuracy data on the comprehension questions suggested that people with aphasia have more difficulty understanding sentences containing low frequency words in the written compared to auditory modality. Both group and single case analyses of the response time data also pointed to more difficulty with reading than listening. Conclusions The results show that sentence comprehension in people with aphasia is influenced by word frequency and presentation modality. PMID:22294411
Investigating lexical competition and the cost of phonemic restoration.

PubMed

Balling, Laura Winther; Morris, David Jackson; Tøndering, John

2017-12-01

Due to phonemic restoration, listeners can reliably perceive words when a phoneme is replaced with noise. The cost associated with this process was investigated along with the effect of lexical uniqueness on phonemic restoration, using data from a lexical decision experiment where noise replaced phonemes that were either uniqueness points (the phoneme at which a word deviates from all nonrelated words that share the same onset) or phonemes immediately prior to these. A baseline condition was also included with no noise-interrupted stimuli. Results showed a significant cost of phonemic restoration, with 100 ms longer word identification times and a 14% decrease in word identification accuracy for interrupted stimuli compared to the baseline. Regression analysis of response times from the interrupted conditions showed no effect of whether the interrupted phoneme was a uniqueness point, but significant effects for several temporal attributes of the stimuli, including the duration and position of the interrupted segment. These results indicate that uniqueness points are not distinct breakpoints in the cohort reduction that occurs during lexical processing, but that temporal properties of the interrupted stimuli are central to auditory word recognition. These results are interpreted in the context of models of speech perception.
Serial recall, word frequency, and mixed lists: the influence of item arrangement.

PubMed

Miller, Leonie M; Roodenrys, Steven

2012-11-01

Studies of the effect of word frequency in the serial recall task show that lists of high-frequency words are better recalled than lists of low-frequency words; however, when high- and low-frequency words are alternated within a list, there is no difference in the level of recall for the two types of words, and recall is intermediate between lists of pure frequency. This pattern has been argued to arise from the development of a network of activated long-term representations of list items that support the redintegration of all list items in a nondirectional and nonspecific way. More recently, it has been proposed that the frequency effect might be a product of the coarticulation of items at word boundaries and their influence on rehearsal rather than a consequence of memory representations. The current work examines recall performance in mixed lists of an equal number of high- and low-frequency items arranged in contiguous segments (i.e., HHHLLL and LLLHHH), under quiet and articulatory suppression conditions, to test whether the effect is (a) nondirectional and (b) dependent on articulatory processes. These experiments demonstrate that neither explanation is satisfactory, although the results suggest that the effect is mnemonic. A language-based approach to short-term memory is favored with emphasis on the role of speech production processes at output.
The words children hear: Picture books and the statistics for language learning

PubMed Central

Montag, Jessica L.; Jones, Michael N.; Smith, Linda B.

2015-01-01

Young children learn language from the speech they hear. Previous work suggests that the statistical diversity of words and of linguistic contexts is associated with better language outcomes. One potential source of lexical diversity is the text of picture books that caregivers read aloud to children. Many parents begin reading to their children shortly after birth, so this is potentially an important source of linguistic input for many children. We constructed a corpus of 100 children’s picture books and compared word type and token counts to a matched sample of child-directed speech. Overall, the picture books contained more unique word types than the child-directed speech. Further, individual picture books generally contained more unique word types than length-matched, child-directed conversations. The text of picture books may be an important source of vocabulary for young children, and these findings suggest a mechanism that underlies the language benefits associated with reading to children. PMID:26243292
The Words Children Hear: Picture Books and the Statistics for Language Learning.

PubMed

Montag, Jessica L; Jones, Michael N; Smith, Linda B

2015-09-01

Young children learn language from the speech they hear. Previous work suggests that greater statistical diversity of words and of linguistic contexts is associated with better language outcomes. One potential source of lexical diversity is the text of picture books that caregivers read aloud to children. Many parents begin reading to their children shortly after birth, so this is potentially an important source of linguistic input for many children. We constructed a corpus of 100 children's picture books and compared word type and token counts in that sample and a matched sample of child-directed speech. Overall, the picture books contained more unique word types than the child-directed speech. Further, individual picture books generally contained more unique word types than length-matched, child-directed conversations. The text of picture books may be an important source of vocabulary for young children, and these findings suggest a mechanism that underlies the language benefits associated with reading to children. © The Author(s) 2015.
Reading is fundamentally similar across disparate writing systems: A systematic characterization of how words and characters influence eye movements in Chinese reading

PubMed Central

Li, Xingshan; Bicknell, Klinton; Liu, Pingping; Wei, Wei; Rayner, Keith

2013-01-01

While much previous work on reading in languages with alphabetic scripts has suggested that reading is word-based, reading in Chinese has been argued to be less reliant on words. This is primarily because in the Chinese writing system words are not spatially segmented, and characters are themselves complex visual objects. Here, we present a systematic characterization of the effects of a wide range of word and character properties on eye movements in Chinese reading, using a set of mixed-effects regression models. The results reveal a rich pattern of effects of the properties of the current, previous, and next words on a range of reading measures, which is strikingly similar to the pattern of effects of word properties reported in spaced alphabetic languages. This finding provides evidence that reading shares a word-based core and may be fundamentally similar across languages with highly dissimilar scripts. We show that these findings are robust to the inclusion of character properties in the regression models, and are equally reliable when dependent measures are defined in terms of characters rather than words, providing strong evidence that word properties have effects in Chinese reading above and beyond characters. This systematic characterization of the effects of word and character properties in Chinese advances our knowledge of the processes underlying reading and informs the future development of models of reading. More generally, however, this work suggests that differences in script may not alter the fundamental nature of reading. PMID:23834023
The nature of the language input affects brain activation during learning from a natural language

PubMed Central

Plante, Elena; Patterson, Dianne; Gómez, Rebecca; Almryde, Kyle R.; White, Milo G.; Asbjørnsen, Arve E.

2015-01-01

Artificial language studies have demonstrated that learners are able to segment individual word-like units from running speech using the transitional probability information. However, this skill has rarely been examined in the context of natural languages, where stimulus parameters can be quite different. In this study, two groups of English-speaking learners were exposed to Norwegian sentences over the course of three fMRI scans. One group was provided with input in which transitional probabilities predicted the presence of target words in the sentences. This group quickly learned to identify the target words and fMRI data revealed an extensive and highly dynamic learning network. These results were markedly different from activation seen for a second group of participants. This group was provided with highly similar input that was modified so that word learning based on syllable co-occurrences was not possible. These participants showed a much more restricted network. The results demonstrate that the nature of the input strongly influenced the nature of the network that learners employ to learn the properties of words in a natural language. PMID:26257471
Pauses and Intonational Phrasing: ERP Studies in 5-Month-Old German Infants and Adults

ERIC Educational Resources Information Center

Mannel, Claudia; Friederici, Angela D.

2009-01-01

In language learning, infants are faced with the challenge of decomposing continuous speech into relevant units, such as syntactic clauses and words. Within the framework of prosodic bootstrapping, behavioral studies suggest infants approach this segmentation problem by relying on prosodic information, especially on acoustically marked…
The Utility of Cognitive Plausibility in Language Acquisition Modeling: Evidence from Word Segmentation

ERIC Educational Resources Information Center

Phillips, Lawrence; Pearl, Lisa

2015-01-01

The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's "cognitive plausibility." We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition…
Pronunciation Instruction through Twitter: The Case of Commonly Mispronounced Words

ERIC Educational Resources Information Center

Fouz-González, Jonás

2017-01-01

This paper presents the results of a study aimed at exploring the possibilities Twitter offers for pronunciation instruction. It investigates the potential of a Twitter-based approach based on explicit instruction and input enhancement techniques to help English Foreing Language (EFL) learners improve their pronunciation of segmental and…
Deaf College Students' Mathematical Skills Relative to Morphological Knowledge, Reading Level, and Language Proficiency

ERIC Educational Resources Information Center

Kelly, Ronald R.; Gaustad, Martha G.

2007-01-01

This study of deaf college students examined specific relationships between their mathematics performance and their assessed skills in reading, language, and English morphology. Simple regression analyses showed that deaf college students' language proficiency scores, reading grade level, and morphological knowledge regarding word segmentation and…
Range and Precision of Formant Movement in Pediatric Dysarthria

ERIC Educational Resources Information Center

Allison, Kristen M.; Annear, Lucas; Annear, Lucas; Policicchio, Marisa; Hustad, Katherine C.

2017-01-01

Purpose: This study aimed to improve understanding of speech characteristics associated with dysarthria in children with cerebral palsy by analyzing segmental and global formant measures in single-word and sentence contexts. Method: Ten 5-year-old children with cerebral palsy and dysarthria and 10 age-matched, typically developing children…

Anthropometric and Mass Distribution Characteristics of the Adult Female. Revised

DTIC Science & Technology

1983-09-01

syst ms, and development of body prostheses. 17. Key Words 18. Distribution Statement Anthropometry , Anatomical Axis, Body Document is available to the...COLLECTION............ . . . ....................... 3 The Subjects ..................................... 3 Anthropometry ...OF TAB&ES Table No. Anthropometry and Mass Distribution Data for the Total Body and Its Segment4: 1 Head
Keyword extraction by nonextensivity measure.

PubMed

Mehri, Ali; Darooneh, Amir H

2011-05-01

The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics for the word-ranking process. In particular, we look at the nonextensivity parameter as an alternative metric to measure the spatial correlation in the text, from which the words may be ranked in terms of this measure. Finally, we compare different methods for keyword extraction. © 2011 American Physical Society
Exploiting Lexical Ambiguity to Help Students Understand the Meaning of "Random"

ERIC Educational Resources Information Center

Kaplan, Jennifer J.; Rogness, Neal T.; Fisher, Diane G.

2014-01-01

Words that are part of colloquial English but used differently in a technical domain may possess lexical ambiguity. The use of such words by instructors may inhibit student learning if incorrect connections are made by students between the technical and colloquial meanings. One fundamental word in statistics that has lexical ambiguity for students…
76 FR 66875 - Informal Entry Limit and Removal of a Formal Entry Requirement

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-28

... to properly assess duties on the merchandise and collect accurate statistics with respect to the.... In Sec. 10.1: a. Introductory paragraph (a) is amended by removing the word ``shall'' and adding in... removing the word ``shall'' and adding in its place the word ``must''; m. Introductory paragraph (h)(4) is...
Phonological effects in handwriting production: evidence from the implicit priming paradigm.

PubMed

Afonso, Olivia; Álvarez, Carlos J

2011-11-01

In the present article, we report 3 experiments using the odd-man-out variant of the implicit priming paradigm, aimed at determining the role played by phonological information during the handwriting process. Participants were asked to write a small set of words learned in response to prompts. Within each block, response words could share initial segments (constant homogeneous) or not (heterogeneous). Also, 2 variable homogeneous blocks were created by including a response word that did not share orthographic onset with the other response (odd-man-out). This odd-man-out could be phonologically related to the targets or not. Experiment 1 showed a preparation effect in the constant homogeneous condition, which disappeared (spoil effect) in the variable condition not phonologically related. However, no spoil effect was found when the odd-man-out shared the phonological initial segment with the targets. In Experiment 2, we obtained a spoil effect in the variable phonologically related condition, but it was significantly smaller than in the variable not phonologically related condition. The effects observed in Experiment 2 vanished in Experiment 3 under articulatory suppression, suggesting that they originated at a sublexical level. These findings suggest that phonological sublexical information is used during handwriting and provide evidence that the implicit priming paradigm (and the odd-man-out version of this) is a suitable tool for handwriting production research.
A variational approach to liver segmentation using statistics from multiple sources

NASA Astrophysics Data System (ADS)

Zheng, Shenhai; Fang, Bin; Li, Laquan; Gao, Mingqi; Wang, Yi

2018-01-01

Medical image segmentation plays an important role in digital medical research, and therapy planning and delivery. However, the presence of noise and low contrast renders automatic liver segmentation an extremely challenging task. In this study, we focus on a variational approach to liver segmentation in computed tomography scan volumes in a semiautomatic and slice-by-slice manner. In this method, one slice is selected and its connected component liver region is determined manually to initialize the subsequent automatic segmentation process. From this guiding slice, we execute the proposed method downward to the last one and upward to the first one, respectively. A segmentation energy function is proposed by combining the statistical shape prior, global Gaussian intensity analysis, and enforced local statistical feature under the level set framework. During segmentation, the shape of the liver shape is estimated by minimization of this function. The improved Chan-Vese model is used to refine the shape to capture the long and narrow regions of the liver. The proposed method was verified on two independent public databases, the 3D-IRCADb and the SLIVER07. Among all the tested methods, our method yielded the best volumetric overlap error (VOE) of 6.5 +/- 2.8 % , the best root mean square symmetric surface distance (RMSD) of 2.1 +/- 0.8 mm, the best maximum symmetric surface distance (MSD) of 18.9 +/- 8.3 mm in 3D-IRCADb dataset, and the best average symmetric surface distance (ASD) of 0.8 +/- 0.5 mm, the best RMSD of 1.5 +/- 1.1 mm in SLIVER07 dataset, respectively. The results of the quantitative comparison show that the proposed liver segmentation method achieves competitive segmentation performance with state-of-the-art techniques.
Interrupted Time Series Versus Statistical Process Control in Quality Improvement Projects.

PubMed

Andersson Hagiwara, Magnus; Andersson Gäre, Boel; Elg, Mattias

2016-01-01

To measure the effect of quality improvement interventions, it is appropriate to use analysis methods that measure data over time. Examples of such methods include statistical process control analysis and interrupted time series with segmented regression analysis. This article compares the use of statistical process control analysis and interrupted time series with segmented regression analysis for evaluating the longitudinal effects of quality improvement interventions, using an example study on an evaluation of a computerized decision support system.
Financial Stylized Facts in the Word of Mouth Model

NASA Astrophysics Data System (ADS)

Misawa, Tadanobu; Watanabe, Kyoko; Shimokawa, Tetsuya

Recently, we proposed an agent-based model called the word of mouth model to analyze the influence of an information transmission process to price formation in financial markets. Especially, the short-term predictability of asset return was focused on and an explanation in the view of information transmission was provided to the question why the predictability was much clearly observed in the small-sized stocks. This paper, to extend the previous study, demonstrates that the word of mouth model also has a consistency with other important financial stylized facts. This strengthens the possibility that the information transmission among investors plays a crucial role in price formation. Concretely, this paper addresses two famous statistical features of returns; the leptokurtic distribution of return and the autocorrelation of return volatility. The reasons why these statistical facts receive especial attentions of researchers among financial stylized facts are their statistical robustness and practical importance, such as the applications to the derivative pricing problems.
Supervised variational model with statistical inference and its application in medical image segmentation.

PubMed

Li, Changyang; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Yin, Yong; Dagan Feng, David

2015-01-01

Automated and general medical image segmentation can be challenging because the foreground and the background may have complicated and overlapping density distributions in medical imaging. Conventional region-based level set algorithms often assume piecewise constant or piecewise smooth for segments, which are implausible for general medical image segmentation. Furthermore, low contrast and noise make identification of the boundaries between foreground and background difficult for edge-based level set algorithms. Thus, to address these problems, we suggest a supervised variational level set segmentation model to harness the statistical region energy functional with a weighted probability approximation. Our approach models the region density distributions by using the mixture-of-mixtures Gaussian model to better approximate real intensity distributions and distinguish statistical intensity differences between foreground and background. The region-based statistical model in our algorithm can intuitively provide better performance on noisy images. We constructed a weighted probability map on graphs to incorporate spatial indications from user input with a contextual constraint based on the minimization of contextual graphs energy functional. We measured the performance of our approach on ten noisy synthetic images and 58 medical datasets with heterogeneous intensities and ill-defined boundaries and compared our technique to the Chan-Vese region-based level set model, the geodesic active contour model with distance regularization, and the random walker model. Our method consistently achieved the highest Dice similarity coefficient when compared to the other methods.
Do preschool children learn to read words from environmental prints?

PubMed

Zhao, Jing; Zhao, Pei; Weng, Xuchu; Li, Su

2014-01-01

Parents and teachers worldwide believe that a visual environment rich with print can contribute to young children's literacy. Children seem to recognize words in familiar logos at an early age. However, most of previous studies were carried out with alphabetic scripts. Alphabetic letters regularly correspond to phonological segments in a word and provide strong cues about the identity of the whole word. Thus it was not clear whether children can learn to read words by extracting visual word form information from environmental prints. To exclude the phonological-cue confound, this study tested children's knowledge of Chinese words embedded in familiar logos. The four environmental logos were employed and transformed into four versions with the contextual cues (i.e., something apart from the presentation of the words themselves in logo format like the color, logo and font type cues) gradually minimized. Children aged from 3 to 5 were tested. We observed that children of different ages all performed better when words were presented in highly familiar logos compared to when they were presented in a plain fashion, devoid of context. This advantage for familiar logos was also present when the contextual information was only partial. However, the role of various cues in learning words changed with age. The color and logo cues had a larger effect in 3- and 4- year-olds than in 5-year-olds, while the font type cue played a greater role in 5-year-olds than in the other two groups. Our findings demonstrated that young children did not easily learn words by extracting their visual form information even from familiar environmental prints. However, children aged 5 begin to pay more attention to the visual form information of words in highly familiar logos than those aged 3 and 4.
Do Preschool Children Learn to Read Words from Environmental Prints?

PubMed Central

Zhao, Jing; Zhao, Pei; Weng, Xuchu; Li, Su

2014-01-01

Parents and teachers worldwide believe that a visual environment rich with print can contribute to young children's literacy. Children seem to recognize words in familiar logos at an early age. However, most of previous studies were carried out with alphabetic scripts. Alphabetic letters regularly correspond to phonological segments in a word and provide strong cues about the identity of the whole word. Thus it was not clear whether children can learn to read words by extracting visual word form information from environmental prints. To exclude the phonological-cue confound, this study tested children's knowledge of Chinese words embedded in familiar logos. The four environmental logos were employed and transformed into four versions with the contextual cues (i.e., something apart from the presentation of the words themselves in logo format like the color, logo and font type cues) gradually minimized. Children aged from 3 to 5 were tested. We observed that children of different ages all performed better when words were presented in highly familiar logos compared to when they were presented in a plain fashion, devoid of context. This advantage for familiar logos was also present when the contextual information was only partial. However, the role of various cues in learning words changed with age. The color and logo cues had a larger effect in 3- and 4- year-olds than in 5-year-olds, while the font type cue played a greater role in 5-year-olds than in the other two groups. Our findings demonstrated that young children did not easily learn words by extracting their visual form information even from familiar environmental prints. However, children aged 5 begin to pay more attention to the visual form information of words in highly familiar logos than those aged 3 and 4. PMID:24465677
[Intervention in dyslexic disorders: phonological awareness training].

PubMed

Etchepareborda, M C

2003-02-01

Taking into account the systems for the treatment of brain information when drawing up a work plan allows us to recreate processing routines that go from multisensory perception to motor, oral and cognitive production, which is the step prior to executive levels of thought, bottom-up and top-down processing systems. In recent years, the use of phonological methods to prevent or resolve reading disorders has become the fundamental mainstay in the treatment of dyslexia. The work is mainly based on phonological proficiency, which enables the patient to detect phonemes (input), to think about them (performance) and to use them to build words (output). Daily work with rhymes, the capacity to listen, the identification of phrases and words, and handling syllables and phonemes allows us to perform a preventive intervention that enhances the capacity to identify letters, phonological analysis and the reading of single words. We present the different therapeutic models that are most frequently employed. Fast For Word (FFW) training helps make progress in phonematic awareness and other linguistic skills, such as phonological awareness, semantics, syntax, grammar, working memory and event sequencing. With Deco-Fon, a programme for training phonological decoding, work is carried out on the auditory discrimination of pure tones, letters and consonant clusters, auditory processing speed, auditory and phonematic memory, and graphophonological processing, which is fundamental for speech, language and reading writing disorders. Hamlet is a programme based on categorisation activities for working on phonological conceptualisation. It attempts to encourage the analysis of the segments of words, syllables or phonemes, and the classification of a certain segment as belonging or not to a particular phonological or orthographical category. Therapeutic approaches in the early phases of reading are oriented towards two poles based on the basic mechanisms underlying the process of learning to read, the grapheme phoneme transformation process and global word recognition. The interventionalist strategies used at school are focused on the use of cognitive strategy techniques. The purpose of these techniques is to teach pupils practical strategies or resources aimed at overcoming specific deficiencies.
Information properties of morphologically complex words modulate brain activity during word reading

PubMed Central

Hultén, Annika; Lehtonen, Minna; Lagus, Krista; Salmelin, Riitta

2018-01-01

Abstract Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well‐defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito‐temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole‐word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages. PMID:29524274
Information properties of morphologically complex words modulate brain activity during word reading.

PubMed

Hakala, Tero; Hultén, Annika; Lehtonen, Minna; Lagus, Krista; Salmelin, Riitta

2018-06-01

Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well-defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito-temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole-word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages. © 2018 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
Modeling envelope statistics of blood and myocardium for segmentation of echocardiographic images.

PubMed

Nillesen, Maartje M; Lopata, Richard G P; Gerrits, Inge H; Kapusta, Livia; Thijssen, Johan M; de Korte, Chris L

2008-04-01

The objective of this study was to investigate the use of speckle statistics as a preprocessing step for segmentation of the myocardium in echocardiographic images. Three-dimensional (3D) and biplane image sequences of the left ventricle of two healthy children and one dog (beagle) were acquired. Pixel-based speckle statistics of manually segmented blood and myocardial regions were investigated by fitting various probability density functions (pdf). The statistics of heart muscle and blood could both be optimally modeled by a K-pdf or Gamma-pdf (Kolmogorov-Smirnov goodness-of-fit test). Scale and shape parameters of both distributions could differentiate between blood and myocardium. Local estimation of these parameters was used to obtain parametric images, where window size was related to speckle size (5 x 2 speckles). Moment-based and maximum-likelihood estimators were used. Scale parameters were still able to differentiate blood from myocardium; however, smoothing of edges of anatomical structures occurred. Estimation of the shape parameter required a larger window size, leading to unacceptable blurring. Using these parameters as an input for segmentation resulted in unreliable segmentation. Adaptive mean squares filtering was then introduced using the moment-based scale parameter (sigma(2)/mu) of the Gamma-pdf to automatically steer the two-dimensional (2D) local filtering process. This method adequately preserved sharpness of the edges. In conclusion, a trade-off between preservation of sharpness of edges and goodness-of-fit when estimating local shape and scale parameters is evident for parametric images. For this reason, adaptive filtering outperforms parametric imaging for the segmentation of echocardiographic images.
The role of interword spacing in reading Japanese: an eye movement study.

PubMed

Sainio, Miia; Hyönä, Jukka; Bingushi, Kazuo; Bertram, Raymond

2007-09-01

The present study investigated the role of interword spacing in a naturally unspaced language, Japanese. Eye movements were registered of native Japanese readers reading pure Hiragana (syllabic) and mixed Kanji-Hiragana (ideographic and syllabic) text in spaced and unspaced conditions. Interword spacing facilitated both word identification and eye guidance when reading syllabic script, but not when the script contained ideographic characters. We conclude that in reading Hiragana interword spacing serves as an effective segmentation cue. In contrast, spacing information in mixed Kanji-Hiragana text is redundant, since the visually salient Kanji characters serve as effective segmentation cues by themselves.
The span of correlations in dolphin whistle sequences

NASA Astrophysics Data System (ADS)

Ferrer-i-Cancho, Ramon; McCowan, Brenda

2012-06-01

Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-α) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle-whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements.
Printed Arabic optical character segmentation

NASA Astrophysics Data System (ADS)

Mohammad, Khader; Ayyesh, Muna; Qaroush, Aziz; Tumar, Iyad

2015-03-01

A considerable progress in recognition techniques for many non-Arabic characters has been achieved. In contrary, few efforts have been put on the research of Arabic characters. In any Optical Character Recognition (OCR) system the segmentation step is usually the essential stage in which an extensive portion of processing is devoted and a considerable share of recognition errors is attributed. In this research, a novel segmentation approach for machine Arabic printed text with diacritics is proposed. The proposed method reduces computation, errors, gives a clear description for the sub-word and has advantages over using the skeleton approach in which the data and information of the character can be lost. Both of initial evaluation and testing of the proposed method have been developed using MATLAB and shows 98.7% promising results.
Evolution of semilocal string networks. II. Velocity estimators

NASA Astrophysics Data System (ADS)

Lopez-Eiguren, A.; Urrestilla, J.; Achúcarro, A.; Avgoustidis, A.; Martins, C. J. A. P.

2017-07-01

We continue a comprehensive numerical study of semilocal string networks and their cosmological evolution. These can be thought of as hybrid networks comprised of (nontopological) string segments, whose core structure is similar to that of Abelian Higgs vortices, and whose ends have long-range interactions and behavior similar to that of global monopoles. Our study provides further evidence of a linear scaling regime, already reported in previous studies, for the typical length scale and velocity of the network. We introduce a new algorithm to identify the position of the segment cores. This allows us to determine the length and velocity of each individual segment and follow their evolution in time. We study the statistical distribution of segment lengths and velocities for radiation- and matter-dominated evolution in the regime where the strings are stable. Our segment detection algorithm gives higher length values than previous studies based on indirect detection methods. The statistical distribution shows no evidence of (anti)correlation between the speed and the length of the segments.
Color Image Segmentation Based on Statistics of Location and Feature Similarity

NASA Astrophysics Data System (ADS)

Mori, Fumihiko; Yamada, Hiromitsu; Mizuno, Makoto; Sugano, Naotoshi

The process of “image segmentation and extracting remarkable regions” is an important research subject for the image understanding. However, an algorithm based on the global features is hardly found. The requisite of such an image segmentation algorism is to reduce as much as possible the over segmentation and over unification. We developed an algorithm using the multidimensional convex hull based on the density as the global feature. In the concrete, we propose a new algorithm in which regions are expanded according to the statistics of the region such as the mean value, standard deviation, maximum value and minimum value of pixel location, brightness and color elements and the statistics are updated. We also introduced a new concept of conspicuity degree and applied it to the various 21 images to examine the effectiveness. The remarkable object regions, which were extracted by the presented system, highly coincided with those which were pointed by the sixty four subjects who attended the psychological experiment.

Online neural monitoring of statistical learning

PubMed Central

Batterink, Laura J.; Paller, Ken A.

2017-01-01

The extraction of patterns in the environment plays a critical role in many types of human learning, from motor skills to language acquisition. This process is known as statistical learning. Here we propose that statistical learning has two dissociable components: (1) perceptual binding of individual stimulus units into integrated composites and (2) storing those integrated representations for later use. Statistical learning is typically assessed using post-learning tasks, such that the two components are conflated. Our goal was to characterize the online perceptual component of statistical learning. Participants were exposed to a structured stream of repeating trisyllabic nonsense words and a random syllable stream. Online learning was indexed by an EEG-based measure that quantified neural entrainment at the frequency of the repeating words relative to that of individual syllables. Statistical learning was subsequently assessed using conventional measures in an explicit rating task and a reaction-time task. In the structured stream, neural entrainment to trisyllabic words was higher than in the random stream, increased as a function of exposure to track the progression of learning, and predicted performance on the RT task. These results demonstrate that monitoring this critical component of learning via rhythmic EEG entrainment reveals a gradual acquisition of knowledge whereby novel stimulus sequences are transformed into familiar composites. This online perceptual transformation is a critical component of learning. PMID:28324696
Masked Morphological Priming in German-Speaking Adults and Children: Evidence from Response Time Distributions

PubMed Central

Hasenäcker, Jana; Beyersmann, Elisabeth; Schroeder, Sascha

2016-01-01

In this study, we looked at masked morphological priming effects in German children and adults beyond mean response times by taking into account response time distributions. We conducted an experiment comparing suffixed word primes (kleidchen-KLEID), suffixed nonword primes (kleidtum-KLEID), nonsuffixed nonword primes (kleidekt-KLEID), and unrelated controls (träumerei-KLEID). The pattern of priming in adults showed facilitation from suffixed words, suffixed nonwords, and nonsuffixed nonwords relative to unrelated controls, and from both suffixed conditions relative to nonsuffixed nonwords, thus providing evidence for morpho-orthographic and embedded stem priming. Children also showed facilitation from real suffixed words, suffixed nonwords, and nonsuffixed nonwords compared to unrelated words, but no difference between the suffixed and nonsuffixed conditions, thus suggesting that German elementary school children do not make use of morpho-orthographic segmentation. Interestingly, for all priming effects, a shift of the response time distribution was observed. Consequences for theories of morphological processing are discussed. PMID:27445899
What You Learn is What You See: Using Eye Movements to Study Infant Cross-Situational Word Learning

PubMed Central

Smith, Linda

2016-01-01

Recent studies show that both adults and young children possess powerful statistical learning capabilities to solve the word-to-world mapping problem. However, the underlying mechanisms that make statistical learning possible and powerful are not yet known. With the goal of providing new insights into this issue, the research reported in this paper used an eye tracker to record the moment-by-moment eye movement data of 14-month-old babies in statistical learning tasks. Various measures are applied to such fine-grained temporal data, such as looking duration and shift rate (the number of shifts in gaze from one visual object to the other) trial by trial, showing different eye movement patterns between strong and weak statistical learners. Moreover, an information-theoretic measure is developed and applied to gaze data to quantify the degree of learning uncertainty trial by trial. Next, a simple associative statistical learning model is applied to eye movement data and these simulation results are compared with empirical results from young children, showing strong correlations between these two. This suggests that an associative learning mechanism with selective attention can provide a cognitively plausible model of cross-situational statistical learning. The work represents the first steps to use eye movement data to infer underlying real-time processes in statistical word learning. PMID:22213894
Distinguishing Man from Molecules: The Distinctiveness of Medical Concepts at Different Levels of Description

PubMed Central

Cole, William G.; Michael, Patricia; Blois, Marsden S.

1987-01-01

A computer program was created to use information about the statistical distribution of words in journal abstracts to make probabilistic judgments about the level of description (e.g. molecular, cell, organ) of medical text. Statistical analysis of 7,409 journal abstracts taken from three medical journals representing distinct levels of description revealed that many medical words seem to be highly specific to one or another level of description. For example, the word adrenoreceptors occurred only in the American Journal of Physiology, never in Journal of Biological Chemistry or in Journal of American Medical Association. Such highly specific words occured so frequently that the automatic classification program was able to classify correctly 45 out of 45 test abstracts, with 100% confidence. These findings are interpreted in terms of both a theory of the structure of medical knowledge and the pragmatics of automatic classification.
New auto-segment method of cerebral hemorrhage

NASA Astrophysics Data System (ADS)

Wang, Weijiang; Shen, Tingzhi; Dang, Hua

2007-12-01

A novel method for Computerized tomography (CT) cerebral hemorrhage (CH) image automatic segmentation is presented in the paper, which uses expert system that models human knowledge about the CH automatic segmentation problem. The algorithm adopts a series of special steps and extracts some easy ignored CH features which can be found by statistic results of mass real CH images, such as region area, region CT number, region smoothness and some statistic CH region relationship. And a seven steps' extracting mechanism will ensure these CH features can be got correctly and efficiently. By using these CH features, a decision tree which models the human knowledge about the CH automatic segmentation problem has been built and it will ensure the rationality and accuracy of the algorithm. Finally some experiments has been taken to verify the correctness and reasonable of the automatic segmentation, and the good correct ratio and fast speed make it possible to be widely applied into practice.
Pharmacological and histological examinations of regional differences of guinea-pig lung: a role of pleural surface smooth muscle in lung strip contraction.

PubMed Central

Wong, W. S.; Bloomquist, S. L.; Bendele, A. M.; Fleisch, J. H.

1992-01-01

1. Parenchymal lung strip preparations have been widely used as an in vitro model of peripheral airway smooth muscle. The present study examined functional responses of 4 consecutive guinea-pig lung parenchymal strips isolated from the central region (segment 1) to the distal edge (segment 4) of the lower lung lobe. The middle two segments were designated as segments 2 and 3. 2. Lung segments 1 and 4 exhibited significantly greater contraction than the other 2 segments to KCl when responses were expressed as mg force per mg tissue weight. Contractile responses to bronchospastic agents including histamine, carbachol, endothelin-1, leukotrienes (LT) B4 and D4, and the thromboxane A2-mimetic U46619 demonstrated no significant difference in EC50 values among the 4 lung segments. 3. Contractile responses of segments 1 and 4 to antigen-challenge (ovalbumin), ionophore A23187 and substance P were significantly greater than the other 2 segments with respect to either sensitivity or maximum responsiveness. 4. U46619-induced contractions of the 4 lung segments were relaxed in similar manner by papaverine and theophylline up to 100%, salbutamol up to 80%, and sodium nitroprusside by only 20%. In contrast, sodium nitroprusside markedly reversed U46619-induced contraction of pulmonary arterial rings and bronchial rings. 5. Histological studies identified 2-4 layers of smooth muscle cells underlying the lung pleural surface. Mast cells were prominent in this area. Moreover, morphometric studies showed that segment 4 possessed the least amount of smooth muscle structures from bronchial/bronchiolar wall and vasculatures as compared to the other 3 segments, and a significant difference in this respect was evident between segment 1 and segment 4.(ABSTRACT TRUNCATED AT 250 WORDS) Images Figure 1 Figure 6 PMID:1378341
Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death

PubMed Central

Petersen, Alexander M.; Tenenbaum, Joel; Havlin, Shlomo; Stanley, H. Eugene

2012-01-01

We analyze the dynamic properties of 107 words recorded in English, Spanish and Hebrew over the period 1800–2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis. PMID:22423321
Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death

NASA Astrophysics Data System (ADS)

Petersen, Alexander M.; Tenenbaum, Joel; Havlin, Shlomo; Stanley, H. Eugene

2012-03-01

We analyze the dynamic properties of 107 words recorded in English, Spanish and Hebrew over the period 1800-2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis.
Degraded Chinese rubbing images thresholding based on local first-order statistics

NASA Astrophysics Data System (ADS)

Wang, Fang; Hou, Ling-Ying; Huang, Han

2017-06-01

It is a necessary step for Chinese character segmentation from degraded document images in Optical Character Recognizer (OCR); however, it is challenging due to various kinds of noising in such an image. In this paper, we present three local first-order statistics method that had been adaptive thresholding for segmenting text and non-text of Chinese rubbing image. Both visual inspection and numerically investigate for the segmentation results of rubbing image had been obtained. In experiments, it obtained better results than classical techniques in the binarization of real Chinese rubbing image and PHIBD 2012 datasets.
Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.

PubMed

Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu

2017-10-03

Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.
Automatic Classification of Medical Text: The Influence of Publication Form1

PubMed Central

Cole, William G.; Michael, Patricia A.; Stewart, James G.; Blois, Marsden S.

1988-01-01

Previous research has shown that within the domain of medical journal abstracts the statistical distribution of words is neither random nor uniform, but is highly characteristic. Many words are used mainly or solely by one medical specialty or when writing about one particular level of description. Due to this regularity of usage, automatic classification within journal abstracts has proved quite successful. The present research asks two further questions. It investigates whether this statistical regularity and automatic classification success can also be achieved in medical textbook chapters. It then goes on to see whether the statistical distribution found in textbooks is sufficiently similar to that found in abstracts to permit accurate classification of abstracts based solely on previous knowledge of textbooks. 14 textbook chapters and 45 MEDLINE abstracts were submitted to an automatic classification program that had been trained only on chapters drawn from a standard textbook series. Statistical analysis of the properties of abstracts vs. chapters revealed important differences in word use. Automatic classification performance was good for chapters, but poor for abstracts.
Processing voiceless vowels in Japanese: Effects of language-specific phonological knowledge

NASA Astrophysics Data System (ADS)

Ogasawara, Naomi

2005-04-01

There has been little research on processing allophonic variation in the field of psycholinguistics. This study focuses on processing the voiced/voiceless allophonic alternation of high vowels in Japanese. Three perception experiments were conducted to explore how listeners parse out vowels with the voicing alternation from other segments in the speech stream and how the different voicing statuses of the vowel affect listeners' word recognition process. The results from the three experiments show that listeners use phonological knowledge of their native language for phoneme processing and for word recognition. However, interactions of the phonological and acoustic effects are observed to be different in each process. The facilitatory phonological effect and the inhibitory acoustic effect cancel out one another in phoneme processing; while in word recognition, the facilitatory phonological effect overrides the inhibitory acoustic effect.
Dissociated repetition deficits in aphasia can reflect flexible interactions between left dorsal and ventral streams and gender-dimorphic architecture of the right dorsal stream

PubMed Central

Berthier, Marcelo L.; Froudist Walsh, Seán; Dávila, Guadalupe; Nabrozidis, Alejandro; Juárez y Ruiz de Mier, Rocío; Gutiérrez, Antonio; De-Torres, Irene; Ruiz-Cruces, Rafael; Alfaro, Francisco; García-Casares, Natalia

2013-01-01

Assessment of brain-damaged subjects presenting with dissociated repetition deficits after selective injury to either the left dorsal or ventral auditory pathways can provide further insight on their respective roles in verbal repetition. We evaluated repetition performance and its neural correlates using multimodal imaging (anatomical MRI, DTI, fMRI, and18FDG-PET) in a female patient with transcortical motor aphasia (TCMA) and in a male patient with conduction aphasia (CA) who had small contiguous but non-overlapping left perisylvian infarctions. Repetition in the TCMA patient was fully preserved except for a mild impairment in nonwords and digits, whereas the CA patient had impaired repetition of nonwords, digits and word triplet lists. Sentence repetition was impaired, but he repeated novel sentences significantly better than clichés. The TCMA patient had tissue damage and reduced metabolism in the left sensorimotor cortex and insula. DTI showed damage to the left temporo-frontal and parieto-frontal segments of the arcuate fasciculus (AF) and part of the left ventral stream together with well-developed right dorsal and ventral streams, as has been reported in more than one-third of females. The CA patient had tissue damage and reduced metabolic activity in the left temporoparietal cortex with additional metabolic decrements in the left frontal lobe. DTI showed damage to the left temporo-parietal and temporo-frontal segments of the AF, but the ventral stream was spared. The direct segment of the AF in the right hemisphere was also absent with only vestigial remains of the other dorsal subcomponents present, as is often found in males. fMRI during word and nonword repetition revealed bilateral perisylvian activation in the TCMA patient suggesting recruitment of spared segments of the left dorsal stream and right dorsal stream with propagation of signals to temporal lobe structures suggesting a compensatory reallocation of resources via the ventral streams. The CA patient showed a greater activation of these cortical areas than the TCMA patient, but these changes did not result in normal performance. Repetition of word triplet lists activated bilateral perisylvian cortices in both patients, but activation in the CA patient with very poor performance was restricted to small frontal and posterior temporal foci bilaterally. These findings suggest that dissociated repetition deficits in our cases are probably reliant on flexible interactions between left dorsal stream (spared segments, short tracts remains) and left ventral stream and on gender-dimorphic architecture of the right dorsal stream. PMID:24391569
Dissociated repetition deficits in aphasia can reflect flexible interactions between left dorsal and ventral streams and gender-dimorphic architecture of the right dorsal stream.

PubMed

Berthier, Marcelo L; Froudist Walsh, Seán; Dávila, Guadalupe; Nabrozidis, Alejandro; Juárez Y Ruiz de Mier, Rocío; Gutiérrez, Antonio; De-Torres, Irene; Ruiz-Cruces, Rafael; Alfaro, Francisco; García-Casares, Natalia

2013-01-01

Assessment of brain-damaged subjects presenting with dissociated repetition deficits after selective injury to either the left dorsal or ventral auditory pathways can provide further insight on their respective roles in verbal repetition. We evaluated repetition performance and its neural correlates using multimodal imaging (anatomical MRI, DTI, fMRI, and(18)FDG-PET) in a female patient with transcortical motor aphasia (TCMA) and in a male patient with conduction aphasia (CA) who had small contiguous but non-overlapping left perisylvian infarctions. Repetition in the TCMA patient was fully preserved except for a mild impairment in nonwords and digits, whereas the CA patient had impaired repetition of nonwords, digits and word triplet lists. Sentence repetition was impaired, but he repeated novel sentences significantly better than clichés. The TCMA patient had tissue damage and reduced metabolism in the left sensorimotor cortex and insula. DTI showed damage to the left temporo-frontal and parieto-frontal segments of the arcuate fasciculus (AF) and part of the left ventral stream together with well-developed right dorsal and ventral streams, as has been reported in more than one-third of females. The CA patient had tissue damage and reduced metabolic activity in the left temporoparietal cortex with additional metabolic decrements in the left frontal lobe. DTI showed damage to the left temporo-parietal and temporo-frontal segments of the AF, but the ventral stream was spared. The direct segment of the AF in the right hemisphere was also absent with only vestigial remains of the other dorsal subcomponents present, as is often found in males. fMRI during word and nonword repetition revealed bilateral perisylvian activation in the TCMA patient suggesting recruitment of spared segments of the left dorsal stream and right dorsal stream with propagation of signals to temporal lobe structures suggesting a compensatory reallocation of resources via the ventral streams. The CA patient showed a greater activation of these cortical areas than the TCMA patient, but these changes did not result in normal performance. Repetition of word triplet lists activated bilateral perisylvian cortices in both patients, but activation in the CA patient with very poor performance was restricted to small frontal and posterior temporal foci bilaterally. These findings suggest that dissociated repetition deficits in our cases are probably reliant on flexible interactions between left dorsal stream (spared segments, short tracts remains) and left ventral stream and on gender-dimorphic architecture of the right dorsal stream.
Character Reading Fluency, Word Segmentation Accuracy, and Reading Comprehension in L2 Chinese

ERIC Educational Resources Information Center

Shen, Helen H.; Jiang, Xin

2013-01-01

This study investigated the relationships between lower-level processing and general reading comprehension among adult L2 (second-language) beginning learners of Chinese, in both target and non-target language learning environments. Lower-level processing in Chinese reading includes the factors of character-naming accuracy, character-naming speed,…
21 CFR 145.130 - Canned figs.

Code of Federal Regulations, 2010 CFR

2010-04-01

... flavoring. (2) Spice. (3) Vinegar. (4) Unpeeled segments of citrus fruits. (5) Salt. Such food is sealed in... chapter and a declaration of any spice or seasoning that characterizes the product; for example, “Spice added”, or in lieu of the word “Spice”, the common name of the spice, “Seasoned with vinegar” or...
Dictionary of Afro-American Slang.

ERIC Educational Resources Information Center

Major, Clarence

The speech habits of the most oppressed --and the largest-- segment of the black population in the United States did not spring solely from an inability to handle acceptable forms of spoken English, nor mainly from the limitations caused by the particular stock of words known to the speaker. Black slang stems from a somewhat disseminated rejection…
Predictive Validity of Early Literacy Measures for Korean English Language Learners in the United States

ERIC Educational Resources Information Center

Han, Jeanie Nam; Vanderwood, Michael L.; Lee, Catherine Y.

2015-01-01

This study examined the predictive validity of early literacy measures with first-grade Korean English language learners (ELLs) in the United States at varying levels of English proficiency. Participants were screened using Dynamic Indicators of Basic Early Literacy Skills (DIBELS) Phoneme Segmentation Fluency (PSF), DIBELS Nonsense Word Fluency…
21 CFR 145.130 - Canned figs.

Code of Federal Regulations, 2011 CFR

2011-04-01

... flavoring. (2) Spice. (3) Vinegar. (4) Unpeeled segments of citrus fruits. (5) Salt. Such food is sealed in... chapter and a declaration of any spice or seasoning that characterizes the product; for example, “Spice added”, or in lieu of the word “Spice”, the common name of the spice, “Seasoned with vinegar” or...
Application of a Multitiered System of Support with English Language Learners

ERIC Educational Resources Information Center

Vanderwood, Mike L.; Tung, Catherine; Arellano, Elizabeth

2014-01-01

This study examined the effects of a phonological awareness (PA) intervention on the phonological and alphabetic principle skills of first-grade English language learners (ELLs). Nine first-grade classrooms in two large elementary schools were screened with DIBELS Phoneme Segmentation Fluency (PSF) and Nonsense Word Fluency (NWF) in the fall and…

Say It like You Mean It: Mothers' Use of Prosody to Convey Word Meaning

ERIC Educational Resources Information Center

Herold, Debora S.; Nygaard, Lynne C.; Namy, Laura L.

2012-01-01

Prosody plays a variety of roles in infants' communicative development, aiding in attention modulation, speech segmentation, and syntax acquisition. This study investigates the extent to which parents also spontaneously modulate prosodic aspects of infant directed speech in ways that distinguish semantic aspects of language. Fourteen mothers of…
Acquisition of L2 Vowel Duration in Japanese by Native English Speakers

ERIC Educational Resources Information Center

Okuno, Tomoko

2013-01-01

Research has demonstrated that focused perceptual training facilitates L2 learners' segmental perception and spoken word identification. Hardison (2003) and Motohashi-Saigo and Hardison (2009) found benefits of visual cues in the training for acquisition of L2 contrasts. The present study examined factors affecting perception and production…
L2 Perception of Spanish Palatal Variants across Different Tasks

ERIC Educational Resources Information Center

Shea, Christine; Renaud, Jeffrey

2014-01-01

While considerable dialectal variation exists, almost all varieties of Spanish exhibit some sort of alternation in terms of the palatal obstruent segments. Typically, the palatal affricate [??] tends to occur in word onset following a pause and in specific linear phonotactic environments. The palatal fricative [?] tends to occur in syllable onset…
PubMed Phrases, an open set of coherent phrases for searching biomedical literature

PubMed Central

Kim, Sun; Yeganova, Lana; Comeau, Donald C.; Wilbur, W. John; Lu, Zhiyong

2018-01-01

In biomedicine, key concepts are often expressed by multiple words (e.g., ‘zinc finger protein’). Previous work has shown treating a sequence of words as a meaningful unit, where applicable, is not only important for human understanding but also beneficial for automatic information seeking. Here we present a collection of PubMed® Phrases that are beneficial for information retrieval and human comprehension. We define these phrases as coherent chunks that are logically connected. To collect the phrase set, we apply the hypergeometric test to detect segments of consecutive terms that are likely to appear together in PubMed. These text segments are then filtered using the BM25 ranking function to ensure that they are beneficial from an information retrieval perspective. Thus, we obtain a set of 705,915 PubMed Phrases. We evaluate the quality of the set by investigating PubMed user click data and manually annotating a sample of 500 randomly selected noun phrases. We also analyze and discuss the usage of these PubMed Phrases in literature search. PMID:29893755
Pattern statistics on Markov chains and sensitivity to parameter estimation

PubMed Central

Nuel, Grégory

2006-01-01

Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). Results: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation. PMID:17044916
Pattern statistics on Markov chains and sensitivity to parameter estimation.

PubMed

Nuel, Grégory

2006-10-17

In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of sigma, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.
Tumor or abnormality identification from magnetic resonance images using statistical region fusion based segmentation.

PubMed

Subudhi, Badri Narayan; Thangaraj, Veerakumar; Sankaralingam, Esakkirajan; Ghosh, Ashish

2016-11-01

In this article, a statistical fusion based segmentation technique is proposed to identify different abnormality in magnetic resonance images (MRI). The proposed scheme follows seed selection, region growing-merging and fusion of multiple image segments. In this process initially, an image is divided into a number of blocks and for each block we compute the phase component of the Fourier transform. The phase component of each block reflects the gray level variation among the block but contains a large correlation among them. Hence a singular value decomposition (SVD) technique is adhered to generate a singular value of each block. Then a thresholding procedure is applied on these singular values to identify edgy and smooth regions and some seed points are selected for segmentation. By considering each seed point we perform a binary segmentation of the complete MRI and hence with all seed points we get an equal number of binary images. A parcel based statistical fusion process is used to fuse all the binary images into multiple segments. Effectiveness of the proposed scheme is tested on identifying different abnormalities: prostatic carcinoma detection, tuberculous granulomas identification and intracranial neoplasm or brain tumor detection. The proposed technique is established by comparing its results against seven state-of-the-art techniques with six performance evaluation measures. Copyright © 2016 Elsevier Inc. All rights reserved.
Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences.

PubMed

Schbath, S; Prum, B; de Turckheim, E

1995-01-01

Identifying exceptional motifs is often used for extracting information from long DNA sequences. The two difficulties of the method are the choice of the model that defines the expected frequencies of words and the approximation of the variance of the difference T(W) between the number of occurrences of a word W and its estimation. We consider here different Markov chain models, either with stationary or periodic transition probabilities. We estimate the variance of the difference T(W) by the conditional variance of the number of occurrences of W given the oligonucleotides counts that define the model. Two applications show how to use asymptotically standard normal statistics associated with the counts to describe a given sequence in terms of its outlying words. Sequences of Escherichia coli and of Bacillus subtilis are compared with respect to their exceptional tri- and tetranucleotides. For both bacteria, exceptional 3-words are mainly found in the coding frame. E. coli palindrome counts are analyzed in different models, showing that many overabundant words are one-letter mutations of avoided palindromes.
Statistical Validation of Image Segmentation Quality Based on a Spatial Overlap Index1

PubMed Central

Zou, Kelly H.; Warfield, Simon K.; Bharatha, Aditya; Tempany, Clare M.C.; Kaus, Michael R.; Haker, Steven J.; Wells, William M.; Jolesz, Ferenc A.; Kikinis, Ron

2005-01-01

Rationale and Objectives To examine a statistical validation method based on the spatial overlap between two sets of segmentations of the same anatomy. Materials and Methods The Dice similarity coefficient (DSC) was used as a statistical validation metric to evaluate the performance of both the reproducibility of manual segmentations and the spatial overlap accuracy of automated probabilistic fractional segmentation of MR images, illustrated on two clinical examples. Example 1: 10 consecutive cases of prostate brachytherapy patients underwent both preoperative 1.5T and intraoperative 0.5T MR imaging. For each case, 5 repeated manual segmentations of the prostate peripheral zone were performed separately on preoperative and on intraoperative images. Example 2: A semi-automated probabilistic fractional segmentation algorithm was applied to MR imaging of 9 cases with 3 types of brain tumors. DSC values were computed and logit-transformed values were compared in the mean with the analysis of variance (ANOVA). Results Example 1: The mean DSCs of 0.883 (range, 0.876–0.893) with 1.5T preoperative MRI and 0.838 (range, 0.819–0.852) with 0.5T intraoperative MRI (P < .001) were within and at the margin of the range of good reproducibility, respectively. Example 2: Wide ranges of DSC were observed in brain tumor segmentations: Meningiomas (0.519–0.893), astrocytomas (0.487–0.972), and other mixed gliomas (0.490–0.899). Conclusion The DSC value is a simple and useful summary measure of spatial overlap, which can be applied to studies of reproducibility and accuracy in image segmentation. We observed generally satisfactory but variable validation results in two clinical applications. This metric may be adapted for similar validation tasks. PMID:14974593
A novel measure and significance testing in data analysis of cell image segmentation.

PubMed

Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L

2017-03-14

Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.
Toward a model for lexical access based on acoustic landmarks and distinctive features

NASA Astrophysics Data System (ADS)

Stevens, Kenneth N.

2002-04-01

This article describes a model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features. These distinctive features specify the phonemic contrasts that are used in the language, such that a change in the value of a feature can potentially generate a new word. This model is a part of a more general model that derives a word sequence from this feature representation, the words being represented in a lexicon by sequences of feature bundles. The processing of the signal proceeds in three steps: (1) Detection of peaks, valleys, and discontinuities in particular frequency ranges of the signal leads to identification of acoustic landmarks. The type of landmark provides evidence for a subset of distinctive features called articulator-free features (e.g., [vowel], [consonant], [continuant]). (2) Acoustic parameters are derived from the signal near the landmarks to provide evidence for the actions of particular articulators, and acoustic cues are extracted by sampling selected attributes of these parameters in these regions. The selection of cues that are extracted depends on the type of landmark and on the environment in which it occurs. (3) The cues obtained in step (2) are combined, taking context into account, to provide estimates of ``articulator-bound'' features associated with each landmark (e.g., [lips], [high], [nasal]). These articulator-bound features, combined with the articulator-free features in (1), constitute the sequence of feature bundles that forms the output of the model. Examples of cues that are used, and justification for this selection, are given, as well as examples of the process of inferring the underlying features for a segment when there is variability in the signal due to enhancement gestures (recruited by a speaker to make a contrast more salient) or due to overlap of gestures from neighboring segments.
Construction of language models for an handwritten mail reading system

NASA Astrophysics Data System (ADS)

Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle

2012-01-01

This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.
Amatchmethod Based on Latent Semantic Analysis for Earthquakehazard Emergency Plan

NASA Astrophysics Data System (ADS)

Sun, D.; Zhao, S.; Zhang, Z.; Shi, X.

2017-09-01

The structure of the emergency plan on earthquake is complex, and it's difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.
The role of partial knowledge in statistical word learning

PubMed Central

Fricker, Damian C.; Yu, Chen; Smith, Linda B.

2013-01-01

A critical question about the nature of human learning is whether it is an all-or-none or a gradual, accumulative process. Associative and statistical theories of word learning rely critically on the later assumption: that the process of learning a word's meaning unfolds over time. That is, learning the correct referent for a word involves the accumulation of partial knowledge across multiple instances. Some theories also make an even stronger claim: Partial knowledge of one word–object mapping can speed up the acquisition of other word–object mappings. We present three experiments that test and verify these claims by exposing learners to two consecutive blocks of cross-situational learning, in which half of the words and objects in the second block were those that participants failed to learn in Block 1. In line with an accumulative account, Re-exposure to these mis-mapped items accelerated the acquisition of both previously experienced mappings and wholly new word–object mappings. But how does partial knowledge of some words speed the acquisition of others? We consider two hypotheses. First, partial knowledge of a word could reduce the amount of information required for it to reach threshold, and the supra-threshold mapping could subsequently aid in the acquisition of new mappings. Alternatively, partial knowledge of a word's meaning could be useful for disambiguating the meanings of other words even before the threshold of learning is reached. We construct and compare computational models embodying each of these hypotheses and show that the latter provides a better explanation of the empirical data. PMID:23702980
Phonological and Semantic Cues to Learning from Word-Types

PubMed Central

Richtsmeier, Peter

2017-01-01

Word-types represent the primary form of data for many models of phonological learning, and they often predict performance in psycholinguistic tasks. Word-types are often tacitly defined as phonologically unique words. Yet, an explicit test of this definition is lacking, and natural language patterning suggests that word meaning could also act as a cue to word-type status. This possibility was tested in a statistical phonotactic learning experiment in which phonological and semantic properties of word-types varied. During familiarization, the learning targets—word-medial consonant sequences—were instantiated either by four related word-types or by just one word-type (the experimental frequency factor). The expectation was that more word-types would lead participants to generalize the target sequences. Regarding semantic cues, related word-types were either associated with different referents or all with a single referent. Regarding phonological cues, related word-types differed from each other by one, two, or more phonemes. At test, participants rated novel wordforms for their similarity to the familiarization words. When participants heard four related word-types, they gave higher ratings to test words with the same consonant sequences, irrespective of the phonological and semantic manipulations. The results support the existing phonological definition of word-types. PMID:29187914
Hierarchical combinatorial deep learning architecture for pancreas segmentation of medical computed tomography cancer images.

PubMed

Fu, Min; Wu, Wenming; Hong, Xiafei; Liu, Qiuhua; Jiang, Jialin; Ou, Yaobin; Zhao, Yupei; Gong, Xinqi

2018-04-24

Efficient computational recognition and segmentation of target organ from medical images are foundational in diagnosis and treatment, especially about pancreas cancer. In practice, the diversity in appearance of pancreas and organs in abdomen, makes detailed texture information of objects important in segmentation algorithm. According to our observations, however, the structures of previous networks, such as the Richer Feature Convolutional Network (RCF), are too coarse to segment the object (pancreas) accurately, especially the edge. In this paper, we extend the RCF, proposed to the field of edge detection, for the challenging pancreas segmentation, and put forward a novel pancreas segmentation network. By employing multi-layer up-sampling structure replacing the simple up-sampling operation in all stages, the proposed network fully considers the multi-scale detailed contexture information of object (pancreas) to perform per-pixel segmentation. Additionally, using the CT scans, we supply and train our network, thus get an effective pipeline. Working with our pipeline with multi-layer up-sampling model, we achieve better performance than RCF in the task of single object (pancreas) segmentation. Besides, combining with multi scale input, we achieve the 76.36% DSC (Dice Similarity Coefficient) value in testing data. The results of our experiments show that our advanced model works better than previous networks in our dataset. On the other words, it has better ability in catching detailed contexture information. Therefore, our new single object segmentation model has practical meaning in computational automatic diagnosis.
Predictions interact with missing sensory evidence in semantic processing areas.

PubMed

Scharinger, Mathias; Bendixen, Alexandra; Herrmann, Björn; Henry, Molly J; Mildner, Toralf; Obleser, Jonas

2016-02-01

Human brain function draws on predictive mechanisms that exploit higher-level context during lower-level perception. These mechanisms are particularly relevant for situations in which sensory information is compromised or incomplete, as for example in natural speech where speech segments may be omitted due to sluggish articulation. Here, we investigate which brain areas support the processing of incomplete words that were predictable from semantic context, compared with incomplete words that were unpredictable. During functional magnetic resonance imaging (fMRI), participants heard sentences that orthogonally varied in predictability (semantically predictable vs. unpredictable) and completeness (complete vs. incomplete, i.e. missing their final consonant cluster). The effects of predictability and completeness interacted in heteromodal semantic processing areas, including left angular gyrus and left precuneus, where activity did not differ between complete and incomplete words when they were predictable. The same regions showed stronger activity for incomplete than for complete words when they were unpredictable. The interaction pattern suggests that for highly predictable words, the speech signal does not need to be complete for neural processing in semantic processing areas. Hum Brain Mapp 37:704-716, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Robust nuclei segmentation in cyto-histopathological images using statistical level set approach with topology preserving constraint

NASA Astrophysics Data System (ADS)

Taheri, Shaghayegh; Fevens, Thomas; Bui, Tien D.

2017-02-01

Computerized assessments for diagnosis or malignancy grading of cyto-histopathological specimens have drawn increased attention in the field of digital pathology. Automatic segmentation of cell nuclei is a fundamental step in such automated systems. Despite considerable research, nuclei segmentation is still a challenging task due noise, nonuniform illumination, and most importantly, in 2D projection images, overlapping and touching nuclei. In most published approaches, nuclei refinement is a post-processing step after segmentation, which usually refers to the task of detaching the aggregated nuclei or merging the over-segmented nuclei. In this work, we present a novel segmentation technique which effectively addresses the problem of individually segmenting touching or overlapping cell nuclei during the segmentation process. The proposed framework is a region-based segmentation method, which consists of three major modules: i) the image is passed through a color deconvolution step to extract the desired stains; ii) then the generalized fast radial symmetry transform is applied to the image followed by non-maxima suppression to specify the initial seed points for nuclei, and their corresponding GFRS ellipses which are interpreted as the initial nuclei borders for segmentation; iii) finally, these nuclei border initial curves are evolved through the use of a statistical level-set approach along with topology preserving criteria for segmentation and separation of nuclei at the same time. The proposed method is evaluated using Hematoxylin and Eosin, and fluorescent stained images, performing qualitative and quantitative analysis, showing that the method outperforms thresholding and watershed segmentation approaches.
Prostate segmentation in MR images using discriminant boundary features.

PubMed

Yang, Meijuan; Li, Xuelong; Turkbey, Baris; Choyke, Peter L; Yan, Pingkun

2013-02-01

Segmentation of the prostate in magnetic resonance image has become more in need for its assistance to diagnosis and surgical planning of prostate carcinoma. Due to the natural variability of anatomical structures, statistical shape model has been widely applied in medical image segmentation. Robust and distinctive local features are critical for statistical shape model to achieve accurate segmentation results. The scale invariant feature transformation (SIFT) has been employed to capture the information of the local patch surrounding the boundary. However, when SIFT feature being used for segmentation, the scale and variance are not specified with the location of the point of interest. To deal with it, the discriminant analysis in machine learning is introduced to measure the distinctiveness of the learned SIFT features for each landmark directly and to make the scale and variance adaptive to the locations. As the gray values and gradients vary significantly over the boundary of the prostate, separate appearance descriptors are built for each landmark and then optimized. After that, a two stage coarse-to-fine segmentation approach is carried out by incorporating the local shape variations. Finally, the experiments on prostate segmentation from MR image are conducted to verify the efficiency of the proposed algorithms.
Semiautomatic tumor segmentation with multimodal images in a conditional random field framework.

PubMed

Hu, Yu-Chi; Grossberg, Michael; Mageras, Gikas

2016-04-01

Volumetric medical images of a single subject can be acquired using different imaging modalities, such as computed tomography, magnetic resonance imaging (MRI), and positron emission tomography. In this work, we present a semiautomatic segmentation algorithm that can leverage the synergies between different image modalities while integrating interactive human guidance. The algorithm provides a statistical segmentation framework partly automating the segmentation task while still maintaining critical human oversight. The statistical models presented are trained interactively using simple brush strokes to indicate tumor and nontumor tissues and using intermediate results within a patient's image study. To accomplish the segmentation, we construct the energy function in the conditional random field (CRF) framework. For each slice, the energy function is set using the estimated probabilities from both user brush stroke data and prior approved segmented slices within a patient study. The progressive segmentation is obtained using a graph-cut-based minimization. Although no similar semiautomated algorithm is currently available, we evaluated our method with an MRI data set from Medical Image Computing and Computer Assisted Intervention Society multimodal brain segmentation challenge (BRATS 2012 and 2013) against a similar fully automatic method based on CRF and a semiautomatic method based on grow-cut, and our method shows superior performance.

Online neural monitoring of statistical learning.

PubMed

Batterink, Laura J; Paller, Ken A

2017-05-01

The extraction of patterns in the environment plays a critical role in many types of human learning, from motor skills to language acquisition. This process is known as statistical learning. Here we propose that statistical learning has two dissociable components: (1) perceptual binding of individual stimulus units into integrated composites and (2) storing those integrated representations for later use. Statistical learning is typically assessed using post-learning tasks, such that the two components are conflated. Our goal was to characterize the online perceptual component of statistical learning. Participants were exposed to a structured stream of repeating trisyllabic nonsense words and a random syllable stream. Online learning was indexed by an EEG-based measure that quantified neural entrainment at the frequency of the repeating words relative to that of individual syllables. Statistical learning was subsequently assessed using conventional measures in an explicit rating task and a reaction-time task. In the structured stream, neural entrainment to trisyllabic words was higher than in the random stream, increased as a function of exposure to track the progression of learning, and predicted performance on the reaction time (RT) task. These results demonstrate that monitoring this critical component of learning via rhythmic EEG entrainment reveals a gradual acquisition of knowledge whereby novel stimulus sequences are transformed into familiar composites. This online perceptual transformation is a critical component of learning. Copyright © 2017 Elsevier Ltd. All rights reserved.
Perceptual statistical learning over one week in child speech production.

PubMed

Richtsmeier, Peter T; Goffman, Lisa

2017-07-01

What cognitive mechanisms account for the trajectory of speech sound development, in particular, gradually increasing accuracy during childhood? An intriguing potential contributor is statistical learning, a type of learning that has been studied frequently in infant perception but less often in child speech production. To assess the relevance of statistical learning to developing speech accuracy, we carried out a statistical learning experiment with four- and five-year-olds in which statistical learning was examined over one week. Children were familiarized with and tested on word-medial consonant sequences in novel words. There was only modest evidence for statistical learning, primarily in the first few productions of the first session. This initial learning effect nevertheless aligns with previous statistical learning research. Furthermore, the overall learning effect was similar to an estimate of weekly accuracy growth based on normative studies. The results implicate other important factors in speech sound development, particularly learning via production. Copyright © 2017 Elsevier Inc. All rights reserved.
Teachers and Textbooks: On Statistical Definitions in Senior Secondary Mathematics

ERIC Educational Resources Information Center

Dunn, Peter K.; Marshman, Margaret; McDougall, Robert; Wiegand, Aaron

2015-01-01

The new "Australian Senior Secondary Curriculum: Mathematics" contains more statistics than the existing Australian Curricula. This case study examines how a group of Queensland mathematics teachers define the word "statistics" and five statistical terms from the new curricula. These definitions are compared to those used in…
Incorporating linguistic knowledge for learning distributed word representations.

PubMed

Wang, Yan; Liu, Zhiyuan; Sun, Maosong

2015-01-01

Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining.
Incorporating Linguistic Knowledge for Learning Distributed Word Representations

PubMed Central

Wang, Yan; Liu, Zhiyuan; Sun, Maosong

2015-01-01

Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining. PMID:25874581
Use what you can: storage, abstraction processes, and perceptual adjustments help listeners recognize reduced forms

PubMed Central

Poellmann, Katja; Mitterer, Holger; McQueen, James M.

2014-01-01

Three eye-tracking experiments tested whether native listeners recognized reduced Dutch words better after having heard the same reduced words, or different reduced words of the same reduction type and whether familiarization with one reduction type helps listeners to deal with another reduction type. In the exposure phase, a segmental reduction group was exposed to /b/-reductions (e.g., minderij instead of binderij, “book binder”) and a syllabic reduction group was exposed to full-vowel deletions (e.g., p'raat instead of paraat, “ready”), while a control group did not hear any reductions. In the test phase, all three groups heard the same speaker producing reduced-/b/ and deleted-vowel words that were either repeated (Experiments 1 and 2) or new (Experiment 3), but that now appeared as targets in semantically neutral sentences. Word-specific learning effects were found for vowel-deletions but not for /b/-reductions. Generalization of learning to new words of the same reduction type occurred only if the exposure words showed a phonologically consistent reduction pattern (/b/-reductions). In contrast, generalization of learning to words of another reduction type occurred only if the exposure words showed a phonologically inconsistent reduction pattern (the vowel deletions; learning about them generalized to recognition of the /b/-reductions). In order to deal with reductions, listeners thus use various means. They store reduced variants (e.g., for the inconsistent vowel-deleted words) and they abstract over incoming information to build up and apply mapping rules (e.g., for the consistent /b/-reductions). Experience with inconsistent pronunciations leads to greater perceptual flexibility in dealing with other forms of reduction uttered by the same speaker than experience with consistent pronunciations. PMID:24910622
Fully Bayesian inference for structural MRI: application to segmentation and statistical analysis of T2-hypointensities.

PubMed

Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark

2013-01-01

Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Statistical Validation of Automatic Methods for Hippocampus Segmentation in MR Images of Epileptic Patients

PubMed Central

Hosseini, Mohammad-Parsa; Nazem-Zadeh, Mohammad R.; Pompili, Dario; Soltanian-Zadeh, Hamid

2015-01-01

Hippocampus segmentation is a key step in the evaluation of mesial Temporal Lobe Epilepsy (mTLE) by MR images. Several automated segmentation methods have been introduced for medical image segmentation. Because of multiple edges, missing boundaries, and shape changing along its longitudinal axis, manual outlining still remains the benchmark for hippocampus segmentation, which however, is impractical for large datasets due to time constraints. In this study, four automatic methods, namely FreeSurfer, Hammer, Automatic Brain Structure Segmentation (ABSS), and LocalInfo segmentation, are evaluated to find the most accurate and applicable method that resembles the bench-mark of hippocampus. Results from these four methods are compared against those obtained using manual segmentation for T1-weighted images of 157 symptomatic mTLE patients. For performance evaluation of automatic segmentation, Dice coefficient, Hausdorff distance, Precision, and Root Mean Square (RMS) distance are extracted and compared. Among these four automated methods, ABSS generates the most accurate results and the reproducibility is more similar to expert manual outlining by statistical validation. By considering p-value<0.05, the results of performance measurement for ABSS reveal that, Dice is 4%, 13%, and 17% higher, Hausdorff is 23%, 87%, and 70% lower, precision is 5%, -5%, and 12% higher, and RMS is 19%, 62%, and 65% lower compared to LocalInfo, FreeSurfer, and Hammer, respectively. PMID:25571043
A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

PubMed

Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

2015-01-01

PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies.
Level set method with automatic selective local statistics for brain tumor segmentation in MR images.

PubMed

Thapaliya, Kiran; Pyun, Jae-Young; Park, Chun-Su; Kwon, Goo-Rak

2013-01-01

The level set approach is a powerful tool for segmenting images. This paper proposes a method for segmenting brain tumor images from MR images. A new signed pressure function (SPF) that can efficiently stop the contours at weak or blurred edges is introduced. The local statistics of the different objects present in the MR images were calculated. Using local statistics, the tumor objects were identified among different objects. In this level set method, the calculation of the parameters is a challenging task. The calculations of different parameters for different types of images were automatic. The basic thresholding value was updated and adjusted automatically for different MR images. This thresholding value was used to calculate the different parameters in the proposed algorithm. The proposed algorithm was tested on the magnetic resonance images of the brain for tumor segmentation and its performance was evaluated visually and quantitatively. Numerical experiments on some brain tumor images highlighted the efficiency and robustness of this method. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
AISLE: an automatic volumetric segmentation method for the study of lung allometry.

PubMed

Ren, Hongliang; Kazanzides, Peter

2011-01-01

We developed a fully automatic segmentation method for volumetric CT (computer tomography) datasets to support construction of a statistical atlas for the study of allometric laws of the lung. The proposed segmentation method, AISLE (Automated ITK-Snap based on Level-set), is based on the level-set implementation from an existing semi-automatic segmentation program, ITK-Snap. AISLE can segment the lung field without human interaction and provide intermediate graphical results as desired. The preliminary experimental results show that the proposed method can achieve accurate segmentation, in terms of volumetric overlap metric, by comparing with the ground-truth segmentation performed by a radiologist.
Self-correcting multi-atlas segmentation

NASA Astrophysics Data System (ADS)

Gao, Yi; Wilford, Andrew; Guo, Liang

2016-03-01

In multi-atlas segmentation, one typically registers several atlases to the new image, and their respective segmented label images are transformed and fused to form the final segmentation. After each registration, the quality of the registration is reflected by the single global value: the final registration cost. Ideally, if the quality of the registration can be evaluated at each point, independent of the registration process, which also provides a direction in which the deformation can further be improved, the overall segmentation performance can be improved. We propose such a self-correcting multi-atlas segmentation method. The method is applied on hippocampus segmentation from brain images and statistically significantly improvement is observed.
Prostate segmentation in MRI using a convolutional neural network architecture and training strategy based on statistical shape models.

PubMed

Karimi, Davood; Samei, Golnoosh; Kesch, Claudia; Nir, Guy; Salcudean, Septimiu E

2018-05-15

Most of the existing convolutional neural network (CNN)-based medical image segmentation methods are based on methods that have originally been developed for segmentation of natural images. Therefore, they largely ignore the differences between the two domains, such as the smaller degree of variability in the shape and appearance of the target volume and the smaller amounts of training data in medical applications. We propose a CNN-based method for prostate segmentation in MRI that employs statistical shape models to address these issues. Our CNN predicts the location of the prostate center and the parameters of the shape model, which determine the position of prostate surface keypoints. To train such a large model for segmentation of 3D images using small data (1) we adopt a stage-wise training strategy by first training the network to predict the prostate center and subsequently adding modules for predicting the parameters of the shape model and prostate rotation, (2) we propose a data augmentation method whereby the training images and their prostate surface keypoints are deformed according to the displacements computed based on the shape model, and (3) we employ various regularization techniques. Our proposed method achieves a Dice score of 0.88, which is obtained by using both elastic-net and spectral dropout for regularization. Compared with a standard CNN-based method, our method shows significantly better segmentation performance on the prostate base and apex. Our experiments also show that data augmentation using the shape model significantly improves the segmentation results. Prior knowledge about the shape of the target organ can improve the performance of CNN-based segmentation methods, especially where image features are not sufficient for a precise segmentation. Statistical shape models can also be employed to synthesize additional training data that can ease the training of large CNNs.
READIT! A Text Presentation Application for the Macintosh

DTIC Science & Technology

1988-12-28

remaining passages would all look similar to the one shown here.) In this example, the subject’s number (7) was the basis for the output folder name: * sub7 ...before,1 segment = 1 sentence. THE FOLLOWING OUTPUT IS FOR SUBJECT " sub7 ." TIMES ARE IN VIEWING ORDER... #W => Number of Words. #L=> Number of Letters
The Role of the Syllable in the Segmentation of Cairene Spoken Arabic

ERIC Educational Resources Information Center

Aquil, Rajaa

2012-01-01

The syllable as a perceptual unit has been investigated cross linguistically. In Cairene Arabic syllables fall into three categories, light CV, heavy CVC/CVV and superheavy CVCC/CVVC. However, heavy syllables in Cariene Arabic have varied weight depending on their position in a word, whether internal or final. The present paper investigates the…
Segmental Production in Mandarin-Learning Infants

ERIC Educational Resources Information Center

Chen, Li-Mei; Kent, Raymond D.

2010-01-01

The early development of vocalic and consonantal production in Mandarin-learning infants was studied at the transition from babbling to producing first words. Spontaneous vocalizations were recorded for 24 infants grouped by age: G1 (0 ; 7 to 1 ; 0) and G2 (1 ; 1 to 1 ; 6). Additionally, the infant-directed speech of 24 caregivers was recorded…
Is There a "Fete" in "Fetish"? Effects of Orthographic Opacity on Morpho-Orthographic Segmentation in Visual Word Recognition

ERIC Educational Resources Information Center

McCormick, Samantha F.; Rastle, Kathleen; Davis, Matthew H.

2008-01-01

Recent research using masked priming has suggested that there is a form of morphological decomposition that is based solely on the appearance of morphological complexity and that operates independently of semantic information [Longtin, C.M., Segui, J., & Halle, P. A. (2003). Morphological priming without morphological relationship. "Language and…
Stress Changes the Representational Landscape: Evidence from Word Segmentation

ERIC Educational Resources Information Center

Curtin, S.; Mintz, T.H.; Christiansen, M.H.

2005-01-01

Over the past couple of decades, research has established that infants are sensitive to the predominant stress pattern of their native language. However, the degree to which the stress pattern shapes infants' language development has yet to be fully determined. Whether stress is merely a cue to help organize the patterns of speech or whether it is…
Bilingual Phonological Awareness: Multilevel Construct Validation among Spanish-Speaking Kindergarteners in Transitional Bilingual Education Classrooms

ERIC Educational Resources Information Center

Branum-Martin, Lee; Mehta, Paras D.; Fletcher, Jack M.; Carlson, Coleen D.; Ortiz, Alba; Carlo, Maria; Francis, David J.

2006-01-01

The construct validity of English and Spanish phonological awareness (PA) tasks was examined with a sample of 812 kindergarten children from 71 transitional bilingual education program classrooms located in 3 different types of geographic regions in California and Texas. Tasks of PA, including blending nonwords, segmenting words, and phoneme…
Revising Segmentation Hypotheses in First and Second Language Listening

ERIC Educational Resources Information Center

Field, John

2008-01-01

Any on-line processing that takes place while an utterance is unfolding is extremely tentative, with early-formed hypotheses having to be revised as the utterance proceeds. The hypotheses in question relate not only to the words that are present but also to where their boundaries fall. This study examines how first and second language listeners…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.