INTERPOL survey of the use of speaker identification by law enforcement agencies.
Morrison, Geoffrey Stewart; Sahito, Farhan Hyder; Jardine, Gaëlle; Djokic, Djordje; Clavet, Sophie; Berghs, Sabine; Goemans Dorny, Caroline
2016-06-01
A survey was conducted of the use of speaker identification by law enforcement agencies around the world. A questionnaire was circulated to law enforcement agencies in the 190 member countries of INTERPOL. 91 responses were received from 69 countries. 44 respondents reported that they had speaker identification capabilities in house or via external laboratories. Half of these came from Europe. 28 respondents reported that they had databases of audio recordings of speakers. The clearest pattern in the responses was that of diversity. A variety of different approaches to speaker identification were used: The human-supervised-automatic approach was the most popular in North America, the auditory-acoustic-phonetic approach was the most popular in Europe, and the spectrographic/auditory-spectrographic approach was the most popular in Africa, Asia, the Middle East, and South and Central America. Globally, and in Europe, the most popular framework for reporting conclusions was identification/exclusion/inconclusive. In Europe, the second most popular framework was the use of verbal likelihood ratio scales. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Analysis of human scream and its impact on text-independent speaker verification.
Hansen, John H L; Nandwana, Mahesh Kumar; Shokouhi, Navid
2017-04-01
Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.
NASA Astrophysics Data System (ADS)
Kamiński, K.; Dobrowolski, A. P.
2017-04-01
The paper presents the architecture and the results of optimization of selected elements of the Automatic Speaker Recognition (ASR) system that uses Gaussian Mixture Models (GMM) in the classification process. Optimization was performed on the process of selection of individual characteristics using the genetic algorithm and the parameters of Gaussian distributions used to describe individual voices. The system that was developed was tested in order to evaluate the impact of different compression methods used, among others, in landline, mobile, and VoIP telephony systems, on effectiveness of the speaker identification. Also, the results were presented of effectiveness of speaker identification at specific levels of noise with the speech signal and occurrence of other disturbances that could appear during phone calls, which made it possible to specify the spectrum of applications of the presented ASR system.
Robust Recognition of Loud and Lombard speech in the Fighter Cockpit Environment
1988-08-01
the latter as inter-speaker variability. According to Zue [Z85j, inter-speaker variabilities can be attributed to sociolinguistic background, dialect...34 Journal of the Acoustical Society of America , Vol 50, 1971. [At74I B. S. Atal, "Linear prediction for speaker identification," Journal of the Acoustical...Society of America , Vol 55, 1974. [B771 B. Beek, E. P. Neuberg, and D. C. Hodge, "An Assessment of the Technology of Automatic Speech Recognition for
Automatic Method of Pause Measurement for Normal and Dysarthric Speech
ERIC Educational Resources Information Center
Rosen, Kristin; Murdoch, Bruce; Folker, Joanne; Vogel, Adam; Cahill, Louise; Delatycki, Martin; Corben, Louise
2010-01-01
This study proposes an automatic method for the detection of pauses and identification of pause types in conversational speech for the purpose of measuring the effects of Friedreich's Ataxia (FRDA) on speech. Speech samples of [approximately] 3 minutes were recorded from 13 speakers with FRDA and 18 healthy controls. Pauses were measured from the…
Multilevel Analysis in Analyzing Speech Data
ERIC Educational Resources Information Center
Guddattu, Vasudeva; Krishna, Y.
2011-01-01
The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
An automatic speech recognition system with speaker-independent identification support
NASA Astrophysics Data System (ADS)
Caranica, Alexandru; Burileanu, Corneliu
2015-02-01
The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.
Performance of wavelet analysis and neural networks for pathological voices identification
NASA Astrophysics Data System (ADS)
Salhi, Lotfi; Talbi, Mourad; Abid, Sabeur; Cherif, Adnane
2011-09-01
Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital 'RABTA' of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.
Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition
NASA Astrophysics Data System (ADS)
Drygajlo, Andrzej
Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-dependent features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court an indication of the strength of the evidence given the estimated within-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.
Speaker-Machine Interaction in Automatic Speech Recognition. Technical Report.
ERIC Educational Resources Information Center
Makhoul, John I.
The feasibility and limitations of speaker adaptation in improving the performance of a "fixed" (speaker-independent) automatic speech recognition system were examined. A fixed vocabulary of 55 syllables is used in the recognition system which contains 11 stops and fricatives and five tense vowels. The results of an experiment on speaker…
Speaker gender identification based on majority vote classifiers
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2017-03-01
Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
Parker, Mark; Cunningham, Stuart; Enderby, Pam; Hawley, Mark; Green, Phil
2006-01-01
The STARDUST project developed robust computer speech recognizers for use by eight people with severe dysarthria and concomitant physical disability to access assistive technologies. Independent computer speech recognizers trained with normal speech are of limited functional use by those with severe dysarthria due to limited and inconsistent proximity to "normal" articulatory patterns. Severe dysarthric output may also be characterized by a small mass of distinguishable phonetic tokens making the acoustic differentiation of target words difficult. Speaker dependent computer speech recognition using Hidden Markov Models was achieved by the identification of robust phonetic elements within the individual speaker output patterns. A new system of speech training using computer generated visual and auditory feedback reduced the inconsistent production of key phonetic tokens over time.
Shattuck-Hufnagel, S.; Choi, J. Y.; Moro-Velázquez, L.; Gómez-García, J. A.
2017-01-01
Although a large amount of acoustic indicators have already been proposed in the literature to evaluate the hypokinetic dysarthria of people with Parkinson’s Disease, the goal of this work is to identify and interpret new reliable and complementary articulatory biomarkers that could be applied to predict/evaluate Parkinson’s Disease from a diadochokinetic test, contributing to the possibility of a further multidimensional analysis of the speech of parkinsonian patients. The new biomarkers proposed are based on the kinetic behaviour of the envelope trace, which is directly linked with the articulatory dysfunctions introduced by the disease since the early stages. The interest of these new articulatory indicators stands on their easiness of identification and interpretation, and their potential to be translated into computer based automatic methods to screen the disease from the speech. Throughout this paper, the accuracy provided by these acoustic kinetic biomarkers is compared with the one obtained with a baseline system based on speaker identification techniques. Results show accuracies around 85% that are in line with those obtained with the complex state of the art speaker recognition techniques, but with an easier physical interpretation, which open the possibility to be transferred to a clinical setting. PMID:29240814
Botti, F; Alexander, A; Drygajlo, A
2004-12-02
This paper deals with a procedure to compensate for mismatched recording conditions in forensic speaker recognition, using a statistical score normalization. Bayesian interpretation of the evidence in forensic automatic speaker recognition depends on three sets of recordings in order to perform forensic casework: reference (R) and control (C) recordings of the suspect, and a potential population database (P), as well as a questioned recording (QR) . The requirement of similar recording conditions between suspect control database (C) and the questioned recording (QR) is often not satisfied in real forensic cases. The aim of this paper is to investigate a procedure of normalization of scores, which is based on an adaptation of the Test-normalization (T-norm) [2] technique used in the speaker verification domain, to compensate for the mismatch. Polyphone IPSC-02 database and ASPIC (an automatic speaker recognition system developed by EPFL and IPS-UNIL in Lausanne, Switzerland) were used in order to test the normalization procedure. Experimental results for three different recording condition scenarios are presented using Tippett plots and the effect of the compensation on the evaluation of the strength of the evidence is discussed.
Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers
NASA Astrophysics Data System (ADS)
Caballero Morales, Santiago Omar; Cox, Stephen J.
2009-12-01
Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.
Unsupervised real-time speaker identification for daily movies
NASA Astrophysics Data System (ADS)
Li, Ying; Kuo, C.-C. Jay
2002-07-01
The problem of identifying speakers for movie content analysis is addressed in this paper. While most previous work on speaker identification was carried out in a supervised mode using pure audio data, more robust results can be obtained in real-time by integrating knowledge from multiple media sources in an unsupervised mode. In this work, both audio and visual cues will be employed and subsequently combined in a probabilistic framework to identify speakers. Particularly, audio information is used to identify speakers with a maximum likelihood (ML)-based approach while visual information is adopted to distinguish speakers by detecting and recognizing their talking faces based on face detection/recognition and mouth tracking techniques. Moreover, to accommodate for speakers' acoustic variations along time, we update their models on the fly by adapting to their newly contributed speech data. Encouraging results have been achieved through extensive experiments, which shows a promising future of the proposed audiovisual-based unsupervised speaker identification system.
Automatic Intention Recognition in Conversation Processing
ERIC Educational Resources Information Center
Holtgraves, Thomas
2008-01-01
A fundamental assumption of many theories of conversation is that comprehension of a speaker's utterance involves recognition of the speaker's intention in producing that remark. However, the nature of intention recognition is not clear. One approach is to conceptualize a speaker's intention in terms of speech acts [Searle, J. (1969). "Speech…
Speaker Clustering for a Mixture of Singing and Reading (Preprint)
2012-03-01
diarization [2, 3] which answers the ques- tion of ”who spoke when?” is a combination of speaker segmentation and clustering. Although it is possible to...focuses on speaker clustering, the techniques developed here can be applied to speaker diarization . For the remainder of this paper, the term ”speech...and retrieval,” Proceedings of the IEEE, vol. 88, 2000. [2] S. Tranter and D. Reynolds, “An overview of automatic speaker diarization systems,” IEEE
Early Detection of Severe Apnoea through Voice Analysis and Automatic Speaker Recognition Techniques
NASA Astrophysics Data System (ADS)
Fernández, Ruben; Blanco, Jose Luis; Díaz, David; Hernández, Luis A.; López, Eduardo; Alcázar, José
This study is part of an on-going collaborative effort between the medical and the signal processing communities to promote research on applying voice analysis and Automatic Speaker Recognition techniques (ASR) for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based diagnosis could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we present and discuss the possibilities of using generative Gaussian Mixture Models (GMMs), generally used in ASR systems, to model distinctive apnoea voice characteristics (i.e. abnormal nasalization). Finally, we present experimental findings regarding the discriminative power of speaker recognition techniques applied to severe apnoea detection. We have achieved an 81.25 % correct classification rate, which is very promising and underpins the interest in this line of inquiry.
You had me at "Hello": Rapid extraction of dialect information from spoken words.
Scharinger, Mathias; Monahan, Philip J; Idsardi, William J
2011-06-15
Research on the neuronal underpinnings of speaker identity recognition has identified voice-selective areas in the human brain with evolutionary homologues in non-human primates who have comparable areas for processing species-specific calls. Most studies have focused on estimating the extent and location of these areas. In contrast, relatively few experiments have investigated the time-course of speaker identity, and in particular, dialect processing and identification by electro- or neuromagnetic means. We show here that dialect extraction occurs speaker-independently, pre-attentively and categorically. We used Standard American English and African-American English exemplars of 'Hello' in a magnetoencephalographic (MEG) Mismatch Negativity (MMN) experiment. The MMN as an automatic change detection response of the brain reflected dialect differences that were not entirely reducible to acoustic differences between the pronunciations of 'Hello'. Source analyses of the M100, an auditory evoked response to the vowels suggested additional processing in voice-selective areas whenever a dialect change was detected. These findings are not only relevant for the cognitive neuroscience of language, but also for the social sciences concerned with dialect and race perception. Copyright © 2011 Elsevier Inc. All rights reserved.
Advancements in robust algorithm formulation for speaker identification of whispered speech
NASA Astrophysics Data System (ADS)
Fan, Xing
Whispered speech is an alternative speech production mode from neutral speech, which is used by talkers intentionally in natural conversational scenarios to protect privacy and to avoid certain content from being overheard/made public. Due to the profound differences between whispered and neutral speech in production mechanism and the absence of whispered adaptation data, the performance of speaker identification systems trained with neutral speech degrades significantly. This dissertation therefore focuses on developing a robust closed-set speaker recognition system for whispered speech by using no or limited whispered adaptation data from non-target speakers. This dissertation proposes the concept of "High''/"Low'' performance whispered data for the purpose of speaker identification. A variety of acoustic properties are identified that contribute to the quality of whispered data. An acoustic analysis is also conducted to compare the phoneme/speaker dependency of the differences between whispered and neutral data in the feature domain. The observations from those acoustic analysis are new in this area and also serve as a guidance for developing robust speaker identification systems for whispered speech. This dissertation further proposes two systems for speaker identification of whispered speech. One system focuses on front-end processing. A two-dimensional feature space is proposed to search for "Low''-quality performance based whispered utterances and separate feature mapping functions are applied to vowels and consonants respectively in order to retain the speaker's information shared between whispered and neutral speech. The other system focuses on speech-mode-independent model training. The proposed method generates pseudo whispered features from neutral features by using the statistical information contained in a whispered Universal Background model (UBM) trained from extra collected whispered data from non-target speakers. Four modeling methods are proposed for the transformation estimation in order to generate the pseudo whispered features. Both of the above two systems demonstrate a significant improvement over the baseline system on the evaluation data. This dissertation has therefore contributed to providing a scientific understanding of the differences between whispered and neutral speech as well as improved front-end processing and modeling method for speaker identification of whispered speech. Such advancements will ultimately contribute to improve the robustness of speech processing systems.
Open-set speaker identification with diverse-duration speech data
NASA Astrophysics Data System (ADS)
Karadaghi, Rawande; Hertlein, Heinz; Ariyaeeinia, Aladdin
2015-05-01
The concern in this paper is an important category of applications of open-set speaker identification in criminal investigation, which involves operating with short and varied duration speech. The study presents investigations into the adverse effects of such an operating condition on the accuracy of open-set speaker identification, based on both GMMUBM and i-vector approaches. The experiments are conducted using a protocol developed for the identification task, based on the NIST speaker recognition evaluation corpus of 2008. In order to closely cover the real-world operating conditions in the considered application area, the study includes experiments with various combinations of training and testing data duration. The paper details the characteristics of the experimental investigations conducted and provides a thorough analysis of the results obtained.
Kharlamov, Viktor; Campbell, Kenneth; Kazanina, Nina
2011-11-01
Speech sounds are not always perceived in accordance with their acoustic-phonetic content. For example, an early and automatic process of perceptual repair, which ensures conformity of speech inputs to the listener's native language phonology, applies to individual input segments that do not exist in the native inventory or to sound sequences that are illicit according to the native phonotactic restrictions on sound co-occurrences. The present study with Russian and Canadian English speakers shows that listeners may perceive phonetically distinct and licit sound sequences as equivalent when the native language system provides robust evidence for mapping multiple phonetic forms onto a single phonological representation. In Russian, due to an optional but productive t-deletion process that affects /stn/ clusters, the surface forms [sn] and [stn] may be phonologically equivalent and map to a single phonological form /stn/. In contrast, [sn] and [stn] clusters are usually phonologically distinct in (Canadian) English. Behavioral data from identification and discrimination tasks indicated that [sn] and [stn] clusters were more confusable for Russian than for English speakers. The EEG experiment employed an oddball paradigm with nonwords [asna] and [astna] used as the standard and deviant stimuli. A reliable mismatch negativity response was elicited approximately 100 msec postchange in the English group but not in the Russian group. These findings point to a perceptual repair mechanism that is engaged automatically at a prelexical level to ensure immediate encoding of speech inputs in phonological terms, which in turn enables efficient access to the meaning of a spoken utterance.
NASA Astrophysics Data System (ADS)
Morishima, Shigeo; Nakamura, Satoshi
2004-12-01
We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker's face in an image sequence. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a 3D wire-frame model that is adaptable to any speaker. Our approach provides translated image synthesis with an extremely small database. The tracking motion of the face from a video image is performed by template matching. In this system, the translation and rotation of the face are detected by using a 3D personal face model whose texture is captured from a video frame. We also propose a method to customize the personal face model by using our GUI tool. By combining these techniques and the translated voice synthesis technique, an automatic multimodal translation can be achieved that is suitable for video mail or automatic dubbing systems into other languages.
NASA Astrophysics Data System (ADS)
Tovarek, Jaromir; Partila, Pavol
2017-05-01
This article discusses the speaker identification for the improvement of the security communication between law enforcement units. The main task of this research was to develop the text-independent speaker identification system which can be used for real-time recognition. This system is designed for identification in the open set. It means that the unknown speaker can be anyone. Communication itself is secured, but we have to check the authorization of the communication parties. We have to decide if the unknown speaker is the authorized for the given action. The calls are recorded by IP telephony server and then these recordings are evaluate using classification If the system evaluates that the speaker is not authorized, it sends a warning message to the administrator. This message can detect, for example a stolen phone or other unusual situation. The administrator then performs the appropriate actions. Our novel proposal system uses multilayer neural network for classification and it consists of three layers (input layer, hidden layer, and output layer). A number of neurons in input layer corresponds with the length of speech features. Output layer then represents classified speakers. Artificial Neural Network classifies speech signal frame by frame, but the final decision is done over the complete record. This rule substantially increases accuracy of the classification. Input data for the neural network are a thirteen Mel-frequency cepstral coefficients, which describe the behavior of the vocal tract. These parameters are the most used for speaker recognition. Parameters for training, testing and validation were extracted from recordings of authorized users. Recording conditions for training data correspond with the real traffic of the system (sampling frequency, bit rate). The main benefit of the research is the system developed for text-independent speaker identification which is applied to secure communication between law enforcement units.
Optimization of multilayer neural network parameters for speaker recognition
NASA Astrophysics Data System (ADS)
Tovarek, Jaromir; Partila, Pavol; Rozhon, Jan; Voznak, Miroslav; Skapa, Jan; Uhrin, Dominik; Chmelikova, Zdenka
2016-05-01
This article discusses the impact of multilayer neural network parameters for speaker identification. The main task of speaker identification is to find a specific person in the known set of speakers. It means that the voice of an unknown speaker (wanted person) belongs to a group of reference speakers from the voice database. One of the requests was to develop the text-independent system, which means to classify wanted person regardless of content and language. Multilayer neural network has been used for speaker identification in this research. Artificial neural network (ANN) needs to set parameters like activation function of neurons, steepness of activation functions, learning rate, the maximum number of iterations and a number of neurons in the hidden and output layers. ANN accuracy and validation time are directly influenced by the parameter settings. Different roles require different settings. Identification accuracy and ANN validation time were evaluated with the same input data but different parameter settings. The goal was to find parameters for the neural network with the highest precision and shortest validation time. Input data of neural networks are a Mel-frequency cepstral coefficients (MFCC). These parameters describe the properties of the vocal tract. Audio samples were recorded for all speakers in a laboratory environment. Training, testing and validation data set were split into 70, 15 and 15 %. The result of the research described in this article is different parameter setting for the multilayer neural network for four speakers.
Integrating hidden Markov model and PRAAT: a toolbox for robust automatic speech transcription
NASA Astrophysics Data System (ADS)
Kabir, A.; Barker, J.; Giurgiu, M.
2010-09-01
An automatic time-aligned phone transcription toolbox of English speech corpora has been developed. Especially the toolbox would be very useful to generate robust automatic transcription and able to produce phone level transcription using speaker independent models as well as speaker dependent models without manual intervention. The system is based on standard Hidden Markov Models (HMM) approach and it was successfully experimented over a large audiovisual speech corpus namely GRID corpus. One of the most powerful features of the toolbox is the increased flexibility in speech processing where the speech community would be able to import the automatic transcription generated by HMM Toolkit (HTK) into a popular transcription software, PRAAT, and vice-versa. The toolbox has been evaluated through statistical analysis on GRID data which shows that automatic transcription deviates by an average of 20 ms with respect to manual transcription.
Evaluation of speaker de-identification based on voice gender and age conversion
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Matoušek, Jindřich
2018-03-01
Two basic tasks are covered in this paper. The first one consists in the design and practical testing of a new method for voice de-identification that changes the apparent age and/or gender of a speaker by multi-segmental frequency scale transformation combined with prosody modification. The second task is aimed at verification of applicability of a classifier based on Gaussian mixture models (GMM) to detect the original Czech and Slovak speakers after applied voice deidentification. The performed experiments confirm functionality of the developed gender and age conversion for all selected types of de-identification which can be objectively evaluated by the GMM-based open-set classifier. The original speaker detection accuracy was compared also for sentences uttered by German and English speakers showing language independence of the proposed method.
NASA Astrophysics Data System (ADS)
Karam, Walid; Mokbel, Chafic; Greige, Hanna; Chollet, Gerard
2006-05-01
A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.
Discriminative analysis of lip motion features for speaker identification and speech-reading.
Cetingül, H Ertan; Yemez, Yücel; Erzin, Engin; Tekalp, A Murat
2006-10-01
There have been several studies that jointly use audio, lip intensity, and lip geometry information for speaker identification and speech-reading applications. This paper proposes using explicit lip motion information, instead of or in addition to lip intensity and/or geometry information, for speaker identification and speech-reading within a unified feature selection and discrimination analysis framework, and addresses two important issues: 1) Is using explicit lip motion information useful, and, 2) if so, what are the best lip motion features for these two applications? The best lip motion features for speaker identification are considered to be those that result in the highest discrimination of individual speakers in a population, whereas for speech-reading, the best features are those providing the highest phoneme/word/phrase recognition rate. Several lip motion feature candidates have been considered including dense motion features within a bounding box about the lip, lip contour motion features, and combination of these with lip shape features. Furthermore, a novel two-stage, spatial, and temporal discrimination analysis is introduced to select the best lip motion features for speaker identification and speech-reading applications. Experimental results using an hidden-Markov-model-based recognition system indicate that using explicit lip motion information provides additional performance gains in both applications, and lip motion features prove more valuable in the case of speech-reading application.
Second Language Learners and Speech Act Comprehension
ERIC Educational Resources Information Center
Holtgraves, Thomas
2007-01-01
Recognizing the specific speech act ( Searle, 1969) that a speaker performs with an utterance is a fundamental feature of pragmatic competence. Past research has demonstrated that native speakers of English automatically recognize speech acts when they comprehend utterances (Holtgraves & Ashley, 2001). The present research examined whether this…
Fuhrman, Orly; Boroditsky, Lera
2010-11-01
Across cultures people construct spatial representations of time. However, the particular spatial layouts created to represent time may differ across cultures. This paper examines whether people automatically access and use culturally specific spatial representations when reasoning about time. In Experiment 1, we asked Hebrew and English speakers to arrange pictures depicting temporal sequences of natural events, and to point to the hypothesized location of events relative to a reference point. In both tasks, English speakers (who read left to right) arranged temporal sequences to progress from left to right, whereas Hebrew speakers (who read right to left) arranged them from right to left, replicating previous work. In Experiments 2 and 3, we asked the participants to make rapid temporal order judgments about pairs of pictures presented one after the other (i.e., to decide whether the second picture showed a conceptually earlier or later time-point of an event than the first picture). Participants made responses using two adjacent keyboard keys. English speakers were faster to make "earlier" judgments when the "earlier" response needed to be made with the left response key than with the right response key. Hebrew speakers showed exactly the reverse pattern. Asking participants to use a space-time mapping inconsistent with the one suggested by writing direction in their language created interference, suggesting that participants were automatically creating writing-direction consistent spatial representations in the course of their normal temporal reasoning. It appears that people automatically access culturally specific spatial representations when making temporal judgments even in nonlinguistic tasks. Copyright © 2010 Cognitive Science Society, Inc.
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition
NASA Astrophysics Data System (ADS)
Ryumin, D.; Karpov, A. A.
2017-05-01
In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
Vasconcelos, Maria J M; Ventura, Sandra M R; Freitas, Diamantino R S; Tavares, João Manuel R S
2012-03-01
The morphological and dynamic characterisation of the vocal tract during speech production has been gaining greater attention due to the motivation of the latest improvements in magnetic resonance (MR) imaging; namely, with the use of higher magnetic fields, such as 3.0 Tesla. In this work, the automatic study of the vocal tract from 3.0 Tesla MR images was assessed through the application of statistical deformable models. Therefore, the primary goal focused on the analysis of the shape of the vocal tract during the articulation of European Portuguese sounds, followed by the evaluation of the results concerning the automatic segmentation, i.e. identification of the vocal tract in new MR images. In what concerns speech production, this is the first attempt to automatically characterise and reconstruct the vocal tract shape of 3.0 Tesla MR images by using deformable models; particularly, by using active and appearance shape models. The achieved results clearly evidence the adequacy and advantage of the automatic analysis of the 3.0 Tesla MR images of these deformable models in order to extract the vocal tract shape and assess the involved articulatory movements. These achievements are mostly required, for example, for a better knowledge of speech production, mainly of patients suffering from articulatory disorders, and to build enhanced speech synthesizer models.
ERIC Educational Resources Information Center
Young, Victoria; Mihailidis, Alex
2010-01-01
Despite their growing presence in home computer applications and various telephony services, commercial automatic speech recognition technologies are still not easily employed by everyone; especially individuals with speech disorders. In addition, relatively little research has been conducted on automatic speech recognition performance with older…
Noise Reduction with Microphone Arrays for Speaker Identification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cohen, Z
Reducing acoustic noise in audio recordings is an ongoing problem that plagues many applications. This noise is hard to reduce because of interfering sources and non-stationary behavior of the overall background noise. Many single channel noise reduction algorithms exist but are limited in that the more the noise is reduced; the more the signal of interest is distorted due to the fact that the signal and noise overlap in frequency. Specifically acoustic background noise causes problems in the area of speaker identification. Recording a speaker in the presence of acoustic noise ultimately limits the performance and confidence of speaker identificationmore » algorithms. In situations where it is impossible to control the environment where the speech sample is taken, noise reduction filtering algorithms need to be developed to clean the recorded speech of background noise. Because single channel noise reduction algorithms would distort the speech signal, the overall challenge of this project was to see if spatial information provided by microphone arrays could be exploited to aid in speaker identification. The goals are: (1) Test the feasibility of using microphone arrays to reduce background noise in speech recordings; (2) Characterize and compare different multichannel noise reduction algorithms; (3) Provide recommendations for using these multichannel algorithms; and (4) Ultimately answer the question - Can the use of microphone arrays aid in speaker identification?« less
Using Avatars for Improving Speaker Identification in Captioning
NASA Astrophysics Data System (ADS)
Vy, Quoc V.; Fels, Deborah I.
Captioning is the main method for accessing television and film content by people who are deaf or hard-of-hearing. One major difficulty consistently identified by the community is that of knowing who is speaking particularly for an off screen narrator. A captioning system was created using a participatory design method to improve speaker identification. The final prototype contained avatars and a coloured border for identifying specific speakers. Evaluation results were very positive; however participants also wanted to customize various components such as caption and avatar location.
Speakers of Different Languages Process the Visual World Differently
Chabal, Sarah; Marian, Viorica
2015-01-01
Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct linguistic input, showing that language is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. PMID:26030171
Automatic detection of articulation disorders in children with cleft lip and palate.
Maier, Andreas; Hönig, Florian; Bocklet, Tobias; Nöth, Elmar; Stelzle, Florian; Nkenke, Emeka; Schuster, Maria
2009-11-01
Speech of children with cleft lip and palate (CLP) is sometimes still disordered even after adequate surgical and nonsurgical therapies. Such speech shows complex articulation disorders, which are usually assessed perceptually, consuming time and manpower. Hence, there is a need for an easy to apply and reliable automatic method. To create a reference for an automatic system, speech data of 58 children with CLP were assessed perceptually by experienced speech therapists for characteristic phonetic disorders at the phoneme level. The first part of the article aims to detect such characteristics by a semiautomatic procedure and the second to evaluate a fully automatic, thus simple, procedure. The methods are based on a combination of speech processing algorithms. The semiautomatic method achieves moderate to good agreement (kappa approximately 0.6) for the detection of all phonetic disorders. On a speaker level, significant correlations between the perceptual evaluation and the automatic system of 0.89 are obtained. The fully automatic system yields a correlation on the speaker level of 0.81 to the perceptual evaluation. This correlation is in the range of the inter-rater correlation of the listeners. The automatic speech evaluation is able to detect phonetic disorders at an experts'level without any additional human postprocessing.
A Limited-Vocabulary, Multi-Speaker Automatic Isolated Word Recognition System.
ERIC Educational Resources Information Center
Paul, James E., Jr.
Techniques for automatic recognition of isolated words are investigated, and a computer simulation of a word recognition system is effected. Considered in detail are data acquisition and digitizing, word detection, amplitude and time normalization, short-time spectral estimation including spectral windowing, spectral envelope approximation,…
Ma, Joan K-Y; Whitehill, Tara L; So, Susanne Y-S
2010-08-01
Speech produced by individuals with hypokinetic dysarthria associated with Parkinson's disease (PD) is characterized by a number of features including impaired speech prosody. The purpose of this study was to investigate intonation contrasts produced by this group of speakers. Speech materials with a question-statement contrast were collected from 14 Cantonese speakers with PD. Twenty listeners then classified the productions as either questions or statements. Acoustic analyses of F0, duration, and intensity were conducted to determine which acoustic cues distinguished the production of questions from statements, and which cues appeared to be exploited by listeners in identifying intonational contrasts. The results show that listeners identified statements with a high degree of accuracy, but the accuracy of question identification ranged from 0.56% to 96% across the 14 speakers. The speakers with PD used similar acoustic cues as nondysarthric Cantonese speakers to mark the question-statement contrast, although the contrasts were not observed in all speakers. Listeners mainly used F0 cues at the final syllable for intonation identification. These data contribute to the researchers' understanding of intonation marking in speakers with PD, with specific application to the production and perception of intonation in a lexical tone language.
Speakers of different languages process the visual world differently.
Chabal, Sarah; Marian, Viorica
2015-06-01
Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct language input, showing that linguistic information is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. (c) 2015 APA, all rights reserved).
Speech endpoint detection with non-language speech sounds for generic speech processing applications
NASA Astrophysics Data System (ADS)
McClain, Matthew; Romanowski, Brian
2009-05-01
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
ERIC Educational Resources Information Center
Yook, Cheongmin; Lindemann, Stephanie
2013-01-01
This study investigates how the attitudes of 60 Korean university students towards five varieties of English are affected by the identification of the speaker's nationality and ethnicity. The study employed both a verbal guise technique and questions eliciting overt beliefs and preferences related to learning English. While the majority of the…
Landwehr, Markus; Fürstenberg, Dirk; Walger, Martin; von Wedel, Hasso; Meister, Hartmut
2014-01-01
Advances in speech coding strategies and electrode array designs for cochlear implants (CIs) predominantly aim at improving speech perception. Current efforts are also directed at transmitting appropriate cues of the fundamental frequency (F0) to the auditory nerve with respect to speech quality, prosody, and music perception. The aim of this study was to examine the effects of various electrode configurations and coding strategies on speech intonation identification, speaker gender identification, and music quality rating. In six MED-EL CI users electrodes were selectively deactivated in order to simulate different insertion depths and inter-electrode distances when using the high definition continuous interleaved sampling (HDCIS) and fine structure processing (FSP) speech coding strategies. Identification of intonation and speaker gender was determined and music quality rating was assessed. For intonation identification HDCIS was robust against the different electrode configurations, whereas fine structure processing showed significantly worse results when a short electrode depth was simulated. In contrast, speaker gender recognition was not affected by electrode configuration or speech coding strategy. Music quality rating was sensitive to electrode configuration. In conclusion, the three experiments revealed different outcomes, even though they all addressed the reception of F0 cues. Rapid changes in F0, as seen with intonation, were the most sensitive to electrode configurations and coding strategies. In contrast, electrode configurations and coding strategies did not show large effects when F0 information was available over a longer time period, as seen with speaker gender. Music quality relies on additional spectral cues other than F0, and was poorest when a shallow insertion was simulated.
Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn
2016-01-01
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
Shibboleth: An Automated Foreign Accent Identification Program
ERIC Educational Resources Information Center
Frost, Wende
2013-01-01
The speech of non-native (L2) speakers of a language contains phonological rules that differentiate them from native speakers. These phonological rules characterize or distinguish accents in an L2. The Shibboleth program creates combinatorial rule-sets to describe the phonological pattern of these accents and classifies L2 speakers into their…
Single-Word Intelligibility in Speakers with Repaired Cleft Palate
ERIC Educational Resources Information Center
Whitehill, Tara; Chau, Cynthia
2004-01-01
Many speakers with repaired cleft palate have reduced intelligibility, but there are limitations with current procedures for assessing intelligibility. The aim of this study was to construct a single-word intelligibility test for speakers with cleft palate. The test used a multiple-choice identification format, and was based on phonetic contrasts…
ERIC Educational Resources Information Center
Bryden, James D.
The purpose of this study was to specify variables which function significantly in the racial identification and speech quality rating of Negro and white speakers by Negro and white listeners. Ninety-one adults served as subjects for the speech task; 86 of these subjects, 43 Negro and 43 white, provided the listener responses. Subjects were chosen…
Automatic speech recognition technology development at ITT Defense Communications Division
NASA Technical Reports Server (NTRS)
White, George M.
1977-01-01
An assessment of the applications of automatic speech recognition to defense communication systems is presented. Future research efforts include investigations into the following areas: (1) dynamic programming; (2) recognition of speech degraded by noise; (3) speaker independent recognition; (4) large vocabulary recognition; (5) word spotting and continuous speech recognition; and (6) isolated word recognition.
Using Automatic Speech Recognition Technology with Elicited Oral Response Testing
ERIC Educational Resources Information Center
Cox, Troy L.; Davies, Randall S.
2012-01-01
This study examined the use of automatic speech recognition (ASR) scored elicited oral response (EOR) tests to assess the speaking ability of English language learners. It also examined the relationship between ASR-scored EOR and other language proficiency measures and the ability of the ASR to rate speakers without bias to gender or native…
When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.
Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola
2017-11-01
Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification and tracking of particular speaker in noisy environment
NASA Astrophysics Data System (ADS)
Sawada, Hideyuki; Ohkado, Minoru
2004-10-01
Human is able to exchange information smoothly using voice under different situations such as noisy environment in a crowd and with the existence of plural speakers. We are able to detect the position of a source sound in 3D space, extract a particular sound from mixed sounds, and recognize who is talking. By realizing this mechanism with a computer, new applications will be presented for recording a sound with high quality by reducing noise, presenting a clarified sound, and realizing a microphone-free speech recognition by extracting particular sound. The paper will introduce a realtime detection and identification of particular speaker in noisy environment using a microphone array based on the location of a speaker and the individual voice characteristics. The study will be applied to develop an adaptive auditory system of a mobile robot which collaborates with a factory worker.
NASA Astrophysics Data System (ADS)
S. Al-Kaltakchi, Musab T.; Woo, Wai L.; Dlay, Satnam; Chambers, Jonathon A.
2017-12-01
In this study, a speaker identification system is considered consisting of a feature extraction stage which utilizes both power normalized cepstral coefficients (PNCCs) and Mel frequency cepstral coefficients (MFCC). Normalization is applied by employing cepstral mean and variance normalization (CMVN) and feature warping (FW), together with acoustic modeling using a Gaussian mixture model-universal background model (GMM-UBM). The main contributions are comprehensive evaluations of the effect of both additive white Gaussian noise (AWGN) and non-stationary noise (NSN) (with and without a G.712 type handset) upon identification performance. In particular, three NSN types with varying signal to noise ratios (SNRs) were tested corresponding to street traffic, a bus interior, and a crowded talking environment. The performance evaluation also considered the effect of late fusion techniques based on score fusion, namely, mean, maximum, and linear weighted sum fusion. The databases employed were TIMIT, SITW, and NIST 2008; and 120 speakers were selected from each database to yield 3600 speech utterances. As recommendations from the study, mean fusion is found to yield overall best performance in terms of speaker identification accuracy (SIA) with noisy speech, whereas linear weighted sum fusion is overall best for original database recordings.
[Study on the automatic parameters identification of water pipe network model].
Jia, Hai-Feng; Zhao, Qi-Feng
2010-01-01
Based on the problems analysis on development and application of water pipe network model, the model parameters automatic identification is regarded as a kernel bottleneck of model's application in water supply enterprise. The methodology of water pipe network model parameters automatic identification based on GIS and SCADA database is proposed. Then the kernel algorithm of model parameters automatic identification is studied, RSA (Regionalized Sensitivity Analysis) is used for automatic recognition of sensitive parameters, and MCS (Monte-Carlo Sampling) is used for automatic identification of parameters, the detail technical route based on RSA and MCS is presented. The module of water pipe network model parameters automatic identification is developed. At last, selected a typical water pipe network as a case, the case study on water pipe network model parameters automatic identification is conducted and the satisfied results are achieved.
Do Bilinguals Automatically Activate Their Native Language When They Are Not Using It?
ERIC Educational Resources Information Center
Costa, Albert; Pannunzi, Mario; Deco, Gustavo; Pickering, Martin J.
2017-01-01
Most models of lexical access assume that bilingual speakers activate their two languages even when they are in a context in which only one language is used. A critical piece of evidence used to support this notion is the observation that a given word automatically activates its translation equivalent in the other language. Here, we argue that…
The Development of the Speaker Independent ARM Continuous Speech Recognition System
1992-01-01
spokeTi airborne reconnaissance reports u-ing a speech recognition system based on phoneme-level hidden Markov models (HMMs). Previous versions of the ARM...will involve automatic selection from multiple model sets, corresponding to different speaker types, and that the most rudimen- tary partition of a...The vocabulary size for the ARM task is 497 words. These words are related to the phoneme-level symbols corresponding to the models in the model set
Evolving Spiking Neural Networks for Recognition of Aged Voices.
Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson
2017-01-01
The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Liu, Hanjun; Wang, Emily Q.; Chen, Zhaocong; Liu, Peng; Larson, Charles R.; Huang, Dongfeng
2010-01-01
The purpose of this cross-language study was to examine whether the online control of voice fundamental frequency (F0) during vowel phonation is influenced by language experience. Native speakers of Cantonese and Mandarin, both tonal languages spoken in China, participated in the experiments. Subjects were asked to vocalize a vowel sound ∕u∕ at their comfortable habitual F0, during which their voice pitch was unexpectedly shifted (±50, ±100, ±200, or ±500 cents, 200 ms duration) and fed back instantaneously to them over headphones. The results showed that Cantonese speakers produced significantly smaller responses than Mandarin speakers when the stimulus magnitude varied from 200 to 500 cents. Further, response magnitudes decreased along with the increase in stimulus magnitude in Cantonese speakers, which was not observed in Mandarin speakers. These findings suggest that online control of voice F0 during vocalization is sensitive to language experience. Further, systematic modulations of vocal responses across stimulus magnitude were observed in Cantonese speakers but not in Mandarin speakers, which indicates that this highly automatic feedback mechanism is sensitive to the specific tonal system of each language. PMID:21218905
2013-01-01
Background Most of the institutional and research information in the biomedical domain is available in the form of English text. Even in countries where English is an official language, such as the United States, language can be a barrier for accessing biomedical information for non-native speakers. Recent progress in machine translation suggests that this technique could help make English texts accessible to speakers of other languages. However, the lack of adequate specialized corpora needed to train statistical models currently limits the quality of automatic translations in the biomedical domain. Results We show how a large-sized parallel corpus can automatically be obtained for the biomedical domain, using the MEDLINE database. The corpus generated in this work comprises article titles obtained from MEDLINE and abstract text automatically retrieved from journal websites, which substantially extends the corpora used in previous work. After assessing the quality of the corpus for two language pairs (English/French and English/Spanish) we use the Moses package to train a statistical machine translation model that outperforms previous models for automatic translation of biomedical text. Conclusions We have built translation data sets in the biomedical domain that can easily be extended to other languages available in MEDLINE. These sets can successfully be applied to train statistical machine translation models. While further progress should be made by incorporating out-of-domain corpora and domain-specific lexicons, we believe that this work improves the automatic translation of biomedical texts. PMID:23631733
Performance enhancement for audio-visual speaker identification using dynamic facial muscle model.
Asadpour, Vahid; Towhidkhah, Farzad; Homayounpour, Mohammad Mehdi
2006-10-01
Science of human identification using physiological characteristics or biometry has been of great concern in security systems. However, robust multimodal identification systems based on audio-visual information has not been thoroughly investigated yet. Therefore, the aim of this work to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently, imitation of valid speakers could be reduced to a large extent. These parameters are applied to a hidden Markov model (HMM) audio-visual identification system. In this work, a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. Noise robust audio features such as Mel-frequency cepstral coefficients (MFCC), spectral subtraction (SS), and relative spectra perceptual linear prediction (J-RASTA-PLP) have been used to evaluate the performance of the multimodal system once efficient audio feature extraction methods have been utilized. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. To evaluate the robustness of algorithms, some experiments were performed on genetically identical twins. Furthermore, changes in speaker voice were simulated with drug inhalation tests. In 3 dB signal to noise ratio (SNR), the dynamic muscle model improved the identification rate of the audio-visual system from 91 to 98%. Results on identical twins revealed that there was an apparent improvement on the performance for the dynamic muscle model-based system, in which the identification rate of the audio-visual system was enhanced from 87 to 96%.
GMM-based speaker age and gender classification in Czech and Slovak
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Matoušek, Jindřich
2017-01-01
The paper describes an experiment with using the Gaussian mixture models (GMM) for automatic classification of the speaker age and gender. It analyses and compares the influence of different number of mixtures and different types of speech features used for GMM gender/age classification. Dependence of the computational complexity on the number of used mixtures is also analysed. Finally, the GMM classification accuracy is compared with the output of the conventional listening tests. The results of these objective and subjective evaluations are in correspondence.
Sadakata, Makiko; McQueen, James M
2013-08-01
This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis.
Blinded by taboo words in L1 but not L2.
Colbeck, Katie L; Bowers, Jeffrey S
2012-04-01
The present study compares the emotionality of English taboo words in native English speakers and native Chinese speakers who learned English as a second language. Neutral and taboo/sexual words were included in a Rapid Serial Visual Presentation (RSVP) task as to-be-ignored distracters in a short- and long-lag condition. Compared with neutral distracters, taboo/sexual distracters impaired the performance in the short-lag condition only. Of critical note, however, is that the performance of Chinese speakers was less impaired by taboo/sexual distracters. This supports the view that a first language is more emotional than a second language, even when words are processed quickly and automatically. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Visual speech influences speech perception immediately but not automatically.
Mitterer, Holger; Reinisch, Eva
2017-02-01
Two experiments examined the time course of the use of auditory and visual speech cues to spoken word recognition using an eye-tracking paradigm. Results of the first experiment showed that the use of visual speech cues from lipreading is reduced if concurrently presented pictures require a division of attentional resources. This reduction was evident even when listeners' eye gaze was on the speaker rather than the (static) pictures. Experiment 2 used a deictic hand gesture to foster attention to the speaker. At the same time, the visual processing load was reduced by keeping the visual display constant over a fixed number of successive trials. Under these conditions, the visual speech cues from lipreading were used. Moreover, the eye-tracking data indicated that visual information was used immediately and even earlier than auditory information. In combination, these data indicate that visual speech cues are not used automatically, but if they are used, they are used immediately.
AdjScales: Visualizing Differences between Adjectives for Language Learners
NASA Astrophysics Data System (ADS)
Sheinman, Vera; Tokunaga, Takenobu
In this study we introduce AdjScales, a method for scaling similar adjectives by their strength. It combines existing Web-based computational linguistic techniques in order to automatically differentiate between similar adjectives that describe the same property by strength. Though this kind of information is rarely present in most of the lexical resources and dictionaries, it may be useful for language learners that try to distinguish between similar words. Additionally, learners might gain from a simple visualization of these differences using unidimensional scales. The method is evaluated by comparison with annotation on a subset of adjectives from WordNet by four native English speakers. It is also compared against two non-native speakers of English. The collected annotation is an interesting resource in its own right. This work is a first step toward automatic differentiation of meaning between similar words for language learners. AdjScales can be useful for lexical resource enhancement.
Gender identification from high-pass filtered vowel segments: the use of high-frequency energy.
Donai, Jeremy J; Lass, Norman J
2015-10-01
The purpose of this study was to examine the use of high-frequency information for making gender identity judgments from high-pass filtered vowel segments produced by adult speakers. Specifically, the effect of removing lower-frequency spectral detail (i.e., F3 and below) from vowel segments via high-pass filtering was evaluated. Thirty listeners (ages 18-35) with normal hearing participated in the experiment. A within-subjects design was used to measure gender identification for six 250-ms vowel segments (/æ/, /ɪ /, /ɝ/, /ʌ/, /ɔ/, and /u/), produced by ten male and ten female speakers. The results of this experiment demonstrated that despite the removal of low-frequency spectral detail, the listeners were accurate in identifying speaker gender from the vowel segments, and did so with performance significantly above chance. The removal of low-frequency spectral detail reduced gender identification by approximately 16 % relative to unfiltered vowel segments. Classification results using linear discriminant function analyses followed the perceptual data, using spectral and temporal representations derived from the high-pass filtered segments. Cumulatively, these findings indicate that normal-hearing listeners are able to make accurate perceptual judgments regarding speaker gender from vowel segments with low-frequency spectral detail removed via high-pass filtering. Therefore, it is reasonable to suggest the presence of perceptual cues related to gender identity in the high-frequency region of naturally produced vowel signals. Implications of these findings and possible mechanisms for performing the gender identification task from high-pass filtered stimuli are discussed.
33 CFR 401.20 - Automatic Identification System.
Code of Federal Regulations, 2010 CFR
2010-07-01
...' maritime Differential Global Positioning System radiobeacon services; or (7) The use of a temporary unit... Identification System. (a) Each of the following vessels must use an Automatic Identification System (AIS... 33 Navigation and Navigable Waters 3 2010-07-01 2010-07-01 false Automatic Identification System...
47 CFR 80.231 - Technical Requirements for Class B Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2012 CFR
2012-10-01
... Identification System (AIS) equipment. 80.231 Section 80.231 Telecommunication FEDERAL COMMUNICATIONS COMMISSION... § 80.231 Technical Requirements for Class B Automatic Identification System (AIS) equipment. (a) Class B Automatic Identification System (AIS) equipment must meet the technical requirements of IEC 62287...
47 CFR 80.231 - Technical Requirements for Class B Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2013 CFR
2013-10-01
... Identification System (AIS) equipment. 80.231 Section 80.231 Telecommunication FEDERAL COMMUNICATIONS COMMISSION... § 80.231 Technical Requirements for Class B Automatic Identification System (AIS) equipment. (a) Class B Automatic Identification System (AIS) equipment must meet the technical requirements of IEC 62287...
47 CFR 80.231 - Technical Requirements for Class B Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2014 CFR
2014-10-01
... Identification System (AIS) equipment. 80.231 Section 80.231 Telecommunication FEDERAL COMMUNICATIONS COMMISSION... § 80.231 Technical Requirements for Class B Automatic Identification System (AIS) equipment. (a) Class B Automatic Identification System (AIS) equipment must meet the technical requirements of IEC 62287...
33 CFR 164.03 - Incorporation by reference.
Code of Federal Regulations, 2014 CFR
2014-07-01
... radiocommunication equipment and systems—Automatic identification systems (AIS)—part 2: Class A shipborne equipment of the universal automatic identification system (AIS)—Operational and performance requirements..., Recommendation on Performance Standards for a Universal Shipborne Automatic Identification System (AIS), adopted...
33 CFR 164.03 - Incorporation by reference.
Code of Federal Regulations, 2012 CFR
2012-07-01
... radiocommunication equipment and systems—Automatic identification systems (AIS)—part 2: Class A shipborne equipment of the universal automatic identification system (AIS)—Operational and performance requirements..., Recommendation on Performance Standards for a Universal Shipborne Automatic Identification System (AIS), adopted...
33 CFR 164.03 - Incorporation by reference.
Code of Federal Regulations, 2013 CFR
2013-07-01
... radiocommunication equipment and systems—Automatic identification systems (AIS)—part 2: Class A shipborne equipment of the universal automatic identification system (AIS)—Operational and performance requirements..., Recommendation on Performance Standards for a Universal Shipborne Automatic Identification System (AIS), adopted...
NASA Astrophysics Data System (ADS)
Kayasith, Prakasith; Theeramunkong, Thanaruk
It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
ERIC Educational Resources Information Center
Bullock, Heather E.; Fernald, Julian L.
2003-01-01
Drawing on a communications model of persuasion (Hovland, Janis, & Kelley, 1953), this study examined the effect of target appearance on feminists' and nonfeminists' perceptions of a speaker delivering a feminist or an antifeminist message. One hundred three college women watched one of four videotaped speeches that varied by content (profeminist…
Accent Identification by Adults with Aphasia
ERIC Educational Resources Information Center
Newton, Caroline; Burns, Rebecca; Bruce, Carolyn
2013-01-01
The UK is a diverse society where individuals regularly interact with speakers with different accents. Whilst there is a growing body of research on the impact of speaker accent on comprehension in people with aphasia, there is none which explores their ability to identify accents. This study investigated the ability of this group to identify the…
NASA Astrophysics Data System (ADS)
Zilletti, Michele; Marker, Arthur; Elliott, Stephen John; Holland, Keith
2017-05-01
In this study model identification of the nonlinear dynamics of a micro-speaker is carried out by purely electrical measurements, avoiding any explicit vibration measurements. It is shown that a dynamic model of the micro-speaker, which takes into account the nonlinear damping characteristic of the device, can be identified by measuring the response between the voltage input and the current flowing into the coil. An analytical formulation of the quasi-linear model of the micro-speaker is first derived and an optimisation method is then used to identify a polynomial function which describes the mechanical damping behaviour of the micro-speaker. The analytical results of the quasi-linear model are compared with numerical results. This study potentially opens up the possibility of efficiently implementing nonlinear echo cancellers.
Perception of relative location of F0 within the speaking range
NASA Astrophysics Data System (ADS)
Honorof, Douglas N.; Whalen, D. H.
2003-10-01
It has been argued that intrinsic fundamental frequency (IF0) is an automatic consequence of vowel production [Whalen et al., J. Phon. 27, 125-142 (1999)], yet speakers do not adjust F0 so as to overcome IF0. It may be that so adjusting F0 would distort information about F0 range-information important to the interpretation of F0. Therefore, a speech production/perception experiment was designed to determine whether listeners can perceive position within a speaker-specific F0 range on the basis of isolated tokens. Ten male and ten female adult native speakers of US English were recorded speaking (not singing) the vowel /a/ on eight different pitches spaced throughout speaker-specific ranges. Recordings were randomized across speakers. Naive listeners made pitch-magnitude estimates of the location of F0 relative to each speaker's range. Preliminary results show correlations between estimated and actual location within the range. Adjusting F0 to compensate for IF0 differences between vowels would seem to obscure voice quality in such a way as to make it difficult for the listener to recover relative F0, requiring a greater perceptual adjustment than simply normalizing for IF0. [Work supported by NIH Grant No. DC02717.
Tuning time-frequency methods for the detection of metered HF speech
NASA Astrophysics Data System (ADS)
Nelson, Douglas J.; Smith, Lawrence H.
2002-12-01
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.
Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
NASA Astrophysics Data System (ADS)
Wang, Longbiao; Minami, Kazue; Yamamoto, Kazumasa; Nakagawa, Seiichi
In this paper, we investigate the effectiveness of phase for speaker recognition in noisy conditions and combine the phase information with mel-frequency cepstral coefficients (MFCCs). To date, almost speaker recognition methods are based on MFCCs even in noisy conditions. For MFCCs which dominantly capture vocal tract information, only the magnitude of the Fourier Transform of time-domain speech frames is used and phase information has been ignored. High complement of the phase information and MFCCs is expected because the phase information includes rich voice source information. Furthermore, some researches have reported that phase based feature was robust to noise. In our previous study, a phase information extraction method that normalizes the change variation in the phase depending on the clipping position of the input speech was proposed, and the performance of the combination of the phase information and MFCCs was remarkably better than that of MFCCs. In this paper, we evaluate the robustness of the proposed phase information for speaker identification in noisy conditions. Spectral subtraction, a method skipping frames with low energy/Signal-to-Noise (SN) and noisy speech training models are used to analyze the effect of the phase information and MFCCs in noisy conditions. The NTT database and the JNAS (Japanese Newspaper Article Sentences) database added with stationary/non-stationary noise were used to evaluate our proposed method. MFCCs outperformed the phase information for clean speech. On the other hand, the degradation of the phase information was significantly smaller than that of MFCCs for noisy speech. The individual result of the phase information was even better than that of MFCCs in many cases by clean speech training models. By deleting unreliable frames (frames having low energy/SN), the speaker identification performance was improved significantly. By integrating the phase information with MFCCs, the speaker identification error reduction rate was about 30%-60% compared with the standard MFCC-based method.
Jiang, Jun; Liu, Fang; Wan, Xuan; Jiang, Cunmei
2015-07-01
Tone language experience benefits pitch processing in music and speech for typically developing individuals. No known studies have examined pitch processing in individuals with autism who speak a tone language. This study investigated discrimination and identification of melodic contour and speech intonation in a group of Mandarin-speaking individuals with high-functioning autism. Individuals with autism showed superior melodic contour identification but comparable contour discrimination relative to controls. In contrast, these individuals performed worse than controls on both discrimination and identification of speech intonation. These findings provide the first evidence for differential pitch processing in music and speech in tone language speakers with autism, suggesting that tone language experience may not compensate for speech intonation perception deficits in individuals with autism.
Speaker recognition with temporal cues in acoustic and electric hearing
NASA Astrophysics Data System (ADS)
Vongphoe, Michael; Zeng, Fan-Gang
2005-08-01
Natural spoken language processing includes not only speech recognition but also identification of the speaker's gender, age, emotional, and social status. Our purpose in this study is to evaluate whether temporal cues are sufficient to support both speech and speaker recognition. Ten cochlear-implant and six normal-hearing subjects were presented with vowel tokens spoken by three men, three women, two boys, and two girls. In one condition, the subject was asked to recognize the vowel. In the other condition, the subject was asked to identify the speaker. Extensive training was provided for the speaker recognition task. Normal-hearing subjects achieved nearly perfect performance in both tasks. Cochlear-implant subjects achieved good performance in vowel recognition but poor performance in speaker recognition. The level of the cochlear implant performance was functionally equivalent to normal performance with eight spectral bands for vowel recognition but only to one band for speaker recognition. These results show a disassociation between speech and speaker recognition with primarily temporal cues, highlighting the limitation of current speech processing strategies in cochlear implants. Several methods, including explicit encoding of fundamental frequency and frequency modulation, are proposed to improve speaker recognition for current cochlear implant users.
English Language Schooling, Linguistic Realities, and the Native Speaker of English in Hong Kong
ERIC Educational Resources Information Center
Hansen Edwards, Jette G.
2018-01-01
The study employs a case study approach to examine the impact of educational backgrounds on nine Hong Kong tertiary students' English and Cantonese language practices and identifications as native speakers of English and Cantonese. The study employed both survey and interview data to probe the participants' English and Cantonese language use at…
Priming of Non-Speech Vocalizations in Male Adults: The Influence of the Speaker's Gender
ERIC Educational Resources Information Center
Fecteau, Shirley; Armony, Jorge L.; Joanette, Yves; Belin, Pascal
2004-01-01
Previous research reported a priming effect for voices. However, the type of information primed is still largely unknown. In this study, we examined the influence of speaker's gender and emotional category of the stimulus on priming of non-speech vocalizations in 10 male participants, who performed a gender identification task. We found a…
Effects of Phonetic Similarity in the Identification of Mandarin Tones
ERIC Educational Resources Information Center
Li, Bin; Shao, Jing; Bao, Mingzhen
2017-01-01
Tonal languages differ in how they use phonetic correlates, e.g. average pitch height and pitch direction, for tonal contrasts. Thus, native speakers of a tonal language may need to adjust their attention to familiar or unfamiliar phonetic cues when perceiving non-native tones. On the other hand, speakers of a non-tonal language may need to…
33 CFR 164.43 - Automatic Identification System Shipborne Equipment-Prince William Sound.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 33 Navigation and Navigable Waters 2 2012-07-01 2012-07-01 false Automatic Identification System Shipborne Equipment-Prince William Sound. 164.43 Section 164.43 Navigation and Navigable Waters COAST GUARD... Automatic Identification System Shipborne Equipment—Prince William Sound. (a) Until December 31, 2004, each...
33 CFR 164.43 - Automatic Identification System Shipborne Equipment-Prince William Sound.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 33 Navigation and Navigable Waters 2 2013-07-01 2013-07-01 false Automatic Identification System Shipborne Equipment-Prince William Sound. 164.43 Section 164.43 Navigation and Navigable Waters COAST GUARD... Automatic Identification System Shipborne Equipment—Prince William Sound. (a) Until December 31, 2004, each...
33 CFR 164.43 - Automatic Identification System Shipborne Equipment-Prince William Sound.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 33 Navigation and Navigable Waters 2 2014-07-01 2014-07-01 false Automatic Identification System Shipborne Equipment-Prince William Sound. 164.43 Section 164.43 Navigation and Navigable Waters COAST GUARD... Automatic Identification System Shipborne Equipment—Prince William Sound. (a) Until December 31, 2004, each...
Effects of a metronome on the filled pauses of fluent speakers.
Christenfeld, N
1996-12-01
Filled pauses (the "ums" and "uhs" that litter spontaneous speech) seem to be a product of the speaker paying deliberate attention to the normally automatic act of talking. This is the same sort of explanation that has been offered for stuttering. In this paper we explore whether a manipulation that has long been known to decrease stuttering, synchronizing speech to the beats of a metronome, will then also decrease filled pauses. Two experiments indicate that a metronome has a dramatic effect on the production of filled pauses. This effect is not due to any simplification or slowing of the speech and supports the view that a metronome causes speakers to attend more to how they are talking and less to what they are saying. It also lends support to the connection between stutters and filled pauses.
Multiple Metaphors: Teaching Tense and Aspect to English-Speakers.
ERIC Educational Resources Information Center
Cody, Karen
2000-01-01
This paper proposes a synthesis of instructional methods from both traditional/explicit grammar and learner-centered/constructivist camps that also incorporates many types of metaphors (abstract, visual, and kinesthetic) in order to lead learners from declarative to proceduralized to automatized knowledge. This integrative, synthetic approach…
ERIC Educational Resources Information Center
Hayes-Harb, Rachel
2006-01-01
English as a second language (ESL) teachers have long noted that native speakers of Arabic exhibit exceptional difficulty with English reading comprehension (e.g., Thompson-Panos & Thomas-Ruzic, 1983). Most existing work in this area has looked to higher level aspects of reading such as familiarity with discourse structure and cultural knowledge…
The Effect of Scene Variation on the Redundant Use of Color in Definite Reference
ERIC Educational Resources Information Center
Koolen, Ruud; Goudbeek, Martijn; Krahmer, Emiel
2013-01-01
This study investigates to what extent the amount of variation in a visual scene causes speakers to mention the attribute color in their definite target descriptions, focusing on scenes in which this attribute is not needed for identification of the target. The results of our three experiments show that speakers are more likely to redundantly…
The perception of FM sweeps by Chinese and English listeners.
Luo, Huan; Boemio, Anthony; Gordon, Michael; Poeppel, David
2007-02-01
Frequency-modulated (FM) signals are an integral acoustic component of ecologically natural sounds and are analyzed effectively in the auditory systems of humans and animals. Linearly frequency-modulated tone sweeps were used here to evaluate two questions. First, how rapid a sweep can listeners accurately perceive? Second, is there an effect of native language insofar as the language (phonology) is differentially associated with processing of FM signals? Speakers of English and Mandarin Chinese were tested to evaluate whether being a speaker of a tone language altered the perceptual identification of non-speech tone sweeps. In two psychophysical studies, we demonstrate that Chinese subjects perform better than English subjects in FM direction identification, but not in an FM discrimination task, in which English and Chinese speakers show similar detection thresholds of approximately 20 ms duration. We suggest that the better FM direction identification in Chinese subjects is related to their experience with FM direction analysis in the tone-language environment, even though supra-segmental tonal variation occurs over a longer time scale. Furthermore, the observed common discrimination temporal threshold across two language groups supports the conjecture that processing auditory signals at durations of approximately 20 ms constitutes a fundamental auditory perceptual threshold.
Johansson, Kerstin; Strömbergsson, Sofia; Robieux, Camille; McAllister, Anita
2017-01-01
Reduced respiratory function following lower cervical spinal cord injuries (CSCIs) may indirectly result in vocal dysfunction. Although self-reports indicate voice change and limitations following CSCI, earlier efforts using global perceptual ratings to distinguish speakers with CSCI from noninjured speakers have not been very successful. We investigate the use of an audience response system-based approach to distinguish speakers with CSCI from noninjured speakers, and explore whether specific vocal traits can be identified as characteristic for speakers with CSCI. Fourteen speech-language pathologists participated in a web-based perceptual task, where their overt reactions to vocal dysfunction were registered during the continuous playback of recordings of 36 speakers (18 with CSCI, and 18 matched controls). Dysphonic events were identified through manual perceptual analysis, to allow the exploration of connections between dysphonic events and listener reactions. More dysphonic events, and more listener reactions, were registered for speakers with CSCI than for noninjured speakers. Strain (particularly in phrase-final position) and creak (particularly in nonphrase-final position) distinguish speakers with CSCI from noninjured speakers. For the identification of intermittent and subtle signs of vocal dysfunction, an approach where the temporal distribution of symptoms is registered offers a viable means to distinguish speakers affected by voice dysfunction from non-affected speakers. In speakers with CSCI, clinicians should listen for presence of final strain and nonfinal creak, and pay attention to self-reported voice function and voice problems, to identify individuals in need for clinical assessment and intervention. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Effects of individual differences in verbal skills on eye-movement patterns during sentence reading
Kuperman, Victor; Van Dyke, Julie A.
2011-01-01
This study is a large-scale exploration of the influence that individual reading skills exert on eye-movement behavior in sentence reading. Seventy one non-college-bound 16–24 year-old speakers of English completed a battery of 18 verbal and cognitive skill assessments, and read a series of sentences as their eye movements were monitored. Statistical analyses were performed to establish what tests of reading abilities were predictive of eye-movement patterns across this population and how strong the effects were. We found that individual scores in rapid automatized naming and word identification tests (i) were the only participant variables with reliable predictivity throughout the time-course of reading; (ii) elicited effects that superceded in magnitude the effects of established predictors like word length or frequency; and (iii) strongly modulated the influence of word length and frequency on fixation times. We discuss implications of our findings for testing reading ability, as well as for research of eye-movements in reading. PMID:21709808
Cost-sensitive learning for emotion robust speaker recognition.
Li, Dongdong; Yang, Yingchun; Dai, Weihui
2014-01-01
In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.
Cost-Sensitive Learning for Emotion Robust Speaker Recognition
Li, Dongdong; Yang, Yingchun
2014-01-01
In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved. PMID:24999492
García-Betances, Rebeca I; Huerta, Mónica K
2012-01-01
A comparative review is presented of available technologies suitable for automatic reading of patient identification bracelet tags. Existing technologies' backgrounds, characteristics, advantages and disadvantages, are described in relation to their possible use by public health care centers with budgetary limitations. A comparative assessment is presented of suitable automatic identification systems based on graphic codes, both one- (1D) and two-dimensional (2D), printed on labels, as well as those based on radio frequency identification (RFID) tags. The analysis looks at the tradeoffs of these technologies to provide guidance to hospital administrator looking to deploy patient identification technology. The results suggest that affordable automatic patient identification systems can be easily and inexpensively implemented using 2D code printed on low cost bracelet labels, which can then be read and automatically decoded by ordinary mobile smart phones. Because of mobile smart phones' present versatility and ubiquity, the implantation and operation of 2D code, and especially Quick Response® (QR) Code, technology emerges as a very attractive alternative to automate the patients' identification processes in low-budget situations.
García-Betances, Rebeca I.; Huerta, Mónica K.
2012-01-01
A comparative review is presented of available technologies suitable for automatic reading of patient identification bracelet tags. Existing technologies’ backgrounds, characteristics, advantages and disadvantages, are described in relation to their possible use by public health care centers with budgetary limitations. A comparative assessment is presented of suitable automatic identification systems based on graphic codes, both one- (1D) and two-dimensional (2D), printed on labels, as well as those based on radio frequency identification (RFID) tags. The analysis looks at the tradeoffs of these technologies to provide guidance to hospital administrator looking to deploy patient identification technology. The results suggest that affordable automatic patient identification systems can be easily and inexpensively implemented using 2D code printed on low cost bracelet labels, which can then be read and automatically decoded by ordinary mobile smart phones. Because of mobile smart phones’ present versatility and ubiquity, the implantation and operation of 2D code, and especially Quick Response® (QR) Code, technology emerges as a very attractive alternative to automate the patients’ identification processes in low-budget situations. PMID:23569629
PechaKucha Presentations: Teaching Storytelling, Visual Design, and Conciseness
ERIC Educational Resources Information Center
Lucas, Kristen; Rawlins, Jacob D.
2015-01-01
When speakers rely too heavily on presentation software templates, they often end up stultifying audiences with a triple-whammy of bullet points. In this article, Lucas and Rawlins present an alternative method--PechaKucha (the Japanese word for "chit chat")--a presentation style driven by a carefully planned, automatically timed…
Acoustic Analysis of Voice in Dysarthria following Stroke
ERIC Educational Resources Information Center
Wang, Yu-Tsai; Kent, Ray D.; Kent, Jane Finley; Duffy, Joseph R.; Thomas, Jack E.
2009-01-01
Although perceptual studies indicate the likelihood of voice disorders in persons with stroke, there have been few objective instrumental studies of voice dysfunction in dysarthria following stroke. This study reports automatic analysis of sustained vowel phonation for 61 speakers with stroke. The results show: (1) men with stroke and healthy…
Second Language Writing Classification System Based on Word-Alignment Distribution
ERIC Educational Resources Information Center
Kotani, Katsunori; Yoshimi, Takehiko
2010-01-01
The present paper introduces an automatic classification system for assisting second language (L2) writing evaluation. This system, which classifies sentences written by L2 learners as either native speaker-like or learner-like sentences, is constructed by machine learning algorithms using word-alignment distributions as classification features…
NASA Astrophysics Data System (ADS)
Wang, Hongcui; Kawahara, Tatsuya
CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.
Bent, Tessa; Holt, Rachael Frush
2018-02-01
Children's ability to understand speakers with a wide range of dialects and accents is essential for efficient language development and communication in a global society. Here, the impact of regional dialect and foreign-accent variability on children's speech understanding was evaluated in both quiet and noisy conditions. Five- to seven-year-old children ( n = 90) and adults ( n = 96) repeated sentences produced by three speakers with different accents-American English, British English, and Japanese-accented English-in quiet or noisy conditions. Adults had no difficulty understanding any speaker in quiet conditions. Their performance declined for the nonnative speaker with a moderate amount of noise; their performance only substantially declined for the British English speaker (i.e., below 93% correct) when their understanding of the American English speaker was also impeded. In contrast, although children showed accurate word recognition for the American and British English speakers in quiet conditions, they had difficulty understanding the nonnative speaker even under ideal listening conditions. With a moderate amount of noise, their perception of British English speech declined substantially and their ability to understand the nonnative speaker was particularly poor. These results suggest that although school-aged children can understand unfamiliar native dialects under ideal listening conditions, their ability to recognize words in these dialects may be highly susceptible to the influence of environmental degradation. Fully adult-like word identification for speakers with unfamiliar accents and dialects may exhibit a protracted developmental trajectory.
Speaker emotion recognition: from classical classifiers to deep neural networks
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2018-04-01
Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.
How Captain Amerika uses neural networks to fight crime
NASA Technical Reports Server (NTRS)
Rogers, Steven K.; Kabrisky, Matthew; Ruck, Dennis W.; Oxley, Mark E.
1994-01-01
Artificial neural network models can make amazing computations. These models are explained along with their application in problems associated with fighting crime. Specific problems addressed are identification of people using face recognition, speaker identification, and fingerprint and handwriting analysis (biometric authentication).
The Effects of the Literal Meaning of Emotional Phrases on the Identification of Vocal Emotions.
Shigeno, Sumi
2018-02-01
This study investigates the discrepancy between the literal emotional content of speech and emotional tone in the identification of speakers' vocal emotions in both the listeners' native language (Japanese), and in an unfamiliar language (random-spliced Japanese). Both experiments involve a "congruent condition," in which the emotion contained in the literal meaning of speech (words and phrases) was compatible with vocal emotion, and an "incongruent condition," in which these forms of emotional information were discordant. Results for Japanese indicated that performance in identifying emotions did not differ significantly between the congruent and incongruent conditions. However, the results for random-spliced Japanese indicated that vocal emotion was correctly identified more often in the congruent than in the incongruent condition. The different results for Japanese and random-spliced Japanese suggested that the literal meaning of emotional phrases influences the listener's perception of the speaker's emotion, and that Japanese participants could infer speakers' intended emotions in the incongruent condition.
Wang, Jiang-Ning; Chen, Xiao-Lin; Hou, Xin-Wen; Zhou, Li-Bing; Zhu, Chao-Dong; Ji, Li-Qiang
2017-07-01
Many species of Tephritidae are damaging to fruit, which might negatively impact international fruit trade. Automatic or semi-automatic identification of fruit flies are greatly needed for diagnosing causes of damage and quarantine protocols for economically relevant insects. A fruit fly image identification system named AFIS1.0 has been developed using 74 species belonging to six genera, which include the majority of pests in the Tephritidae. The system combines automated image identification and manual verification, balancing operability and accuracy. AFIS1.0 integrates image analysis and expert system into a content-based image retrieval framework. In the the automatic identification module, AFIS1.0 gives candidate identification results. Afterwards users can do manual selection based on comparing unidentified images with a subset of images corresponding to the automatic identification result. The system uses Gabor surface features in automated identification and yielded an overall classification success rate of 87% to the species level by Independent Multi-part Image Automatic Identification Test. The system is useful for users with or without specific expertise on Tephritidae in the task of rapid and effective identification of fruit flies. It makes the application of computer vision technology to fruit fly recognition much closer to production level. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
A language-familiarity effect for speaker discrimination without comprehension.
Fleming, David; Giordano, Bruno L; Caldara, Roberto; Belin, Pascal
2014-09-23
The influence of language familiarity upon speaker identification is well established, to such an extent that it has been argued that "Human voice recognition depends on language ability" [Perrachione TK, Del Tufo SN, Gabrieli JDE (2011) Science 333(6042):595]. However, 7-mo-old infants discriminate speakers of their mother tongue better than they do foreign speakers [Johnson EK, Westrek E, Nazzi T, Cutler A (2011) Dev Sci 14(5):1002-1011] despite their limited speech comprehension abilities, suggesting that speaker discrimination may rely on familiarity with the sound structure of one's native language rather than the ability to comprehend speech. To test this hypothesis, we asked Chinese and English adult participants to rate speaker dissimilarity in pairs of sentences in English or Mandarin that were first time-reversed to render them unintelligible. Even in these conditions a language-familiarity effect was observed: Both Chinese and English listeners rated pairs of native-language speakers as more dissimilar than foreign-language speakers, despite their inability to understand the material. Our data indicate that the language familiarity effect is not based on comprehension but rather on familiarity with the phonology of one's native language. This effect may stem from a mechanism analogous to the "other-race" effect in face recognition.
Auto identification technology and its impact on patient safety in the Operating Room of the Future.
Egan, Marie T; Sandberg, Warren S
2007-03-01
Automatic identification technologies, such as bar coding and radio frequency identification, are ubiquitous in everyday life but virtually nonexistent in the operating room. User expectations, based on everyday experience with automatic identification technologies, have generated much anticipation that these systems will improve readiness, workflow, and safety in the operating room, with minimal training requirements. We report, in narrative form, a multi-year experience with various automatic identification technologies in the Operating Room of the Future Project at Massachusetts General Hospital. In each case, the additional human labor required to make these ;labor-saving' technologies function in the medical environment has proved to be their undoing. We conclude that while automatic identification technologies show promise, significant barriers to realizing their potential still exist. Nevertheless, overcoming these obstacles is necessary if the vision of an operating room of the future in which all processes are monitored, controlled, and optimized is to be achieved.
The Roles of Suprasegmental Features in Predicting English Oral Proficiency with an Automated System
ERIC Educational Resources Information Center
Kang, Okim; Johnson, David
2018-01-01
Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups…
ERIC Educational Resources Information Center
LaCelle-Peterson, Mark W.; Rivera, Charlene
1994-01-01
Educational reforms will not automatically have the same effects for native English speakers and English language learners (ELLs). Equitable assessment for ELLs must consider equity issues of assessment technologies, provide information on ELLs' developing language abilities and content-area achievement, and be comprehensive, flexible, progress…
The Influence of Anticipation of Word Misrecognition on the Likelihood of Stuttering
ERIC Educational Resources Information Center
Brocklehurst, Paul H.; Lickley, Robin J.; Corley, Martin
2012-01-01
This study investigates whether the experience of stuttering can result from the speaker's anticipation of his words being misrecognized. Twelve adults who stutter (AWS) repeated single words into what appeared to be an automatic speech-recognition system. Following each iteration of each word, participants provided a self-rating of whether they…
The influence of tone inventory on ERP without focal attention: a cross-language study.
Zheng, Hong-Ying; Peng, Gang; Chen, Jian-Yong; Zhang, Caicai; Minett, James W; Wang, William S-Y
2014-01-01
This study investigates the effect of tone inventories on brain activities underlying pitch without focal attention. We find that the electrophysiological responses to across-category stimuli are larger than those to within-category stimuli when the pitch contours are superimposed on nonspeech stimuli; however, there is no electrophysiological response difference associated with category status in speech stimuli. Moreover, this category effect in nonspeech stimuli is stronger for Cantonese speakers. Results of previous and present studies lead us to conclude that brain activities to the same native lexical tone contrasts are modulated by speakers' language experiences not only in active phonological processing but also in automatic feature detection without focal attention. In contrast to the condition with focal attention, where phonological processing is stronger for speech stimuli, the feature detection (pitch contours in this study) without focal attention as shaped by language background is superior in relatively regular stimuli, that is, the nonspeech stimuli. The results suggest that Cantonese listeners outperform Mandarin listeners in automatic detection of pitch features because of the denser Cantonese tone system.
ERIC Educational Resources Information Center
Blount, Ben G.; Padgug, Elise J.
Features of parental speech to young children was studied in four English-speaking and four Spanish-speaking families. Children ranged in age from 9 to 12 months for the English speakers and from 8 to 22 months for the Spanish speakers. Examination of the utterances led to the identification of 34 prosodic, paralinguistic, and interactional…
Tidal analysis and Arrival Process Mining Using Automatic Identification System (AIS) Data
2017-01-01
files, organized by location. The data were processed using the Python programming language (van Rossum and Drake 2001), the Pandas data analysis...ER D C/ CH L TR -1 7- 2 Coastal Inlets Research Program Tidal Analysis and Arrival Process Mining Using Automatic Identification System...17-2 January 2017 Tidal Analysis and Arrival Process Mining Using Automatic Identification System (AIS) Data Brandan M. Scully Coastal and
Recognition of speaker-dependent continuous speech with KEAL
NASA Astrophysics Data System (ADS)
Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.
1989-04-01
A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.
Age as a Factor in Ethnic Accent Identification in Singapore
ERIC Educational Resources Information Center
Tan, Ying Ying
2012-01-01
This study seeks to answer two research questions. First, can listeners distinguish the ethnicity of the speakers on the basis of voice quality alone? Second, do demographic differences among the listeners affect discriminability? A simple but carefully designed and controlled ethnic identification test was carried out on 325 Singaporean…
NASA Astrophysics Data System (ADS)
O'Sullivan, James; Chen, Zhuo; Herrero, Jose; McKhann, Guy M.; Sheth, Sameer A.; Mehta, Ashesh D.; Mesgarani, Nima
2017-10-01
Objective. People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. Approach. We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener’s neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker’s voice to assist the listener. Main results. Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. Significance. Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.
ERIC Educational Resources Information Center
Horton, William S.
2007-01-01
In typical interactions, speakers frequently produce utterances that appear to reflect beliefs about the common ground shared with particular addressees. Horton and Gerrig (2005a) proposed that one important basis for audience design is the manner in which conversational partners serve as cues for the automatic retrieval of associated information…
Frequency of Use Leads to Automaticity of Production: Evidence from Repair in Conversation
ERIC Educational Resources Information Center
Kapatsinski, Vsevolod
2010-01-01
In spontaneous speech, speakers sometimes replace a word they have just produced or started producing by another word. The present study reports that in these replacement repairs, low-frequency replaced words are more likely to be interrupted prior to completion than high-frequency words, providing support to the hypothesis that the production of…
ERIC Educational Resources Information Center
Lansman, Marcy, Ed.; Hunt, Earl, Ed.
This technical report contains papers prepared by the 11 speakers at the 1980 Lake Wilderness (Seattle, Washington) Conference on Attention. The papers are divided into general models, physiological evidence, and visual attention categories. Topics of the papers include the following: (1) willed versus automatic control of behavior; (2) multiple…
Congenital amusia in speakers of a tone language: association with lexical tone agnosia.
Nan, Yun; Sun, Yanan; Peretz, Isabelle
2010-09-01
Congenital amusia is a neurogenetic disorder that affects the processing of musical pitch in speakers of non-tonal languages like English and French. We assessed whether this musical disorder exists among speakers of Mandarin Chinese who use pitch to alter the meaning of words. Using the Montreal Battery of Evaluation of Amusia, we tested 117 healthy young Mandarin speakers with no self-declared musical problems and 22 individuals who reported musical difficulties and scored two standard deviations below the mean obtained by the Mandarin speakers without amusia. These 22 amusic individuals showed a similar pattern of musical impairment as did amusic speakers of non-tonal languages, by exhibiting a more pronounced deficit in melody than in rhythm processing. Furthermore, nearly half the tested amusics had impairments in the discrimination and identification of Mandarin lexical tones. Six showed marked impairments, displaying what could be called lexical tone agnosia, but had normal tone production. Our results show that speakers of tone languages such as Mandarin may experience musical pitch disorder despite early exposure to speech-relevant pitch contrasts. The observed association between the musical disorder and lexical tone difficulty indicates that the pitch disorder as defining congenital amusia is not specific to music or culture but is rather general in nature.
Alternative Speech Communication System for Persons with Severe Speech Disorders
NASA Astrophysics Data System (ADS)
Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas
2009-12-01
Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
47 CFR 80.275 - Technical Requirements for Class A Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2013 CFR
2013-10-01
... Compulsory Ships § 80.275 Technical Requirements for Class A Automatic Identification System (AIS) equipment. (a) Prior to submitting a certification application for a Class A AIS device, the following... Identification System (AIS) equipment. 80.275 Section 80.275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION...
47 CFR 80.275 - Technical Requirements for Class A Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2011 CFR
2011-10-01
... Compulsory Ships § 80.275 Technical Requirements for Class A Automatic Identification System (AIS) equipment. (a) Prior to submitting a certification application for a Class A AIS device, the following... Identification System (AIS) equipment. 80.275 Section 80.275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION...
47 CFR 80.275 - Technical Requirements for Class A Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2012 CFR
2012-10-01
... Compulsory Ships § 80.275 Technical Requirements for Class A Automatic Identification System (AIS) equipment. (a) Prior to submitting a certification application for a Class A AIS device, the following... Identification System (AIS) equipment. 80.275 Section 80.275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION...
47 CFR 80.275 - Technical Requirements for Class A Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2014 CFR
2014-10-01
... Compulsory Ships § 80.275 Technical Requirements for Class A Automatic Identification System (AIS) equipment. (a) Prior to submitting a certification application for a Class A AIS device, the following... Identification System (AIS) equipment. 80.275 Section 80.275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION...
Sensing of Particular Speakers for the Construction of Voice Interface Utilized in Noisy Environment
NASA Astrophysics Data System (ADS)
Sawada, Hideyuki; Ohkado, Minoru
Human is able to exchange information smoothly using voice under different situations such as noisy environment in a crowd and with the existence of plural speakers. We are able to detect the position of a source sound in 3D space, extract a particular sound from mixed sounds, and recognize who is talking. By realizing this mechanism with a computer, new applications will be presented for recording a sound with high quality by reducing noise, presenting a clarified sound, and realizing a microphone-free speech recognition by extracting particular sound. The paper will introduce a realtime detection and identification of particular speaker in noisy environment using a microphone array based on the location of a speaker and the individual voice characteristics. The study will be applied to develop an adaptive auditory system of a mobile robot which collaborates with a factory worker.
Understanding of emotions and false beliefs among hearing children versus deaf children.
Ziv, Margalit; Most, Tova; Cohen, Shirit
2013-04-01
Emotion understanding and theory of mind (ToM) are two major aspects of social cognition in which deaf children demonstrate developmental delays. The current study investigated these social cognition aspects in two subgroups of deaf children-those with cochlear implants who communicate orally (speakers) and those who communicate primarily using sign language (signers)-in comparison to hearing children. Participants were 53 Israeli kindergartners-20 speakers, 10 signers, and 23 hearing children. Tests included four emotion identification and understanding tasks and one false belief task (ToM). Results revealed similarities among all children's emotion labeling and affective perspective taking abilities, similarities between speakers and hearing children in false beliefs and in understanding emotions in typical contexts, and lower performance of signers on the latter three tasks. Adapting educational experiences to the unique characteristics and needs of speakers and signers is recommended.
Automatic identification of species with neural networks.
Hernández-Serna, Andrés; Jiménez-Segura, Luz Fernanda
2014-01-01
A new automatic identification system using photographic images has been designed to recognize fish, plant, and butterfly species from Europe and South America. The automatic classification system integrates multiple image processing tools to extract the geometry, morphology, and texture of the images. Artificial neural networks (ANNs) were used as the pattern recognition method. We tested a data set that included 740 species and 11,198 individuals. Our results show that the system performed with high accuracy, reaching 91.65% of true positive fish identifications, 92.87% of plants and 93.25% of butterflies. Our results highlight how the neural networks are complementary to species identification.
ERIC Educational Resources Information Center
Adel, Annelie; Erman, Britt
2012-01-01
In order for discourse to be considered idiomatic, it needs to exhibit features like fluency and pragmatically appropriate language use. Advances in corpus linguistics make it possible to examine idiomaticity from the perspective of recurrent word combinations. One approach to capture such word combinations is by the automatic retrieval of lexical…
Jung, Jeeyoun; Park, Bongki; Lee, Ju Ah; You, Sooseong; Alraek, Terje; Bian, Zhao-Xiang; Birch, Stephen; Kim, Tae-Hun; Xu, Hao; Zaslawski, Chris; Kang, Byoung-Kab; Lee, Myeong Soo
2016-09-01
An international brainstorming session on standardizing pattern identification (PI) was held at the Korea Institute of Oriental Medicine on October 1, 2013 in Daejeon, South Korea. This brainstorming session was convened to gather insights from international traditional East Asian medicine specialists regarding PI standardization. With eight presentations and discussion sessions, the meeting allowed participants to discuss research methods and diagnostic systems used in traditional medicine for PI. One speaker presented a talk titled "The diagnostic criteria for blood stasis syndrome: implications for standardization of PI". Four speakers presented on future strategies and objective measurement tools that could be used in PI research. Later, participants shared information and methodology for accurate diagnosis and PI. They also discussed the necessity for standardizing PI and methods for international collaborations in pattern research.
Suspect/foil identification in actual crimes and in the laboratory: a reality monitoring analysis.
Behrman, Bruce W; Richards, Regina E
2005-06-01
Four reality monitoring variables were used to discriminate suspect from foil identifications in 183 actual criminal cases. Four hundred sixty-one identification attempts based on five and six-person lineups were analyzed. These identification attempts resulted in 238 suspect identifications and 68 foil identifications. Confidence, automatic processing, eliminative processing and feature use comprised the set of reality monitoring variables. Thirty-five verbal confidence phrases taken from police reports were assigned numerical values on a 10-point confidence scale. Automatic processing identifications were those that occurred "immediately" or "without hesitation." Eliminative processing identifications occurred when witnesses compared or eliminated persons in the lineups. Confidence, automatic processing and eliminative processing were significant predictors, but feature use was not. Confidence was the most effective discriminator. In cases that involved substantial evidence extrinsic to the identification 43% of the suspect identifications were made with high confidence, whereas only 10% of the foil identifications were made with high confidence. The results of a laboratory study using the same predictors generally paralleled the archival results. Forensic implications are discussed.
Automatic Car Identification - an Evaluation
DOT National Transportation Integrated Search
1972-03-01
In response to a Federal Railroad Administration request, the Transportation Systems Center evaluated the Automatic Car Identification System (ACI) used on the nation's railroads. The ACI scanner was found to be adequate for reliable data output whil...
Perception of musical and lexical tones by Taiwanese-speaking musicians.
Lee, Chao-Yang; Lee, Yuh-Fang; Shr, Chia-Lin
2011-07-01
This study explored the relationship between music and speech by examining absolute pitch and lexical tone perception. Taiwanese-speaking musicians were asked to identify musical tones without a reference pitch and multispeaker Taiwanese level tones without acoustic cues typically present for speaker normalization. The results showed that a high percentage of the participants (65% with an exact match required and 81% with one-semitone errors allowed) possessed absolute pitch, as measured by the musical tone identification task. A negative correlation was found between occurrence of absolute pitch and age of onset of musical training, suggesting that the acquisition of absolute pitch resembles the acquisition of speech. The participants were able to identify multispeaker Taiwanese level tones with above-chance accuracy, even though the acoustic cues typically present for speaker normalization were not available in the stimuli. No correlations were found between the performance in musical tone identification and the performance in Taiwanese tone identification. Potential reasons for the lack of association between the two tasks are discussed. © 2011 Acoustical Society of America
"Who" is saying "what"? Brain-based decoding of human voice and speech.
Formisano, Elia; De Martino, Federico; Bonte, Milene; Goebel, Rainer
2008-11-07
Can we decipher speech content ("what" is being said) and speaker identity ("who" is saying it) from observations of brain activity of a listener? Here, we combine functional magnetic resonance imaging with a data-mining algorithm and retrieve what and whom a person is listening to from the neural fingerprints that speech and voice signals elicit in the listener's auditory cortex. These cortical fingerprints are spatially distributed and insensitive to acoustic variations of the input so as to permit the brain-based recognition of learned speech from unknown speakers and of learned voices from previously unheard utterances. Our findings unravel the detailed cortical layout and computational properties of the neural populations at the basis of human speech recognition and speaker identification.
Optical Automatic Car Identification (OACI) : Volume 1. Advanced System Specification.
DOT National Transportation Integrated Search
1978-12-01
A performance specification is provided in this report for an Optical Automatic Car Identification (OACI) scanner system which features 6% improved readability over existing industry scanner systems. It also includes the analysis and rationale which ...
Estimating spatial travel times using automatic vehicle identification data
DOT National Transportation Integrated Search
2001-01-01
Prepared ca. 2001. The paper describes an algorithm that was developed for estimating reliable and accurate average roadway link travel times using Automatic Vehicle Identification (AVI) data. The algorithm presented is unique in two aspects. First, ...
The influence of ambient speech on adult speech productions through unintentional imitation.
Delvaux, Véronique; Soquet, Alain
2007-01-01
This paper deals with the influence of ambient speech on individual speech productions. A methodological framework is defined to gather the experimental data necessary to feed computer models simulating self-organisation in phonological systems. Two experiments were carried out. Experiment 1 was run on French native speakers from two regiolects of Belgium: two from Liège and two from Brussels. When exposed to the way of speaking of the other regiolect via loudspeakers, the speakers of one regiolect produced vowels that were significantly different from their typical realisations, and significantly closer to the way of speaking specific of the other regiolect. Experiment 2 achieved a replication of the results for 8 Mons speakers hearing a Liège speaker. A significant part of the imitative effect remained up to 10 min after the end of the exposure to the other regiolect productions. As a whole, the results suggest that: (i) imitation occurs automatically and unintentionally, (ii) the modified realisations leave a memory trace, in which case the mechanism may be better defined as 'mimesis' than as 'imitation'. The potential effects of multiple imitative speech interactions on sound change are discussed in this paper, as well as the implications for a general theory of phonetic implementation and phonetic representation.
On the Development of Speech Resources for the Mixtec Language
2013-01-01
The Mixtec language is one of the main native languages in Mexico. In general, due to urbanization, discrimination, and limited attempts to promote the culture, the native languages are disappearing. Most of the information available about the Mixtec language is in written form as in dictionaries which, although including examples about how to pronounce the Mixtec words, are not as reliable as listening to the correct pronunciation from a native speaker. Formal acoustic resources, as speech corpora, are almost non-existent for the Mixtec, and no speech technologies are known to have been developed for it. This paper presents the development of the following resources for the Mixtec language: (1) a speech database of traditional narratives of the Mixtec culture spoken by a native speaker (labelled at the phonetic and orthographic levels by means of spectral analysis) and (2) a native speaker-adaptive automatic speech recognition (ASR) system (trained with the speech database) integrated with a Mixtec-to-Spanish/Spanish-to-Mixtec text translator. The speech database, although small and limited to a single variant, was reliable enough to build the multiuser speech application which presented a mean recognition/translation performance up to 94.36% in experiments with non-native speakers (the target users). PMID:23710134
Choi, Ji Eun; Moon, Il Joon; Kim, Eun Yeon; Park, Hee-Sung; Kim, Byung Kil; Chung, Won-Ho; Cho, Yang-Sun; Brown, Carolyn J; Hong, Sung Hwa
The aim of this study was to compare binaural performance of auditory localization task and speech perception in babble measure between children who use a cochlear implant (CI) in one ear and a hearing aid (HA) in the other (bimodal fitting) and those who use bilateral CIs. Thirteen children (mean age ± SD = 10 ± 2.9 years) with bilateral CIs and 19 children with bimodal fitting were recruited to participate. Sound localization was assessed using a 13-loudspeaker array in a quiet sound-treated booth. Speakers were placed in an arc from -90° azimuth to +90° azimuth (15° interval) in horizontal plane. To assess the accuracy of sound location identification, we calculated the absolute error in degrees between the target speaker and the response speaker during each trial. The mean absolute error was computed by dividing the sum of absolute errors by the total number of trials. We also calculated the hemifield identification score to reflect the accuracy of right/left discrimination. Speech-in-babble perception was also measured in the sound field using target speech presented from the front speaker. Eight-talker babble was presented in the following four different listening conditions: from the front speaker (0°), from one of the two side speakers (+90° or -90°), from both side speakers (±90°). Speech, spatial, and quality questionnaire was administered. When the two groups of children were directly compared with each other, there was no significant difference in localization accuracy ability or hemifield identification score under binaural condition. Performance in speech perception test was also similar to each other under most babble conditions. However, when the babble was from the first device side (CI side for children with bimodal stimulation or first CI side for children with bilateral CIs), speech understanding in babble by bilateral CI users was significantly better than that by bimodal listeners. Speech, spatial, and quality scores were comparable with each other between the two groups. Overall, the binaural performance was similar to each other between children who are fit with two CIs (CI + CI) and those who use bimodal stimulation (HA + CI) in most conditions. However, the bilateral CI group showed better speech perception than the bimodal CI group when babble was from the first device side (first CI side for bilateral CI users or CI side for bimodal listeners). Therefore, if bimodal performance is significantly below the mean bilateral CI performance on speech perception in babble, these results suggest that a child should be considered to transit from bimodal stimulation to bilateral CIs.
Interface of Linguistic and Visual Information During Audience Design.
Fukumura, Kumiko
2015-08-01
Evidence suggests that speakers can take account of the addressee's needs when referring. However, what representations drive the speaker's audience design has been less clear. This study aims to go beyond previous studies by investigating the interplay between the visual and linguistic context during audience design. Speakers repeated subordinate descriptions (e.g., firefighter) given in the prior linguistic context less and used basic-level descriptions (e.g., man) more when the addressee did not hear the linguistic context than when s/he did. But crucially, this effect happened only when the referent lacked the visual attributes associated with the expressions (e.g., the referent was in plain clothes rather than in a firefighter uniform), so there was no other contextual cue available for the identification of the referent. This suggests that speakers flexibly use different contextual cues to help their addressee map the referring expression onto the intended referent. In addition, speakers used fewer pronouns when the addressee did not hear the linguistic antecedent than when s/he did. This suggests that although speakers may be egocentric during anaphoric reference (Fukumura & Van Gompel, 2012), they can cooperatively avoid pronouns when the linguistic antecedents were not shared with their addressee during initial reference. © 2014 Cognitive Science Society, Inc.
Roadway system assessment using bluetooth-based automatic vehicle identification travel time data.
DOT National Transportation Integrated Search
2012-12-01
This monograph is an exposition of several practice-ready methodologies for automatic vehicle identification (AVI) data collection : systems. This includes considerations in the physical setup of the collection system as well as the interpretation of...
Voice input/output capabilities at Perception Technology Corporation
NASA Technical Reports Server (NTRS)
Ferber, Leon A.
1977-01-01
Condensed resumes of key company personnel at the Perception Technology Corporation are presented. The staff possesses recognition, speech synthesis, speaker authentication, and language identification. Hardware and software engineers' capabilities are included.
Noh, Heil; Lee, Dong-Hee
2012-01-01
To identify the quantitative differences between Korean and English in long-term average speech spectra (LTASS). Twenty Korean speakers, who lived in the capital of Korea and spoke standard Korean as their first language, were compared with 20 native English speakers. For the Korean speakers, a passage from a novel and a passage from a leading newspaper article were chosen. For the English speakers, the Rainbow Passage was used. The speech was digitally recorded using GenRad 1982 Precision Sound Level Meter and GoldWave® software and analyzed using MATLAB program. There was no significant difference in the LTASS between the Korean subjects reading a news article or a novel. For male subjects, the LTASS of Korean speakers was significantly lower than that of English speakers above 1.6 kHz except at 4 kHz and its difference was more than 5 dB, especially at higher frequencies. For women, the LTASS of Korean speakers showed significantly lower levels at 0.2, 0.5, 1, 1.25, 2, 2.5, 6.3, 8, and 10 kHz, but the differences were less than 5 dB. Compared with English speakers, the LTASS of Korean speakers showed significantly lower levels in frequencies above 2 kHz except at 4 kHz. The difference was less than 5 dB between 2 and 5 kHz but more than 5 dB above 6 kHz. To adjust the formula for fitting hearing aids for Koreans, our results based on the LTASS analysis suggest that one needs to raise the gain in high-frequency regions.
Martens, Heidi; Van Nuffelen, Gwen; Dekens, Tomas; Hernández-Díaz Huici, Maria; Kairuz Hernández-Díaz, Hector Arturo; De Letter, Miet; De Bodt, Marc
2015-01-01
Most studies on treatment of prosody in individuals with dysarthria due to Parkinson's disease are based on intensive treatment of loudness. The present study investigates the effect of intensive treatment of speech rate and intonation on the intelligibility of individuals with dysarthria due to Parkinson's disease. A one group pretest-posttest design was used to compare intelligibility, speech rate, and intonation before and after treatment. Participants included eleven Dutch-speaking individuals with predominantly moderate dysarthria due to Parkinson's disease, who received five one-hour treatment sessions per week during three weeks. Treatment focused on lowering speech rate and magnifying the phrase final intonation contrast between statements and questions. Intelligibility was perceptually assessed using a standardized sentence intelligibility test. Speech rate was automatically assessed during the sentence intelligibility test as well as during a passage reading task and a storytelling task. Intonation was perceptually assessed using a sentence reading task and a sentence repetition task, and also acoustically analyzed in terms of maximum fundamental frequency. After treatment, there was a significant improvement of sentence intelligibility (effect size .83), a significant increase of pause frequency during the passage reading task, a significant improvement of correct listener identification of statements and questions, and a significant increase of the maximum fundamental frequency in the final syllable of questions during both intonation tasks. The findings suggest that participants were more intelligible and more able to manipulate pause frequency and statement-question intonation after treatment. However, the relationship between the change in intelligibility on the one hand and the changes in speech rate and intonation on the other hand is not yet fully understood. Results are nuanced in the light of the operated research design. The reader will be able to: (1) describe the effect of intensive speech rate and intonation treatment on intelligibility of speakers with dysarthria due to PD, (2) describe the effect of intensive speech rate treatment on rate manipulation by speakers with dysarthria due to PD, and (3) describe the effect of intensive intonation treatment on manipulation of the phrase final intonation contrast between statements and questions by speakers with dysarthria due to PD. Copyright © 2015 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Suzuki, Yuichi; DeKeyser, Robert
2017-01-01
Recent research has called for the use of fine-grained measures that distinguish implicit knowledge from automatized explicit knowledge. In the current study, such measures were used to determine how the two systems interact in a naturalistic second language (L2) acquisition context. One hundred advanced L2 speakers of Japanese living in Japan…
ERIC Educational Resources Information Center
Yee, Penny L.
This study investigates the role of specific inhibitory processes in lexical ambiguity resolution. An attentional view of inhibition and a view based on specific automatic inhibition between nodes predict different results when a neutral item is processed between an ambiguous word and a related target. Subjects were 32 English speakers with normal…
NASA Astrophysics Data System (ADS)
Kao, Meng-Chun; Ting, Chien-Kun; Kuo, Wen-Chuan
2018-02-01
Incorrect placement of the needle causes medical complications in the epidural block, such as dural puncture or spinal cord injury. This study proposes a system which combines an optical coherence tomography (OCT) imaging probe with an automatic identification (AI) system to objectively identify the position of the epidural needle tip. The automatic identification system uses three features as image parameters to distinguish the different tissue by three classifiers. Finally, we found that the support vector machine (SVM) classifier has highest accuracy, specificity, and sensitivity, which reached to 95%, 98%, and 92%, respectively.
Weber, Kirsten; Luther, Lisa; Indefrey, Peter; Hagoort, Peter
2016-05-01
When we learn a second language later in life, do we integrate it with the established neural networks in place for the first language or is at least a partially new network recruited? While there is evidence that simple grammatical structures in a second language share a system with the native language, the story becomes more multifaceted for complex sentence structures. In this study, we investigated the underlying brain networks in native speakers compared with proficient second language users while processing complex sentences. As hypothesized, complex structures were processed by the same large-scale inferior frontal and middle temporal language networks of the brain in the second language, as seen in native speakers. These effects were seen both in activations and task-related connectivity patterns. Furthermore, the second language users showed increased task-related connectivity from inferior frontal to inferior parietal regions of the brain, regions related to attention and cognitive control, suggesting less automatic processing for these structures in a second language.
Video indexing based on image and sound
NASA Astrophysics Data System (ADS)
Faudemay, Pascal; Montacie, Claude; Caraty, Marie-Jose
1997-10-01
Video indexing is a major challenge for both scientific and economic reasons. Information extraction can sometimes be easier from sound channel than from image channel. We first present a multi-channel and multi-modal query interface, to query sound, image and script through 'pull' and 'push' queries. We then summarize the segmentation phase, which needs information from the image channel. Detection of critical segments is proposed. It should speed-up both automatic and manual indexing. We then present an overview of the information extraction phase. Information can be extracted from the sound channel, through speaker recognition, vocal dictation with unconstrained vocabularies, and script alignment with speech. We present experiment results for these various techniques. Speaker recognition methods were tested on the TIMIT and NTIMIT database. Vocal dictation as experimented on newspaper sentences spoken by several speakers. Script alignment was tested on part of a carton movie, 'Ivanhoe'. For good quality sound segments, error rates are low enough for use in indexing applications. Major issues are the processing of sound segments with noise or music, and performance improvement through the use of appropriate, low-cost architectures or networks of workstations.
Automatic Publication of a MIS Product to GeoNetwork: Case of the AIS Indexer
2012-11-01
installation and configuration The following instructions are for installing and configuring the software packages Java 1.6 and MySQL 5.5 which are...An Automatic Identification System (AIS) reception indexer Java application was developed in the summer of 2011, based on the work of Lapinski and...release; distribution unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT An Automatic Identification System (AIS) reception indexer Java application was
Associative priming in a masked perceptual identification task: evidence for automatic processes.
Pecher, Diane; Zeelenberg, René; Raaijmakers, Jeroen G W
2002-10-01
Two experiments investigated the influence of automatic and strategic processes on associative priming effects in a perceptual identification task in which prime-target pairs are briefly presented and masked. In this paradigm, priming is defined as a higher percentage of correctly identified targets for related pairs than for unrelated pairs. In Experiment 1, priming was obtained for mediated word pairs. This mediated priming effect was affected neither by the presence of direct associations nor by the presentation time of the primes, indicating that automatic priming effects play a role in perceptual identification. Experiment 2 showed that the priming effect was not affected by the proportion (.90 vs. .10) of related pairs if primes were presented briefly to prevent their identification. However, a large proportion effect was found when primes were presented for 1000 ms so that they were clearly visible. These results indicate that priming in a masked perceptual identification task is the result of automatic processes and is not affected by strategies. The present paradigm provides a valuable alternative to more commonly used tasks such as lexical decision.
Abbreviation definition identification based on automatic precision estimates.
Sohn, Sunghwan; Comeau, Donald C; Kim, Won; Wilbur, W John
2008-09-25
The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders indexing algorithms and adversely affects information retrieval and extraction. Automatic abbreviation definition identification can help resolve these issues. However, abbreviations and their definitions identified by an automatic process are of uncertain validity. Due to the size of databases such as MEDLINE only a small fraction of abbreviation-definition pairs can be examined manually. An automatic way to estimate the accuracy of abbreviation-definition pairs extracted from text is needed. In this paper we propose an abbreviation definition identification algorithm that employs a variety of strategies to identify the most probable abbreviation definition. In addition our algorithm produces an accuracy estimate, pseudo-precision, for each strategy without using a human-judged gold standard. The pseudo-precisions determine the order in which the algorithm applies the strategies in seeking to identify the definition of an abbreviation. On the Medstract corpus our algorithm produced 97% precision and 85% recall which is higher than previously reported results. We also annotated 1250 randomly selected MEDLINE records as a gold standard. On this set we achieved 96.5% precision and 83.2% recall. This compares favourably with the well known Schwartz and Hearst algorithm. We developed an algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result. This process is purely automatic.
Affective processing in bilingual speakers: disembodied cognition?
Pavlenko, Aneta
2012-01-01
A recent study by Keysar, Hayakawa, and An (2012) suggests that "thinking in a foreign language" may reduce decision biases because a foreign language provides a greater emotional distance than a native tongue. The possibility of such "disembodied" cognition is of great interest for theories of affect and cognition and for many other areas of psychological theory and practice, from clinical and forensic psychology to marketing, but first this claim needs to be properly evaluated. The purpose of this review is to examine the findings of clinical, introspective, cognitive, psychophysiological, and neuroimaging studies of affective processing in bilingual speakers in order to identify converging patterns of results, to evaluate the claim about "disembodied cognition," and to outline directions for future inquiry. The findings to date reveal two interrelated processing effects. First-language (L1) advantage refers to increased automaticity of affective processing in the L1 and heightened electrodermal reactivity to L1 emotion-laden words. Second-language (L2) advantage refers to decreased automaticity of affective processing in the L2, which reduces interference effects and lowers electrodermal reactivity to negative emotional stimuli. The differences in L1 and L2 affective processing suggest that in some bilingual speakers, in particular late bilinguals and foreign language users, respective languages may be differentially embodied, with the later learned language processed semantically but not affectively. This difference accounts for the reduction of framing biases in L2 processing in the study by Keysar et al. (2012). The follow-up discussion identifies the limits of the findings to date in terms of participant populations, levels of processing, and types of stimuli, puts forth alternative explanations of the documented effects, and articulates predictions to be tested in future research.
The emotional impact of being myself: Emotions and foreign-language processing.
Ivaz, Lela; Costa, Albert; Duñabeitia, Jon Andoni
2016-03-01
Native languages are acquired in emotionally rich contexts, whereas foreign languages are typically acquired in emotionally neutral academic environments. As a consequence of this difference, it has been suggested that bilinguals' emotional reactivity in foreign-language contexts is reduced as compared with native language contexts. In the current study, we investigated whether this emotional distance associated with foreign languages could modulate automatic responses to self-related linguistic stimuli. Self-related stimuli enhance performance by boosting memory, speed, and accuracy as compared with stimuli unrelated to the self (the so-called self-bias effect). We explored whether this effect depends on the language context by comparing self-biases in a native and a foreign language. Two experiments were conducted with native Spanish speakers with a high level of English proficiency in which they were asked to complete a perceptual matching task during which they associated simple geometric shapes (circles, squares, and triangles) with the labels "you," "friend," and "other" either in their native or foreign language. Results showed a robust asymmetry in the self-bias in the native- and foreign-language contexts: A larger self-bias was found in the native than in the foreign language. An additional control experiment demonstrated that the same materials administered to a group of native English speakers yielded robust self-bias effects that were comparable in magnitude to the ones obtained with the Spanish speakers when tested in their native language (but not in their foreign language). We suggest that the emotional distance evoked by the foreign-language contexts caused these differential effects across language contexts. These results demonstrate that the foreign-language effects are pervasive enough to affect automatic stages of emotional processing. (c) 2016 APA, all rights reserved).
Optical Automatic Car Identification (OACI) Field Test Program
DOT National Transportation Integrated Search
1976-05-01
The results of the Optical Automatic Car Identification (OACI) tests at Chicago conducted from August 16 to September 4, 1975 are presented. The main purpose of this test was to determine the suitability of optics as a principle of operation for an a...
Perception of English palatal codas by Korean speakers of English
NASA Astrophysics Data System (ADS)
Yeon, Sang-Hee
2003-04-01
This study aimed at looking at perception of English palatal codas by Korean speakers of English to determine if perception problems are the source of production problems. In particular, first, this study looked at the possible first language effect on the perception of English palatal codas. Second, a possible perceptual source of vowel epenthesis after English palatal codas was investigated. In addition, individual factors, such as length of residence, TOEFL score, gender and academic status, were compared to determine if those affected the varying degree of the perception accuracy. Eleven adult Korean speakers of English as well as three native speakers of English participated in the study. Three sets of a perception test including identification of minimally different English pseudo- or real words were carried out. The results showed that, first, the Korean speakers perceived the English codas significantly worse than the Americans. Second, the study supported the idea that Koreans perceived an extra /i/ after the final affricates due to final release. Finally, none of the individual factors explained the varying degree of the perceptional accuracy. In particular, TOEFL scores and the perception test scores did not have any statistically significant association.
Pruitt, John S; Jenkins, James J; Strange, Winifred
2006-03-01
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast.
Code of Federal Regulations, 2011 CFR
2011-07-01
..., or transcriptions of electronic recordings including the identification of speakers, shall to the... cost of transcription. (c) The agency shall maintain a complete verbatim copy of the transcript, a...
Code of Federal Regulations, 2010 CFR
2010-07-01
..., or transcriptions of electronic recordings including the identification of speakers, shall to the... cost of transcription. (c) The agency shall maintain a complete verbatim copy of the transcript, a...
DOT National Transportation Integrated Search
2002-11-01
This paper develops an algorithm for optimally locating surveillance technologies with an emphasis on Automatic Vehicle Identification tag readers by maximizing the benefit that would accrue from measuring travel times on a transportation network. Th...
47 CFR 25.281 - Automatic Transmitter Identification System (ATIS).
Code of Federal Regulations, 2010 CFR
2010-10-01
... identified through the use of an automatic transmitter identification system as specified below. (a.... (3) The ATIS signal as a minimum shall consist of the following: (i) The FCC assigned earth station... (ATIS). 25.281 Section 25.281 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON...
47 CFR 25.281 - Automatic Transmitter Identification System (ATIS).
Code of Federal Regulations, 2013 CFR
2013-10-01
... identified through the use of an automatic transmitter identification system as specified below. (a.... (3) The ATIS signal as a minimum shall consist of the following: (i) The FCC assigned earth station... (ATIS). 25.281 Section 25.281 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON...
47 CFR 25.281 - Automatic Transmitter Identification System (ATIS).
Code of Federal Regulations, 2012 CFR
2012-10-01
... identified through the use of an automatic transmitter identification system as specified below. (a.... (3) The ATIS signal as a minimum shall consist of the following: (i) The FCC assigned earth station... (ATIS). 25.281 Section 25.281 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON...
47 CFR 25.281 - Automatic Transmitter Identification System (ATIS).
Code of Federal Regulations, 2011 CFR
2011-10-01
... identified through the use of an automatic transmitter identification system as specified below. (a.... (3) The ATIS signal as a minimum shall consist of the following: (i) The FCC assigned earth station... (ATIS). 25.281 Section 25.281 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON...
Facilitator control as automatic behavior: A verbal behavior analysis
Hall, Genae A.
1993-01-01
Several studies of facilitated communication have demonstrated that the facilitators were controlling and directing the typing, although they appeared to be unaware of doing so. Such results shift the focus of analysis to the facilitator's behavior and raise questions regarding the controlling variables for that behavior. This paper analyzes facilitator behavior as an instance of automatic verbal behavior, from the perspective of Skinner's (1957) book Verbal Behavior. Verbal behavior is automatic when the speaker or writer is not stimulated by the behavior at the time of emission, the behavior is not edited, the products of behavior differ from what the person would produce normally, and the behavior is attributed to an outside source. All of these characteristics appear to be present in facilitator behavior. Other variables seem to account for the thematic content of the typed messages. These variables also are discussed. PMID:22477083
NASA Astrophysics Data System (ADS)
Fernández Pozo, Rubén; Blanco Murillo, Jose Luis; Hernández Gómez, Luis; López Gonzalo, Eduardo; Alcázar Ramírez, José; Toledano, Doroteo T.
2009-12-01
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
1988-12-15
ORGANIZATION PEPOPT NVBER S; 6a E O c PEPFC=V’NG CPC;% ZA " ON 6o OrFiCE SYMBOL 7a NAME OF MONITORING ORGANZA - ON 0 University of Califor-ia (If appi cable...readers. Experimental Aging Research, 7, 253-268. McLeod, B., & McLaughlin, B. (1986). Restructuring or automatization? Reading in a second language
Automated Intelligibility Assessment of Pathological Speech Using Phonological Features
NASA Astrophysics Data System (ADS)
Middag, Catherine; Martens, Jean-Pierre; Van Nuffelen, Gwen; De Bodt, Marc
2009-12-01
It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008) is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.
Automatically identifying health outcome information in MEDLINE records.
Demner-Fushman, Dina; Few, Barbara; Hauser, Susan E; Thoma, George
2006-01-01
Understanding the effect of a given intervention on the patient's health outcome is one of the key elements in providing optimal patient care. This study presents a methodology for automatic identification of outcomes-related information in medical text and evaluates its potential in satisfying clinical information needs related to health care outcomes. An annotation scheme based on an evidence-based medicine model for critical appraisal of evidence was developed and used to annotate 633 MEDLINE citations. Textual, structural, and meta-information features essential to outcome identification were learned from the created collection and used to develop an automatic system. Accuracy of automatic outcome identification was assessed in an intrinsic evaluation and in an extrinsic evaluation, in which ranking of MEDLINE search results obtained using PubMed Clinical Queries relied on identified outcome statements. The accuracy and positive predictive value of outcome identification were calculated. Effectiveness of the outcome-based ranking was measured using mean average precision and precision at rank 10. Automatic outcome identification achieved 88% to 93% accuracy. The positive predictive value of individual sentences identified as outcomes ranged from 30% to 37%. Outcome-based ranking improved retrieval accuracy, tripling mean average precision and achieving 389% improvement in precision at rank 10. Preliminary results in outcome-based document ranking show potential validity of the evidence-based medicine-model approach in timely delivery of information critical to clinical decision support at the point of service.
Report of an international symposium on drugs and driving
DOT National Transportation Integrated Search
1975-06-30
This report presents the proceedings of a Symposium on Drugs (other than alcohol) and Driving. Speaker's papers and work session summaries are included. Major topics include: Overview of Problem, Risk Identification, Drug Measurement in Biological Ma...
Individual differences in selective attention predict speech identification at a cocktail party.
Oberfeld, Daniel; Klöckner-Nowotny, Felicitas
2016-08-31
Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise.
Motor signatures of emotional reactivity in frontotemporal dementia.
Marshall, Charles R; Hardy, Chris J D; Russell, Lucy L; Clark, Camilla N; Bond, Rebecca L; Dick, Katrina M; Brotherhood, Emilie V; Mummery, Cath J; Schott, Jonathan M; Rohrer, Jonathan D; Kilner, James M; Warren, Jason D
2018-01-18
Automatic motor mimicry is essential to the normal processing of perceived emotion, and disrupted automatic imitation might underpin socio-emotional deficits in neurodegenerative diseases, particularly the frontotemporal dementias. However, the pathophysiology of emotional reactivity in these diseases has not been elucidated. We studied facial electromyographic responses during emotion identification on viewing videos of dynamic facial expressions in 37 patients representing canonical frontotemporal dementia syndromes versus 21 healthy older individuals. Neuroanatomical associations of emotional expression identification accuracy and facial muscle reactivity were assessed using voxel-based morphometry. Controls showed characteristic profiles of automatic imitation, and this response predicted correct emotion identification. Automatic imitation was reduced in the behavioural and right temporal variant groups, while the normal coupling between imitation and correct identification was lost in the right temporal and semantic variant groups. Grey matter correlates of emotion identification and imitation were delineated within a distributed network including primary visual and motor, prefrontal, insular, anterior temporal and temporo-occipital junctional areas, with common involvement of supplementary motor cortex across syndromes. Impaired emotional mimesis may be a core mechanism of disordered emotional signal understanding and reactivity in frontotemporal dementia, with implications for the development of novel physiological biomarkers of socio-emotional dysfunction in these diseases.
Greek perception and production of an English vowel contrast: A preliminary study
NASA Astrophysics Data System (ADS)
Podlipský, Václav J.
2005-04-01
This study focused on language-independent principles functioning in acquisition of second language (L2) contrasts. Specifically, it tested Bohn's Desensitization Hypothesis [in Speech perception and linguistic experience: Issues in Cross Language Research, edited by W. Strange (York Press, Baltimore, 1995)] which predicted that Greek speakers of English as an L2 would base their perceptual identification of English /i/ and /I/ on durational differences. Synthetic vowels differing orthogonally in duration and spectrum between the /i/ and /I/ endpoints served as stimuli for a forced-choice identification test. To assess L2 proficiency and to evaluate the possibility of cross-language category assimilation, productions of English /i/, /I/, and /ɛ/ and of Greek /i/ and /e/ were elicited and analyzed acoustically. The L2 utterances were also rated for the degree of foreign accent. Two native speakers of Modern Greek with low and 2 with intermediate experience in English participated. Six native English (NE) listeners and 6 NE speakers tested in an earlier study constituted the control groups. Heterogeneous perceptual behavior was observed for the L2 subjects. It is concluded that until acquisition in completely naturalistic settings is tested, possible interference of formally induced meta-linguistic differentiation between a ``short'' and a ``long'' vowel cannot be eliminated.
Robust speaker's location detection in a vehicle environment using GMM models.
Hu, Jwu-Sheng; Cheng, Chieh-Cheng; Liu, Wei-Han
2006-04-01
Abstract-Human-computer interaction (HCI) using speech communication is becoming increasingly important, especially in driving where safety is the primary concern. Knowing the speaker's location (i.e., speaker localization) not only improves the enhancement results of a corrupted signal, but also provides assistance to speaker identification. Since conventional speech localization algorithms suffer from the uncertainties of environmental complexity and noise, as well as from the microphone mismatch problem, they are frequently not robust in practice. Without a high reliability, the acceptance of speech-based HCI would never be realized. This work presents a novel speaker's location detection method and demonstrates high accuracy within a vehicle cabinet using a single linear microphone array. The proposed approach utilize Gaussian mixture models (GMM) to model the distributions of the phase differences among the microphones caused by the complex characteristic of room acoustic and microphone mismatch. The model can be applied both in near-field and far-field situations in a noisy environment. The individual Gaussian component of a GMM represents some general location-dependent but content and speaker-independent phase difference distributions. Moreover, the scheme performs well not only in nonline-of-sight cases, but also when the speakers are aligned toward the microphone array but at difference distances from it. This strong performance can be achieved by exploiting the fact that the phase difference distributions at different locations are distinguishable in the environment of a car. The experimental results also show that the proposed method outperforms the conventional multiple signal classification method (MUSIC) technique at various SNRs.
21 CFR 892.1900 - Automatic radiographic film processor.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
Rep. Connolly, Gerald E. [D-VA-11
2013-02-13
House - 02/13/2013 Referred to the Committee on House Administration, and in addition to the Committee on Oversight and Government Reform, for a period to be subsequently determined by the Speaker, in each case for consideration of such provisions as fall within the jurisdiction of... (All Actions) Tracker: This bill has the status IntroducedHere are the steps for Status of Legislation:
47 CFR 80.275 - Technical Requirements for Class A Automatic Identification System (AIS) equipment.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 47 Telecommunication 5 2010-10-01 2010-10-01 false Technical Requirements for Class A Automatic Identification System (AIS) equipment. 80.275 Section 80.275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) SAFETY AND SPECIAL RADIO SERVICES STATIONS IN THE MARITIME SERVICES Equipment Authorization for Compulsory Ships § 80.275...
RFID: A Revolution in Automatic Data Recognition
ERIC Educational Resources Information Center
Deal, Walter F., III
2004-01-01
Radio frequency identification, or RFID, is a generic term for technologies that use radio waves to automatically identify people or objects. There are several methods of identification, but the most common is to store a serial number that identifies a person or object, and perhaps other information, on a microchip that is attached to an antenna…
33 CFR 164.43 - Automatic Identification System Shipborne Equipment-Prince William Sound.
Code of Federal Regulations, 2010 CFR
2010-07-01
... (AISSE) system consisting of a: (1) Twelve-channel all-in-view Differential Global Positioning System (d... to indicate to shipboard personnel that the U.S. Coast Guard dGPS system cannot provide the required... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false Automatic Identification System...
Pakulak, Eric; Neville, Helen J.
2010-01-01
While anecdotally there appear to be differences in the way native speakers use and comprehend their native language, most empirical investigations of language processing study university students and none have studied differences in language proficiency which may be independent of resource limitations such as working memory span. We examined differences in language proficiency in adult monolingual native speakers of English using an event-related potential (ERP) paradigm. ERPs were recorded to insertion phrase structure violations in naturally spoken English sentences. Participants recruited from a wide spectrum of society were given standardized measures of English language proficiency, and two complementary ERP analyses were performed. In between-groups analyses, participants were divided, based on standardized proficiency scores, into Lower Proficiency (LP) and Higher Proficiency (HP) groups. Compared to LP participants, HP participants showed an early anterior negativity that was more focal, both spatially and temporally, and a larger and more widely distributed positivity (P600) to violations. In correlational analyses, we utilized a wide spectrum of proficiency scores to examine the degree to which individual proficiency scores correlated with individual neural responses to syntactic violations in regions and time windows identified in the between-group analyses. This approach also employed partial correlation analyses to control for possible confounding variables. These analyses provided evidence for the effects of proficiency that converged with the between-groups analyses. These results suggest that adult monolingual native speakers of English who vary in language proficiency differ in the recruitment of syntactic processes that are hypothesized to be at least in part automatic as well as of those thought to be more controlled. These results also suggest that in order to fully characterize neural organization for language in native speakers it is necessary to include participants of varying proficiency. PMID:19925188
A cross-language study of perception of lexical stress in English.
Yu, Vickie Y; Andruski, Jean E
2010-08-01
This study investigates the question of whether language background affects the perception of lexical stress in English. Thirty native English speakers and 30 native Chinese learners of English participated in a stressed-syllable identification task and a discrimination task involving three types of stimuli (real words/pseudowords/hums). The results show that both language groups were able to identify and discriminate stress patterns. Lexical and segmental information affected the English and Chinese speakers in varying degrees. English and Chinese speakers showed different response patterns to trochaic vs. iambic stress across the three types of stimuli. An acoustic analysis revealed that two language groups used different acoustic cues to process lexical stress. The findings suggest that the different degrees of lexical and segmental effects can be explained by language background, which in turn supports the hypothesis that language background affects the perception of lexical stress in English.
Brain Plasticity in Speech Training in Native English Speakers Learning Mandarin Tones
NASA Astrophysics Data System (ADS)
Heinzen, Christina Carolyn
The current study employed behavioral and event-related potential (ERP) measures to investigate brain plasticity associated with second-language (L2) phonetic learning based on an adaptive computer training program. The program utilized the acoustic characteristics of Infant-Directed Speech (IDS) to train monolingual American English-speaking listeners to perceive Mandarin lexical tones. Behavioral identification and discrimination tasks were conducted using naturally recorded speech, carefully controlled synthetic speech, and non-speech control stimuli. The ERP experiments were conducted with selected synthetic speech stimuli in a passive listening oddball paradigm. Identical pre- and post- tests were administered on nine adult listeners, who completed two-to-three hours of perceptual training. The perceptual training sessions used pair-wise lexical tone identification, and progressed through seven levels of difficulty for each tone pair. The levels of difficulty included progression in speaker variability from one to four speakers and progression through four levels of acoustic exaggeration of duration, pitch range, and pitch contour. Behavioral results for the natural speech stimuli revealed significant training-induced improvement in identification of Tones 1, 3, and 4. Improvements in identification of Tone 4 generalized to novel stimuli as well. Additionally, comparison between discrimination of across-category and within-category stimulus pairs taken from a synthetic continuum revealed a training-induced shift toward more native-like categorical perception of the Mandarin lexical tones. Analysis of the Mismatch Negativity (MMN) responses in the ERP data revealed increased amplitude and decreased latency for pre-attentive processing of across-category discrimination as a result of training. There were also laterality changes in the MMN responses to the non-speech control stimuli, which could reflect reallocation of brain resources in processing pitch patterns for the across-category lexical tone contrast. Overall, the results support the use of IDS characteristics in training non-native speech contrasts and provide impetus for further research.
Early-stage chunking of finger tapping sequences by persons who stutter and fluent speakers.
Smits-Bandstra, Sarah; De Nil, Luc F
2013-01-01
This research note explored the hypothesis that chunking differences underlie the slow finger-tap sequencing performance reported in the literature for persons who stutter (PWS) relative to fluent speakers (PNS). Early-stage chunking was defined as an immediate and spontaneous tendency to organize a long sequence into pauses, for motor planning, and chunks of fluent motor performance. A previously published study in which 12 PWS and 12 matched PNS practised a 10-item finger tapping sequence 30 times was examined. Both groups significantly decreased the duration of between-chunk intervals (BCIs) and within-chunk intervals (WCIs) over practice. PNS had significantly shorter WCIs relative to PWS, but minimal differences between groups were found for the number of, or duration of, BCI. Results imply that sequencing differences found between PNS and PWS may be due to differences in automatizing movements within chunks or retrieving chunks from memory rather than chunking per se.
DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1
NASA Astrophysics Data System (ADS)
Garofolo, J. S.; Lamel, L. F.; Fisher, W. M.; Fiscus, J. G.; Pallett, D. S.
1993-02-01
The Texas Instruments/Massachusetts Institute of Technology (TIMIT) corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems. TIMIT contains speech from 630 speakers representing 8 major dialect divisions of American English, each speaking 10 phonetically-rich sentences. The TIMIT corpus includes time-aligned orthographic, phonetic, and word transcriptions, as well as speech waveform data for each spoken sentence. The release of TIMIT contains several improvements over the Prototype CD-ROM released in December, 1988: (1) full 630-speaker corpus, (2) checked and corrected transcriptions, (3) word-alignment transcriptions, (4) NIST SPHERE-headered waveform files and header manipulation software, (5) phonemic dictionary, (6) new test and training subsets balanced for dialectal and phonetic coverage, and (7) more extensive documentation.
Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.
Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin
2018-02-21
In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Information Processing Techniques Program. Volume 1. Packet Speech Systems Technology
1980-03-31
DMA transfer is enabled from the 2652 serial I/O device to the buffer memory. This enables automatic recep- tion of an incoming packet without (’PU...conference speaker. Producing multiple copies at the source wastes network bandwidth and is likely to cause local overload conditions for a large... wasted . If the setup fails because ST can fird no route with sufficient capacity, the phone will have rung and possibly been answered 18 but the call will
The non-trusty clown attack on model-based speaker recognition systems
NASA Astrophysics Data System (ADS)
Farrokh Baroughi, Alireza; Craver, Scott
2015-03-01
Biometric detectors for speaker identification commonly employ a statistical model for a subject's voice, such as a Gaussian Mixture Model, that combines multiple means to improve detector performance. This allows a malicious insider to amend or append a component of a subject's statistical model so that a detector behaves normally except under a carefully engineered circumstance. This allows an attacker to force a misclassification of his or her voice only when desired, by smuggling data into a database far in advance of an attack. Note that the attack is possible if attacker has access to database even for a limited time to modify victim's model. We exhibit such an attack on a speaker identification, in which an attacker can force a misclassification by speaking in an unusual voice, and replacing the least weighted component of victim's model by the most weighted competent of the unusual voice of the attacker's model. The reason attacker make his or her voice unusual during the attack is because his or her normal voice model can be in database, and by attacking with unusual voice, the attacker has the option to be recognized as himself or herself when talking normally or as the victim when talking in the unusual manner. By attaching an appropriately weighted vector to a victim's model, we can impersonate all users in our simulations, while avoiding unwanted false rejections.
Application of the wavelet transform for speech processing
NASA Technical Reports Server (NTRS)
Maes, Stephane
1994-01-01
Speaker identification and word spotting will shortly play a key role in space applications. An approach based on the wavelet transform is presented that, in the context of the 'modulation model,' enables extraction of speech features which are used as input for the classification process.
Individual differences in selective attention predict speech identification at a cocktail party
Oberfeld, Daniel; Klöckner-Nowotny, Felicitas
2016-01-01
Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise. DOI: http://dx.doi.org/10.7554/eLife.16747.001 PMID:27580272
Flege, J E; Hillenbrand, J
1986-02-01
This study examined the effect of linguistic experience on perception of the English /s/-/z/ contrast in word-final position. The durations of the periodic ("vowel") and aperiodic ("fricative") portions of stimuli, ranging from peas to peace, were varied in a 5 X 5 factorial design. Forced-choice identification judgments were elicited from two groups of native speakers of American English differing in dialect, and from two groups each of native speakers of French, Swedish, and Finnish differing in English-language experience. The results suggested that the non-native subjects used cues established for the perception of phonetic contrasts in their native language to identify fricatives as /s/ or /z/. Lengthening vowel duration increased /z/ judgments in all eight subject groups, although the effect was smaller for native speakers of French than for native speakers of the other languages. Shortening fricative duration, on the other hand, significantly decreased /z/ judgments only by the English and French subjects. It did not influence voicing judgments by the Swedish and Finnish subjects, even those who had lived for a year or more in an English-speaking environment. These findings raise the question of whether adults who learn a foreign language can acquire the ability to integrate multiple acoustic cues to a phonetic contrast which does not exist in their native language.
Tao, Qian; Milles, Julien; Zeppenfeld, Katja; Lamb, Hildo J; Bax, Jeroen J; Reiber, Johan H C; van der Geest, Rob J
2010-08-01
Accurate assessment of the size and distribution of a myocardial infarction (MI) from late gadolinium enhancement (LGE) MRI is of significant prognostic value for postinfarction patients. In this paper, an automatic MI identification method combining both intensity and spatial information is presented in a clear framework of (i) initialization, (ii) false acceptance removal, and (iii) false rejection removal. The method was validated on LGE MR images of 20 chronic postinfarction patients, using manually traced MI contours from two independent observers as reference. Good agreement was observed between automatic and manual MI identification. Validation results showed that the average Dice indices, which describe the percentage of overlap between two regions, were 0.83 +/- 0.07 and 0.79 +/- 0.08 between the automatic identification and the manual tracing from observer 1 and observer 2, and the errors in estimated infarct percentage were 0.0 +/- 1.9% and 3.8 +/- 4.7% compared with observer 1 and observer 2. The difference between the automatic method and manual tracing is in the order of interobserver variation. In conclusion, the developed automatic method is accurate and robust in MI delineation, providing an objective tool for quantitative assessment of MI in LGE MR imaging.
Yu, Chengzhu; Hansen, John H L
2017-03-01
Human physiology has evolved to accommodate environmental conditions, including temperature, pressure, and air chemistry unique to Earth. However, the environment in space varies significantly compared to that on Earth and, therefore, variability is expected in astronauts' speech production mechanism. In this study, the variations of astronaut voice characteristics during the NASA Apollo 11 mission are analyzed. Specifically, acoustical features such as fundamental frequency and phoneme formant structure that are closely related to the speech production system are studied. For a further understanding of astronauts' vocal tract spectrum variation in space, a maximum likelihood frequency warping based analysis is proposed to detect the vocal tract spectrum displacement during space conditions. The results from fundamental frequency, formant structure, as well as vocal spectrum displacement indicate that astronauts change their speech production mechanism when in space. Moreover, the experimental results for astronaut voice identification tasks indicate that current speaker recognition solutions are highly vulnerable to astronaut voice production variations in space conditions. Future recommendations from this study suggest that successful applications of speaker recognition during extended space missions require robust speaker modeling techniques that could effectively adapt to voice production variation caused by diverse space conditions.
NASA Astrophysics Data System (ADS)
Yao, Guang-tao; Zhang, Xiao-hui; Ge, Wei-long
2012-01-01
The underwater laser imaging detection is an effective method of detecting short distance target underwater as an important complement of sonar detection. With the development of underwater laser imaging technology and underwater vehicle technology, the underwater automatic target identification has gotten more and more attention, and is a research difficulty in the area of underwater optical imaging information processing. Today, underwater automatic target identification based on optical imaging is usually realized with the method of digital circuit software programming. The algorithm realization and control of this method is very flexible. However, the optical imaging information is 2D image even 3D image, the amount of imaging processing information is abundant, so the electronic hardware with pure digital algorithm will need long identification time and is hard to meet the demands of real-time identification. If adopt computer parallel processing, the identification speed can be improved, but it will increase complexity, size and power consumption. This paper attempts to apply optical correlation identification technology to realize underwater automatic target identification. The optics correlation identification technology utilizes the Fourier transform characteristic of Fourier lens which can accomplish Fourier transform of image information in the level of nanosecond, and optical space interconnection calculation has the features of parallel, high speed, large capacity and high resolution, combines the flexibility of calculation and control of digital circuit method to realize optoelectronic hybrid identification mode. We reduce theoretical formulation of correlation identification and analyze the principle of optical correlation identification, and write MATLAB simulation program. We adopt single frame image obtained in underwater range gating laser imaging to identify, and through identifying and locating the different positions of target, we can improve the speed and orientation efficiency of target identification effectively, and validate the feasibility of this method primarily.
1980-01-01
nor has it ever been. My only interest is getting the best deal for the government so that all of us have a smaller tax bill. That’s what the Hoover... US ARMY CERCOM Added following page 50 LIST OF SPEAKERS ......................................................... 51 LIST OF STEERING COMMITTEE...application of technology to defense needs. We need ETE to help us maintain an adequate national defense posture.. The annual DoD expenditures for automatic
Communication and Miscommunication.
1985-10-01
piece" to refer to the NOZZLE. When A adds the relative clause "that has four gizmos on it," J is forced to drop the NOZZLE as the referent and to select...a red plastic piece [J starts for NOZZLE] 14. that has four gizmos on it. [J switches to SLIDEVALVE] J: 15. Yes. A: 16. Okay. Put the ungizmoed end...not always be specified explicitly since some postconditions automatically come with an action. If the speaker said the utterance "fit the red gizmo
[Wearable Automatic External Defibrillators].
Luo, Huajie; Luo, Zhangyuan; Jin, Xun; Zhang, Leilei; Wang, Changjin; Zhang, Wenzan; Tu, Quan
2015-11-01
Defibrillation is the most effective method of treating ventricular fibrillation(VF), this paper introduces wearable automatic external defibrillators based on embedded system which includes EGG measurements, bioelectrical impedance measurement, discharge defibrillation module, which can automatic identify VF signal, biphasic exponential waveform defibrillation discharge. After verified by animal tests, the device can realize EGG acquisition and automatic identification. After identifying the ventricular fibrillation signal, it can automatic defibrillate to abort ventricular fibrillation and to realize the cardiac electrical cardioversion.
Shahamiri, Seyed Reza; Salim, Siti Salwah Binti
2014-09-01
Automatic speech recognition (ASR) can be very helpful for speakers who suffer from dysarthria, a neurological disability that damages the control of motor speech articulators. Although a few attempts have been made to apply ASR technologies to sufferers of dysarthria, previous studies show that such ASR systems have not attained an adequate level of performance. In this study, a dysarthric multi-networks speech recognizer (DM-NSR) model is provided using a realization of multi-views multi-learners approach called multi-nets artificial neural networks, which tolerates variability of dysarthric speech. In particular, the DM-NSR model employs several ANNs (as learners) to approximate the likelihood of ASR vocabulary words and to deal with the complexity of dysarthric speech. The proposed DM-NSR approach was presented as both speaker-dependent and speaker-independent paradigms. In order to highlight the performance of the proposed model over legacy models, multi-views single-learner models of the DM-NSRs were also provided and their efficiencies were compared in detail. Moreover, a comparison among the prominent dysarthric ASR methods and the proposed one is provided. The results show that the DM-NSR recorded improved recognition rate by up to 24.67% and the error rate was reduced by up to 8.63% over the reference model.
Individuals with autism spectrum disorders do not use social stereotypes in irony comprehension.
Zalla, Tiziana; Amsellem, Frederique; Chaste, Pauline; Ervas, Francesca; Leboyer, Marion; Champagne-Lavau, Maud
2014-01-01
Social and communication impairments are part of the essential diagnostic criteria used to define Autism Spectrum Disorders (ASDs). Difficulties in appreciating non-literal speech, such as irony in ASDs have been explained as due to impairments in social understanding and in recognizing the speaker's communicative intention. It has been shown that social-interactional factors, such as a listener's beliefs about the speaker's attitudinal propensities (e.g., a tendency to use sarcasm, to be mocking, less sincere and more prone to criticism), as conveyed by an occupational stereotype, do influence a listener's interpretation of potentially ironic remarks. We investigate the effect of occupational stereotype on irony detection in adults with High Functioning Autism or Asperger Syndrome (HFA/AS) and a comparison group of typically developed adults. We used a series of verbally presented stories containing ironic or literal utterances produced by a speaker having either a "sarcastic" or a "non-sarcastic" occupation. Although individuals with HFA/AS were able to recognize ironic intent and occupational stereotypes when the latter are made salient, stereotype information enhanced irony detection and modulated its social meaning (i.e., mockery and politeness) only in comparison participants. We concluded that when stereotype knowledge is not made salient, it does not automatically affect pragmatic communicative processes in individuals with HFA/AS.
Age Estimation Based on Children's Voice: A Fuzzy-Based Decision Fusion Strategy
Ting, Hua-Nong
2014-01-01
Automatic estimation of a speaker's age is a challenging research topic in the area of speech analysis. In this paper, a novel approach to estimate a speaker's age is presented. The method features a “divide and conquer” strategy wherein the speech data are divided into six groups based on the vowel classes. There are two reasons behind this strategy. First, reduction in the complicated distribution of the processing data improves the classifier's learning performance. Second, different vowel classes contain complementary information for age estimation. Mel-frequency cepstral coefficients are computed for each group and single layer feed-forward neural networks based on self-adaptive extreme learning machine are applied to the features to make a primary decision. Subsequently, fuzzy data fusion is employed to provide an overall decision by aggregating the classifier's outputs. The results are then compared with a number of state-of-the-art age estimation methods. Experiments conducted based on six age groups including children aged between 7 and 12 years revealed that fuzzy fusion of the classifier's outputs resulted in considerable improvement of up to 53.33% in age estimation accuracy. Moreover, the fuzzy fusion of decisions aggregated the complementary information of a speaker's age from various speech sources. PMID:25006595
Korean Word Frequency and Commonality Study for Augmentative and Alternative Communication
ERIC Educational Resources Information Center
Shin, Sangeun; Hill, Katya
2016-01-01
Background: Vocabulary frequency results have been reported to design and support augmentative and alternative communication (AAC) interventions. A few studies exist for adult speakers and for other natural languages. With the increasing demand on AAC treatment for Korean adults, identification of high-frequency or core vocabulary (CV) becomes…
A Report by the Air Pollution Committee
ERIC Educational Resources Information Center
Kirkpatrick, Lane
1972-01-01
Description of a Symposium on Air Resource Protection and the Environment,'' held at the 1972 Environmental Health Conference and Exposition. Reports included a mathematical model for predicting future levels of air pollution, evaluation and identification of transportation controls, and a panel discussion of points raised by the speakers. (LK)
Myths and Political Rhetoric: Jimmy Carter Accepts the Nomination.
ERIC Educational Resources Information Center
Corso, Dianne M.
Like other political speakers who have drawn on the personification, identification, and dramatic encounter images of mythology to pressure and persuade audiences, Jimmy Carter evoked the myths of the hero, the American Dream, and the ideal political process in his presidential nomination acceptance speech. By stressing his unknown status, his…
System Identification and Automatic Mass Balancing of Ground-Based Three-Axis Spacecraft Simulator
2006-08-01
commanded torque to move away from these singularity points. The introduction of this error may not degrade the performance for large slew angle ...trajectory has been generated and quaternion feedback control has been implemented for reference trajectory tracking. The testbed was reasonably well...System Identification and Automatic Mass Balancing of Ground-Based Three-Axis Spacecraft Simulator Jae-Jun Kim∗ and Brij N. Agrawal † Department of
A new methodology for automatic detection of reference points in 3D cephalometry: A pilot study.
Ed-Dhahraouy, Mohammed; Riri, Hicham; Ezzahmouly, Manal; Bourzgui, Farid; El Moutaoukkil, Abdelmajid
2018-04-05
The aim of this study was to develop a new method for an automatic detection of reference points in 3D cephalometry to overcome the limits of 2D cephalometric analyses. A specific application was designed using the C++ language for automatic and manual identification of 21 (reference) points on the craniofacial structures. Our algorithm is based on the implementation of an anatomical and geometrical network adapted to the craniofacial structure. This network was constructed based on the anatomical knowledge of the 3D cephalometric (reference) points. The proposed algorithm was tested on five CBCT images. The proposed approach for the automatic 3D cephalometric identification was able to detect 21 points with a mean error of 2.32mm. In this pilot study, we propose an automated methodology for the identification of the 3D cephalometric (reference) points. A larger sample will be implemented in the future to assess the method validity and reliability. Copyright © 2018 CEO. Published by Elsevier Masson SAS. All rights reserved.
NASA Astrophysics Data System (ADS)
Leavens, Claudia; Vik, Torbjørn; Schulz, Heinrich; Allaire, Stéphane; Kim, John; Dawson, Laura; O'Sullivan, Brian; Breen, Stephen; Jaffray, David; Pekar, Vladimir
2008-03-01
Manual contouring of target volumes and organs at risk in radiation therapy is extremely time-consuming, in particular for treating the head-and-neck area, where a single patient treatment plan can take several hours to contour. As radiation treatment delivery moves towards adaptive treatment, the need for more efficient segmentation techniques will increase. We are developing a method for automatic model-based segmentation of the head and neck. This process can be broken down into three main steps: i) automatic landmark identification in the image dataset of interest, ii) automatic landmark-based initialization of deformable surface models to the patient image dataset, and iii) adaptation of the deformable models to the patient-specific anatomical boundaries of interest. In this paper, we focus on the validation of the first step of this method, quantifying the results of our automatic landmark identification method. We use an image atlas formed by applying thin-plate spline (TPS) interpolation to ten atlas datasets, using 27 manually identified landmarks in each atlas/training dataset. The principal variation modes returned by principal component analysis (PCA) of the landmark positions were used by an automatic registration algorithm, which sought the corresponding landmarks in the clinical dataset of interest using a controlled random search algorithm. Applying a run time of 60 seconds to the random search, a root mean square (rms) distance to the ground-truth landmark position of 9.5 +/- 0.6 mm was calculated for the identified landmarks. Automatic segmentation of the brain, mandible and brain stem, using the detected landmarks, is demonstrated.
Aviation Careers Series: Airline Non-Flying Careers
DOT National Transportation Integrated Search
1996-01-01
TRAVLINK demonstrated the use of Automatic Vehicle Location (AVL), ComputerAided dispatch (CAD), and Automatic Vehicle Identification (AVI) systems on Metropolitan Council Transit Operations (MCTO) buses in Minneapolis, Minnesota and western suburbs,...
Automatic limb identification and sleeping parameters assessment for pressure ulcer prevention.
Baran Pouyan, Maziyar; Birjandtalab, Javad; Nourani, Mehrdad; Matthew Pompeo, M D
2016-08-01
Pressure ulcers (PUs) are common among vulnerable patients such as elderly, bedridden and diabetic. PUs are very painful for patients and costly for hospitals and nursing homes. Assessment of sleeping parameters on at-risk limbs is critical for ulcer prevention. An effective assessment depends on automatic identification and tracking of at-risk limbs. An accurate limb identification can be used to analyze the pressure distribution and assess risk for each limb. In this paper, we propose a graph-based clustering approach to extract the body limbs from the pressure data collected by a commercial pressure map system. A robust signature-based technique is employed to automatically label each limb. Finally, an assessment technique is applied to evaluate the experienced stress by each limb over time. The experimental results indicate high performance and more than 94% average accuracy of the proposed approach. Copyright © 2016 Elsevier Ltd. All rights reserved.
Automatic speech recognition research at NASA-Ames Research Center
NASA Technical Reports Server (NTRS)
Coler, Clayton R.; Plummer, Robert P.; Huff, Edward M.; Hitchcock, Myron H.
1977-01-01
A trainable acoustic pattern recognizer manufactured by Scope Electronics is presented. The voice command system VCS encodes speech by sampling 16 bandpass filters with center frequencies in the range from 200 to 5000 Hz. Variations in speaking rate are compensated for by a compression algorithm that subdivides each utterance into eight subintervals in such a way that the amount of spectral change within each subinterval is the same. The recorded filter values within each subinterval are then reduced to a 15-bit representation, giving a 120-bit encoding for each utterance. The VCS incorporates a simple recognition algorithm that utilizes five training samples of each word in a vocabulary of up to 24 words. The recognition rate of approximately 85 percent correct for untrained speakers and 94 percent correct for trained speakers was not considered adequate for flight systems use. Therefore, the built-in recognition algorithm was disabled, and the VCS was modified to transmit 120-bit encodings to an external computer for recognition.
Fast and automatic thermographic material identification for the recycling process
NASA Astrophysics Data System (ADS)
Haferkamp, Heinz; Burmester, Ingo
1998-03-01
Within the framework of the future closed loop recycling process the automatic and economical sorting of plastics is a decisive element. The at the present time available identification and sorting systems are not yet suitable for the sorting of technical plastics since essential demands, as the realization of high recognition reliability and identification rates considering the variety of technical plastics, can not be guaranteed. Therefore the Laser Zentrum Hannover e.V. in cooperation with the Hoerotron GmbH and the Preussag Noell GmbH has carried out investigations on a rapid thermographic and laser-supported material- identification-system for automatic material-sorting- systems. The automatic identification of different engineering plastics coming from electronic or automotive waste is possible. Identification rates up to 10 parts per second are allowed by the effort from fast IR line scanners. The procedure is based on the following principle: within a few milliseconds a spot on the relevant sample is heated by a CO2 laser. The samples different and specific chemical and physical material properties cause different temperature distributions on their surfaces that are measured by a fast IR-linescan system. This 'thermal impulse response' has to be analyzed by means of a computer system. Investigations have shown that it is possible to analyze more than 18 different sorts of plastics at a frequency of 10 Hz. Crucial for the development of such a system is the rapid processing of imaging data, the minimization of interferences caused by oscillating samples geometries, and a wide range of possible additives in plastics in question. One possible application area is sorting of plastics coming from car- and electronic waste recycling.
Gender Differences in the Recognition of Vocal Emotions
Lausen, Adi; Schacht, Annekathrin
2018-01-01
The conflicting findings from the few studies conducted with regard to gender differences in the recognition of vocal expressions of emotion have left the exact nature of these differences unclear. Several investigators have argued that a comprehensive understanding of gender differences in vocal emotion recognition can only be achieved by replicating these studies while accounting for influential factors such as stimulus type, gender-balanced samples, number of encoders, decoders, and emotional categories. This study aimed to account for these factors by investigating whether emotion recognition from vocal expressions differs as a function of both listeners' and speakers' gender. A total of N = 290 participants were randomly and equally allocated to two groups. One group listened to words and pseudo-words, while the other group listened to sentences and affect bursts. Participants were asked to categorize the stimuli with respect to the expressed emotions in a fixed-choice response format. Overall, females were more accurate than males when decoding vocal emotions, however, when testing for specific emotions these differences were small in magnitude. Speakers' gender had a significant impact on how listeners' judged emotions from the voice. The group listening to words and pseudo-words had higher identification rates for emotions spoken by male than by female actors, whereas in the group listening to sentences and affect bursts the identification rates were higher when emotions were uttered by female than male actors. The mixed pattern for emotion-specific effects, however, indicates that, in the vocal channel, the reliability of emotion judgments is not systematically influenced by speakers' gender and the related stereotypes of emotional expressivity. Together, these results extend previous findings by showing effects of listeners' and speakers' gender on the recognition of vocal emotions. They stress the importance of distinguishing these factors to explain recognition ability in the processing of emotional prosody. PMID:29922202
Hearing history influences voice gender perceptual performance in cochlear implant users.
Kovačić, Damir; Balaban, Evan
2010-12-01
The study was carried out to assess the role that five hearing history variables (chronological age, age at onset of deafness, age of first cochlear implant [CI] activation, duration of CI use, and duration of known deafness) play in the ability of CI users to identify speaker gender. Forty-one juvenile CI users participated in two voice gender identification tasks. In a fixed, single-interval task, subjects listened to a single speech item from one of 20 adult male or 20 adult female speakers and had to identify speaker gender. In an adaptive speech-based voice gender discrimination task with the fundamental frequency difference between the voices as the adaptive parameter, subjects listened to a pair of speech items presented in sequential order, one of which was always spoken by an adult female and the other by an adult male. Subjects had to identify the speech item spoken by the female voice. Correlation and regression analyses between perceptual scores in the two tasks and the hearing history variables were performed. Subjects fell into three performance groups: (1) those who could distinguish voice gender in both tasks, (2) those who could distinguish voice gender in the adaptive but not the fixed task, and (3) those who could not distinguish voice gender in either task. Gender identification performance for single voices in the fixed task was significantly and negatively related to the duration of deafness before cochlear implantation (shorter deafness yielded better performance), whereas performance in the adaptive task was weakly but significantly related to age at first activation of the CI device, with earlier activations yielding better scores. The existence of a group of subjects able to perform adaptive discrimination but unable to identify the gender of singly presented voices demonstrates the potential dissociability of the skills required for these two tasks, suggesting that duration of deafness and age of cochlear implantation could have dissociable effects on the development of different skills required by CI users to identify speaker gender.
Acoustic and perceptual effects of overall F0 range in a lexical pitch accent distinction
NASA Astrophysics Data System (ADS)
Wade, Travis
2002-05-01
A speaker's overall fundamental frequency range is generally considered a variable, nonlinguistic element of intonation. This study examined the precision with which overall F0 is predictable based on previous intonational context and the extent to which it may be perceptually significant. Speakers of Tokyo Japanese produced pairs of sentences differing lexically only in the presence or absence of a single pitch accent as responses to visual and prerecorded speech cues presented in an interactive manner. F0 placement of high tones (previously observed to be relatively variable in pitch contours) was found to be consistent across speakers and uniformly dependent on the intonation of the different sentences used as cues. In a subsequent perception experiment, continuous manipulation of these same sentences between typical accented and typical non-accent-containing versions were presented to Japanese listeners for lexical identification. Results showed that listeners' perception was not significantly altered in compensation for artificial manipulation of preceding intonation. Implications are discussed within an autosegmental analysis of tone. The current results are consistent with the notion that pitch range (i.e., specific vertical locations of tonal peaks) does not simply vary gradiently across speakers and situations but constitutes a predictable part of the phonetic specification of tones.
Senthil Kumar, S; Suresh Babu, S S; Anand, P; Dheva Shantha Kumari, G
2012-06-01
The purpose of our study was to fabricate in-house web-camera based automatic continuous patient movement monitoring device and control the movement of the patients during EXRT. Web-camera based patient movement monitoring device consists of a computer, digital web-camera, mounting system, breaker circuit, speaker, and visual indicator. The computer is used to control and analyze the patient movement using indigenously developed software. The speaker and the visual indicator are placed in the console room to indicate the positional displacement of the patient. Studies were conducted on phantom and 150 patients with different types of cancers. Our preliminary clinical results indicate that our device is highly reliable and can accurately report smaller movements of the patients in all directions. The results demonstrated that the device was able to detect patient's movements with the sensitivity of about 1 mm. When a patient moves, the receiver activates the circuit; an audible warning sound will be produced in the console. Through real-time measurements, an audible alarm can alert the radiation technologist to stop the treatment if the user defined positional threshold is violated. Simultaneously, the electrical circuit to the teletherapy machine will be activated and radiation will be halted. Patient's movement during the course for radiotherapy was studied. The beam is halted automatically when the threshold level of the system is exceeded. By using the threshold provided in the system, it is possible to monitor the patient continuously with certain fixed limits. An additional benefit is that it has reduced the tension and stress of a treatment team associated with treating patients who are not immobilized. It also enables the technologists to do their work more efficiently, because they don't have to continuously monitor patients with as much scrutiny as was required. © 2012 American Association of Physicists in Medicine.
The Language of Persuasion, English, Vocabulary: 5114.68.
ERIC Educational Resources Information Center
Groff, Irvin
Developed for a high school quinmester unit on the language of persuasion, this guide provides the teacher with teaching strategies for a study of the speaker or writer as a persuader, the identification of the logical and psychological tools of persuasion, an examination of the levels of abstraction, the techniques of propaganda, and the…
ERIC Educational Resources Information Center
Jiang, Jun; Liu, Fang; Wan, Xuan; Jiang, Cunmei
2015-01-01
Tone language experience benefits pitch processing in music and speech for typically developing individuals. No known studies have examined pitch processing in individuals with autism who speak a tone language. This study investigated discrimination and identification of melodic contour and speech intonation in a group of Mandarin-speaking…
A Cross-Language Study of Perception of Lexical Stress in English
ERIC Educational Resources Information Center
Yu, Vickie Y.; Andruski, Jean E.
2010-01-01
This study investigates the question of whether language background affects the perception of lexical stress in English. Thirty native English speakers and 30 native Chinese learners of English participated in a stressed-syllable identification task and a discrimination task involving three types of stimuli (real words/pseudowords/hums). The…
Processing of Acoustic Cues in Lexical-Tone Identification by Pediatric Cochlear-Implant Recipients
ERIC Educational Resources Information Center
Peng, Shu-Chen; Lu, Hui-Ping; Lu, Nelson; Lin, Yung-Song; Deroche, Mickael L. D.; Chatterjee, Monita
2017-01-01
Purpose: The objective was to investigate acoustic cue processing in lexical-tone recognition by pediatric cochlear-implant (CI) recipients who are native Mandarin speakers. Method: Lexical-tone recognition was assessed in pediatric CI recipients and listeners with normal hearing (NH) in 2 tasks. In Task 1, participants identified naturally…
Acquisition of L2 Vowel Duration in Japanese by Native English Speakers
ERIC Educational Resources Information Center
Okuno, Tomoko
2013-01-01
Research has demonstrated that focused perceptual training facilitates L2 learners' segmental perception and spoken word identification. Hardison (2003) and Motohashi-Saigo and Hardison (2009) found benefits of visual cues in the training for acquisition of L2 contrasts. The present study examined factors affecting perception and production…
Improving Speaker Recognition by Biometric Voice Deconstruction
Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro
2015-01-01
Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245
Improving Speaker Recognition by Biometric Voice Deconstruction.
Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro
2015-01-01
Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.
Automatic tracking of wake vortices using ground-wind sensor data
DOT National Transportation Integrated Search
1977-01-03
Algorithms for automatic tracking of wake vortices using ground-wind anemometer : data are developed. Methods of bad-data suppression, track initiation, and : track termination are included. An effective sensor-failure detection-and identification : ...
Crescent Evaluation : appendix D : crescent computer system components evaluation report
DOT National Transportation Integrated Search
1994-02-01
In 1990, Lockheed Integrated Systems Company (LISC) was awarded a contract, under the Crescent Demonstration Project, to demonstrate the integration of Weigh In Motion (WIM), Automatic Vehicle Classification (AVC) and Automatic Vehicle Identification...
Automatic Molar Extraction from Dental Panoramic Radiographs for Forensic Personal Identification
NASA Astrophysics Data System (ADS)
Samopa, Febriliyan; Asano, Akira; Taguchi, Akira
Measurement of an individual molar provides rich information for forensic personal identification. We propose a computer-based system for extracting an individual molar from dental panoramic radiographs. A molar is obtained by extracting the region-of-interest, separating the maxilla and mandible, and extracting the boundaries between teeth. The proposed system is almost fully automatic; all that the user has to do is clicking three points on the boundary between the maxilla and the mandible.
NASA Technical Reports Server (NTRS)
Wolf, Jared J.
1977-01-01
The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.
Robust Speaker Authentication Based on Combined Speech and Voiceprint Recognition
NASA Astrophysics Data System (ADS)
Malcangi, Mario
2009-08-01
Personal authentication is becoming increasingly important in many applications that have to protect proprietary data. Passwords and personal identification numbers (PINs) prove not to be robust enough to ensure that unauthorized people do not use them. Biometric authentication technology may offer a secure, convenient, accurate solution but sometimes fails due to its intrinsically fuzzy nature. This research aims to demonstrate that combining two basic speech processing methods, voiceprint identification and speech recognition, can provide a very high degree of robustness, especially if fuzzy decision logic is used.
Event identification by acoustic signature recognition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dress, W.B.; Kercel, S.W.
1995-07-01
Many events of interest to the security commnnity produce acoustic emissions that are, in principle, identifiable as to cause. Some obvious examples are gunshots, breaking glass, takeoffs and landings of small aircraft, vehicular engine noises, footsteps (high frequencies when on gravel, very low frequencies. when on soil), and voices (whispers to shouts). We are investigating wavelet-based methods to extract unique features of such events for classification and identification. We also discuss methods of classification and pattern recognition specifically tailored for acoustic signatures obtained by wavelet analysis. The paper is divided into three parts: completed work, work in progress, and futuremore » applications. The completed phase has led to the successful recognition of aircraft types on landing and takeoff. Both small aircraft (twin-engine turboprop) and large (commercial airliners) were included in the study. The project considered the design of a small, field-deployable, inexpensive device. The techniques developed during the aircraft identification phase were then adapted to a multispectral electromagnetic interference monitoring device now deployed in a nuclear power plant. This is a general-purpose wavelet analysis engine, spanning 14 octaves, and can be adapted for other specific tasks. Work in progress is focused on applying the methods previously developed to speaker identification. Some of the problems to be overcome include recognition of sounds as voice patterns and as distinct from possible background noises (e.g., music), as well as identification of the speaker from a short-duration voice sample. A generalization of the completed work and the work in progress is a device capable of classifying any number of acoustic events-particularly quasi-stationary events such as engine noises and voices and singular events such as gunshots and breaking glass. We will show examples of both kinds of events and discuss their recognition likelihood.« less
Occurrence Frequencies of Acoustic Patterns of Vocal Fry in American English Speakers.
Abdelli-Beruh, Nassima B; Drugman, Thomas; Red Owl, R H
2016-11-01
The goal of this study was to analyze the occurrence frequencies of three individual acoustic patterns (A, B, C) and of vocal fry overall (A + B + C) as a function of gender, word position in the sentence (Not Last Word vs. Last Word), and sentence length (number of words in a sentence). This is an experimental design. Twenty-five male and 29 female American English (AE) speakers read the Grandfather Passage. The recordings were processed by a Matlab toolbox designed for the analysis and detection of creaky segments, automatically identified using the Kane-Drugman algorithm. The experiment produced subsamples of outcomes, three that reflect a single, discrete acoustic pattern (A, B, or C) and the fourth that reflects the occurrence frequency counts of Vocal Fry Overall without regard to any specific pattern. Zero-truncated Poisson regression analyses were conducted with Gender and Word Position as predictors and Sentence Length as a covariate. The results of the present study showed that the occurrence frequencies of the three acoustic patterns and vocal fry overall (A + B + C) are greatest at the end of sentences but are unaffected by sentence length. The findings also reveal that AE female speakers exhibit Pattern C significantly more frequently than Pattern B, and the converse holds for AE male speakers. Future studies are needed to confirm such outcomes, assess the perceptual salience of these acoustic patterns, and determine the physiological correlates of these acoustic patterns. The findings have implications for the design of new excitation models of vocal fry. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Port-of-entry advanced sorting system (PASS) operational test
DOT National Transportation Integrated Search
1998-12-01
In 1992 the Oregon Department of Transportation undertook an operational test of the Port-of-Entry Advanced Sorting System (PASS), which uses a two-way communication automatic vehicle identification system, integrated with weigh-in-motion, automatic ...
DeRobertis, Christopher V.; Lu, Yantian T.
2010-02-23
A method, system, and program storage device for creating a new user account or user group with a unique identification number in a computing environment having multiple user registries is provided. In response to receiving a command to create a new user account or user group, an operating system of a clustered computing environment automatically checks multiple registries configured for the operating system to determine whether a candidate identification number for the new user account or user group has been assigned already to one or more existing user accounts or groups, respectively. The operating system automatically assigns the candidate identification number to the new user account or user group created in a target user registry if the checking indicates that the candidate identification number has not been assigned already to any of the existing user accounts or user groups, respectively.
Perception of Non-Native Consonant Length Contrast: The Role of Attention in Phonetic Processing
ERIC Educational Resources Information Center
Porretta, Vincent J.; Tucker, Benjamin V.
2015-01-01
The present investigation examines English speakers' ability to identify and discriminate non-native consonant length contrast. Three groups (L1 English No-Instruction, L1 English Instruction, and L1 Finnish control) performed a speeded forced-choice identification task and a speeded AX discrimination task on Finnish non-words (e.g.…
The Effect of Pitch Peak Alignment on Sentence Type Identification in Russian
ERIC Educational Resources Information Center
Makarova, Veronika
2007-01-01
This paper reports the results of an experimental phonetic study examining pitch peak alignment in production and perception of three-syllable one-word sentences with phonetic rising-falling pitch movement by speakers of Russian. The first part of the study (Experiment 1) utilizes 22 one-word three-syllable utterances read by five female speakers…
ERIC Educational Resources Information Center
Liu, Fang; Xu, Yi; Patel, Aniruddh D.; Francart, Tom; Jiang, Cunmei
2012-01-01
This study examined whether "melodic contour deafness" (insensitivity to the direction of pitch movement) in congenital amusia is associated with specific types of pitch patterns (discrete versus gliding pitches) or stimulus types (speech syllables versus complex tones). Thresholds for identification of pitch direction were obtained using discrete…
Effects of Audio-Visual Integration on the Detection of Masked Speech and Non-Speech Sounds
ERIC Educational Resources Information Center
Eramudugolla, Ranmalee; Henderson, Rachel; Mattingley, Jason B.
2011-01-01
Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that…
The Crescent Project : an evaluation of an element of the HELP Program : executive summary
DOT National Transportation Integrated Search
1994-02-01
The HELP/Crescent Project on the West Coast evaluated the applicability of four technologies for screening transponder-equipped vehicles. The technologies included automatic vehicle identification, weigh-in-motion, automatic vehicle classification, a...
Port-of-entry Advanced Sorting System (PASS) operational test : final report
DOT National Transportation Integrated Search
1998-12-01
In 1992 the Oregon Department of Transportation undertook an operational test of the Port-of-Entry Advanced Sorting System (PASS), which uses a two-way communication automatic vehicle identification system, integrated with weigh-in-motion, automatic ...
Understanding ITS/CVO Technology Applications, Student Manual, Course 3
DOT National Transportation Integrated Search
1999-01-01
WEIGHT-IN-MOTION OR WIM, COMMERCIAL VEHICLE INFORMATION SYSTEMS AND NETWORK OR CVISN, AUTOMATIC VEHICLE IDENTIFICATION OR AVI, AUTOMATIC LOCATION OR AVL, ELECTRONIC DATA INTERCHANGE OR EDI, GLOBAL POSITIONING SYSTEM OR GPS, INTERNET OR WORLD WIDE WEB...
Response Identification in the Extremely Low Frequency Region of an Electret Condenser Microphone
Jeng, Yih-Nen; Yang, Tzung-Ming; Lee, Shang-Yin
2011-01-01
This study shows that a small electret condenser microphone connected to a notebook or a personal computer (PC) has a prominent response in the extremely low frequency region in a specific environment. It confines most acoustic waves within a tiny air cell as follows. The air cell is constructed by drilling a small hole in a digital versatile disk (DVD) plate. A small speaker and an electret condenser microphone are attached to the two sides of the hole. Thus, the acoustic energy emitted by the speaker and reaching the microphone is strong enough to actuate the diaphragm of the latter. The experiments showed that, once small air leakages are allowed on the margin of the speaker, the microphone captured the signal in the range of 0.5 to 20 Hz. Moreover, by removing the plastic cover of the microphone and attaching the microphone head to the vibration surface, the low frequency signal can be effectively captured too. Two examples are included to show the convenience of applying the microphone to pick up the low frequency vibration information of practical systems. PMID:22346594
Response identification in the extremely low frequency region of an electret condenser microphone.
Jeng, Yih-Nen; Yang, Tzung-Ming; Lee, Shang-Yin
2011-01-01
This study shows that a small electret condenser microphone connected to a notebook or a personal computer (PC) has a prominent response in the extremely low frequency region in a specific environment. It confines most acoustic waves within a tiny air cell as follows. The air cell is constructed by drilling a small hole in a digital versatile disk (DVD) plate. A small speaker and an electret condenser microphone are attached to the two sides of the hole. Thus, the acoustic energy emitted by the speaker and reaching the microphone is strong enough to actuate the diaphragm of the latter. The experiments showed that, once small air leakages are allowed on the margin of the speaker, the microphone captured the signal in the range of 0.5 to 20 Hz. Moreover, by removing the plastic cover of the microphone and attaching the microphone head to the vibration surface, the low frequency signal can be effectively captured too. Two examples are included to show the convenience of applying the microphone to pick up the low frequency vibration information of practical systems.
Gas Reservoir Identification Basing on Deep Learning of Seismic-print Characteristics
NASA Astrophysics Data System (ADS)
Cao, J.; Wu, S.; He, X.
2016-12-01
Reservoir identification based on seismic data analysis is the core task in oil and gas geophysical exploration. The essence of reservoir identification is to identify the properties of rock pore fluid. We developed a novel gas reservoir identification method named seismic-print analysis by imitation of the vocal-print analysis techniques in speaker identification. The term "seismic-print" is referred to the characteristics of the seismic waveform which can identify determinedly the property of the geological objectives, for instance, a nature gas reservoir. Seismic-print can be characterized by one or a few parameters named as seismic-print parameters. It has been proven that gas reservoirs are of characteristics of negative 1-order cepstrum coefficient anomaly and Positive 2-order cepstrum coefficient anomaly, concurrently. The method is valid for sandstone gas reservoir, carbonate reservoir and shale gas reservoirs, and the accuracy rate may reach up to 90%. There are two main problems to deal with in the application of seismic-print analysis method. One is to identify the "ripple" of a reservoir on the seismogram, and another is to construct the mapping relationship between the seismic-print and the gas reservoirs. Deep learning developed in recent years is of the ability to reveal the complex non-linear relationship between the attribute and the data, and of ability to extract automatically the features of the objective from the data. Thus, deep learning could been used to deal with these two problems. There are lots of algorithms to carry out deep learning. The algorithms can be roughly divided into two categories: Belief Networks Network (DBNs) and Convolutional Neural Network (CNN). DBNs is a probabilistic generative model, which can establish a joint distribution of the observed data and tags. CNN is a feedforward neural network, which can be used to extract the 2D structure feature of the input data. Both DBNs and CNN can be used to deal with seismic data. We use an improved DBNs to identify carbonate rocks from log data, the accuracy rate can reach up to 83%. DBNs is used to deal with seismic waveform data, more information is obtained. The work was supported by NSFC under grant No. 41430323 and No. 41274128, and State Key Lab. of Oil and Gas Reservoir Geology and Exploration.
Astrometrica: Astrometric data reduction of CCD images
NASA Astrophysics Data System (ADS)
Raab, Herbert
2012-03-01
Astrometrica is an interactive software tool for scientific grade astrometric data reduction of CCD images. The current version of the software is for the Windows 32bit operating system family. Astrometrica reads FITS (8, 16 and 32 bit integer files) and SBIG image files. The size of the images is limited only by available memory. It also offers automatic image calibration (Dark Frame and Flat Field correction), automatic reference star identification, automatic moving object detection and identification, and access to new-generation star catalogs (PPMXL, UCAC 3 and CMC-14), in addition to online help and other features. Astrometrica is shareware, available for use for a limited period of time (100 days) for free; special arrangements can be made for educational projects.
Automatic identification of bullet signatures based on consecutive matching striae (CMS) criteria.
Chu, Wei; Thompson, Robert M; Song, John; Vorburger, Theodore V
2013-09-10
The consecutive matching striae (CMS) numeric criteria for firearm and toolmark identifications have been widely accepted by forensic examiners, although there have been questions concerning its observer subjectivity and limited statistical support. In this paper, based on signal processing and extraction, a model for the automatic and objective counting of CMS is proposed. The position and shape information of the striae on the bullet land is represented by a feature profile, which is used for determining the CMS number automatically. Rapid counting of CMS number provides a basis for ballistics correlations with large databases and further statistical and probability analysis. Experimental results in this report using bullets fired from ten consecutively manufactured barrels support this developed model. Published by Elsevier Ireland Ltd.
Experiments in automatic word class and word sense identification for information retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gauch, S.; Futrelle, R.P.
Automatic identification of related words and automatic detection of word senses are two long-standing goals of researchers in natural language processing. Word class information and word sense identification may enhance the performance of information retrieval system4ms. Large online corpora and increased computational capabilities make new techniques based on corpus linguisitics feasible. Corpus-based analysis is especially needed for corpora from specialized fields for which no electronic dictionaries or thesauri exist. The methods described here use a combination of mutual information and word context to establish word similarities. Then, unsupervised classification is done using clustering in the word space, identifying word classesmore » without pretagging. We also describe an extension of the method to handle the difficult problems of disambiguation and of determining part-of-speech and semantic information for low-frequency words. The method is powerful enough to produce high-quality results on a small corpus of 200,000 words from abstracts in a field of molecular biology.« less
Preparing a collection of radiology examinations for distribution and retrieval.
Demner-Fushman, Dina; Kohli, Marc D; Rosenman, Marc B; Shooshan, Sonya E; Rodriguez, Laritza; Antani, Sameer; Thoma, George R; McDonald, Clement J
2016-03-01
Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database. The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals' picture archiving systems. The images and reports were de-identified automatically and then the automatic de-identification was manually verified. The authors coded the key findings of the reports and empirically assessed the benefits of manual coding on retrieval. The automatic de-identification of the narrative was aggressive and achieved 100% precision at the cost of rendering a few findings uninterpretable. Automatic de-identification of images was not quite as perfect. Images for two of 3996 patients (0.05%) showed protected health information. Manual encoding of findings improved retrieval precision. Stringent de-identification methods can remove all identifiers from text radiology reports. DICOM de-identification of images does not remove all identifying information and needs special attention to images scanned from film. Adding manual coding to the radiologist narrative reports significantly improved relevancy of the retrieved clinical documents. The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine (http://openi.nlm.nih.gov/). Published by Oxford University Press on behalf of the American Medical Informatics Association 2015. This work is written by US Government employees and is in the public domain in the US.
Human Activity Recognition in AAL Environments Using Random Projections.
Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin
2016-01-01
Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented.
Human Activity Recognition in AAL Environments Using Random Projections
Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin
2016-01-01
Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented. PMID:27413392
A comparison of algorithms for inference and learning in probabilistic graphical models.
Frey, Brendan J; Jojic, Nebojsa
2005-09-01
Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
The Mechanism of Speech Processing in Congenital Amusia: Evidence from Mandarin Speakers
Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren
2012-01-01
Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results. PMID:22347374
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.
Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren
2012-01-01
Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
Voice recognition through phonetic features with Punjabi utterances
NASA Astrophysics Data System (ADS)
Kaur, Jasdeep; Juglan, K. C.; Sharma, Vishal; Upadhyay, R. K.
2017-07-01
This paper deals with perception and disorders of speech in view of Punjabi language. Visualizing the importance of voice identification, various parameters of speaker identification has been studied. The speech material was recorded with a tape recorder in their normal and disguised mode of utterances. Out of the recorded speech materials, the utterances free from noise, etc were selected for their auditory and acoustic spectrographic analysis. The comparison of normal and disguised speech of seven subjects is reported. The fundamental frequency (F0) at similar places, Plosive duration at certain phoneme, Amplitude ratio (A1:A2) etc. were compared in normal and disguised speech. It was found that the formant frequency of normal and disguised speech remains almost similar only if it is compared at the position of same vowel quality and quantity. If the vowel is more closed or more open in the disguised utterance the formant frequency will be changed in comparison to normal utterance. The ratio of the amplitude (A1: A2) is found to be speaker dependent. It remains unchanged in the disguised utterance. However, this value may shift in disguised utterance if cross sectioning is not done at the same location.
Deguchi, Shinji; Kawashima, Kazutaka; Washio, Seiichi
2008-12-01
The effect of artificially altered transglottal pressures on the voice fundamental frequency (F0) is known to be associated with vocal fold stiffness. Its measurement, though useful as a potential diagnostic tool for noncontact assessment of vocal fold stiffness, often requires manual and painstaking determination of an unstable F0 of voice. Here, we provide a computer-aided technique that enables one to carry out the determination easily and accurately. Human subjects vocalized in accordance with a series of reference sounds from a speaker controlled by a computer. Transglottal pressures were altered by means of a valve embedded in a mouthpiece. Time-varying vocal F0 was extracted, without manual procedures, from a specific range of the voice spectrum determined on the basis of the controlled reference sounds. The validity of the proposed technique was assessed for 11 healthy subjects. Fluctuating voice F0 was tracked automatically during experiments, providing the relationship between transglottal pressure change and F0 on the computer. The proposed technique overcomes the difficulty in automatic determination of the voice F0, which tends to be transient both in normal voice and in some types of pathological voice.
Chaspari, Theodora; Soldatos, Constantin; Maragos, Petros
2015-01-01
The development of ecologically valid procedures for collecting reliable and unbiased emotional data towards computer interfaces with social and affective intelligence targeting patients with mental disorders. Following its development, presented with, the Athens Emotional States Inventory (AESI) proposes the design, recording and validation of an audiovisual database for five emotional states: anger, fear, joy, sadness and neutral. The items of the AESI consist of sentences each having content indicative of the corresponding emotion. Emotional content was assessed through a survey of 40 young participants with a questionnaire following the Latin square design. The emotional sentences that were correctly identified by 85% of the participants were recorded in a soundproof room with microphones and cameras. A preliminary validation of AESI is performed through automatic emotion recognition experiments from speech. The resulting database contains 696 recorded utterances in Greek language by 20 native speakers and has a total duration of approximately 28 min. Speech classification results yield accuracy up to 75.15% for automatically recognizing the emotions in AESI. These results indicate the usefulness of our approach for collecting emotional data with reliable content, balanced across classes and with reduced environmental variability.
Intrinsic fundamental frequency of vowels is moderated by regional dialect
Jacewicz, Ewa; Fox, Robert Allen
2015-01-01
There has been a long-standing debate whether the intrinsic fundamental frequency (IF0) of vowels is an automatic consequence of articulation or whether it is independently controlled by speakers to perceptually enhance vowel contrasts along the height dimension. This paper provides evidence from regional variation in American English that IF0 difference between high and low vowels is, in part, controlled and varies across dialects. The sources of this F0 control are socio-cultural and cannot be attributed to differences in the vowel inventory size. The socially motivated enhancement was found only in prosodically prominent contexts. PMID:26520352
Enhancing the authenticity of assessments through grounding in first impressions.
Humă, Bogdana
2015-09-01
This article examines first impressions through a discursive and interactional lens. Until now, social psychologists have studied first impressions in laboratory conditions, in isolation from their natural environment, thus overseeing their discursive roles as devices for managing situated interactional concerns. I examine fragments of text and talk in which individuals spontaneously invoke first impressions of other persons as part of assessment activities in settings where the authenticity of speakers' stances might be threatened: (1) in activities with inbuilt evaluative components and (2) in sequential contexts where recipients have been withholding affiliation to speakers' actions. I discuss the relationship between authenticity, as a type of credibility issue related to intersubjective trouble, and the characteristics of first impression assessments, which render them useful for dealing with this specific credibility concern. I identify four features of first impression assessments which make them effective in enhancing authenticity: witness positioning (Potter, 1996, Representing reality: Discourse, rhetoric and social construction, Sage, London), (dis)location in time and space, automaticity, and extreme formulations (Edwards, 2003, Analyzing race talk: Multidisciplinary perspectives on the research interview, Cambridge University Press, New York). © 2014 The British Psychological Society.
Eccles, B A; Klevecz, R R
1986-06-01
Mitotic frequency in a synchronous culture of mammalian cells was determined fully automatically and in real time using low-intensity phase-contrast microscopy and a newvicon video camera connected to an EyeCom III image processor. Image samples, at a frequency of one per minute for 50 hours, were analyzed by first extracting the high-frequency picture components, then thresholding and probing for annular objects indicative of putative mitotic cells. Both the extraction of high-frequency components and the recognition of rings of varying radii and discontinuities employed novel algorithms. Spatial and temporal relationships between annuli were examined to discern the occurrences of mitoses, and such events were recorded in a computer data file. At present, the automatic analysis is suited for random cell proliferation rate measurements or cell cycle studies. The automatic identification of mitotic cells as described here provides a measure of the average proliferative activity of the cell population as a whole and eliminates more than eight hours of manual review per time-lapse video recording.
NASA Astrophysics Data System (ADS)
Masson, Josiane; Soille, Pierre; Mueller, Rick
2004-10-01
In the context of the Common Agricultural Policy (CAP) there is a strong interest of the European Commission for counting and individually locating fruit trees. An automatic counting algorithm developed by the JRC (OLICOUNT) was used in the past for olive trees only, on 1m black and white orthophotos but with limits in case of young trees or irregular groves. This study investigates the improvement of fruit tree identification using VHR images on a large set of data in three test sites, one in Creta (Greece; one in the south-east of France with a majority of olive trees and associated fruit trees, and the last one in Florida on citrus trees. OLICOUNT was compared with two other automatic tree counting, applications, one using the CRISP software on citrus trees and the other completely automatic based on regional minima (morphological image analysis). Additional investigation was undertaken to refine the methods. This paper describes the automatic methods and presents the results derived from the tests.
Research into automatic recognition of joints in human symmetrical movements
NASA Astrophysics Data System (ADS)
Fan, Yifang; Li, Zhiyu
2008-03-01
High speed photography is a major means of collecting data from human body movement. It enables the automatic identification of joints, which brings great significance to the research, treatment and recovery of injuries, the analysis to the diagnosis of sport techniques and the ergonomics. According to the features that when the adjacent joints of human body are in planetary motion, their distance remains the same, and according to the human body joint movement laws (such as the territory of the articular anatomy and the kinematic features), a new approach is introduced to process the image thresholding of joints filmed by the high speed camera, to automatically identify the joints and to automatically trace the joint points (by labeling markers at the joints). Based upon the closure of marking points, automatic identification can be achieved through thresholding treatment. Due to the screening frequency and the laws of human segment movement, when the marking points have been initialized, their automatic tracking can be achieved with the progressive sequential images.Then the testing results, the data from three-dimensional force platform and the characteristics that human body segment will only rotate around the closer ending segment when the segment has no boding force and only valid to the conservative force all tell that after being analyzed kinematically, the approach is approved to be valid.
NASA Astrophysics Data System (ADS)
Abdullah, Nurul Azma; Saidi, Md. Jamri; Rahman, Nurul Hidayah Ab; Wen, Chuah Chai; Hamid, Isredza Rahmi A.
2017-10-01
In practice, identification of criminal in Malaysia is done through thumbprint identification. However, this type of identification is constrained as most of criminal nowadays getting cleverer not to leave their thumbprint on the scene. With the advent of security technology, cameras especially CCTV have been installed in many public and private areas to provide surveillance activities. The footage of the CCTV can be used to identify suspects on scene. However, because of limited software developed to automatically detect the similarity between photo in the footage and recorded photo of criminals, the law enforce thumbprint identification. In this paper, an automated facial recognition system for criminal database was proposed using known Principal Component Analysis approach. This system will be able to detect face and recognize face automatically. This will help the law enforcements to detect or recognize suspect of the case if no thumbprint present on the scene. The results show that about 80% of input photo can be matched with the template data.
Degraded Vowel Acoustics and the Perceptual Consequences in Dysarthria
NASA Astrophysics Data System (ADS)
Lansford, Kaitlin L.
Distorted vowel production is a hallmark characteristic of dysarthric speech, irrespective of the underlying neurological condition or dysarthria diagnosis. A variety of acoustic metrics have been used to study the nature of vowel production deficits in dysarthria; however, not all demonstrate sensitivity to the exhibited deficits. Less attention has been paid to quantifying the vowel production deficits associated with the specific dysarthrias. Attempts to characterize the relationship between naturally degraded vowel production in dysarthria with overall intelligibility have met with mixed results, leading some to question the nature of this relationship. It has been suggested that aberrant vowel acoustics may be an index of overall severity of the impairment and not an "integral component" of the intelligibility deficit. A limitation of previous work detailing perceptual consequences of disordered vowel acoustics is that overall intelligibility, not vowel identification accuracy, has been the perceptual measure of interest. A series of three experiments were conducted to address the problems outlined herein. The goals of the first experiment were to identify subsets of vowel metrics that reliably distinguish speakers with dysarthria from non-disordered speakers and differentiate the dysarthria subtypes. Vowel metrics that capture vowel centralization and reduced spectral distinctiveness among vowels differentiated dysarthric from non-disordered speakers. Vowel metrics generally failed to differentiate speakers according to their dysarthria diagnosis. The second and third experiments were conducted to evaluate the relationship between degraded vowel acoustics and the resulting percept. In the second experiment, correlation and regression analyses revealed vowel metrics that capture vowel centralization and distinctiveness and movement of the second formant frequency were most predictive of vowel identification accuracy and overall intelligibility. The third experiment was conducted to evaluate the extent to which the nature of the acoustic degradation predicts the resulting percept. Results suggest distinctive vowel tokens are better identified and, likewise, better-identified tokens are more distinctive. Further, an above-chance level agreement between nature of vowel misclassification and misidentification errors was demonstrated for all vowels, suggesting degraded vowel acoustics are not merely an index of severity in dysarthria, but rather are an integral component of the resultant intelligibility disorder.
RFID applications in transportation operation and intelligent transportation systems (ITS).
DOT National Transportation Integrated Search
2009-06-01
Radio frequency identification (RFID) transmits the identity of an object or a person wirelessly. It is grouped under : the broad category of automatic identification technologies with corresponding standards and established protocols. : RFID is suit...
MAC, A System for Automatically IPR Identification, Collection and Distribution
NASA Astrophysics Data System (ADS)
Serrão, Carlos
Controlling Intellectual Property Rights (IPR) in the Digital World is a very hard challenge. The facility to create multiple bit-by-bit identical copies from original IPR works creates the opportunities for digital piracy. One of the most affected industries by this fact is the Music Industry. The Music Industry has supported huge losses during the last few years due to this fact. Moreover, this fact is also affecting the way that music rights collecting and distributing societies are operating to assure a correct music IPR identification, collection and distribution. In this article a system for automating this IPR identification, collection and distribution is presented and described. This system makes usage of advanced automatic audio identification system based on audio fingerprinting technology. This paper will present the details of the system and present a use-case scenario where this system is being used.
DOT National Transportation Integrated Search
2014-05-01
Thefederallymandatedmaterialsclearanceprocessrequiresstatetransportation : agenciestosubjectallconstructionfieldsamplestoqualitycontrol/assurancetestingin : ordertopassstandardizedstateinspections....
An automatic system to detect and extract texts in medical images for de-identification
NASA Astrophysics Data System (ADS)
Zhu, Yingxuan; Singh, P. D.; Siddiqui, Khan; Gillam, Michael
2010-03-01
Recently, there is an increasing need to share medical images for research purpose. In order to respect and preserve patient privacy, most of the medical images are de-identified with protected health information (PHI) before research sharing. Since manual de-identification is time-consuming and tedious, so an automatic de-identification system is necessary and helpful for the doctors to remove text from medical images. A lot of papers have been written about algorithms of text detection and extraction, however, little has been applied to de-identification of medical images. Since the de-identification system is designed for end-users, it should be effective, accurate and fast. This paper proposes an automatic system to detect and extract text from medical images for de-identification purposes, while keeping the anatomic structures intact. First, considering the text have a remarkable contrast with the background, a region variance based algorithm is used to detect the text regions. In post processing, geometric constraints are applied to the detected text regions to eliminate over-segmentation, e.g., lines and anatomic structures. After that, a region based level set method is used to extract text from the detected text regions. A GUI for the prototype application of the text detection and extraction system is implemented, which shows that our method can detect most of the text in the images. Experimental results validate that our method can detect and extract text in medical images with a 99% recall rate. Future research of this system includes algorithm improvement, performance evaluation, and computation optimization.
Parallel Processing of Large Scale Microphone Arrays for Sound Capture
NASA Astrophysics Data System (ADS)
Jan, Ea-Ee.
1995-01-01
Performance of microphone sound pick up is degraded by deleterious properties of the acoustic environment, such as multipath distortion (reverberation) and ambient noise. The degradation becomes more prominent in a teleconferencing environment in which the microphone is positioned far away from the speaker. Besides, the ideal teleconference should feel as easy and natural as face-to-face communication with another person. This suggests hands-free sound capture with no tether or encumbrance by hand-held or body-worn sound equipment. Microphone arrays for this application represent an appropriate approach. This research develops new microphone array and signal processing techniques for high quality hands-free sound capture in noisy, reverberant enclosures. The new techniques combine matched-filtering of individual sensors and parallel processing to provide acute spatial volume selectivity which is capable of mitigating the deleterious effects of noise interference and multipath distortion. The new method outperforms traditional delay-and-sum beamformers which provide only directional spatial selectivity. The research additionally explores truncated matched-filtering and random distribution of transducers to reduce complexity and improve sound capture quality. All designs are first established by computer simulation of array performance in reverberant enclosures. The simulation is achieved by a room model which can efficiently calculate the acoustic multipath in a rectangular enclosure up to a prescribed order of images. It also calculates the incident angle of the arriving signal. Experimental arrays were constructed and their performance was measured in real rooms. Real room data were collected in a hard-walled laboratory and a controllable variable acoustics enclosure of similar size, approximately 6 x 6 x 3 m. An extensive speech database was also collected in these two enclosures for future research on microphone arrays. The simulation results are shown to be consistent with the real room data. Localization of sound sources has been explored using cross-power spectrum time delay estimation and has been evaluated using real room data under slightly, moderately and highly reverberant conditions. To improve the accuracy and reliability of the source localization, an outlier detector that removes incorrect time delay estimation has been invented. To provide speaker selectivity for microphone array systems, a hands-free speaker identification system has been studied. A recently invented feature using selected spectrum information outperforms traditional recognition methods. Measured results demonstrate the capabilities of speaker selectivity from a matched-filtered array. In addition, simulation utilities, including matched -filtering processing of the array and hands-free speaker identification, have been implemented on the massively -parallel nCube super-computer. This parallel computation highlights the requirements for real-time processing of array signals.
Development of equally intelligible Telugu sentence-lists to test speech recognition in noise.
Tanniru, Kishore; Narne, Vijaya Kumar; Jain, Chandni; Konadath, Sreeraj; Singh, Niraj Kumar; Sreenivas, K J Ramadevi; K, Anusha
2017-09-01
To develop sentence lists in the Telugu language for the assessment of speech recognition threshold (SRT) in the presence of background noise through identification of the mean signal-to-noise ratio required to attain a 50% sentence recognition score (SRTn). This study was conducted in three phases. The first phase involved the selection and recording of Telugu sentences. In the second phase, 20 lists, each consisting of 10 sentences with equal intelligibility, were formulated using a numerical optimisation procedure. In the third phase, the SRTn of the developed lists was estimated using adaptive procedures on individuals with normal hearing. A total of 68 native Telugu speakers with normal hearing participated in the study. Of these, 18 (including the speakers) performed on various subjective measures in first phase, 20 performed on sentence/word recognition in noise for second phase and 30 participated in the list equivalency procedures in third phase. In all, 15 lists of comparable difficulty were formulated as test material. The mean SRTn across these lists corresponded to -2.74 (SD = 0.21). The developed sentence lists provided a valid and reliable tool to measure SRTn in Telugu native speakers.
Automatic building identification under bomb damage conditions
NASA Astrophysics Data System (ADS)
Woodley, Robert; Noll, Warren; Barker, Joseph; Wunsch, Donald C., II
2009-05-01
Given the vast amount of image intelligence utilized in support of planning and executing military operations, a passive automated image processing capability for target identification is urgently required. Furthermore, transmitting large image streams from remote locations would quickly use available band width (BW) precipitating the need for processing to occur at the sensor location. This paper addresses the problem of automatic target recognition for battle damage assessment (BDA). We utilize an Adaptive Resonance Theory approach to cluster templates of target buildings. The results show that the network successfully classifies targets from non-targets in a virtual test bed environment.
Deem, J F; Manning, W H; Knack, J V; Matesich, J S
1989-09-01
A program for the automatic extraction of jitter (PAEJ) was developed for the clinical measurement of pitch perturbations using a microcomputer. The program currently includes 12 implementations of an algorithm for marking the boundary criteria for a fundamental period of vocal fold vibration. The relative sensitivity of these extraction procedures for identifying the pitch period was compared using sine waves. Data obtained to date provide information for each procedure concerning the effects of waveform peakedness and slope, sample duration in cycles, noise level of the analysis system with both direct and tape recorded input, and the influence of interpolation. Zero crossing extraction procedures provided lower jitter values regardless of sine wave frequency or sample duration. The procedures making use of positive- or negative-going zero crossings with interpolation provided the lowest measures of jitter with the sine wave stimuli. Pilot data obtained with normal-speaking adults indicated that jitter measures varied as a function of the speaker, vowel, and sample duration.
An investigation of articulatory setting using real-time magnetic resonance imaging
Ramanarayanan, Vikram; Goldstein, Louis; Byrd, Dani; Narayanan, Shrikanth S.
2013-01-01
This paper presents an automatic procedure to analyze articulatory setting in speech production using real-time magnetic resonance imaging of the moving human vocal tract. The procedure extracts frames corresponding to inter-speech pauses, speech-ready intervals and absolute rest intervals from magnetic resonance imaging sequences of read and spontaneous speech elicited from five healthy speakers of American English and uses automatically extracted image features to quantify vocal tract posture during these intervals. Statistical analyses show significant differences between vocal tract postures adopted during inter-speech pauses and those at absolute rest before speech; the latter also exhibits a greater variability in the adopted postures. In addition, the articulatory settings adopted during inter-speech pauses in read and spontaneous speech are distinct. The results suggest that adopted vocal tract postures differ on average during rest positions, ready positions and inter-speech pauses, and might, in that order, involve an increasing degree of active control by the cognitive speech planning mechanism. PMID:23862826
Multimodal fusion of polynomial classifiers for automatic person recgonition
NASA Astrophysics Data System (ADS)
Broun, Charles C.; Zhang, Xiaozheng
2001-03-01
With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with AWGN in the audio domain over a range of signal-to-noise ratios.
Computer-Mediated Assessment of Intelligibility in Aphasia and Apraxia of Speech
Haley, Katarina L.; Roth, Heidi; Grindstaff, Enetta; Jacks, Adam
2011-01-01
Background Previous work indicates that single word intelligibility tests developed for dysarthria are sensitive to segmental production errors in aphasic individuals with and without apraxia of speech. However, potential listener learning effects and difficulties adapting elicitation procedures to coexisting language impairments limit their applicability to left hemisphere stroke survivors. Aims The main purpose of this study was to examine basic psychometric properties for a new monosyllabic intelligibility test developed for individuals with aphasia and/or AOS. A related purpose was to examine clinical feasibility and potential to standardize a computer-mediated administration approach. Methods & Procedures A 600-item monosyllabic single word intelligibility test was constructed by assembling sets of phonetically similar words. Custom software was used to select 50 target words from this test in a pseudo-random fashion and to elicit and record production of these words by 23 speakers with aphasia and 20 neurologically healthy participants. To evaluate test-retest reliability, two identical sets of 50-word lists were elicited by requesting repetition after a live speaker model. To examine the effect of a different word set and auditory model, an additional set of 50 different words was elicited with a pre-recorded model. The recorded words were presented to normal-hearing listeners for identification via orthographic and multiple-choice response formats. To examine construct validity, production accuracy for each speaker was estimated via phonetic transcription and rating of overall articulation. Outcomes & Results Recording and listening tasks were completed in less than six minutes for all speakers and listeners. Aphasic speakers were significantly less intelligible than neurologically healthy speakers and displayed a wide range of intelligibility scores. Test-retest and inter-listener reliability estimates were strong. No significant difference was found in scores based on recordings from a live model versus a pre-recorded model, but some individual speakers favored the live model. Intelligibility test scores correlated highly with segmental accuracy derived from broad phonetic transcription of the same speech sample and a motor speech evaluation. Scores correlated moderately with rated articulation difficulty. Conclusions We describe a computerized, single-word intelligibility test that yields clinically feasible, reliable, and valid measures of segmental speech production in adults with aphasia. This tool can be used in clinical research to facilitate appropriate participant selection and to establish matching across comparison groups. For a majority of speakers, elicitation procedures can be standardized by using a pre-recorded auditory model for repetition. This assessment tool has potential utility for both clinical assessment and outcomes research. PMID:22215933
Investigation of an automatic trim algorithm for restructurable aircraft control
NASA Technical Reports Server (NTRS)
Weiss, J.; Eterno, J.; Grunberg, D.; Looze, D.; Ostroff, A.
1986-01-01
This paper develops and solves an automatic trim problem for restructurable aircraft control. The trim solution is applied as a feed-forward control to reject measurable disturbances following control element failures. Disturbance rejection and command following performances are recovered through the automatic feedback control redesign procedure described by Looze et al. (1985). For this project the existence of a failure detection mechanism is assumed, and methods to cope with potential detection and identification inaccuracies are addressed.
Automatic Identification and Organization of Index Terms for Interactive Browsing.
ERIC Educational Resources Information Center
Wacholder, Nina; Evans, David K.; Klavans, Judith L.
The potential of automatically generated indexes for information access has been recognized for several decades, but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development of interactive systems to support phrase…
Video-assisted segmentation of speech and audio track
NASA Astrophysics Data System (ADS)
Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.
1999-08-01
Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.
Kouider, Sid; Dupoux, Emmanuel
2005-08-01
We present a novel subliminal priming technique that operates in the auditory modality. Masking is achieved by hiding a spoken word within a stream of time-compressed speechlike sounds with similar spectral characteristics. Participants were unable to consciously identify the hidden words, yet reliable repetition priming was found. This effect was unaffected by a change in the speaker's voice and remained restricted to lexical processing. The results show that the speech modality, like the written modality, involves the automatic extraction of abstract word-form representations that do not include nonlinguistic details. In both cases, priming operates at the level of discrete and abstract lexical entries and is little influenced by overlap in form or semantics.
Heinrich, Andreas; Güttler, Felix; Wendt, Sebastian; Schenkl, Sebastian; Hubig, Michael; Wagner, Rebecca; Mall, Gita; Teichgräber, Ulf
2018-06-18
In forensic odontology the comparison between antemortem and postmortem panoramic radiographs (PRs) is a reliable method for person identification. The purpose of this study was to improve and automate identification of unknown people by comparison between antemortem and postmortem PR using computer vision. The study includes 43 467 PRs from 24 545 patients (46 % females/54 % males). All PRs were filtered and evaluated with Matlab R2014b including the toolboxes image processing and computer vision system. The matching process used the SURF feature to find the corresponding points between two PRs (unknown person and database entry) out of the whole database. From 40 randomly selected persons, 34 persons (85 %) could be reliably identified by corresponding PR matching points between an already existing scan in the database and the most recent PR. The systematic matching yielded a maximum of 259 points for a successful identification between two different PRs of the same person and a maximum of 12 corresponding matching points for other non-identical persons in the database. Hence 12 matching points are the threshold for reliable assignment. Operating with an automatic PR system and computer vision could be a successful and reliable tool for identification purposes. The applied method distinguishes itself by virtue of its fast and reliable identification of persons by PR. This Identification method is suitable even if dental characteristics were removed or added in the past. The system seems to be robust for large amounts of data. · Computer vision allows an automated antemortem and postmortem comparison of panoramic radiographs (PRs) for person identification.. · The present method is able to find identical matching partners among huge datasets (big data) in a short computing time.. · The identification method is suitable even if dental characteristics were removed or added.. · Heinrich A, Güttler F, Wendt S et al. Forensic Odontology: Automatic Identification of Persons Comparing Antemortem and Postmortem Panoramic Radiographs Using Computer Vision. Fortschr Röntgenstr 2018; DOI: 10.1055/a-0632-4744. © Georg Thieme Verlag KG Stuttgart · New York.
Cohen, Aaron M
2008-01-01
We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.
Natural language processing of spoken diet records (SDRs).
Lacson, Ronilda; Long, William
2006-01-01
Dietary assessment is a fundamental aspect of nutritional evaluation that is essential for management of obesity as well as for assessing dietary impact on chronic diseases. Various methods have been used for dietary assessment including written records, 24-hour recalls, and food frequency questionnaires. The use of mobile phones to provide real-time dietary records provides potential advantages for accessibility, ease of use and automated documentation. However, understanding even a perfect transcript of spoken dietary records (SDRs) is challenging for people. This work presents a first step towards automatic analysis of SDRs. Our approach consists of four steps - identification of food items, identification of food quantifiers, classification of food quantifiers and temporal annotation. Our method enables automatic extraction of dietary information from SDRs, which in turn allows automated mapping to a Diet History Questionnaire dietary database. Our model has an accuracy of 90%. This work demonstrates the feasibility of automatically processing SDRs.
Automatic identification of artifacts in electrodermal activity data.
Taylor, Sara; Jaques, Natasha; Chen, Weixuan; Fedor, Szymon; Sano, Akane; Picard, Rosalind
2015-01-01
Recently, wearable devices have allowed for long term, ambulatory measurement of electrodermal activity (EDA). Despite the fact that ambulatory recording can be noisy, and recording artifacts can easily be mistaken for a physiological response during analysis, to date there is no automatic method for detecting artifacts. This paper describes the development of a machine learning algorithm for automatically detecting EDA artifacts, and provides an empirical evaluation of classification performance. We have encoded our results into a freely available web-based tool for artifact and peak detection.
Management of natural resources through automatic cartographic inventory
NASA Technical Reports Server (NTRS)
Rey, P.; Gourinard, Y.; Cambou, F. (Principal Investigator)
1973-01-01
The author has identified the following significant results. Significant results of the ARNICA program from August 1972 - January 1973 have been: (1) establishment of image to object correspondence codes for all types of soil use and forestry in northern Spain; (2) establishment of a transfer procedure between qualitative (remote identification and remote interpretation) and quantitative (numerization, storage, automatic statistical cartography) use of images; (3) organization of microdensitometric data processing and automatic cartography software; and (4) development of a system for measuring reflectance simultaneous with imagery.
Go, Taesik; Byeon, Hyeokjun; Lee, Sang Joon
2018-04-30
Cell types of erythrocytes should be identified because they are closely related to their functionality and viability. Conventional methods for classifying erythrocytes are time consuming and labor intensive. Therefore, an automatic and accurate erythrocyte classification system is indispensable in healthcare and biomedical fields. In this study, we proposed a new label-free sensor for automatic identification of erythrocyte cell types using a digital in-line holographic microscopy (DIHM) combined with machine learning algorithms. A total of 12 features, including information on intensity distributions, morphological descriptors, and optical focusing characteristics, is quantitatively obtained from numerically reconstructed holographic images. All individual features for discocytes, echinocytes, and spherocytes are statistically different. To improve the performance of cell type identification, we adopted several machine learning algorithms, such as decision tree model, support vector machine, linear discriminant classification, and k-nearest neighbor classification. With the aid of these machine learning algorithms, the extracted features are effectively utilized to distinguish erythrocytes. Among the four tested algorithms, the decision tree model exhibits the best identification performance for the training sets (n = 440, 98.18%) and test sets (n = 190, 97.37%). This proposed methodology, which smartly combined DIHM and machine learning, would be helpful for sensing abnormal erythrocytes and computer-aided diagnosis of hematological diseases in clinic. Copyright © 2017 Elsevier B.V. All rights reserved.
The impact of musical training and tone language experience on talker identification
Xie, Xin; Myers, Emily
2015-01-01
Listeners can use pitch changes in speech to identify talkers. Individuals exhibit large variability in sensitivity to pitch and in accuracy perceiving talker identity. In particular, people who have musical training or long-term tone language use are found to have enhanced pitch perception. In the present study, the influence of pitch experience on talker identification was investigated as listeners identified talkers in native language as well as non-native languages. Experiment 1 was designed to explore the influence of pitch experience on talker identification in two groups of individuals with potential advantages for pitch processing: musicians and tone language speakers. Experiment 2 further investigated individual differences in pitch processing and the contribution to talker identification by testing a mediation model. Cumulatively, the results suggested that (a) musical training confers an advantage for talker identification, supporting a shared resources hypothesis regarding music and language and (b) linguistic use of lexical tones also increases accuracy in hearing talker identity. Importantly, these two types of hearing experience enhance talker identification by sharpening pitch perception skills in a domain-general manner. PMID:25618071
The impact of musical training and tone language experience on talker identification.
Xie, Xin; Myers, Emily
2015-01-01
Listeners can use pitch changes in speech to identify talkers. Individuals exhibit large variability in sensitivity to pitch and in accuracy perceiving talker identity. In particular, people who have musical training or long-term tone language use are found to have enhanced pitch perception. In the present study, the influence of pitch experience on talker identification was investigated as listeners identified talkers in native language as well as non-native languages. Experiment 1 was designed to explore the influence of pitch experience on talker identification in two groups of individuals with potential advantages for pitch processing: musicians and tone language speakers. Experiment 2 further investigated individual differences in pitch processing and the contribution to talker identification by testing a mediation model. Cumulatively, the results suggested that (a) musical training confers an advantage for talker identification, supporting a shared resources hypothesis regarding music and language and (b) linguistic use of lexical tones also increases accuracy in hearing talker identity. Importantly, these two types of hearing experience enhance talker identification by sharpening pitch perception skills in a domain-general manner.
Smits-Bandstra, Sarah; De Nil, Luc F
2007-01-01
The basal ganglia and cortico-striato-thalamo-cortical connections are known to play a critical role in sequence skill learning and increasing automaticity over practice. The current paper reviews four studies comparing the sequence skill learning and the transition to automaticity of persons who stutter (PWS) and fluent speakers (PNS) over practice. Studies One and Two found PWS to have poor finger tap sequencing skill and nonsense syllable sequencing skill after practice, and on retention and transfer tests relative to PNS. Studies Three and Four found PWS to be significantly less accurate and/or significantly slower after practice on dual tasks requiring concurrent sequencing and colour recognition over practice relative to PNS. Evidence of PWS' deficits in sequence skill learning and automaticity development support the hypothesis that dysfunction in cortico-striato-thalamo-cortical connections may be one etiological component in the development and maintenance of stuttering. As a result of this activity, the reader will: (1) be able to articulate the research regarding the basal ganglia system relating to sequence skill learning; (2) be able to summarize the research on stuttering with indications of sequence skill learning deficits; and (3) be able to discuss basal ganglia mechanisms with relevance for theory of stuttering.
48 CFR 252.211-7003 - Item identification and valuation.
Code of Federal Regulations, 2013 CFR
2013-10-01
..., used to retrieve data encoded on machine-readable media. Concatenated unique item identifier means— (1... (or controlling) authority for the enterprise identifier. Item means a single hardware article or a...-readable means an automatic identification technology media, such as bar codes, contact memory buttons...
48 CFR 252.211-7003 - Item identification and valuation.
Code of Federal Regulations, 2011 CFR
2011-10-01
..., used to retrieve data encoded on machine-readable media. Concatenated unique item identifier means— (1... (or controlling) authority for the enterprise identifier. Item means a single hardware article or a...-readable means an automatic identification technology media, such as bar codes, contact memory buttons...
48 CFR 252.211-7003 - Item identification and valuation.
Code of Federal Regulations, 2012 CFR
2012-10-01
..., used to retrieve data encoded on machine-readable media. Concatenated unique item identifier means— (1... (or controlling) authority for the enterprise identifier. Item means a single hardware article or a...-readable means an automatic identification technology media, such as bar codes, contact memory buttons...
Automatic Adviser on Mobile Objects Status Identification and Classification
NASA Astrophysics Data System (ADS)
Shabelnikov, A. N.; Liabakh, N. N.; Gibner, Ya M.; Saryan, A. S.
2018-05-01
A mobile object status identification task is defined within the image discrimination theory. It is proposed to classify objects into three classes: object operation status; its maintenance is required and object should be removed from the production process. Two methods were developed to construct the separating boundaries between the designated classes: a) using statistical information on the research objects executed movement, b) basing on regulatory documents and expert commentary. Automatic Adviser operation simulation and the operation results analysis complex were synthesized. Research results are commented using a specific example of cuts rolling from the hump yard. The work was supported by Russian Fundamental Research Fund, project No. 17-20-01040.
Automatic measurement of images on astrometric plates
NASA Astrophysics Data System (ADS)
Ortiz Gil, A.; Lopez Garcia, A.; Martinez Gonzalez, J. M.; Yershov, V.
1994-04-01
We present some results on the process of automatic detection and measurement of objects in overlapped fields of astrometric plates. The main steps of our algorithm are the following: determination of the Scale and Tilt between charge coupled devices (CCD) and microscope coordinate systems and estimation of signal-to-noise ratio in each field;--image identification and improvement of its position and size;--image final centering;--image selection and storage. Several parameters allow the use of variable criteria for image identification, characterization and selection. Problems related with faint images and crowded fields will be approached by special techniques (morphological filters, histogram properties and fitting models).
Automatic vasculature identification in coronary angiograms by adaptive geometrical tracking.
Xiao, Ruoxiu; Yang, Jian; Goyal, Mahima; Liu, Yue; Wang, Yongtian
2013-01-01
As the uneven distribution of contrast agents and the perspective projection principle of X-ray, the vasculatures in angiographic image are with low contrast and are generally superposed with other organic tissues; therefore, it is very difficult to identify the vasculature and quantitatively estimate the blood flow directly from angiographic images. In this paper, we propose a fully automatic algorithm named adaptive geometrical vessel tracking (AGVT) for coronary artery identification in X-ray angiograms. Initially, the ridge enhancement (RE) image is obtained utilizing multiscale Hessian information. Then, automatic initialization procedures including seed points detection, and initial directions determination are performed on the RE image. The extracted ridge points can be adjusted to the geometrical centerline points adaptively through diameter estimation. Bifurcations are identified by discriminating connecting relationship of the tracked ridge points. Finally, all the tracked centerlines are merged and smoothed by classifying the connecting components on the vascular structures. Synthetic angiographic images and clinical angiograms are used to evaluate the performance of the proposed algorithm. The proposed algorithm is compared with other two vascular tracking techniques in terms of the efficiency and accuracy, which demonstrate successful applications of the proposed segmentation and extraction scheme in vasculature identification.
Lee, Kichol; Casali, John G
2016-01-01
To investigate the effect of controlled low-speed wind-noise on the auditory situation awareness performance afforded by military hearing protection/enhancement devices (HPED) and tactical communication and protective systems (TCAPS). Recognition/identification and pass-through communications tasks were separately conducted under three wind conditions (0, 5, and 10 mph). Subjects wore two in-ear-type TCAPS, one earmuff-type TCAPS, a Combat Arms Earplug in its 'open' or pass-through setting, and an EB-15LE electronic earplug. Devices with electronic gain systems were tested under two gain settings: 'unity' and 'max'. Testing without any device (open ear) was conducted as a control. Ten subjects were recruited from the student population at Virginia Tech. Audiometric requirements were 25 dBHL or better at 500, 1000, 2000, 4000, and 8000 Hz in both ears. Performance on the interaction of communication task-by-device was significantly different only in 0 mph wind speed. The between-device performance differences varied with azimuthal speaker locations. It is evident from this study that stable (non-gusting) wind speeds up to 10 mph did not significantly degrade recognition/identification task performance and pass-through communication performance of the group of HPEDs and TCAPS tested. However, the various devices performed differently as the test sound signal speaker location was varied and it appears that physical as well as electronic features may have contributed to this directional result.
Tone classification of syllable-segmented Thai speech based on multilayer perception
NASA Astrophysics Data System (ADS)
Satravaha, Nuttavudh; Klinkhachorn, Powsiri; Lass, Norman
2002-05-01
Thai is a monosyllabic tonal language that uses tone to convey lexical information about the meaning of a syllable. Thus to completely recognize a spoken Thai syllable, a speech recognition system not only has to recognize a base syllable but also must correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system. Thai has five distinctive tones (``mid,'' ``low,'' ``falling,'' ``high,'' and ``rising'') and each tone is represented by a single fundamental frequency (F0) pattern. However, several factors, including tonal coarticulation, stress, intonation, and speaker variability, affect the F0 pattern of a syllable in continuous Thai speech. In this study, an efficient method for tone classification of syllable-segmented Thai speech, which incorporates the effects of tonal coarticulation, stress, and intonation, as well as a method to perform automatic syllable segmentation, were developed. Acoustic parameters were used as the main discriminating parameters. The F0 contour of a segmented syllable was normalized by using a z-score transformation before being presented to a tone classifier. The proposed system was evaluated on 920 test utterances spoken by 8 speakers. A recognition rate of 91.36% was achieved by the proposed system.
Vanrell, Maria del Mar; Mascaró, Ignasi; Torres-Tamarit, Francesc; Prieto, Pilar
2013-06-01
Recent studies in the field of intonational phonology have shown that information-seeking questions can be distinguished from confirmation-seeking questions by prosodic means in a variety of languages (Armstrong, 2010, for Puerto Rican Spanish; Grice & Savino, 1997, for Bari Italian; Kügler, 2003, for Leipzig German; Mata & Santos, 2010, for European Portuguese; Vanrell, Mascaró, Prieto, & Torres-Tamarit, 2010, for Catalan). However, all these studies have relied on production experiments and little is known about the perceptual relevance of these intonational cues. This paper explores whether Majorcan Catalan listeners distinguish information- and confirmation-seeking questions by means of two distinct nuclear falling pitch accents. Three behavioral tasks were conducted with 20 Majorcan Catalan subjects, namely a semantic congruity test, a rating test, and a classical categorical perception identification/discrimination test. The results show that a difference in pitch scaling on the leading H tone of the H+L* nuclear pitch accent is the main cue used by Majorcan Catalan listeners to distinguish confirmation questions from information-seeking questions. Thus, while a iH+L* pitch accent signals an information-seeking question (i.e., the speaker has no expectation about the nature of the answer), the H+L* pitch accent indicates that the speaker is asking about mutually shared information. We argue that these results have implications in representing the distinctions of tonal height in Catalan. The results also support the claim that phonological contrasts in intonation, together with other linguistic strategies, can signal the speakers' beliefs about the certainty of the proposition expressed.
ERIC Educational Resources Information Center
Human Engineering Inst., Cleveland, OH.
THIS MODULE OF A 25-MODULE COURSE IS DESIGNED TO DEVELOP AN UNDERSTANDING OF THE OPERATION AND MAINTENANCE OF SPECIFIC MODELS OF AUTOMATIC TRANSMISSIONS USED ON DIESEL POWERED VEHICLES. TOPICS ARE (1) GENERAL SPECIFICATION DATA, (2) OPTIONS FOR VARIOUS APPLICATIONS, (3) ROAD TEST INSTRUCTIONS, (4) IDENTIFICATION AND SPECIFICATION DATA, (5) ALLISON…
Code of Federal Regulations, 2010 CFR
2010-04-01
... reports with appropriate statistical methodology in accordance with § 820.100. (c) Each manufacturer who... chapter shall automatically consider the report a complaint and shall process it in accordance with the... device serviced; (2) Any device identification(s) and control number(s) used; (3) The date of service; (4...
48 CFR 252.211-7003 - Item unique identification and valuation.
Code of Federal Regulations, 2014 CFR
2014-10-01
... reader or interrogator, used to retrieve data encoded on machine-readable media. Concatenated unique item... identifier. Item means a single hardware article or a single unit formed by a grouping of subassemblies... manufactured under identical conditions. Machine-readable means an automatic identification technology media...
Perceptual Learning of Time-Compressed Speech: More than Rapid Adaptation
Banai, Karen; Lavner, Yizhar
2012-01-01
Background Time-compressed speech, a form of rapidly presented speech, is harder to comprehend than natural speech, especially for non-native speakers. Although it is possible to adapt to time-compressed speech after a brief exposure, it is not known whether additional perceptual learning occurs with further practice. Here, we ask whether multiday training on time-compressed speech yields more learning than that observed during the initial adaptation phase and whether the pattern of generalization following successful learning is different than that observed with initial adaptation only. Methodology/Principal Findings Two groups of non-native Hebrew speakers were tested on five different conditions of time-compressed speech identification in two assessments conducted 10–14 days apart. Between those assessments, one group of listeners received five practice sessions on one of the time-compressed conditions. Between the two assessments, trained listeners improved significantly more than untrained listeners on the trained condition. Furthermore, the trained group generalized its learning to two untrained conditions in which different talkers presented the trained speech materials. In addition, when the performance of the non-native speakers was compared to that of a group of naïve native Hebrew speakers, performance of the trained group was equivalent to that of the native speakers on all conditions on which learning occurred, whereas performance of the untrained non-native listeners was substantially poorer. Conclusions/Significance Multiday training on time-compressed speech results in significantly more perceptual learning than brief adaptation. Compared to previous studies of adaptation, the training induced learning is more stimulus specific. Taken together, the perceptual learning of time-compressed speech appears to progress from an initial, rapid adaptation phase to a subsequent prolonged and more stimulus specific phase. These findings are consistent with the predictions of the Reverse Hierarchy Theory of perceptual learning and suggest constraints on the use of perceptual-learning regimens during second language acquisition. PMID:23056592
Analysis and Classification of Voice Pathologies Using Glottal Signal Parameters.
Forero M, Leonardo A; Kohler, Manoela; Vellasco, Marley M B R; Cataldo, Edson
2016-09-01
The classification of voice diseases has many applications in health, in diseases treatment, and in the design of new medical equipment for helping doctors in diagnosing pathologies related to the voice. This work uses the parameters of the glottal signal to help the identification of two types of voice disorders related to the pathologies of the vocal folds: nodule and unilateral paralysis. The parameters of the glottal signal are obtained through a known inverse filtering method, and they are used as inputs to an Artificial Neural Network, a Support Vector Machine, and also to a Hidden Markov Model, to obtain the classification, and to compare the results, of the voice signals into three different groups: speakers with nodule in the vocal folds; speakers with unilateral paralysis of the vocal folds; and speakers with normal voices, that is, without nodule or unilateral paralysis present in the vocal folds. The database is composed of 248 voice recordings (signals of vowels production) containing samples corresponding to the three groups mentioned. In this study, a larger database was used for the classification when compared with similar studies, and its classification rate is superior to other studies, reaching 97.2%. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Integrated Robust Open-Set Speaker Identification System (IROSIS)
2012-05-01
29 LIST OF TABLES Table 1. Detail of NIST Data Used for Training and Testing ............................................ 3 Table 2...scenarios are referred to as VB-YB, VL-YL, VB-YL and VL-YB respectively. Table 1. Detail of NIST Data Used for Training and Testing Purpose Source No...M is the UBM supervector, and that the difference between ( )L m and ( , )Q M m is the Kullback - Leibler divergence between the “alignment” of the
78 FR 63159 - Amendment to Certification of Nebraska's Central Filing System
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-23
... system for Nebraska to permit the conversion of all debtor social security and taxpayer identification... automatically convert social security numbers and taxpayer identification numbers into ten number unique... certified central filing systems is available through the Internet on the GIPSA Web site ( http://www.gipsa...
Semi-automated identification of leopard frogs
Petrovska-Delacrétaz, Dijana; Edwards, Aaron; Chiasson, John; Chollet, Gérard; Pilliod, David S.
2014-01-01
Principal component analysis is used to implement a semi-automatic recognition system to identify recaptured northern leopard frogs (Lithobates pipiens). Results of both open set and closed set experiments are given. The presented algorithm is shown to provide accurate identification of 209 individual leopard frogs from a total set of 1386 images.
33 CFR 169.235 - What exemptions are there from reporting?
Code of Federal Regulations, 2012 CFR
2012-07-01
... SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY SHIP REPORTING SYSTEMS Transmission of Long Range Identification and Tracking Information § 169.235 What exemptions are there from reporting? A ship is exempt from this subpart if it is— (a) Fitted with an operating automatic identification system (AIS), under 33 CFR...
33 CFR 164.46 - Automatic Identification System (AIS).
Code of Federal Regulations, 2014 CFR
2014-07-01
... (AIS). 164.46 Section 164.46 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Identification System (AIS). (a) The following vessels must have a properly installed, operational, type approved AIS as of the date specified: (1) Self-propelled vessels of 65 feet or more in length, other than...
33 CFR 169.235 - What exemptions are there from reporting?
Code of Federal Regulations, 2011 CFR
2011-07-01
... SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY SHIP REPORTING SYSTEMS Transmission of Long Range Identification and Tracking Information § 169.235 What exemptions are there from reporting? A ship is exempt from this subpart if it is— (a) Fitted with an operating automatic identification system (AIS), under 33 CFR...
33 CFR 164.46 - Automatic Identification System (AIS).
Code of Federal Regulations, 2013 CFR
2013-07-01
... (AIS). 164.46 Section 164.46 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Identification System (AIS). (a) The following vessels must have a properly installed, operational, type approved AIS as of the date specified: (1) Self-propelled vessels of 65 feet or more in length, other than...
33 CFR 164.46 - Automatic Identification System (AIS).
Code of Federal Regulations, 2010 CFR
2010-07-01
... (AIS). 164.46 Section 164.46 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Identification System (AIS). (a) The following vessels must have a properly installed, operational, type approved AIS as of the date specified: (1) Self-propelled vessels of 65 feet or more in length, other than...
33 CFR 169.235 - What exemptions are there from reporting?
Code of Federal Regulations, 2013 CFR
2013-07-01
... SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY SHIP REPORTING SYSTEMS Transmission of Long Range Identification and Tracking Information § 169.235 What exemptions are there from reporting? A ship is exempt from this subpart if it is— (a) Fitted with an operating automatic identification system (AIS), under 33 CFR...
33 CFR 164.46 - Automatic Identification System (AIS).
Code of Federal Regulations, 2011 CFR
2011-07-01
... (AIS). 164.46 Section 164.46 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Identification System (AIS). (a) The following vessels must have a properly installed, operational, type approved AIS as of the date specified: (1) Self-propelled vessels of 65 feet or more in length, other than...
33 CFR 169.235 - What exemptions are there from reporting?
Code of Federal Regulations, 2014 CFR
2014-07-01
... SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY SHIP REPORTING SYSTEMS Transmission of Long Range Identification and Tracking Information § 169.235 What exemptions are there from reporting? A ship is exempt from this subpart if it is— (a) Fitted with an operating automatic identification system (AIS), under 33 CFR...
33 CFR 164.46 - Automatic Identification System (AIS).
Code of Federal Regulations, 2012 CFR
2012-07-01
... (AIS). 164.46 Section 164.46 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Identification System (AIS). (a) The following vessels must have a properly installed, operational, type approved AIS as of the date specified: (1) Self-propelled vessels of 65 feet or more in length, other than...
Report: Unsupervised identification of malaria parasites using computer vision.
Khan, Najeed Ahmed; Pervaz, Hassan; Latif, Arsalan; Musharaff, Ayesha
2017-01-01
Malaria in human is a serious and fatal tropical disease. This disease results from Anopheles mosquitoes that are infected by Plasmodium species. The clinical diagnosis of malaria based on the history, symptoms and clinical findings must always be confirmed by laboratory diagnosis. Laboratory diagnosis of malaria involves identification of malaria parasite or its antigen / products in the blood of the patient. Manual diagnosis of malaria parasite by the pathologists has proven to become cumbersome. Therefore, there is a need of automatic, efficient and accurate identification of malaria parasite. In this paper, we proposed a computer vision based approach to identify the malaria parasite from light microscopy images. This research deals with the challenges involved in the automatic detection of malaria parasite tissues. Our proposed method is based on the pixel-based approach. We used K-means clustering (unsupervised approach) for the segmentation to identify malaria parasite tissues.
Automatic network coupling analysis for dynamical systems based on detailed kinetic models.
Lebiedz, Dirk; Kammerer, Julia; Brandt-Pollmann, Ulrich
2005-10-01
We introduce a numerical complexity reduction method for the automatic identification and analysis of dynamic network decompositions in (bio)chemical kinetics based on error-controlled computation of a minimal model dimension represented by the number of (locally) active dynamical modes. Our algorithm exploits a generalized sensitivity analysis along state trajectories and subsequent singular value decomposition of sensitivity matrices for the identification of these dominant dynamical modes. It allows for a dynamic coupling analysis of (bio)chemical species in kinetic models that can be exploited for the piecewise computation of a minimal model on small time intervals and offers valuable functional insight into highly nonlinear reaction mechanisms and network dynamics. We present results for the identification of network decompositions in a simple oscillatory chemical reaction, time scale separation based model reduction in a Michaelis-Menten enzyme system and network decomposition of a detailed model for the oscillatory peroxidase-oxidase enzyme system.
NASA Astrophysics Data System (ADS)
Lasaponara, Rosa; Masini, Nicola
2018-06-01
The identification and quantification of disturbance of archaeological sites has been generally approached by visual inspection of optical aerial or satellite pictures. In this paper, we briefly summarize the state of the art of the traditionally satellite-based approaches for looting identification and propose a new automatic method for archaeological looting feature extraction approach (ALFEA). It is based on three steps: the enhancement using spatial autocorrelation, unsupervised classification, and segmentation. ALFEA has been applied to Google Earth images of two test areas, selected in desert environs in Syria (Dura Europos), and in Peru (Cahuachi-Nasca). The reliability of ALFEA was assessed through field surveys in Peru and visual inspection for the Syrian case study. Results from the evaluation procedure showed satisfactory performance from both of the two analysed test cases with a rate of success higher than 90%.
Automatic photointerpretation for plant species and stress identification (ERTS-A1)
NASA Technical Reports Server (NTRS)
Swanlund, G. D. (Principal Investigator); Kirvida, L.; Johnson, G. R.
1973-01-01
The author has identified the following significant results. Automatic stratification of forested land from ERTS-1 data provides a valuable tool for resource management. The results are useful for wood product yield estimates, recreation and wildlife management, forest inventory, and forest condition monitoring. Automatic procedures based on both multispectral and spatial features are evaluated. With five classes, training and testing on the same samples, classification accuracy of 74 percent was achieved using the MSS multispectral features. When adding texture computed from 8 x 8 arrays, classification accuracy of 90 percent was obtained.
NASA Astrophysics Data System (ADS)
Broersen, Tom; Peters, Ravi; Ledoux, Hugo
2017-09-01
Drainage networks play a crucial role in protecting land against floods. It is therefore important to have an accurate map of the watercourses that form the drainage network. Previous work on the automatic identification of watercourses was typically based on grids, focused on natural landscapes, and used mostly the slope and curvature of the terrain. We focus in this paper on areas that are characterised by low-lying, flat, and engineered landscapes; these are characteristic to the Netherlands for instance. We propose a new methodology to identify watercourses automatically from elevation data, it uses solely a raw classified LiDAR point cloud as input. We show that by computing twice a skeleton of the point cloud-once in 2D and once in 3D-and that by using the properties of the skeletons we can identify most of the watercourses. We have implemented our methodology and tested it for three different soil types around Utrecht, the Netherlands. We were able to detect 98% of the watercourses for one soil type, and around 75% for the worst case, when we compared to a reference dataset that was obtained semi-automatically.
Automatic contact in DYNA3D for vehicle crashworthiness
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whirley, R.G.; Engelmann, B.E.
1993-07-15
This paper presents a new formulation for the automatic definition and treatment of mechanical contact in explicit nonlinear finite element analysis. Automatic contact offers the benefits of significantly reduced model construction time and fewer opportunities for user error, but faces significant challenges in reliability and computational costs. This paper discusses in detail a new four-step automatic contact algorithm. Key aspects of the proposed method include automatic identification of adjacent and opposite surfaces in the global search phase, and the use of a smoothly varying surface normal which allows a consistent treatment of shell intersection and corner contact conditions without ad-hocmore » rules. The paper concludes with three examples which illustrate the performance of the newly proposed algorithm in the public DYNA3D code.« less
Zhang, Shufang; Sun, Xiaowen
2018-01-01
This paper investigates the Additional Secondary Phase Factor (ASF) characteristics of Automatic Identification System (AIS) signals spreading over a rough sea surface. According to the change of the ASFs for AIS signals in different signal form, the influences of the different propagation conditions on the ASFs are analyzed. The expression, numerical calculation, and simulation analysis of the ASFs of AIS signal are performed in the rough sea surface. The results contribute to the high-accuracy propagation delay measurement of AIS signals spreading over the rough sea surface as, well as providing a reference for reliable communication link design in marine engineering for Very High Frequency (VHF) signals. PMID:29462995
NASA Astrophysics Data System (ADS)
Rulaningtyas, Riries; Suksmono, Andriyan B.; Mengko, Tati L. R.; Saptawati, Putri
2015-04-01
Sputum smear observation has an important role in tuberculosis (TB) disease diagnosis, because it needs accurate identification to avoid high errors diagnosis. In development countries, sputum smear slide observation is commonly done with conventional light microscope from Ziehl-Neelsen stained tissue and it doesn't need high cost to maintain the microscope. The clinicians do manual screening process for sputum smear slide which is time consuming and needs highly training to detect the presence of TB bacilli (mycobacterium tuberculosis) accurately, especially for negative slide and slide with less number of TB bacilli. For helping the clinicians, we propose automatic scanning microscope with automatic identification of TB bacilli. The designed system modified the field movement of light microscope with stepper motor which was controlled by microcontroller. Every sputum smear field was captured by camera. After that some image processing techniques were done for the sputum smear images. The color threshold was used for background subtraction with hue canal in HSV color space. Sobel edge detection algorithm was used for TB bacilli image segmentation. We used feature extraction based on shape for bacilli analyzing and then neural network classified TB bacilli or not. The results indicated identification of TB bacilli that we have done worked well and detected TB bacilli accurately in sputum smear slide with normal staining, but not worked well in over staining and less staining tissue slide. However, overall the designed system can help the clinicians in sputum smear observation becomes more easily.
Automatic poisson peak harvesting for high throughput protein identification.
Breen, E J; Hopwood, F G; Williams, K L; Wilkins, M R
2000-06-01
High throughput identification of proteins by peptide mass fingerprinting requires an efficient means of picking peaks from mass spectra. Here, we report the development of a peak harvester to automatically pick monoisotopic peaks from spectra generated on matrix-assisted laser desorption/ionisation time of flight (MALDI-TOF) mass spectrometers. The peak harvester uses advanced mathematical morphology and watershed algorithms to first process spectra to stick representations. Subsequently, Poisson modelling is applied to determine which peak in an isotopically resolved group represents the monoisotopic mass of a peptide. We illustrate the features of the peak harvester with mass spectra of standard peptides, digests of gel-separated bovine serum albumin, and with Escherictia coli proteins prepared by two-dimensional polyacrylamide gel electrophoresis. In all cases, the peak harvester proved effective in its ability to pick similar monoisotopic peaks as an experienced human operator, and also proved effective in the identification of monoisotopic masses in cases where isotopic distributions of peptides were overlapping. The peak harvester can be operated in an interactive mode, or can be completely automated and linked through to peptide mass fingerprinting protein identification tools to achieve high throughput automated protein identification.
An automatic method to detect and track the glottal gap from high speed videoendoscopic images.
Andrade-Miranda, Gustavo; Godino-Llorente, Juan I; Moro-Velázquez, Laureano; Gómez-García, Jorge Andrés
2015-10-29
The image-based analysis of the vocal folds vibration plays an important role in the diagnosis of voice disorders. The analysis is based not only on the direct observation of the video sequences, but also in an objective characterization of the phonation process by means of features extracted from the recorded images. However, such analysis is based on a previous accurate identification of the glottal gap, which is the most challenging step for a further automatic assessment of the vocal folds vibration. In this work, a complete framework to automatically segment and track the glottal area (or glottal gap) is proposed. The algorithm identifies a region of interest that is adapted along time, and combine active contours and watershed transform for the final delineation of the glottis and also an automatic procedure for synthesize different videokymograms is proposed. Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds. The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation. Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.
Velupillai, Sumithra; Dalianis, Hercules; Hassel, Martin; Nilsson, Gunnar H
2009-12-01
Electronic patient records (EPRs) contain a large amount of information written in free text. This information is considered very valuable for research but is also very sensitive since the free text parts may contain information that could reveal the identity of a patient. Therefore, methods for de-identifying EPRs are needed. The work presented here aims to perform a manual and automatic Protected Health Information (PHI)-annotation trial for EPRs written in Swedish. This study consists of two main parts: the initial creation of a manually PHI-annotated gold standard, and the porting and evaluation of an existing de-identification software written for American English to Swedish in a preliminary automatic de-identification trial. Results are measured with precision, recall and F-measure. This study reports fairly high Inter-Annotator Agreement (IAA) results on the manually created gold standard, especially for specific tags such as names. The average IAA over all tags was 0.65 F-measure (0.84 F-measure highest pairwise agreement). For name tags the average IAA was 0.80 F-measure (0.91 F-measure highest pairwise agreement). Porting a de-identification software written for American English to Swedish directly was unfortunately non-trivial, yielding poor results. Developing gold standard sets as well as automatic systems for de-identification tasks in Swedish is feasible. However, discussions and definitions on identifiable information is needed, as well as further developments both on the tag sets and the annotation guidelines, in order to get a reliable gold standard. A completely new de-identification software needs to be developed.
Automatic topics segmentation for TV news video
NASA Astrophysics Data System (ADS)
Hmayda, Mounira; Ejbali, Ridha; Zaied, Mourad
2017-03-01
Automatic identification of television programs in the TV stream is an important task for operating archives. This article proposes a new spatio-temporal approach to identify the programs in TV stream into two main steps: First, a reference catalogue for video features visual jingles built. We operate the features that characterize the instances of the same program type to identify the different types of programs in the flow of television. The role of video features is to represent the visual invariants for each visual jingle using appropriate automatic descriptors for each television program. On the other hand, programs in television streams are identified by examining the similarity of the video signal for visual grammars in the catalogue. The main idea of the identification process is to compare the visual similarity of the video signal features in the flow of television to the catalogue. After presenting the proposed approach, the paper overviews encouraging experimental results on several streams extracted from different channels and compounds of several programs.
Ambrosini, Emilia; Ferrante, Simona; Schauer, Thomas; Ferrigno, Giancarlo; Molteni, Franco; Pedrocchi, Alessandra
2014-01-01
Cycling induced by Functional Electrical Stimulation (FES) training currently requires a manual setting of different parameters, which is a time-consuming and scarcely repeatable procedure. We proposed an automatic procedure for setting session-specific parameters optimized for hemiparetic patients. This procedure consisted of the identification of the stimulation strategy as the angular ranges during which FES drove the motion, the comparison between the identified strategy and the physiological muscular activation strategy, and the setting of the pulse amplitude and duration of each stimulated muscle. Preliminary trials on 10 healthy volunteers helped define the procedure. Feasibility tests on 8 hemiparetic patients (5 stroke, 3 traumatic brain injury) were performed. The procedure maximized the motor output within the tolerance constraint, identified a biomimetic strategy in 6 patients, and always lasted less than 5 minutes. Its reasonable duration and automatic nature make the procedure usable at the beginning of every training session, potentially enhancing the performance of FES-cycling training.
NASA Astrophysics Data System (ADS)
Ghoraani, Behnaz; Krishnan, Sridhar
2009-12-01
The number of people affected by speech problems is increasing as the modern world places increasing demands on the human voice via mobile telephones, voice recognition software, and interpersonal verbal communications. In this paper, we propose a novel methodology for automatic pattern classification of pathological voices. The main contribution of this paper is extraction of meaningful and unique features using Adaptive time-frequency distribution (TFD) and nonnegative matrix factorization (NMF). We construct Adaptive TFD as an effective signal analysis domain to dynamically track the nonstationarity in the speech and utilize NMF as a matrix decomposition (MD) technique to quantify the constructed TFD. The proposed method extracts meaningful and unique features from the joint TFD of the speech, and automatically identifies and measures the abnormality of the signal. Depending on the abnormality measure of each signal, we classify the signal into normal or pathological. The proposed method is applied on the Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database which consists of 161 pathological and 51 normal speakers, and an overall classification accuracy of 98.6% was achieved.
Evaluation of the automatic optical authentication technologies for control systems of objects
NASA Astrophysics Data System (ADS)
Averkin, Vladimir V.; Volegov, Peter L.; Podgornov, Vladimir A.
2000-03-01
The report considers the evaluation of the automatic optical authentication technologies for the automated integrated system of physical protection, control and accounting of nuclear materials at RFNC-VNIITF, and for providing of the nuclear materials nonproliferation regime. The report presents the nuclear object authentication objectives and strategies, the methodology of the automatic optical authentication and results of the development of pattern recognition techniques carried out under the ISTC project #772 with the purpose of identification of unique features of surface structure of a controlled object and effects of its random treatment. The current decision of following functional control tasks is described in the report: confirmation of the item authenticity (proof of the absence of its substitution by an item of similar shape), control over unforeseen change of item state, control over unauthorized access to the item. The most important distinctive feature of all techniques is not comprehensive description of some properties of controlled item, but unique identification of item using minimum necessary set of parameters, properly comprising identification attribute of the item. The main emphasis in the technical approach is made on the development of rather simple technological methods for the first time intended for use in the systems of physical protection, control and accounting of nuclear materials. The developed authentication devices and system are described.
Edmands, William M B; Barupal, Dinesh K; Scalbert, Augustin
2015-03-01
MetMSLine represents a complete collection of functions in the R programming language as an accessible GUI for biomarker discovery in large-scale liquid-chromatography high-resolution mass spectral datasets from acquisition through to final metabolite identification forming a backend to output from any peak-picking software such as XCMS. MetMSLine automatically creates subdirectories, data tables and relevant figures at the following steps: (i) signal smoothing, normalization, filtration and noise transformation (PreProc.QC.LSC.R); (ii) PCA and automatic outlier removal (Auto.PCA.R); (iii) automatic regression, biomarker selection, hierarchical clustering and cluster ion/artefact identification (Auto.MV.Regress.R); (iv) Biomarker-MS/MS fragmentation spectra matching and fragment/neutral loss annotation (Auto.MS.MS.match.R) and (v) semi-targeted metabolite identification based on a list of theoretical masses obtained from public databases (DBAnnotate.R). All source code and suggested parameters are available in an un-encapsulated layout on http://wmbedmands.github.io/MetMSLine/. Readme files and a synthetic dataset of both X-variables (simulated LC-MS data), Y-variables (simulated continuous variables) and metabolite theoretical masses are also available on our GitHub repository. © The Author 2014. Published by Oxford University Press.
Edmands, William M. B.; Barupal, Dinesh K.; Scalbert, Augustin
2015-01-01
Summary: MetMSLine represents a complete collection of functions in the R programming language as an accessible GUI for biomarker discovery in large-scale liquid-chromatography high-resolution mass spectral datasets from acquisition through to final metabolite identification forming a backend to output from any peak-picking software such as XCMS. MetMSLine automatically creates subdirectories, data tables and relevant figures at the following steps: (i) signal smoothing, normalization, filtration and noise transformation (PreProc.QC.LSC.R); (ii) PCA and automatic outlier removal (Auto.PCA.R); (iii) automatic regression, biomarker selection, hierarchical clustering and cluster ion/artefact identification (Auto.MV.Regress.R); (iv) Biomarker—MS/MS fragmentation spectra matching and fragment/neutral loss annotation (Auto.MS.MS.match.R) and (v) semi-targeted metabolite identification based on a list of theoretical masses obtained from public databases (DBAnnotate.R). Availability and implementation: All source code and suggested parameters are available in an un-encapsulated layout on http://wmbedmands.github.io/MetMSLine/. Readme files and a synthetic dataset of both X-variables (simulated LC–MS data), Y-variables (simulated continuous variables) and metabolite theoretical masses are also available on our GitHub repository. Contact: ScalbertA@iarc.fr PMID:25348215
Parametric diagnosis of the adaptive gas path in the automatic control system of the aircraft engine
NASA Astrophysics Data System (ADS)
Kuznetsova, T. A.
2017-01-01
The paper dwells on the adaptive multimode mathematical model of the gas-turbine aircraft engine (GTE) embedded in the automatic control system (ACS). The mathematical model is based on the throttle performances, and is characterized by high accuracy of engine parameters identification in stationary and dynamic modes. The proposed on-board engine model is the state space linearized low-level simulation. The engine health is identified by the influence of the coefficient matrix. The influence coefficient is determined by the GTE high-level mathematical model based on measurements of gas-dynamic parameters. In the automatic control algorithm, the sum of squares of the deviation between the parameters of the mathematical model and real GTE is minimized. The proposed mathematical model is effectively used for gas path defects detecting in on-line GTE health monitoring. The accuracy of the on-board mathematical model embedded in ACS determines the quality of adaptive control and reliability of the engine. To improve the accuracy of identification solutions and sustainability provision, the numerical method of Monte Carlo was used. The parametric diagnostic algorithm based on the LPτ - sequence was developed and tested. Analysis of the results suggests that the application of the developed algorithms allows achieving higher identification accuracy and reliability than similar models used in practice.
NASA Astrophysics Data System (ADS)
Ometto, Giovanni; Calivá, Francesco; Al-Diri, Bashir; Bek, Toke; Hunter, Andrew
2016-03-01
Automatic, quick and reliable identification of retinal landmarks from fundus photography is key for measurements used in research, diagnosis, screening and treating of common diseases affecting the eyes. This study presents a fast method for the detection of the centre of mass of the vascular arcades, optic nerve head (ONH) and fovea, used in the definition of five clinically relevant areas in use for screening programmes for diabetic retinopathy (DR). Thirty-eight fundus photographs showing 7203 DR lesions were analysed to find the landmarks manually by two retina-experts and automatically by the proposed method. The automatic identification of the ONH and fovea were performed using template matching based on normalised cross correlation. The centre of mass of the arcades was obtained by fitting an ellipse on sample coordinates of the main vessels. The coordinates were obtained by processing the image with hessian filtering followed by shape analyses and finally sampling the results. The regions obtained manually and automatically were used to count the retinal lesions falling within, and to evaluate the method. 92.7% of the lesions were falling within the same regions based on the landmarks selected by the two experts. 91.7% and 89.0% were counted in the same areas identified by the method and the first and second expert respectively. The inter-repeatability of the proposed method and the experts is comparable, while the 100% intra-repeatability makes the algorithm a valuable tool in tasks like analyses in real-time, of large datasets and of intra-patient variability.
6 CFR 37.19 - Machine readable technology on the driver's license or identification card.
Code of Federal Regulations, 2011 CFR
2011-01-01
..., States must use the ISO/IEC 15438:2006(E) Information Technology—Automatic identification and data... 6 Domestic Security 1 2011-01-01 2011-01-01 false Machine readable technology on the driver's..., Verification, and Card Issuance Requirements § 37.19 Machine readable technology on the driver's license or...
6 CFR 37.19 - Machine readable technology on the driver's license or identification card.
Code of Federal Regulations, 2010 CFR
2010-01-01
..., States must use the ISO/IEC 15438:2006(E) Information Technology—Automatic identification and data... 6 Domestic Security 1 2010-01-01 2010-01-01 false Machine readable technology on the driver's..., Verification, and Card Issuance Requirements § 37.19 Machine readable technology on the driver's license or...
Caller I.D. and ANI: The Technology and the Controversy.
ERIC Educational Resources Information Center
Bertot, John C.
1992-01-01
Examines telephone caller identification (caller-I.D.) and Automatic Number Identification (ANI) technology and discusses policy and privacy issues at the state and federal levels of government. A comparative analysis of state caller-I.D. adoption policies is presented, caller-I.D. blocking is discussed, costs are reported, and legal aspects of…
Lotto, A J; Kluender, K R
1998-05-01
When members of a series of synthesized stop consonants varying acoustically in F3 characteristics and varying perceptually from /da/ to /ga/ are preceded by /al/, subjects report hearing more /ga/ syllables relative to when each member is preceded by /ar/ (Mann, 1980). It has been suggested that this result demonstrates the existence of a mechanism that compensates for coarticulation via tacit knowledge of articulatory dynamics and constraints, or through perceptual recovery of vocal-tract dynamics. The present study was designed to assess the degree to which these perceptual effects are specific to qualities of human articulatory sources. In three experiments, series of consonant-vowel (CV) stimuli varying in F3-onset frequency (/da/-/ga/) were preceded by speech versions or nonspeech analogues of /al/ and /ar/. The effect of liquid identity on stop consonant labeling remained when the preceding VC was produced by a female speaker and the CV syllable was modeled after a male speaker's productions. Labeling boundaries also shifted when the CV was preceded by a sine wave glide modeled after F3 characteristics of /al/ and /ar/. Identifications shifted even when the preceding sine wave was of constant frequency equal to the offset frequency of F3 from a natural production. These results suggest an explanation in terms of general auditory processes as opposed to recovery of or knowledge of specific articulatory dynamics.
Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C
2015-12-01
The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Automated Dispersion and Orientation Analysis for Carbon Nanotube Reinforced Polymer Composites
Gao, Yi; Li, Zhuo; Lin, Ziyin; Zhu, Liangjia; Tannenbaum, Allen; Bouix, Sylvain; Wong, C.P.
2012-01-01
The properties of carbon nanotube (CNT)/polymer composites are strongly dependent on the dispersion and orientation of CNTs in the host matrix. Quantification of the dispersion and orientation of CNTs by microstructure observation and image analysis has been demonstrated as a useful way to understand the structure-property relationship of CNT/polymer composites. However, due to the various morphologies and large amount of CNTs in one image, automatic and accurate identification of CNTs has become the bottleneck for dispersion/orientation analysis. To solve this problem, shape identification is performed for each pixel in the filler identification step, so that individual CNT can be exacted from images automatically. The improved filler identification enables more accurate analysis of CNT dispersion and orientation. The obtained dispersion index and orientation index of both synthetic and real images from model compounds correspond well with the observations. Moreover, these indices help to explain the electrical properties of CNT/Silicone composite, which is used as a model compound. This method can also be extended to other polymer composites with high aspect ratio fillers. PMID:23060008
González-Vidal, Juan José; Pérez-Pueyo, Rosanna; Soneira, María José; Ruiz-Moreno, Sergio
2015-03-01
A new method has been developed to automatically identify Raman spectra, whether they correspond to single- or multicomponent spectra. The method requires no user input or judgment. There are thus no parameters to be tweaked. Furthermore, it provides a reliability factor on the resulting identification, with the aim of becoming a useful support tool for the analyst in the decision-making process. The method relies on the multivariate techniques of principal component analysis (PCA) and independent component analysis (ICA), and on some metrics. It has been developed for the application of automated spectral analysis, where the analyzed spectrum is provided by a spectrometer that has no previous knowledge of the analyzed sample, meaning that the number of components in the sample is unknown. We describe the details of this method and demonstrate its efficiency by identifying both simulated spectra and real spectra. The method has been applied to artistic pigment identification. The reliable and consistent results that were obtained make the methodology a helpful tool suitable for the identification of pigments in artwork or in paint in general.
Hybrid neuro-fuzzy approach for automatic vehicle license plate recognition
NASA Astrophysics Data System (ADS)
Lee, Hsi-Chieh; Jong, Chung-Shi
1998-03-01
Most currently available vehicle identification systems use techniques such as R.F., microwave, or infrared to help identifying the vehicle. Transponders are usually installed in the vehicle in order to transmit the corresponding information to the sensory system. It is considered expensive to install a transponder in each vehicle and the malfunction of the transponder will result in the failure of the vehicle identification system. In this study, novel hybrid approach is proposed for automatic vehicle license plate recognition. A system prototype is built which can be used independently or cooperating with current vehicle identification system in identifying a vehicle. The prototype consists of four major modules including the module for license plate region identification, the module for character extraction from the license plate, the module for character recognition, and the module for the SimNet neuro-fuzzy system. To test the performance of the proposed system, three hundred and eighty vehicle image samples are taken by a digital camera. The license plate recognition success rate of the prototype is approximately 91% while the character recognition success rate of the prototype is approximately 97%.
Automatic identification of inertial sensor placement on human body segments during walking
2013-01-01
Background Current inertial motion capture systems are rarely used in biomedical applications. The attachment and connection of the sensors with cables is often a complex and time consuming task. Moreover, it is prone to errors, because each sensor has to be attached to a predefined body segment. By using wireless inertial sensors and automatic identification of their positions on the human body, the complexity of the set-up can be reduced and incorrect attachments are avoided. We present a novel method for the automatic identification of inertial sensors on human body segments during walking. This method allows the user to place (wireless) inertial sensors on arbitrary body segments. Next, the user walks for just a few seconds and the segment to which each sensor is attached is identified automatically. Methods Walking data was recorded from ten healthy subjects using an Xsens MVN Biomech system with full-body configuration (17 inertial sensors). Subjects were asked to walk for about 6 seconds at normal walking speed (about 5 km/h). After rotating the sensor data to a global coordinate frame with x-axis in walking direction, y-axis pointing left and z-axis vertical, RMS, mean, and correlation coefficient features were extracted from x-, y- and z-components and magnitudes of the accelerations, angular velocities and angular accelerations. As a classifier, a decision tree based on the C4.5 algorithm was developed using Weka (Waikato Environment for Knowledge Analysis). Results and conclusions After testing the algorithm with 10-fold cross-validation using 31 walking trials (involving 527 sensors), 514 sensors were correctly classified (97.5%). When a decision tree for a lower body plus trunk configuration (8 inertial sensors) was trained and tested using 10-fold cross-validation, 100% of the sensors were correctly identified. This decision tree was also tested on walking trials of 7 patients (17 walking trials) after anterior cruciate ligament reconstruction, which also resulted in 100% correct identification, thus illustrating the robustness of the method. PMID:23517757
Automatic identification of inertial sensor placement on human body segments during walking.
Weenk, Dirk; van Beijnum, Bert-Jan F; Baten, Chris T M; Hermens, Hermie J; Veltink, Peter H
2013-03-21
Current inertial motion capture systems are rarely used in biomedical applications. The attachment and connection of the sensors with cables is often a complex and time consuming task. Moreover, it is prone to errors, because each sensor has to be attached to a predefined body segment. By using wireless inertial sensors and automatic identification of their positions on the human body, the complexity of the set-up can be reduced and incorrect attachments are avoided.We present a novel method for the automatic identification of inertial sensors on human body segments during walking. This method allows the user to place (wireless) inertial sensors on arbitrary body segments. Next, the user walks for just a few seconds and the segment to which each sensor is attached is identified automatically. Walking data was recorded from ten healthy subjects using an Xsens MVN Biomech system with full-body configuration (17 inertial sensors). Subjects were asked to walk for about 6 seconds at normal walking speed (about 5 km/h). After rotating the sensor data to a global coordinate frame with x-axis in walking direction, y-axis pointing left and z-axis vertical, RMS, mean, and correlation coefficient features were extracted from x-, y- and z-components and magnitudes of the accelerations, angular velocities and angular accelerations. As a classifier, a decision tree based on the C4.5 algorithm was developed using Weka (Waikato Environment for Knowledge Analysis). After testing the algorithm with 10-fold cross-validation using 31 walking trials (involving 527 sensors), 514 sensors were correctly classified (97.5%). When a decision tree for a lower body plus trunk configuration (8 inertial sensors) was trained and tested using 10-fold cross-validation, 100% of the sensors were correctly identified. This decision tree was also tested on walking trials of 7 patients (17 walking trials) after anterior cruciate ligament reconstruction, which also resulted in 100% correct identification, thus illustrating the robustness of the method.
Flow Pattern Identification of Horizontal Two-Phase Refrigerant Flow Using Neural Networks
2015-12-31
AFRL-RQ-WP-TP-2016-0079 FLOW PATTERN IDENTIFICATION OF HORIZONTAL TWO-PHASE REFRIGERANT FLOW USING NEURAL NETWORKS (POSTPRINT) Abdeel J...Journal Article Postprint 01 October 2013 – 22 June 2015 4. TITLE AND SUBTITLE FLOW PATTERN IDENTIFICATION OF HORIZONTAL TWO-PHASE REFRIGERANT FLOW USING...networks were used to automatically identify two-phase flow patterns for refrigerant R-134a flowing in a horizontal tube. In laboratory experiments
NASA Astrophysics Data System (ADS)
Peng, Bo; Zheng, Sifa; Liao, Xiangning; Lian, Xiaomin
2018-03-01
In order to achieve sound field reproduction in a wide frequency band, multiple-type speakers are used. The reproduction accuracy is not only affected by the signals sent to the speakers, but also depends on the position and the number of each type of speaker. The method of optimizing a mixed speaker array is investigated in this paper. A virtual-speaker weighting method is proposed to optimize both the position and the number of each type of speaker. In this method, a virtual-speaker model is proposed to quantify the increment of controllability of the speaker array when the speaker number increases. While optimizing a mixed speaker array, the gain of the virtual-speaker transfer function is used to determine the priority orders of the candidate speaker positions, which optimizes the position of each type of speaker. Then the relative gain of the virtual-speaker transfer function is used to determine whether the speakers are redundant, which optimizes the number of each type of speaker. Finally the virtual-speaker weighting method is verified by reproduction experiments of the interior sound field in a passenger car. The results validate that the optimum mixed speaker array can be obtained using the proposed method.
Revis, J; Galant, C; Fredouille, C; Ghio, A; Giovanni, A
2012-01-01
Widely studied in terms of perception, acoustics or aerodynamics, dysphonia stays nevertheless a speech phenomenon, closely related to the phonetic composition of the message conveyed by the voice. In this paper, we present a series of three works with the aim to understand the implications of the phonetic manifestation of dysphonia. Our first study proposes a new approach to the perceptual analysis of dysphonia (the phonetic labeling), which principle is to listen and evaluate each phoneme in a sentence separately. This study confirms the hypothesis of Laver that the dysphonia is not a constant noise added to the speech signal, but a discontinuous phenomenon, occurring on certain phonemes, based on the phonetic context. However, the burden of executing the task has led us to look to the techniques of automatic speaker recognition (ASR) to automate the procedure. With the collaboration of the LIA, we have developed a system for automatic classification of dysphonia from the techniques of ASR. This is the subject of our second study. The first results obtained with this system suggest that the unvoiced consonants show predominant performance in the task of automatic classification of dysphonia. This result is surprising since it is often assumed that dysphonia occurs only on laryngeal vibration. We started looking for explanations of this phenomenon and we present our assumptions and experiences in the third work we present.
On the road to somewhere: Brain potentials reflect language effects on motion event perception.
Flecken, Monique; Athanasopoulos, Panos; Kuipers, Jan Rouke; Thierry, Guillaume
2015-08-01
Recent studies have identified neural correlates of language effects on perception in static domains of experience such as colour and objects. The generalization of such effects to dynamic domains like motion events remains elusive. Here, we focus on grammatical differences between languages relevant for the description of motion events and their impact on visual scene perception. Two groups of native speakers of German or English were presented with animated videos featuring a dot travelling along a trajectory towards a geometrical shape (endpoint). English is a language with grammatical aspect in which attention is drawn to trajectory and endpoint of motion events equally. German, in contrast, is a non-aspect language which highlights endpoints. We tested the comparative perceptual saliency of trajectory and endpoint of motion events by presenting motion event animations (primes) followed by a picture symbolising the event (target): In 75% of trials, the animation was followed by a mismatching picture (both trajectory and endpoint were different); in 10% of trials, only the trajectory depicted in the picture matched the prime; in 10% of trials, only the endpoint matched the prime; and in 5% of trials both trajectory and endpoint were matching, which was the condition requiring a response from the participant. In Experiment 1 we recorded event-related brain potentials elicited by the picture in native speakers of German and native speakers of English. German participants exhibited a larger P3 wave in the endpoint match than the trajectory match condition, whereas English speakers showed no P3 amplitude difference between conditions. In Experiment 2 participants performed a behavioural motion matching task using the same stimuli as those used in Experiment 1. German and English participants did not differ in response times showing that motion event verbalisation cannot readily account for the difference in P3 amplitude found in the first experiment. We argue that, even in a non-verbal context, the grammatical properties of the native language and associated sentence-level patterns of event encoding influence motion event perception, such that attention is automatically drawn towards aspects highlighted by the grammar. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, J.F.; Ng, L.C.
1998-03-17
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, John F.; Ng, Lawrence C.
1998-01-01
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
[The maintenance of automatic analysers and associated documentation].
Adjidé, V; Fournier, P; Vassault, A
2010-12-01
The maintenance of automatic analysers and associated documentation taking part in the requirements of the ISO 15189 Standard and the French regulation as well have to be defined in the laboratory policy. The management of the periodic maintenance and documentation shall be implemented and fulfilled. The organisation of corrective maintenance has to be managed to avoid interruption of the task of the laboratory. The different recommendations concern the identification of materials including automatic analysers, the environmental conditions to take into account, the documentation provided by the manufacturer and documents prepared by the laboratory including procedures for maintenance.
Automatic integration of data from dissimilar sensors
NASA Astrophysics Data System (ADS)
Citrin, W. I.; Proue, R. W.; Thomas, J. W.
The present investigation is concerned with the automatic integration of radar and electronic support measures (ESM) sensor data, and with the development of a method for the automatical integration of identification friend or foe (IFF) and radar sensor data. On the basis of the two considered proojects, significant advances have been made in the areas of sensor data integration. It is pointed out that the log likelihood approach in sensor data correlation is appropriate for both similar and dissimilar sensor data. Attention is given to the real time integration of radar and ESM sensor data, and a radar ESM correlation simulation program.
Song, Dandan; Li, Ning; Liao, Lejian
2015-01-01
Due to the generation of enormous amounts of data at both lower costs as well as in shorter times, whole-exome sequencing technologies provide dramatic opportunities for identifying disease genes implicated in Mendelian disorders. Since upwards of thousands genomic variants can be sequenced in each exome, it is challenging to filter pathogenic variants in protein coding regions and reduce the number of missing true variants. Therefore, an automatic and efficient pipeline for finding disease variants in Mendelian disorders is designed by exploiting a combination of variants filtering steps to analyze the family-based exome sequencing approach. Recent studies on the Freeman-Sheldon disease are revisited and show that the proposed method outperforms other existing candidate gene identification methods.
Training Japanese listeners to identify English /r/ and /l/: A first report
Logan, John S.; Lively, Scott E.; Pisoni, David B.
2012-01-01
Native speakers of Japanese learning English generally have difficulty differentiating the phonemes /r/ and /l/, even after years of experience with English. Previous research that attempted to train Japanese listeners to distinguish this contrast using synthetic stimuli reported little success, especially when transfer to natural tokens containing /r/ and /l/ was tested. In the present study, a different training procedure that emphasized variability among stimulus tokens was used. Japanese subjects were trained in a minimal pair identification paradigm using multiple natural exemplars contrasting /r/ and /l/ from a variety of phonetic environments as stimuli. A pretest–posttest design containing natural tokens was used to assess the effects of training. Results from six subjects showed that the new procedure was more robust than earlier training techniques. Small but reliable differences in performance were obtained between pretest and posttest scores. The results demonstrate the importance of stimulus variability and task-related factors in training nonnative speakers to perceive novel phonetic contrasts that are not distinctive in their native language. PMID:2016438
MARTI: man-machine animation real-time interface
NASA Astrophysics Data System (ADS)
Jones, Christian M.; Dlay, Satnam S.
1997-05-01
The research introduces MARTI (man-machine animation real-time interface) for the realization of natural human-machine interfacing. The system uses simple vocal sound-tracks of human speakers to provide lip synchronization of computer graphical facial models. We present novel research in a number of engineering disciplines, which include speech recognition, facial modeling, and computer animation. This interdisciplinary research utilizes the latest, hybrid connectionist/hidden Markov model, speech recognition system to provide very accurate phone recognition and timing for speaker independent continuous speech, and expands on knowledge from the animation industry in the development of accurate facial models and automated animation. The research has many real-world applications which include the provision of a highly accurate and 'natural' man-machine interface to assist user interactions with computer systems and communication with one other using human idiosyncrasies; a complete special effects and animation toolbox providing automatic lip synchronization without the normal constraints of head-sets, joysticks, and skilled animators; compression of video data to well below standard telecommunication channel bandwidth for video communications and multi-media systems; assisting speech training and aids for the handicapped; and facilitating player interaction for 'video gaming' and 'virtual worlds.' MARTI has introduced a new level of realism to man-machine interfacing and special effect animation which has been previously unseen.
2008-12-01
AUTHOR(S) H.T. Kung, Chit-Kwan Lin, Chia-Yung Su, Dario Vlah, John Grieco, Mark Huggins, and Bruce Suter 5d. PROJECT NUMBER WCNA 5e. TASK NUMBER...APPLICATION H. T. Kung, Chit-Kwan Lin, Chia-Yung Su, Dario Vlah John Grieco†, Mark Huggins‡, Bruce Suter† Harvard University Air Force Research Lab†, Oasis...contributing its C7 processors used in our wireless testbed. REFERENCES [1] R. North, N. Browne, and L. Schiavone , “Joint tactical radio system - connecting
1989-08-18
for public release; distribution 2b. DECLASSIFICATION / DOWNGRADING SCHEDULE ufnl i mi ted. 4. PERFORMING ORGANIZATION REPORT NUMBER(S) S. MONITORING... ORGANIZATION REPORT NUMBER(S) 6a. NAME OF PERFORMING ORGANIZATION 6b OFFICE SYMBOL 7a. NAME OF MONITORING ORGANIZATION University of California (if...OF FUNDING/SPONSORING 8b OFFICE SYMBOL 9 PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER ORGANIZATION (If applicable) N00014-85-K0562 8c. ADDRESS (City
Code of Federal Regulations, 2011 CFR
2011-10-01
..., and 3.358-3.6 GHz. (a) Operation under the provisions of this section is limited to automatic vehicle identification systems (AVIS) which use swept frequency techniques for the purpose of automatically identifying transportation vehicles. (b) The field strength anywhere within the frequency range swept by the signal shall not...
Code of Federal Regulations, 2012 CFR
2012-10-01
..., and 3.358-3.6 GHz. (a) Operation under the provisions of this section is limited to automatic vehicle identification systems (AVIS) which use swept frequency techniques for the purpose of automatically identifying transportation vehicles. (b) The field strength anywhere within the frequency range swept by the signal shall not...
Code of Federal Regulations, 2014 CFR
2014-10-01
..., and 3.358-3.6 GHz. (a) Operation under the provisions of this section is limited to automatic vehicle identification systems (AVIS) which use swept frequency techniques for the purpose of automatically identifying transportation vehicles. (b) The field strength anywhere within the frequency range swept by the signal shall not...
Code of Federal Regulations, 2013 CFR
2013-10-01
..., and 3.358-3.6 GHz. (a) Operation under the provisions of this section is limited to automatic vehicle identification systems (AVIS) which use swept frequency techniques for the purpose of automatically identifying transportation vehicles. (b) The field strength anywhere within the frequency range swept by the signal shall not...
Director, Operational Test and Evaluation FY 2004 Annual Report
2004-01-01
HIGH) Space Based Radar (SBR) Sensor Fuzed Weapon (SFW) P3I (CBU-97/B) Small Diameter Bomb (SDB) Secure Mobile Anti-Jam Reliable Tactical Terminal...detection, identification, and sampling capability for both fixed-site and mobile operations. The system must automatically detect and identify up to ten...staffing within the Services. SYSTEM DESCRIPTION AND MISSION The Services envision JCAD as a hand-held device that automatically detects, identifies, and
Ferrández, Oscar; South, Brett R; Shen, Shuying; Friedlin, F Jeffrey; Samore, Matthew H; Meystre, Stéphane M
2012-07-27
The increased use and adoption of Electronic Health Records (EHR) causes a tremendous growth in digital information useful for clinicians, researchers and many other operational purposes. However, this information is rich in Protected Health Information (PHI), which severely restricts its access and possible uses. A number of investigators have developed methods for automatically de-identifying EHR documents by removing PHI, as specified in the Health Insurance Portability and Accountability Act "Safe Harbor" method.This study focuses on the evaluation of existing automated text de-identification methods and tools, as applied to Veterans Health Administration (VHA) clinical documents, to assess which methods perform better with each category of PHI found in our clinical notes; and when new methods are needed to improve performance. We installed and evaluated five text de-identification systems "out-of-the-box" using a corpus of VHA clinical documents. The systems based on machine learning methods were trained with the 2006 i2b2 de-identification corpora and evaluated with our VHA corpus, and also evaluated with a ten-fold cross-validation experiment using our VHA corpus. We counted exact, partial, and fully contained matches with reference annotations, considering each PHI type separately, or only one unique 'PHI' category. Performance of the systems was assessed using recall (equivalent to sensitivity) and precision (equivalent to positive predictive value) metrics, as well as the F(2)-measure. Overall, systems based on rules and pattern matching achieved better recall, and precision was always better with systems based on machine learning approaches. The highest "out-of-the-box" F(2)-measure was 67% for partial matches; the best precision and recall were 95% and 78%, respectively. Finally, the ten-fold cross validation experiment allowed for an increase of the F(2)-measure to 79% with partial matches. The "out-of-the-box" evaluation of text de-identification systems provided us with compelling insight about the best methods for de-identification of VHA clinical documents. The errors analysis demonstrated an important need for customization to PHI formats specific to VHA documents. This study informed the planning and development of a "best-of-breed" automatic de-identification application for VHA clinical text.
Wang, Hong-ping; Chen, Chang; Liu, Yan; Yang, Hong-Jun; Wu, Hong-Wei; Xiao, Hong-Bin
2015-11-01
The incomplete identification of the chemical components of traditional Chinese medicinal formula has been one of the bottlenecks in the modernization of traditional Chinese medicine. Tandem mass spectrometry has been widely used for the identification of chemical substances. Current automatic tandem mass spectrometry acquisition, where precursor ions were selected according to their signal intensity, encounters a drawback in chemical substances identification when samples contain many overlapping signals. Compounds in minor or trace amounts could not be identified because most tandem mass spectrometry information was lost. Herein, a molecular feature orientated precursor ion selection and tandem mass spectrometry structure elucidation method for complex Chinese medicine chemical constituent analysis was developed. The precursor ions were selected according to their two-dimensional characteristics of retention times and mass-to-charge ratio ranges from herbal compounds, so that all precursor ions from herbal compounds were included and more minor chemical constituents in Chinese medicine were identified. Compared to the conventional automatic tandem mass spectrometry setups, the approach is novel and can overcome the drawback for chemical substances identification. As an example, 276 compounds from the Chinese Medicine of Yi-Xin-Shu capsule were identified. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Speaker normalization for chinese vowel recognition in cochlear implants.
Luo, Xin; Fu, Qian-Jie
2005-07-01
Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.
Valderrama, Joaquin T; de la Torre, Angel; Alvarez, Isaac; Segura, Jose Carlos; Thornton, A Roger D; Sainz, Manuel; Vargas, Jose Luis
2014-05-01
The recording of the auditory brainstem response (ABR) is used worldwide for hearing screening purposes. In this process, a precise estimation of the most relevant components is essential for an accurate interpretation of these signals. This evaluation is usually carried out subjectively by an audiologist. However, the use of automatic methods for this purpose is being encouraged nowadays in order to reduce human evaluation biases and ensure uniformity among test conditions, patients, and screening personnel. This article describes a new method that performs automatic quality assessment and identification of the peaks, the fitted parametric peaks (FPP). This method is based on the use of synthesized peaks that are adjusted to the ABR response. The FPP is validated, on one hand, by an analysis of amplitudes and latencies measured manually by an audiologist and automatically by the FPP method in ABR signals recorded at different stimulation rates; and on the other hand, contrasting the performance of the FPP method with the automatic evaluation techniques based on the correlation coefficient, FSP, and cross correlation with a predefined template waveform by comparing the automatic evaluations of the quality of these methods with subjective evaluations provided by five experienced evaluators on a set of ABR signals of different quality. The results of this study suggest (a) that the FPP method can be used to provide an accurate parameterization of the peaks in terms of amplitude, latency, and width, and (b) that the FPP remains as the method that best approaches the averaged subjective quality evaluation, as well as provides the best results in terms of sensitivity and specificity in ABR signals validation. The significance of these findings and the clinical value of the FPP method are highlighted on this paper. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Oral-diadochokinesis rates across languages: English and Hebrew norms.
Icht, Michal; Ben-David, Boaz M
2014-01-01
Oro-facial and speech motor control disorders represent a variety of speech and language pathologies. Early identification of such problems is important and carries clinical implications. A common and simple tool for gauging the presence and severity of speech motor control impairments is oral-diadochokinesis (oral-DDK). Surprisingly, norms for adult performance are missing from the literature. The goals of this study were: (1) to establish a norm for oral-DDK rate for (young to middle-age) adult English speakers, by collecting data from the literature (five studies, N=141); (2) to investigate the possible effect of language (and culture) on oral-DDK performance, by analyzing studies conducted in other languages (five studies, N=140), alongside the English norm; and (3) to find a new norm for adult Hebrew speakers, by testing 115 speakers. We first offer an English norm with a mean of 6.2syllables/s (SD=.8), and a lower boundary of 5.4syllables/s that can be used to indicate possible abnormality. Next, we found significant differences between four tested languages (English, Portuguese, Farsi and Greek) in oral-DDK rates. Results suggest the need to set language and culture sensitive norms for the application of the oral-DDK task world-wide. Finally, we found the oral-DDK performance for adult Hebrew speakers to be 6.4syllables/s (SD=.8), not significantly different than the English norms. This implies possible phonological similarities between English and Hebrew. We further note that no gender effects were found in our study. We recommend using oral-DDK as an important tool in the speech language pathologist's arsenal. Yet, application of this task should be done carefully, comparing individual performance to a set norm within the specific language. Readers will be able to: (1) identify the Speech-Language Pathologist assessment process using the oral-DDK task, by comparing an individual performance to the present English norm, (2) describe the impact of language on oral-DDK performance, and (3) accurately detect Hebrew speakers' patients using this tool. Copyright © 2014 Elsevier Inc. All rights reserved.
Crop Identification Technolgy Assessment for Remote Sensing (CITARS). Volume 1: Task design plan
NASA Technical Reports Server (NTRS)
Hall, F. G.; Bizzell, R. M.
1975-01-01
A plan for quantifying the crop identification performances resulting from the remote identification of corn, soybeans, and wheat is described. Steps for the conversion of multispectral data tapes to classification results are specified. The crop identification performances resulting from the use of several basic types of automatic data processing techniques are compared and examined for significant differences. The techniques are evaluated also for changes in geographic location, time of the year, management practices, and other physical factors. The results of the Crop Identification Technology Assessment for Remote Sensing task will be applied extensively in the Large Area Crop Inventory Experiment.
Lu, Yingjie
2013-01-01
To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.
Shaheen, E; Mowafy, B; Politis, C; Jacobs, R
2017-12-01
Previous research proposed the use of the mandibular midline neurovascular canal structures as a forensic finger print. In their observer study, an average correct identification of 95% was reached which triggered this study. To present a semi-automatic computer recognition approach to replace the observers and to validate the accuracy of this newly proposed method. Imaging data from Computer Tomography (CT) and Cone Beam Computer Tomography (CBCT) of mandibles scanned at two different moments were collected to simulate an AM and PM situation where the first scan presented AM and the second scan was used to simulate PM. Ten cases with 20 scans were used to build a classifier which relies on voxel based matching and results with classification into one of two groups: "Unmatched" and "Matched". This protocol was then tested using five other scans out of the database. Unpaired t-testing was applied and accuracy of the computerized approach was determined. A significant difference was found between the "Unmatched" and "Matched" classes with means of 0.41 and 0.86 respectively. Furthermore, the testing phase showed an accuracy of 100%. The validation of this method pushes this protocol further to a fully automatic identification procedure for victim identification based on the mandibular midline canals structures only in cases with available AM and PM CBCT/CT data.
Lassahn, Gordon D.; Lancaster, Gregory D.; Apel, William A.; Thompson, Vicki S.
2013-01-08
Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture are described. According to one embodiment, an image portion identification method includes accessing data regarding an image depicting a plurality of biological substrates corresponding to at least one biological sample and indicating presence of at least one biological indicator within the biological sample and, using processing circuitry, automatically identifying a portion of the image depicting one of the biological substrates but not others of the biological substrates.
ERIC Educational Resources Information Center
Boger, Zvi; Kuflik, Tsvi; Shoval, Peretz; Shapira, Bracha
2001-01-01
Discussion of information filtering (IF) and information retrieval focuses on the use of an artificial neural network (ANN) as an alternative method for both IF and term selection and compares its effectiveness to that of traditional methods. Results show that the ANN relevance prediction out-performs the prediction of an IF system. (Author/LRW)
Plat, Rika; Lowie, Wander; de Bot, Kees
2017-01-01
Reaction time data have long been collected in order to gain insight into the underlying mechanisms involved in language processing. Means analyses often attempt to break down what factors relate to what portion of the total reaction time. From a dynamic systems theory perspective or an interaction dominant view of language processing, it is impossible to isolate discrete factors contributing to language processing, since these continually and interactively play a role. Non-linear analyses offer the tools to investigate the underlying process of language use in time, without having to isolate discrete factors. Patterns of variability in reaction time data may disclose the relative contribution of automatic (grapheme-to-phoneme conversion) processing and attention-demanding (semantic) processing. The presence of a fractal structure in the variability of a reaction time series indicates automaticity in the mental structures contributing to a task. A decorrelated pattern of variability will indicate a higher degree of attention-demanding processing. A focus on variability patterns allows us to examine the relative contribution of automatic and attention-demanding processing when a speaker is using the mother tongue (L1) or a second language (L2). A word naming task conducted in the L1 (Dutch) and L2 (English) shows L1 word processing to rely more on automatic spelling-to-sound conversion than L2 word processing. A word naming task with a semantic categorization subtask showed more reliance on attention-demanding semantic processing when using the L2. A comparison to L1 English data shows this was not only due to the amount of language use or language dominance, but also to the difference in orthographic depth between Dutch and English. An important implication of this finding is that when the same task is used to test and compare different languages, one cannot straightforwardly assume the same cognitive sub processes are involved to an equal degree using the same task in different languages.
NASA Astrophysics Data System (ADS)
Chen, Jin; Cheng, Wen; Lopresti, Daniel
2011-01-01
Since real data is time-consuming and expensive to collect and label, researchers have proposed approaches using synthetic variations for the tasks of signature verification, speaker authentication, handwriting recognition, keyword spotting, etc. However, the limitation of real data is particularly critical in the field of writer identification in that in forensics, adversaries cannot be expected to provide sufficient data to train a classifier. Therefore, it is unrealistic to always assume sufficient real data to train classifiers extensively for writer identification. In addition, this field differs from many others in that we strive to preserve as much inter-writer variations, but model-perturbed handwriting might break such discriminability among writers. Building on work described in another paper where human subjects were involved in calibrating realistic-looking transformation, we then measured the effects of incorporating perturbed handwriting into the training dataset. Experimental results justified our hypothesis that with limited real data, model-perturbed handwriting improved the performance of writer identification. Particularly, if only one single sample for each writer was available, incorporating perturbed data achieved a 36x performance gain.
Automatic Processing and the Unitization of Two Features.
1980-02-01
experiment, LaBerge (1973) showed that with practice two features could be automatically unitized to form a novel character. We wish to address a...different from a search for a target which requires identification of one of the features alone. Page 2 Indeed, LaBerge (1973) used a similar implicit...perception? Journal of Experimental Child Psychology, 1978, 26, 498-507. LaBerge , D. Attention and the measurement of perceptual learning. Memory and
Wolterink, Jelmer M; Leiner, Tim; de Vos, Bob D; Coatrieux, Jean-Louis; Kelm, B Michael; Kondo, Satoshi; Salgado, Rodrigo A; Shahzad, Rahil; Shu, Huazhong; Snoeren, Miranda; Takx, Richard A P; van Vliet, Lucas J; van Walsum, Theo; Willems, Tineke P; Yang, Guanyu; Zheng, Yefeng; Viergever, Max A; Išgum, Ivana
2016-05-01
The amount of coronary artery calcification (CAC) is a strong and independent predictor of cardiovascular disease (CVD) events. In clinical practice, CAC is manually identified and automatically quantified in cardiac CT using commercially available software. This is a tedious and time-consuming process in large-scale studies. Therefore, a number of automatic methods that require no interaction and semiautomatic methods that require very limited interaction for the identification of CAC in cardiac CT have been proposed. Thus far, a comparison of their performance has been lacking. The objective of this study was to perform an independent evaluation of (semi)automatic methods for CAC scoring in cardiac CT using a publicly available standardized framework. Cardiac CT exams of 72 patients distributed over four CVD risk categories were provided for (semi)automatic CAC scoring. Each exam consisted of a noncontrast-enhanced calcium scoring CT (CSCT) and a corresponding coronary CT angiography (CCTA) scan. The exams were acquired in four different hospitals using state-of-the-art equipment from four major CT scanner vendors. The data were divided into 32 training exams and 40 test exams. A reference standard for CAC in CSCT was defined by consensus of two experts following a clinical protocol. The framework organizers evaluated the performance of (semi)automatic methods on test CSCT scans, per lesion, artery, and patient. Five (semi)automatic methods were evaluated. Four methods used both CSCT and CCTA to identify CAC, and one method used only CSCT. The evaluated methods correctly detected between 52% and 94% of CAC lesions with positive predictive values between 65% and 96%. Lesions in distal coronary arteries were most commonly missed and aortic calcifications close to the coronary ostia were the most common false positive errors. The majority (between 88% and 98%) of correctly identified CAC lesions were assigned to the correct artery. Linearly weighted Cohen's kappa for patient CVD risk categorization by the evaluated methods ranged from 0.80 to 1.00. A publicly available standardized framework for the evaluation of (semi)automatic methods for CAC identification in cardiac CT is described. An evaluation of five (semi)automatic methods within this framework shows that automatic per patient CVD risk categorization is feasible. CAC lesions at ambiguous locations such as the coronary ostia remain challenging, but their detection had limited impact on CVD risk determination.
NASA Astrophysics Data System (ADS)
Sa, Qila; Wang, Zhihui
2018-03-01
At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
NASA Astrophysics Data System (ADS)
Csorba, Robert
2002-09-01
The Government Accounting Office found that the Navy, between 1996 and 1998, lost 3 billion in materiel in-transit. This thesis explores the benefits and cost of automatic identification and serial number tracking technologies under consideration by the Naval Supply Systems Command and the Naval Air Systems Command. Detailed cost-savings estimates are made for each aircraft type in the Navy inventory. Project and item managers of repairable components using Serial Number Tracking were surveyed as to the value of this system. It concludes that two thirds of the in-transit losses can be avoided with implementation of effective information technology-based logistics and maintenance tracking systems. Recommendations are made for specific steps and components of such an implementation. Suggestions are made for further research.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, J.F.; Ng, L.C.
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less
Blue-green color categorization in Mandarin-English speakers.
Wuerger, Sophie; Xiao, Kaida; Mylonas, Dimitris; Huang, Qingmei; Karatzas, Dimosthenis; Hird, Emily; Paramei, Galina
2012-02-01
Observers are faster to detect a target among a set of distracters if the targets and distracters come from different color categories. This cross-boundary advantage seems to be limited to the right visual field, which is consistent with the dominance of the left hemisphere for language processing [Gilbert et al., Proc. Natl. Acad. Sci. USA 103, 489 (2006)]. Here we study whether a similar visual field advantage is found in the color identification task in speakers of Mandarin, a language that uses a logographic system. Forty late Mandarin-English bilinguals performed a blue-green color categorization task, in a blocked design, in their first language (L1: Mandarin) or second language (L2: English). Eleven color singletons ranging from blue to green were presented for 160 ms, randomly in the left visual field (LVF) or right visual field (RVF). Color boundary and reaction times (RTs) at the color boundary were estimated in L1 and L2, for both visual fields. We found that the color boundary did not differ between the languages; RTs at the color boundary, however, were on average more than 100 ms shorter in the English compared to the Mandarin sessions, but only when the stimuli were presented in the RVF. The finding may be explained by the script nature of the two languages: Mandarin logographic characters are analyzed visuospatially in the right hemisphere, which conceivably facilitates identification of color presented to the LVF. © 2012 Optical Society of America
Qi, Beier; Liu, Bo; Liu, Sha; Liu, Haihong; Dong, Ruijuan; Zhang, Ning; Gong, Shusheng
2011-05-01
To study the effect of cochlear electrode coverage and different insertion region on speech recognition, especially tone perception of cochlear implant users whose native language is Mandarin Chinese. Setting seven test conditions by fitting software. All conditions were created by switching on/off respective channels in order to simulate different insertion position. Then Mandarin CI users received 4 Speech tests, including Vowel Identification test, Consonant Identification test, Tone Identification test-male speaker, Mandarin HINT test (SRS) in quiet and noise. To all test conditions: the average score of vowel identification was significantly different, from 56% to 91% (Rank sum test, P < 0.05). The average score of consonant identification was significantly different, from 72% to 85% (ANOVNA, P < 0.05). The average score of Tone identification was not significantly different (ANOVNA, P > 0.05). However the more channels activated, the higher scores obtained, from 68% to 81%. This study shows that there is a correlation between insertion depth and speech recognition. Because all parts of the basement membrane can help CI users to improve their speech recognition ability, it is very important to enhance verbal communication ability and social interaction ability of CI users by increasing insertion depth and actively stimulating the top region of cochlear.
77 FR 28923 - Shipping Coordinating Committee; Notice of Committee Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-16
... new symbols for Automatic Identification System (AIS) aids to navigation --Casualty analysis..., parking in the vicinity of the building is extremely limited. Additional information regarding this and...
Automatic identification of alpine mass movements based on seismic and infrasound signals
NASA Astrophysics Data System (ADS)
Schimmel, Andreas; Hübl, Johannes
2017-04-01
The automatic detection and identification of alpine mass movements like debris flows, debris floods or landslides gets increasing importance for mitigation measures in the densely populated and intensively used alpine regions. Since this mass movement processes emits characteristically seismic and acoustic waves in the low frequency range this events can be detected and identified based on this signals. So already several approaches for detection and warning systems based on seismic or infrasound signals has been developed. But a combination of both methods, which can increase detection probability and reduce false alarms is currently used very rarely and can serve as a promising method for developing an automatic detection and identification system. So this work presents an approach for a detection and identification system based on a combination of seismic and infrasound sensors, which can detect sediment related mass movements from a remote location unaffected by the process. The system is based on one infrasound sensor and one geophone which are placed co-located and a microcontroller where a specially designed detection algorithm is executed which can detect mass movements in real time directly at the sensor site. Further this work tries to get out more information from the seismic and infrasound spectrum produced by different sediment related mass movements to identify the process type and estimate the magnitude of the event. The system is currently installed and tested on five test sites in Austria, two in Italy and one in Switzerland as well as one in Germany. This high number of test sites is used to get a large database of very different events which will be the basis for a new identification method for alpine mass movements. These tests shows promising results and so this system provides an easy to install and inexpensive approach for a detection and warning system.
Identification of Cichlid Fishes from Lake Malawi Using Computer Vision
Joo, Deokjin; Kwan, Ye-seul; Song, Jongwoo; Pinho, Catarina; Hey, Jody; Won, Yong-Jin
2013-01-01
Background The explosively radiating evolution of cichlid fishes of Lake Malawi has yielded an amazing number of haplochromine species estimated as many as 500 to 800 with a surprising degree of diversity not only in color and stripe pattern but also in the shape of jaw and body among them. As these morphological diversities have been a central subject of adaptive speciation and taxonomic classification, such high diversity could serve as a foundation for automation of species identification of cichlids. Methodology/Principal Finding Here we demonstrate a method for automatic classification of the Lake Malawi cichlids based on computer vision and geometric morphometrics. For this end we developed a pipeline that integrates multiple image processing tools to automatically extract informative features of color and stripe patterns from a large set of photographic images of wild cichlids. The extracted information was evaluated by statistical classifiers Support Vector Machine and Random Forests. Both classifiers performed better when body shape information was added to the feature of color and stripe. Besides the coloration and stripe pattern, body shape variables boosted the accuracy of classification by about 10%. The programs were able to classify 594 live cichlid individuals belonging to 12 different classes (species and sexes) with an average accuracy of 78%, contrasting to a mere 42% success rate by human eyes. The variables that contributed most to the accuracy were body height and the hue of the most frequent color. Conclusions Computer vision showed a notable performance in extracting information from the color and stripe patterns of Lake Malawi cichlids although the information was not enough for errorless species identification. Our results indicate that there appears an unavoidable difficulty in automatic species identification of cichlid fishes, which may arise from short divergence times and gene flow between closely related species. PMID:24204918
Automatic Identification of Subtechniques in Skating-Style Roller Skiing Using Inertial Sensors
Sakurai, Yoshihisa; Fujita, Zenya; Ishige, Yusuke
2016-01-01
This study aims to develop and validate an automated system for identifying skating-style cross-country subtechniques using inertial sensors. In the first experiment, the performance of a male cross-country skier was used to develop an automated identification system. In the second, eight male and seven female college cross-country skiers participated to validate the developed identification system. Each subject wore inertial sensors on both wrists and both roller skis, and a small video camera on a backpack. All subjects skied through a 3450 m roller ski course using a skating style at their maximum speed. The adopted subtechniques were identified by the automated method based on the data obtained from the sensors, as well as by visual observations from a video recording of the same ski run. The system correctly identified 6418 subtechniques from a total of 6768 cycles, which indicates an accuracy of 94.8%. The precisions of the automatic system for identifying the V1R, V1L, V2R, V2L, V2AR, and V2AL subtechniques were 87.6%, 87.0%, 97.5%, 97.8%, 92.1%, and 92.0%, respectively. Most incorrect identification cases occurred during a subtechnique identification that included a transition and turn event. Identification accuracy can be improved by separately identifying transition and turn events. This system could be used to evaluate each skier’s subtechniques in course conditions. PMID:27049388
Transmit: An Advanced Traffic Management System
DOT National Transportation Integrated Search
1995-11-27
TRANSCOM'S SYSTEM FOR MANAGING INCIDENTS AND TRAFFIC, KNOWN AS TRANSMIT, WAS INITIATED TO ESTABLISH THE FEASIBILITY OF USING AUTOMATIC VEHICLE IDENTIFICATION (AVI) EQUIPMENT FOR TRAFFIC MANAGEMENT AND SURVEILLANCE APPLICATIONS. AVI TECHNOLOGY SYSTEMS...
Tasking and sharing sensing assets using controlled natural language
NASA Astrophysics Data System (ADS)
Preece, Alun; Pizzocaro, Diego; Braines, David; Mott, David
2012-06-01
We introduce an approach to representing intelligence, surveillance, and reconnaissance (ISR) tasks at a relatively high level in controlled natural language. We demonstrate that this facilitates both human interpretation and machine processing of tasks. More specically, it allows the automatic assignment of sensing assets to tasks, and the informed sharing of tasks between collaborating users in a coalition environment. To enable automatic matching of sensor types to tasks, we created a machine-processable knowledge representation based on the Military Missions and Means Framework (MMF), and implemented a semantic reasoner to match task types to sensor types. We combined this mechanism with a sensor-task assignment procedure based on a well-known distributed protocol for resource allocation. In this paper, we re-formulate the MMF ontology in Controlled English (CE), a type of controlled natural language designed to be readable by a native English speaker whilst representing information in a structured, unambiguous form to facilitate machine processing. We show how CE can be used to describe both ISR tasks (for example, detection, localization, or identication of particular kinds of object) and sensing assets (for example, acoustic, visual, or seismic sensors, mounted on motes or unmanned vehicles). We show how these representations enable an automatic sensor-task assignment process. Where a group of users are cooperating in a coalition, we show how CE task summaries give users in the eld a high-level picture of ISR coverage of an area of interest. This allows them to make ecient use of sensing resources by sharing tasks.
Xiao, Bo; Huang, Chewei; Imel, Zac E; Atkins, David C; Georgiou, Panayiotis; Narayanan, Shrikanth S
2016-04-01
Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.
Xiao, Bo; Huang, Chewei; Imel, Zac E.; Atkins, David C.; Georgiou, Panayiotis; Narayanan, Shrikanth S.
2016-01-01
Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy—a key therapy quality index—from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training. PMID:28286867
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review
NASA Astrophysics Data System (ADS)
Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH
2017-09-01
This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
Investigations into the Properties, Conditions, and Effects of the Ionosphere
1990-01-15
ionogram database to be used in testing trace-identification algorithms; d. Development of automatic trace-identification algorithms and autoscaling ...Scaler ( ARTIST ) and improvement of the ARTIST software; g. Maintenance and upgrade of the digital ionosondes at Argentia, Newfoundland, and Goose Bay...provided by the contractor; j. Upgrade of the ARTIST computer at the Danish Meteorological Institute/GL Qaanaaq site to provide digisonde tape-playback
Automatic identification and location technology of glass insulator self-shattering
NASA Astrophysics Data System (ADS)
Huang, Xinbo; Zhang, Huiying; Zhang, Ye
2017-11-01
The insulator of transmission lines is one of the most important infrastructures, which is vital to ensure the safe operation of transmission lines under complex and harsh operating conditions. The glass insulator often self-shatters but the available identification methods are inefficient and unreliable. Then, an automatic identification and localization technology of self-shattered glass insulators is proposed, which consists of the cameras installed on the tower video monitoring devices or the unmanned aerial vehicles, the 4G/OPGW network, and the monitoring center, where the identification and localization algorithm is embedded into the expert software. First, the images of insulators are captured by cameras, which are processed to identify the region of insulator string by the presented identification algorithm of insulator string. Second, according to the characteristics of the insulator string image, a mathematical model of the insulator string is established to estimate the direction and the length of the sliding blocks. Third, local binary pattern histograms of the template and the sliding block are extracted, by which the self-shattered insulator can be recognized and located. Finally, a series of experiments is fulfilled to verify the effectiveness of the algorithm. For single insulator images, Ac, Pr, and Rc of the algorithm are 94.5%, 92.38%, and 96.78%, respectively. For double insulator images, Ac, Pr, and Rc are 90.00%, 86.36%, and 93.23%, respectively.
78 FR 32699 - Shipping Coordinating Committee; Notice of Committee Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-31
...)) --Revision of the Guidelines for the onboard operational use of shipborne automatic identification systems... transportation is not generally available). However, parking in the vicinity of the building is limited...
Speaking fundamental frequency and vowel formant frequencies: effects on perception of gender.
Gelfer, Marylou Pausewang; Bennett, Quinn E
2013-09-01
The purpose of the present study was to investigate the contribution of vowel formant frequencies to gender identification in connected speech, the distinctiveness of vowel formants in males versus females, and how ambiguous speaking fundamental frequencies (SFFs) and vowel formants might affect perception of gender. Multivalent experimental. Speakers subjects (eight tall males, eight short females, and seven males and seven females of "middle" height) were recorded saying two carrier phrases to elicit the vowels /i/ and /α/ and a sentence. The gender/height groups were selected to (presumably) maximize formant differences between some groups (tall vs short) and minimize differences between others (middle height). Each subjects' samples were digitally altered to distinct SFFs (116, 145, 155, 165, and 207 Hz) to represent SFFs typical of average males, average females, and in an ambiguous range. Listeners judged the gender of each randomized altered speech sample. Results indicated that female speakers were perceived as female even with an SFF in the typical male range. For male speakers, gender perception was less accurate at SFFs of 165 Hz and higher. Although the ranges of vowel formants had considerable overlap between genders, significant differences in formant frequencies of males and females were seen. Vowel formants appeared to be important to perception of gender, especially for SFFs in the range of 145-165 Hz; however, formants may be a more salient cue in connected speech when compared with isolated vowels or syllables. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Effect of gender on communication of health information to older adults.
Dearborn, Jennifer L; Panzer, Victoria P; Burleson, Joseph A; Hornung, Frederick E; Waite, Harrison; Into, Frances H
2006-04-01
To examine the effect of gender on three key elements of communication with elderly individuals: effectiveness of the communication, perceived relevance to the individual, and effect of gender-stereotyped content. Survey. University of Connecticut Health Center. Thirty-three subjects (17 female); aged 69 to 91 (mean+/-standard deviation 82+/-5.4). Older adults listened to 16 brief narratives randomized in order and by the sex of the speaker (Narrator Voice). Effectiveness was measured according to ability to identify key features (Risks), and subjects were asked to rate the relevance (Plausibility). Number of Risks detected and determinations of plausibility were analyzed according to Subject Gender and Narrator Voice. Narratives were written for either sex or included male or female bias (Neutral or Stereotyped). Female subjects identified a significantly higher number of Risks across all narratives (P=.01). Subjects perceived a significantly higher number of Risks with a female Narrator Voice (P=.03). A significant Voice-by-Stereotype interaction was present for female-stereotyped narratives (P=.009). In narratives rated as Plausible, subjects detected more Risks (P=.02). Subject Gender influenced communication effectiveness. A female speaker resulted in identification of more Risks for subjects of both sexes, particularly for Stereotyped narratives. There was no significant effect of matching Subject Gender and Narrator Voice. This study suggests that the sex of the speaker influences the effectiveness of communication with older adults. These findings should motivate future research into the means by which medical providers can improve communication with their patients.
Lee, Soomin; Katsuura, Tetsuo; Shimomura, Yoshihiro
2011-01-01
In recent years, a new type of speaker called the parametric speaker has been used to generate highly directional sound, and these speakers are now commercially available. In our previous study, we verified that the burden of the parametric speaker was lower than that of the general speaker for endocrine functions. However, nothing has yet been demonstrated about the effects of the shorter distance than 2.6 m between parametric speakers and the human body. Therefore, we investigated the distance effect on endocrinological function and subjective evaluation. Nine male subjects participated in this study. They completed three consecutive sessions: a 20-min quiet period as a baseline, a 30-min mental task period with general speakers or parametric speakers, and a 20-min recovery period. We measured salivary cortisol and chromogranin A (CgA) concentrations. Furthermore, subjects took the Kwansei-gakuin Sleepiness Scale (KSS) test before and after the task and also a sound quality evaluation test after it. Four experiments, one with a speaker condition (general speaker and parametric speaker), the other with a distance condition (0.3 m and 1.0 m), were conducted, respectively, at the same time of day on separate days. We used three-way repeated measures ANOVA (speaker factor × distance factor × time factor) to examine the effects of the parametric speaker. We found that the endocrinological functions were not significantly different between the speaker condition and the distance condition. The results also showed that the physiological burdens increased with progress in time independent of the speaker condition and distance condition.
Wolf, M; Miller, L; Donnelly, K
2000-01-01
The most important implication of the double-deficit hypothesis (Wolf & Bowers, in this issue) concerns a new emphasis on fluency and automaticity in intervention for children with developmental reading disabilities. The RAVE-O (Retrieval, Automaticity, Vocabulary Elaboration, Orthography) program is an experimental, fluency-based approach to reading intervention that is designed to accompany a phonological analysis program. In an effort to address multiple possible sources of dysfluency in readers with disabilities, the program involves comprehensive emphases both on fluency in word attack, word identification, and comprehension and on automaticity in underlying componential processes (e.g., phonological, orthographic, semantic, and lexical retrieval skills). The goals, theoretical principles, and applied activities of the RAVE-O curriculum are described with particular stress on facilitating the development of rapid orthographic pattern recognition and on changing children's attitudes toward language.
The Communication of Public Speaking Anxiety: Perceptions of Asian and American Speakers.
ERIC Educational Resources Information Center
Martini, Marianne; And Others
1992-01-01
Finds that U.S. audiences perceive Asian speakers to have more speech anxiety than U.S. speakers, even though Asian speakers do not self-report higher anxiety levels. Confirms that speech state anxiety is not communicated effectively between speakers and audiences for Asian or U.S. speakers. (SR)
An Investigation of Syntactic Priming among German Speakers at Varying Proficiency Levels
ERIC Educational Resources Information Center
Ruf, Helena T.
2011-01-01
This dissertation investigates syntactic priming in second language (L2) development among three speaker populations: (1) less proficient L2 speakers; (2) advanced L2 speakers; and (3) LI speakers. Using confederate scripting this study examines how German speakers choose certain word orders in locative constructions (e.g., "Auf dem Tisch…
Modeling Speaker Proficiency, Comprehensibility, and Perceived Competence in a Language Use Domain
ERIC Educational Resources Information Center
Schmidgall, Jonathan Edgar
2013-01-01
Research suggests that listener perceptions of a speaker's oral language use, or a speaker's "comprehensibility," may be influenced by a variety of speaker-, listener-, and context-related factors. Primary speaker factors include aspects of the speaker's proficiency in the target language such as pronunciation and…
Methods and apparatus for non-acoustic speech characterization and recognition
Holzrichter, John F.
1999-01-01
By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Methods and apparatus for non-acoustic speech characterization and recognition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, J.F.
By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
76 FR 19176 - Shipping Coordinating Committee; Notice of Committee Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2011-04-06
... (SOLAS) regulation V/22 --Development of policy and new symbols for Automatic Identification System (AIS... transportation is not generally available). However, parking in the vicinity of the building is extremely limited...
Better service, greater efficiency : transit management for demand response systems
DOT National Transportation Integrated Search
1999-01-01
This brochure briefly describes different technologies which can enhance demand response transit systems. It covers automated scheduling and dispatching, mobile data terminals, electronic identification cards, automatic vehicle location, and geograph...
Nikolic, Dejan; Stojkovic, Nikola; Lekic, Nikola
2018-04-09
To obtain the complete operational picture of the maritime situation in the Exclusive Economic Zone (EEZ) which lies over the horizon (OTH) requires the integration of data obtained from various sensors. These sensors include: high frequency surface-wave-radar (HFSWR), satellite automatic identification system (SAIS) and land automatic identification system (LAIS). The algorithm proposed in this paper utilizes radar tracks obtained from the network of HFSWRs, which are already processed by a multi-target tracking algorithm and associates SAIS and LAIS data to the corresponding radar tracks, thus forming an integrated data pair. During the integration process, all HFSWR targets in the vicinity of AIS data are evaluated and the one which has the highest matching factor is used for data association. On the other hand, if there is multiple AIS data in the vicinity of a single HFSWR track, the algorithm still makes only one data pair which consists of AIS and HFSWR data with the highest mutual matching factor. During the design and testing, special attention is given to the latency of AIS data, which could be very high in the EEZs of developing countries. The algorithm is designed, implemented and tested in a real working environment. The testing environment is located in the Gulf of Guinea and includes a network of HFSWRs consisting of two HFSWRs, several coastal sites with LAIS receivers and SAIS data provided by provider of SAIS data.
Oßmann, Barbara E; Sarau, George; Schmitt, Sebastian W; Holtmannspötter, Heinrich; Christiansen, Silke H; Dicke, Wilhelm
2017-06-01
When analysing microplastics in food, due to toxicological reasons it is important to achieve clear identification of particles down to a size of at least 1 μm. One reliable, optical analytical technique allowing this is micro-Raman spectroscopy. After isolation of particles via filtration, analysis is typically performed directly on the filter surface. In order to obtain high qualitative Raman spectra, the material of the membrane filters should not show any interference in terms of background and Raman signals during spectrum acquisition. To facilitate the usage of automatic particle detection, membrane filters should also show specific optical properties. In this work, beside eight different, commercially available membrane filters, three newly designed metal-coated polycarbonate membrane filters were tested to fulfil these requirements. We found that aluminium-coated polycarbonate membrane filters had ideal characteristics as a substrate for micro-Raman spectroscopy. Its spectrum shows no or minimal interference with particle spectra, depending on the laser wavelength. Furthermore, automatic particle detection can be applied when analysing the filter surface under dark-field illumination. With this new membrane filter, analytics free of interference of microplastics down to a size of 1 μm becomes possible. Thus, an important size class of these contaminants can now be visualized and spectrally identified. Graphical abstract A newly developed aluminium coated polycarbonate membrane filter enables automatic particle detection and generation of high qualitative Raman spectra allowing identification of small microplastics.
Bergstra, Myrthe; DE Mulder, Hannah N M; Coopmans, Peter
2018-04-06
This study investigated how speaker certainty (a rational cue) and speaker benevolence (an emotional cue) influence children's willingness to learn words in a selective learning paradigm. In two experiments four- to six-year-olds learnt novel labels from two speakers and, after a week, their memory for these labels was reassessed. Results demonstrated that children retained the label-object pairings for at least a week. Furthermore, children preferred to learn from certain over uncertain speakers, but they had no significant preference for nice over nasty speakers. When the cues were combined, children followed certain speakers, even if they were nasty. However, children did prefer to learn from nice and certain speakers over nasty and certain speakers. These results suggest that rational cues regarding a speaker's linguistic competence trump emotional cues regarding a speaker's affective status in word learning. However, emotional cues were found to have a subtle influence on this process.
Automated Drug Identification for Urban Hospitals
NASA Technical Reports Server (NTRS)
Shirley, Donna L.
1971-01-01
Many urban hospitals are becoming overloaded with drug abuse cases requiring chemical analysis for identification of drugs. In this paper, the requirements for chemical analysis of body fluids for drugs are determined and a system model for automated drug analysis is selected. The system as modeled, would perform chemical preparation of samples, gas-liquid chromatographic separation of drugs in the chemically prepared samples, infrared spectrophotometric analysis of the drugs, and would utilize automatic data processing and control for drug identification. Requirements of cost, maintainability, reliability, flexibility, and operability are considered.
Chun, Audrey; Reinhardt, Joann P; Ramirez, Mildred; Ellis, Julie M; Silver, Stephanie; Burack, Orah; Eimicke, Joseph P; Cimarolli, Verena; Teresi, Jeanne A
2017-12-01
To examine agreement between Minimum Data Set clinician ratings and researcher assessments of depression among ethnically diverse nursing home residents using the 9-item Patient Health Questionnaire. Although depression is common among nursing homes residents, its recognition remains a challenge. Observational baseline data from a longitudinal intervention study. Sample of 155 residents from 12 long-term care units in one US facility; 50 were interviewed in Spanish. Convergence between clinician and researcher ratings was examined for (i) self-report capacity, (ii) suicidal ideation, (iii) at least moderate depression, (iv) Patient Health Questionnaire severity scores. Experiences by clinical raters using the depression assessment were analysed. The intraclass correlation coefficient was used to examine concordance and Cohen's kappa to examine agreement between clinicians and researchers. Moderate agreement (κ = 0.52) was observed in determination of capacity and poor to fair agreement in reporting suicidal ideation (κ = 0.10-0.37) across time intervals. Poor agreement was observed in classification of at least moderate depression (κ = -0.02 to 0.24), lower than the maximum kappa obtainable (0.58-0.85). Eight assessors indicated problems assessing Spanish-speaking residents. Among Spanish speakers, researchers identified 16% with Patient Health Questionnaire scores of 10 or greater, and 14% with thoughts of self-harm whilst clinicians identified 6% and 0%, respectively. This study advances the field of depression recognition in long-term care by identification of possible challenges in assessing Spanish speakers. Use of the Patient Health Questionnaire requires further investigation, particularly among non-English speakers. Depression screening for ethnically diverse nursing home residents is required, as underreporting of depression and suicidal ideation among Spanish speakers may result in lack of depression recognition and referral for evaluation and treatment. Training in depression recognition is imperative to improve the recognition, evaluation and treatment of depression in older people living in nursing homes. © 2017 John Wiley & Sons Ltd.
Gayle, Alberto Alexander; Shimaoka, Motomu
2017-01-01
The predominance of English in scientific research has created hurdles for "non-native speakers" of English. Here we present a novel application of native language identification (NLI) for the assessment of medical-scientific writing. For this purpose, we created a novel classification system whereby scoring would be based solely on text features found to be distinctive among native English speakers (NS) within a given context. We dubbed this the "Genuine Index" (GI). This methodology was validated using a small set of journals in the field of pediatric oncology. Our dataset consisted of 5,907 abstracts, representing work from 77 countries. A support vector machine (SVM) was used to generate our model and for scoring. Accuracy, precision, and recall of the classification model were 93.3%, 93.7%, and 99.4%, respectively. Class specific F-scores were 96.5% for NS and 39.8% for our benchmark class, Japan. Overall kappa was calculated to be 37.2%. We found significant differences between countries with respect to the GI score. Significant correlation was found between GI scores and two validated objective measures of writing proficiency and readability. Two sets of key terms and phrases differentiating NS and non-native writing were identified. Our GI model was able to detect, with a high degree of reliability, subtle differences between the terms and phrasing used by native and non-native speakers in peer reviewed journals, in the field of pediatric oncology. In addition, L1 language transfer was found to be very likely to survive revision, especially in non-Western countries such as Japan. These findings show that even when the language used is technically correct, there may still be some phrasing or usage that impact quality.
Speaker's voice as a memory cue.
Campeanu, Sandra; Craik, Fergus I M; Alain, Claude
2015-02-01
Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect reflective of voice congruency is currently lacking. Copyright © 2014 Elsevier B.V. All rights reserved.
Callan, Daniel E.; Jones, Jeffery A.; Callan, Akiko
2014-01-01
Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex (PMC) has been shown to be active during both observation and execution of action (“Mirror System” properties), and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI) study, participants identified vowels produced by a speaker in audio-visual (saw the speaker's articulating face and heard her voice), visual only (only saw the speaker's articulating face), and audio only (only heard the speaker's voice) conditions with varying audio signal-to-noise ratios in order to determine the regions of the PMC involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the functional magnetic resonance imaging (fMRI) analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and PMC. The left ventral inferior premotor cortex (PMvi) showed properties of multimodal (audio-visual) enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex (PMvs/PMd) did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the PMC are involved with mapping unimodal (in this case visual) sensory features of the speech signal with articulatory speech gestures. PMID:24860526
Improvements of ModalMax High-Fidelity Piezoelectric Audio Device
NASA Technical Reports Server (NTRS)
Woodard, Stanley E.
2005-01-01
ModalMax audio speakers have been enhanced by innovative means of tailoring the vibration response of thin piezoelectric plates to produce a high-fidelity audio response. The ModalMax audio speakers are 1 mm in thickness. The device completely supplants the need to have a separate driver and speaker cone. ModalMax speakers can perform the same applications of cone speakers, but unlike cone speakers, ModalMax speakers can function in harsh environments such as high humidity or extreme wetness. New design features allow the speakers to be completely submersed in salt water, making them well suited for maritime applications. The sound produced from the ModalMax audio speakers has sound spatial resolution that is readily discernable for headset users.
NASA Astrophysics Data System (ADS)
Chen, Jianbo; Guo, Baolin; Yan, Rui; Sun, Suqin; Zhou, Qun
2017-07-01
With the utilization of the hand-held equipment, Fourier transform infrared (FT-IR) spectroscopy is a promising analytical technique to minimize the time cost for the chemical identification of herbal materials. This research examines the feasibility of the hand-held FT-IR spectrometer for the on-site testing of herbal materials, using Lonicerae Japonicae Flos (LJF) and Lonicerae Flos (LF) as examples. Correlation-based linear discriminant models for LJF and LF are established based on the benchtop and hand-held FT-IR instruments. The benchtop FT-IR models can exactly recognize all articles of LJF and LF. Although a few LF articles are misjudged at the sub-class level, the hand-held FT-IR models are able to exactly discriminate LJF and LF. As a direct and label-free analytical technique, FT-IR spectroscopy has great potential in the rapid and automatic chemical identification of herbal materials either in laboratories or in fields. This is helpful to prevent the spread and use of adulterated herbal materials in time.
Chang, Son-A; Won, Jong Ho; Kim, HyangHee; Oh, Seung-Ha; Tyler, Richard S.; Cho, Chang Hyun
2018-01-01
Background and Objectives It is important to understand the frequency region of cues used, and not used, by cochlear implant (CI) recipients. Speech and environmental sound recognition by individuals with CI and normal-hearing (NH) was measured. Gradients were also computed to evaluate the pattern of change in identification performance with respect to the low-pass filtering or high-pass filtering cutoff frequencies. Subjects and Methods Frequency-limiting effects were implemented in the acoustic waveforms by passing the signals through low-pass filters (LPFs) or high-pass filters (HPFs) with seven different cutoff frequencies. Identification of Korean vowels and consonants produced by a male and female speaker and environmental sounds was measured. Crossover frequencies were determined for each identification test, where the LPF and HPF conditions show the identical identification scores. Results CI and NH subjects showed changes in identification performance in a similar manner as a function of cutoff frequency for the LPF and HPF conditions, suggesting that the degraded spectral information in the acoustic signals may similarly constraint the identification performance for both subject groups. However, CI subjects were generally less efficient than NH subjects in using the limited spectral information for speech and environmental sound identification due to the inefficient coding of acoustic cues through the CI sound processors. Conclusions This finding will provide vital information in Korean for understanding how different the frequency information is in receiving speech and environmental sounds by CI processor from normal hearing. PMID:29325391
Chang, Son-A; Won, Jong Ho; Kim, HyangHee; Oh, Seung-Ha; Tyler, Richard S; Cho, Chang Hyun
2017-12-01
It is important to understand the frequency region of cues used, and not used, by cochlear implant (CI) recipients. Speech and environmental sound recognition by individuals with CI and normal-hearing (NH) was measured. Gradients were also computed to evaluate the pattern of change in identification performance with respect to the low-pass filtering or high-pass filtering cutoff frequencies. Frequency-limiting effects were implemented in the acoustic waveforms by passing the signals through low-pass filters (LPFs) or high-pass filters (HPFs) with seven different cutoff frequencies. Identification of Korean vowels and consonants produced by a male and female speaker and environmental sounds was measured. Crossover frequencies were determined for each identification test, where the LPF and HPF conditions show the identical identification scores. CI and NH subjects showed changes in identification performance in a similar manner as a function of cutoff frequency for the LPF and HPF conditions, suggesting that the degraded spectral information in the acoustic signals may similarly constraint the identification performance for both subject groups. However, CI subjects were generally less efficient than NH subjects in using the limited spectral information for speech and environmental sound identification due to the inefficient coding of acoustic cues through the CI sound processors. This finding will provide vital information in Korean for understanding how different the frequency information is in receiving speech and environmental sounds by CI processor from normal hearing.
Spunt, Robert P; Lieberman, Matthew D
2013-01-01
Much social-cognitive processing is believed to occur automatically; however, the relative automaticity of the brain systems underlying social cognition remains largely undetermined. We used functional MRI to test for automaticity in the functioning of two brain systems that research has indicated are important for understanding other people's behavior: the mirror neuron system and the mentalizing system. Participants remembered either easy phone numbers (low cognitive load) or difficult phone numbers (high cognitive load) while observing actions after adopting one of four comprehension goals. For all four goals, mirror neuron system activation showed relatively little evidence of modulation by load; in contrast, the association of mentalizing system activation with the goal of inferring the actor's mental state was extinguished by increased cognitive load. These results support a dual-process model of the brain systems underlying action understanding and social cognition; the mirror neuron system supports automatic behavior identification, and the mentalizing system supports controlled social causal attribution.
Partially supervised speaker clustering.
Tang, Hao; Chu, Stephen Mingyu; Hasegawa-Johnson, Mark; Huang, Thomas S
2012-05-01
Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance.
A speech-controlled environmental control system for people with severe dysarthria.
Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca
2007-06-01
Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.
Shin, Young Hoon; Seo, Jiwon
2016-10-29
People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.
Automatic Adviser on stationary devices status identification and anticipated change
NASA Astrophysics Data System (ADS)
Shabelnikov, A. N.; Liabakh, N. N.; Gibner, Ya M.; Pushkarev, E. A.
2018-05-01
A task is defined to synthesize an Automatic Adviser to identify the automation systems stationary devices status using an autoregressive model of changing their key parameters. An applied model type was rationalized and the research objects monitoring process algorithm was developed. A complex of mobile objects status operation simulation and prediction results analysis was proposed. Research results are commented using a specific example of a hump yard compressor station. The work was supported by the Russian Fundamental Research Fund, project No. 17-20-01040.
Suba, Eric J; Pfeifer, John D; Raab, Stephen S
2007-10-01
Patient identification errors in surgical pathology often involve switches of prostate or breast needle core biopsy specimens among patients. We assessed strategies for decreasing the occurrence of these uncommon and yet potentially catastrophic events. Root cause analyses were performed following 3 cases of patient identification error involving prostate needle core biopsy specimens. Patient identification errors in surgical pathology result from slips and lapses of automatic human action that may occur at numerous steps during pre-laboratory, laboratory and post-laboratory work flow processes. Patient identification errors among prostate needle biopsies may be difficult to entirely prevent through the optimization of work flow processes. A DNA time-out, whereby DNA polymorphic microsatellite analysis is used to confirm patient identification before radiation therapy or radical surgery, may eliminate patient identification errors among needle biopsies.
ERIC Educational Resources Information Center
Subtirelu, Nicholas Close; Lindemann, Stephanie
2016-01-01
While most research in applied linguistics has focused on second language (L2) speakers and their language capabilities, the success of interaction between such speakers and first language (L1) speakers also relies on the positive attitudes and communication skills of the L1 speakers. However, some research has suggested that many L1 speakers lack…
Neural Network Design on the SRC-6 Reconfigurable Computer
2006-12-01
fingerprint identification. In this field, automatic identification methods are used to save time, especially for the purpose of fingerprint matching in...grid widths and lengths and therefore was useful in producing an accurate canvas with which to create sample training images. The added benefit of...tools available free of charge and readily accessible on the computer, it was simple to design bitmap data files visually on a canvas and then
Temporal and acoustic characteristics of Greek vowels produced by adults with cerebral palsy
NASA Astrophysics Data System (ADS)
Botinis, Antonis; Orfanidou, Ioanna; Fourakis, Marios; Fourakis, Marios
2005-09-01
The present investigation examined the temporal and spectral characteristics of Greek vowels as produced by speakers with intact (NO) versus cerebral palsy affected (CP) neuromuscular systems. Six NO and six CP native speakers of Greek produced the Greek vowels [i, e, a, o, u] in the first syllable of CVCV nonsense words in a short carrier phrase. Stress could be on either the first or second syllable. There were three female and three male speakers in each group. In terms of temporal characteristics, the results showed that: vowels produced by CP speakers were longer than vowels produced by NO speakers; stressed vowels were longer than unstressed vowels; vowels produced by female speakers were longer than vowels produced by male speakers. In terms of spectral characteristics the results showed that the vowel space of the CP speakers was smaller than that of the NO speakers. This is similar to the results recently reported by Liu et al. [J. Acoust. Soc. Am. 117, 3879-3889 (2005)] for CP speakers of Mandarin. There was also a reduction of the acoustic vowel space defined by unstressed vowels, but this reduction was much more pronounced in the vowel productions of CP speakers than NO speakers.
Consistency between verbal and non-verbal affective cues: a clue to speaker credibility.
Gillis, Randall L; Nilsen, Elizabeth S
2017-06-01
Listeners are exposed to inconsistencies in communication; for example, when speakers' words (i.e. verbal) are discrepant with their demonstrated emotions (i.e. non-verbal). Such inconsistencies introduce ambiguity, which may render a speaker to be a less credible source of information. Two experiments examined whether children make credibility discriminations based on the consistency of speakers' affect cues. In Experiment 1, school-age children (7- to 8-year-olds) preferred to solicit information from consistent speakers (e.g. those who provided a negative statement with negative affect), over novel speakers, to a greater extent than they preferred to solicit information from inconsistent speakers (e.g. those who provided a negative statement with positive affect) over novel speakers. Preschoolers (4- to 5-year-olds) did not demonstrate this preference. Experiment 2 showed that school-age children's ratings of speakers were influenced by speakers' affect consistency when the attribute being judged was related to information acquisition (speakers' believability, "weird" speech), but not general characteristics (speakers' friendliness, likeability). Together, findings suggest that school-age children are sensitive to, and use, the congruency of affect cues to determine whether individuals are credible sources of information.
Inferring speaker attributes in adductor spasmodic dysphonia: ratings from unfamiliar listeners.
Isetti, Derek; Xuereb, Linnea; Eadie, Tanya L
2014-05-01
To determine whether unfamiliar listeners' perceptions of speakers with adductor spasmodic dysphonia (ADSD) differ from control speakers on the parameters of relative age, confidence, tearfulness, and vocal effort and are related to speaker-rated vocal effort or voice-specific quality of life. Twenty speakers with ADSD (including 6 speakers with ADSD plus tremor) and 20 age- and sex-matched controls provided speech recordings, completed a voice-specific quality-of-life instrument (Voice Handicap Index; Jacobson et al., 1997), and rated their own vocal effort. Twenty listeners evaluated speech samples for relative age, confidence, tearfulness, and vocal effort using rating scales. Listeners judged speakers with ADSD as sounding significantly older, less confident, more tearful, and more effortful than control speakers (p < .01). Increased vocal effort was strongly associated with decreased speaker confidence (rs = .88-.89) and sounding more tearful (rs = .83-.85). Self-rated speaker effort was moderately related (rs = .45-.52) to listener impressions. Listeners' perceptions of confidence and tearfulness were also moderately associated with higher Voice Handicap Index scores (rs = .65-.70). Unfamiliar listeners judge speakers with ADSD more negatively than control speakers, with judgments extending beyond typical clinical measures. The results have implications for counseling and understanding the psychosocial effects of ADSD.
NASA Astrophysics Data System (ADS)
Dimakis, Nikolaos; Soldatos, John; Polymenakos, Lazaros; Sturm, Janienke; Neumann, Joachim; Casas, Josep R.
The CHIL Memory Jog service focuses on facilitating the collaboration of participants in meetings, lectures, presentations, and other human interactive events, occurring in indoor CHIL spaces. It exploits the whole set of the perceptual components that have been developed by the CHIL Consortium partners (e.g., person tracking, face identification, audio source localization, etc) along with a wide range of actuating devices such as projectors, displays, targeted audio devices, speakers, etc. The underlying set of perceptual components provides a constant flow of elementary contextual information, such as “person at location x0,y0”, “speech at location x0,y0”, information that alone is not of significant use. However, the CHIL Memory Jog service is accompanied by powerful situation identification techniques that fuse all the incoming information and creates complex states that drive the actuating logic.
Waaramaa, Teija; Leisiö, Timo
2013-01-01
The present study focused on voice quality and the perception of the basic emotions from speech samples in cross-cultural conditions. It was examined whether voice quality, cultural, or language background, age, or gender were related to the identification of the emotions. Professional actors (n2) and actresses (n2) produced non-sense sentences (n32) and protracted vowels (n8) expressing the six basic emotions, interest, and a neutral emotional state. The impact of musical interests on the ability to distinguish between emotions or valence (on an axis positivity – neutrality – negativity) from voice samples was studied. Listening tests were conducted on location in five countries: Estonia, Finland, Russia, Sweden, and the USA with 50 randomly chosen participants (25 males and 25 females) in each country. The participants (total N = 250) completed a questionnaire eliciting their background information and musical interests. The responses in the listening test and the questionnaires were statistically analyzed. Voice quality parameters and the share of the emotions and valence identified correlated significantly with each other for both genders. The percentage of emotions and valence identified was clearly above the chance level in each of the five countries studied, however, the countries differed significantly from each other for the identified emotions and the gender of the speaker. The samples produced by females were identified significantly better than those produced by males. Listener's age was a significant variable. Only minor gender differences were found for the identification. Perceptual confusion in the listening test between emotions seemed to be dependent on their similar voice production types. Musical interests tended to have a positive effect on the identification of the emotions. The results also suggest that identifying emotions from speech samples may be easier for those listeners who share a similar language or cultural background with the speaker. PMID:23801972
MASGOMAS PROJECT, New automatic-tool for cluster search on IR photometric surveys
NASA Astrophysics Data System (ADS)
Rübke, K.; Herrero, A.; Borissova, J.; Ramirez-Alegria, S.; García, M.; Marin-Franch, A.
2015-05-01
The Milky Way is expected to contain a large number of young massive (few x 1000 solar masses) stellar clusters, borne in dense cores of gas and dust. Yet, their known number remains small. We have started a programme to search for such clusters, MASGOMAS (MAssive Stars in Galactic Obscured MAssive clusterS). Initially, we selected promising candidates by means of visual inspection of infrared images. In a second phase of the project we have presented a semi-automatic method to search for obscured massive clusters that resulted in the identification of new massive clusters, like MASGOMAS-1 (with more than 10,000 solar masses) and MASGOMAS-4 (a double-cored association of about 3,000 solar masses). We have now developped a new automatic tool for MASGOMAS that allows the identification of a large number of massive cluster candidates from the 2MASS and VVV catalogues. Cluster candidates fulfilling criteria appropriated for massive OB stars are thus selected in an efficient and objective way. We present the results from this tool and the observations of the first selected cluster, and discuss the implications for the Milky Way structure.
49 CFR 599.303 - Agency disposition of dealer application for reimbursement.
Code of Federal Regulations, 2010 CFR
2010-10-01
... correct a non-conforming submission. (d) Electronic rejection. An application is automatically rejected... transaction, or identifies the vehicle identification number of a new or trade-in vehicle that was involved in...
The integrated manual and automatic control of complex flight systems
NASA Technical Reports Server (NTRS)
Schmidt, D. K.
1984-01-01
A unified control synthesis methodology for complex and/or non-conventional flight vehicles are developed. Prediction techniques for the handling characteristics of such vehicles and pilot parameter identification from experimental data are addressed.
Critical Assessment of Small Molecule Identification 2016: automated methods.
Schymanski, Emma L; Ruttkies, Christoph; Krauss, Martin; Brouard, Céline; Kind, Tobias; Dührkop, Kai; Allen, Felicity; Vaniya, Arpana; Verdegem, Dries; Böcker, Sebastian; Rousu, Juho; Shen, Huibin; Tsugawa, Hiroshi; Sajed, Tanvir; Fiehn, Oliver; Ghesquière, Bart; Neumann, Steffen
2017-03-27
The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification. The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways. The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .
Vignally, P; Fondi, G; Taggi, F; Pitidis, A
2011-03-31
In Italy the European Union Injury Database reports the involvement of chemical products in 0.9% of home and leisure accidents. The Emergency Department registry on domestic accidents in Italy and the Poison Control Centres record that 90% of cases of exposure to toxic substances occur in the home. It is not rare for the effects of chemical agents to be observed in hospitals, with a high potential risk of damage - the rate of this cause of hospital admission is double the domestic injury average. The aim of this study was to monitor the effects of injuries caused by caustic agents in Italy using automatic free-text recognition in Emergency Department medical databases. We created a Stata software program to automatically identify caustic or corrosive injury cases using an agent-specific list of keywords. We focused attention on the procedure's sensitivity and specificity. Ten hospitals in six regions of Italy participated in the study. The program identified 112 cases of injury by caustic or corrosive agents. Checking the cases by quality controls (based on manual reading of ED reports), we assessed 99 cases as true positive, i.e. 88.4% of the patients were automatically recognized by the software as being affected by caustic substances (99% CI: 80.6%- 96.2%), that is to say 0.59% (99% CI: 0.45%-0.76%) of the whole sample of home injuries, a value almost three times as high as that expected (p < 0.0001) from European codified information. False positives were 11.6% of the recognized cases (99% CI: 5.1%- 21.5%). Our automatic procedure for caustic agent identification proved to have excellent product recognition capacity with an acceptable level of excess sensitivity. Contrary to our a priori hypothesis, the automatic recognition system provided a level of identification of agents possessing caustic effects that was significantly much greater than was predictable on the basis of the values from current codifications reported in the European Database.
NASA Astrophysics Data System (ADS)
White, Jonathan; Panda, Brajendra
A major concern for computer system security is the threat from malicious insiders who target and abuse critical data items in the system. In this paper, we propose a solution to enable automatic identification of critical data items in a database by way of data dependency relationships. This identification of critical data items is necessary because insider threats often target mission critical data in order to accomplish malicious tasks. Unfortunately, currently available systems fail to address this problem in a comprehensive manner. It is more difficult for non-experts to identify these critical data items because of their lack of familiarity and due to the fact that data systems are constantly changing. By identifying the critical data items automatically, security engineers will be better prepared to protect what is critical to the mission of the organization and also have the ability to focus their security efforts on these critical data items. We have developed an algorithm that scans the database logs and forms a directed graph showing which items influence a large number of other items and at what frequency this influence occurs. This graph is traversed to reveal the data items which have a large influence throughout the database system by using a novel metric based formula. These items are critical to the system because if they are maliciously altered or stolen, the malicious alterations will spread throughout the system, delaying recovery and causing a much more malignant effect. As these items have significant influence, they are deemed to be critical and worthy of extra security measures. Our proposal is not intended to replace existing intrusion detection systems, but rather is intended to complement current and future technologies. Our proposal has never been performed before, and our experimental results have shown that it is very effective in revealing critical data items automatically.
Speaker Linking and Applications using Non-Parametric Hashing Methods
2016-09-08
clustering method based on hashing—canopy- clustering . We apply this method to a large corpus of speaker recordings, demonstrate performance tradeoffs...and compare to other hash- ing methods. Index Terms: speaker recognition, clustering , hashing, locality sensitive hashing. 1. Introduction We assume...speaker in our corpus. Second, given a QBE method, how can we perform speaker clustering —each clustering should be a single speaker, and a cluster should
On the Time Course of Vocal Emotion Recognition
Pell, Marc D.; Kotz, Sonja A.
2011-01-01
How quickly do listeners recognize emotions from a speaker's voice, and does the time course for recognition vary by emotion type? To address these questions, we adapted the auditory gating paradigm to estimate how much vocal information is needed for listeners to categorize five basic emotions (anger, disgust, fear, sadness, happiness) and neutral utterances produced by male and female speakers of English. Semantically-anomalous pseudo-utterances (e.g., The rivix jolled the silling) conveying each emotion were divided into seven gate intervals according to the number of syllables that listeners heard from sentence onset. Participants (n = 48) judged the emotional meaning of stimuli presented at each gate duration interval, in a successive, blocked presentation format. Analyses looked at how recognition of each emotion evolves as an utterance unfolds and estimated the “identification point” for each emotion. Results showed that anger, sadness, fear, and neutral expressions are recognized more accurately at short gate intervals than happiness, and particularly disgust; however, as speech unfolds, recognition of happiness improves significantly towards the end of the utterance (and fear is recognized more accurately than other emotions). When the gate associated with the emotion identification point of each stimulus was calculated, data indicated that fear (M = 517 ms), sadness (M = 576 ms), and neutral (M = 510 ms) expressions were identified from shorter acoustic events than the other emotions. These data reveal differences in the underlying time course for conscious recognition of basic emotions from vocal expressions, which should be accounted for in studies of emotional speech processing. PMID:22087275
Zare, Marzieh; Rezvani, Zahra; Benasich, April A
2016-07-01
This study assesses the ability of a novel, "automatic classification" approach to facilitate identification of infants at highest familial risk for language-learning disorders (LLD) and to provide converging assessments to enable earlier detection of developmental disorders that disrupt language acquisition. Network connectivity measures derived from 62-channel electroencephalogram (EEG) recording were used to identify selected features within two infant groups who differed on LLD risk: infants with a family history of LLD (FH+) and typically-developing infants without such a history (FH-). A support vector machine was deployed; global efficiency and global and local clustering coefficients were computed. A novel minimum spanning tree (MST) approach was also applied. Cross-validation was employed to assess the resultant classification. Infants were classified with about 80% accuracy into FH+ and FH- groups with 89% specificity and precision of 92%. Clustering patterns differed by risk group and MST network analysis suggests that FH+ infants' EEG complexity patterns were significantly different from FH- infants. The automatic classification techniques used here were shown to be both robust and reliable and should provide valuable information when applied to early identification of risk or clinical groups. The ability to identify infants at highest risk for LLD using "automatic classification" strategies is a novel convergent approach that may facilitate earlier diagnosis and remediation. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Travtek Evaluation Task C3: Camera Car Study
DOT National Transportation Integrated Search
1998-11-01
A "biometric" technology is an automatic method for the identification, or identity verification, of an individual based on physiological or behavioral characteristics. The primary objective of the study summarized in this tech brief was to make reco...
Superville, Pierre-Jean; Pižeta, Ivanka; Omanović, Dario; Billon, Gabriel
2013-08-15
Based on automatic on-line measurements on the Deûle River that showed daily variation of a peak around -0.56V (vs Ag|AgCl 3M), identification of Reduced Sulphur Species (RSS) in oxic waters was performed applying cathodic stripping voltammetry (CSV) with the hanging mercury drop electrode (HMDE). Pseudopolarographic studies accompanied with increasing concentrations of copper revealed the presence of elemental sulphur S(0), thioacetamide (TA) and reduced glutathione (GSH) as the main sulphur compounds in the Deûle River. In order to resolve these three species, a simple procedure was developed and integrated in an automatic on-line monitoring system. During one week monitoring with hourly measurements, GSH and S(0) exhibited daily cycles whereas no consequential pattern was observed for TA. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Ni, Y. Q.; Fan, K. Q.; Zheng, G.; Chan, T. H. T.; Ko, J. M.
2003-08-01
An automatic modal identification program is developed for continuous extraction of modal parameters of three cable-supported bridges in Hong Kong which are instrumented with a long-term monitoring system. The program employs the Complex Modal Indication Function (CMIF) algorithm to identify modal properties from continuous ambient vibration measurements in an on-line manner. By using the LabVIEW graphical programming language, the software realizes the algorithm in Virtual Instrument (VI) style. The applicability and implementation issues of the developed software are demonstrated by using one-year measurement data acquired from 67 channels of accelerometers deployed on the cable-stayed Ting Kau Bridge. With the continuously identified results, normal variability of modal vectors caused by varying environmental and operational conditions is observed. Such observation is very helpful for selection of appropriate measured modal vectors for structural health monitoring applications.
Use of AFIS for linking scenes of crime.
Hefetz, Ido; Liptz, Yakir; Vaturi, Shaul; Attias, David
2016-05-01
Forensic intelligence can provide critical information in criminal investigations - the linkage of crime scenes. The Automatic Fingerprint Identification System (AFIS) is an example of a technological improvement that has advanced the entire forensic identification field to strive for new goals and achievements. In one example using AFIS, a series of burglaries into private apartments enabled a fingerprint examiner to search latent prints from different burglary scenes against an unsolved latent print database. Latent finger and palm prints coming from the same source were associated with over than 20 cases. Then, by forensic intelligence and profile analysis the offender's behavior could be anticipated. He was caught, identified, and arrested. It is recommended to perform an AFIS search of LT/UL prints against current crimes automatically as part of laboratory protocol and not by an examiner's discretion. This approach may link different crime scenes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An intelligent identification algorithm for the monoclonal picking instrument
NASA Astrophysics Data System (ADS)
Yan, Hua; Zhang, Rongfu; Yuan, Xujun; Wang, Qun
2017-11-01
The traditional colony selection is mainly operated by manual mode, which takes on low efficiency and strong subjectivity. Therefore, it is important to develop an automatic monoclonal-picking instrument. The critical stage of the automatic monoclonal-picking and intelligent optimal selection is intelligent identification algorithm. An auto-screening algorithm based on Support Vector Machine (SVM) is proposed in this paper, which uses the supervised learning method, which combined with the colony morphological characteristics to classify the colony accurately. Furthermore, through the basic morphological features of the colony, system can figure out a series of morphological parameters step by step. Through the establishment of maximal margin classifier, and based on the analysis of the growth trend of the colony, the selection of the monoclonal colony was carried out. The experimental results showed that the auto-screening algorithm could screen out the regular colony from the other, which meets the requirement of various parameters.
Automatic identification and normalization of dosage forms in drug monographs
2012-01-01
Background Each day, millions of health consumers seek drug-related information on the Web. Despite some efforts in linking related resources, drug information is largely scattered in a wide variety of websites of different quality and credibility. Methods As a step toward providing users with integrated access to multiple trustworthy drug resources, we aim to develop a method capable of identifying drug's dosage form information in addition to drug name recognition. We developed rules and patterns for identifying dosage forms from different sections of full-text drug monographs, and subsequently normalized them to standardized RxNorm dosage forms. Results Our method represents a significant improvement compared with a baseline lookup approach, achieving overall macro-averaged Precision of 80%, Recall of 98%, and F-Measure of 85%. Conclusions We successfully developed an automatic approach for drug dosage form identification, which is critical for building links between different drug-related resources. PMID:22336431
NASA Astrophysics Data System (ADS)
Facsko, Gabor; Sibeck, David; Balogh, Tamas; Kis, Arpad; Wesztergom, Viktor
2017-04-01
The bow shock and the outer rim of the outer radiation belt are detected automatically by our algorithm developed as a part of the Boundary Layer Identification Code Cluster Active Archive project. The radiation belt positions are determined from energized electron measurements working properly onboard all Cluster spacecraft. For bow shock identification we use magnetometer data and, when available, ion plasma instrument data. In addition, electrostatic wave instrument electron density, spacecraft potential measurements and wake indicator auxiliary data are also used so the events can be identified by all Cluster probes in highly redundant way, as the magnetometer and these instruments are still operational in all spacecraft. The capability and performance of the bow shock identification algorithm were tested using known bow shock crossing determined manually from January 29, 2002 to February 3,. The verification enabled 70% of the bow shock crossings to be identified automatically. The method shows high flexibility and it can be applied to observations from various spacecraft. Now these tools have been applied to Time History of Events and Macroscale Interactions during Substorms (THEMIS)/Acceleration, Reconnection, Turbulence, and Electrodynamics of the Moon's Interaction with the Sun (ARTEMIS) magnetic field, plasma and spacecraft potential observations to identify bow shock crossings; and to Van Allen Probes supra-thermal electron observations to identify the edges of the radiation belt. The outcomes of the algorithms are checked manually and the parameters used to search for bow shock identification are refined.
The effect of tonal changes on voice onset time in Mandarin esophageal speech.
Liu, Hanjun; Ng, Manwa L; Wan, Mingxi; Wang, Supin; Zhang, Yi
2008-03-01
The present study investigated the effect of tonal changes on voice onset time (VOT) between normal laryngeal (NL) and superior esophageal (SE) speakers of Mandarin Chinese. VOT values were measured from the syllables /pha/, /tha/, and /kha/ produced at four tone levels by eight NL and seven SE speakers who were native speakers of Mandarin. Results indicated that Mandarin tones were associated with significantly different VOT values for NL speakers, in which high-falling tone was associated with significantly shorter VOT values than mid-rising tone and falling-rising tone. Regarding speaker group, SE speakers showed significantly shorter VOT values than NL speakers across all tone levels. This may be related to their use of pharyngoesophageal (PE) segment as another sound source. SE speakers appear to take a shorter time to start PE segment vibration compared to NL speakers using the vocal folds for vibration.
Ng, Manwa L; Chen, Yang
2011-12-01
The present study examined English sentence stress produced by native Cantonese speakers who were speaking English as a second language (ESL). Cantonese ESL speakers' proficiency in English stress production as perceived by English-speaking listeners was also studied. Acoustical parameters associated with sentence stress including fundamental frequency (F0), vowel duration, and intensity were measured from the English sentences produced by 40 Cantonese ESL speakers. Data were compared with those obtained from 40 native speakers of American English. The speech samples were also judged by eight native listeners who were native speakers of American English for placement, degree, and naturalness of stress. Results showed that Cantonese ESL speakers were able to use F0, vowel duration, and intensity to differentiate sentence stress patterns. Yet, both female and male Cantonese ESL speakers exhibited consistently higher F0 in stressed words than English speakers. Overall, Cantonese ESL speakers were found to be proficient in using duration and intensity to signal sentence stress, in a way comparable with English speakers. In addition, F0 and intensity were found to correlate closely with perceptual judgement and the degree of stress with the naturalness of stress.
Automatic extraction of road features in urban environments using dense ALS data
NASA Astrophysics Data System (ADS)
Soilán, Mario; Truong-Hong, Linh; Riveiro, Belén; Laefer, Debra
2018-02-01
This paper describes a methodology that automatically extracts semantic information from urban ALS data for urban parameterization and road network definition. First, building façades are segmented from the ground surface by combining knowledge-based information with both voxel and raster data. Next, heuristic rules and unsupervised learning are applied to the ground surface data to distinguish sidewalk and pavement points as a means for curb detection. Then radiometric information was employed for road marking extraction. Using high-density ALS data from Dublin, Ireland, this fully automatic workflow was able to generate a F-score close to 95% for pavement and sidewalk identification with a resolution of 20 cm and better than 80% for road marking detection.
Multiple layer identification label using stacked identification symbols
NASA Technical Reports Server (NTRS)
Schramm, Harry F. (Inventor)
2005-01-01
An automatic identification system and method are provided which employ a machine readable multiple layer label. The label has a plurality of machine readable marking layers stacked one upon another. Each of the marking layers encodes an identification symbol detectable using one or more sensing technologies. The various marking layers may comprise the same marking material or each marking layer may comprise a different medium having characteristics detectable by a different sensing technology. These sensing technologies include x-ray, radar, capacitance, thermal, magnetic and ultrasonic. A complete symbol may be encoded within each marking layer or a symbol may be segmented into fragments which are then divided within a single marking layer or encoded across multiple marking layers.
The Speaker Gender Gap at Critical Care Conferences.
Mehta, Sangeeta; Rose, Louise; Cook, Deborah; Herridge, Margaret; Owais, Sawayra; Metaxa, Victoria
2018-06-01
To review women's participation as faculty at five critical care conferences over 7 years. Retrospective analysis of five scientific programs to identify the proportion of females and each speaker's profession based on conference conveners, program documents, or internet research. Three international (European Society of Intensive Care Medicine, International Symposium on Intensive Care and Emergency Medicine, Society of Critical Care Medicine) and two national (Critical Care Canada Forum, U.K. Intensive Care Society State of the Art Meeting) annual critical care conferences held between 2010 and 2016. Female faculty speakers. None. Male speakers outnumbered female speakers at all five conferences, in all 7 years. Overall, women represented 5-31% of speakers, and female physicians represented 5-26% of speakers. Nursing and allied health professional faculty represented 0-25% of speakers; in general, more than 50% of allied health professionals were women. Over the 7 years, Society of Critical Care Medicine had the highest representation of female (27% overall) and nursing/allied health professional (16-25%) speakers; notably, male physicians substantially outnumbered female physicians in all years (62-70% vs 10-19%, respectively). Women's representation on conference program committees ranged from 0% to 40%, with Society of Critical Care Medicine having the highest representation of women (26-40%). The female proportions of speakers, physician speakers, and program committee members increased significantly over time at the Society of Critical Care Medicine and U.K. Intensive Care Society State of the Art Meeting conferences (p < 0.05), but there was no temporal change at the other three conferences. There is a speaker gender gap at critical care conferences, with male faculty outnumbering female faculty. This gap is more marked among physician speakers than those speakers representing nursing and allied health professionals. Several organizational strategies can address this gender gap.
Automatic identification of individual killer whales.
Brown, Judith C; Smaragdis, Paris; Nousek-McGregor, Anna
2010-09-01
Following the successful use of HMM and GMM models for classification of a set of 75 calls of northern resident killer whales into call types [Brown, J. C., and Smaragdis, P., J. Acoust. Soc. Am. 125, 221-224 (2009)], the use of these same methods has been explored for the identification of vocalizations from the same call type N2 of four individual killer whales. With an average of 20 vocalizations from each of the individuals the pairwise comparisons have an extremely high success rate of 80 to 100% and the identifications within the entire group yield around 78%.
Reflecting on Native Speaker Privilege
ERIC Educational Resources Information Center
Berger, Kathleen
2014-01-01
The issues surrounding native speakers (NSs) and nonnative speakers (NNSs) as teachers (NESTs and NNESTs, respectively) in the field of teaching English to speakers of other languages (TESOL) are a current topic of interest. In many contexts, the native speaker of English is viewed as the model teacher, thus putting the NEST into a position of…
ERIC Educational Resources Information Center
Kersten, Alan W.; Meissner, Christian A.; Lechuga, Julia; Schwartz, Bennett L.; Albrechtsen, Justin S.; Iglesias, Adam
2010-01-01
Three experiments provide evidence that the conceptualization of moving objects and events is influenced by one's native language, consistent with linguistic relativity theory. Monolingual English speakers and bilingual Spanish/English speakers tested in an English-speaking context performed better than monolingual Spanish speakers and bilingual…
Zou, Yuan; Nathan, Viswam; Jafari, Roozbeh
2016-01-01
Electroencephalography (EEG) is the recording of electrical activity produced by the firing of neurons within the brain. These activities can be decoded by signal processing techniques. However, EEG recordings are always contaminated with artifacts which hinder the decoding process. Therefore, identifying and removing artifacts is an important step. Researchers often clean EEG recordings with assistance from independent component analysis (ICA), since it can decompose EEG recordings into a number of artifact-related and event-related potential (ERP)-related independent components. However, existing ICA-based artifact identification strategies mostly restrict themselves to a subset of artifacts, e.g., identifying eye movement artifacts only, and have not been shown to reliably identify artifacts caused by nonbiological origins like high-impedance electrodes. In this paper, we propose an automatic algorithm for the identification of general artifacts. The proposed algorithm consists of two parts: 1) an event-related feature-based clustering algorithm used to identify artifacts which have physiological origins; and 2) the electrode-scalp impedance information employed for identifying nonbiological artifacts. The results on EEG data collected from ten subjects show that our algorithm can effectively detect, separate, and remove both physiological and nonbiological artifacts. Qualitative evaluation of the reconstructed EEG signals demonstrates that our proposed method can effectively enhance the signal quality, especially the quality of ERPs, even for those that barely display ERPs in the raw EEG. The performance results also show that our proposed method can effectively identify artifacts and subsequently enhance the classification accuracies compared to four commonly used automatic artifact removal methods.
Zou, Yuan; Nathan, Viswam; Jafari, Roozbeh
2017-01-01
Electroencephalography (EEG) is the recording of electrical activity produced by the firing of neurons within the brain. These activities can be decoded by signal processing techniques. However, EEG recordings are always contaminated with artifacts which hinder the decoding process. Therefore, identifying and removing artifacts is an important step. Researchers often clean EEG recordings with assistance from Independent Component Analysis (ICA), since it can decompose EEG recordings into a number of artifact-related and event related potential (ERP)-related independent components (ICs). However, existing ICA-based artifact identification strategies mostly restrict themselves to a subset of artifacts, e.g. identifying eye movement artifacts only, and have not been shown to reliably identify artifacts caused by non-biological origins like high-impedance electrodes. In this paper, we propose an automatic algorithm for the identification of general artifacts. The proposed algorithm consists of two parts: 1) an event-related feature based clustering algorithm used to identify artifacts which have physiological origins and 2) the electrode-scalp impedance information employed for identifying non-biological artifacts. The results on EEG data collected from 10 subjects show that our algorithm can effectively detect, separate, and remove both physiological and non-biological artifacts. Qualitative evaluation of the reconstructed EEG signals demonstrates that our proposed method can effectively enhance the signal quality, especially the quality of ERPs, even for those that barely display ERPs in the raw EEG. The performance results also show that our proposed method can effectively identify artifacts and subsequently enhance the classification accuracies compared to four commonly used automatic artifact removal methods. PMID:25415992
First tests of a multi-wavelength mini-DIAL system for the automatic detection of greenhouse gases
NASA Astrophysics Data System (ADS)
Parracino, S.; Gelfusa, M.; Lungaroni, M.; Murari, A.; Peluso, E.; Ciparisse, J. F.; Malizia, A.; Rossi, R.; Ventura, P.; Gaudio, P.
2017-10-01
Considering the increase of atmospheric pollution levels in our cities, due to emissions from vehicles and domestic heating, and the growing threat of terrorism, it is necessary to develop instrumentation and gather know-how for the automatic detection and measurement of dangerous substances as quickly and far away as possible. The Multi- Wavelength DIAL, an extension of the conventional DIAL technique, is one of the most powerful remote sensing methods for the identification of multiple substances and seems to be a promising solution compared to existing alternatives. In this paper, first in-field tests of a smart and fully automated Multi-Wavelength mini-DIAL will be presented and discussed in details. The recently developed system, based on a long-wavelength infrared (IR-C) CO2 laser source, has the potential of giving an early warning, whenever something strange is found in the atmosphere, followed by identification and simultaneous concentration measurements of many chemical species, ranging from the most important Greenhouse Gases (GHG) to other harmful Volatile Organic Compounds (VOCs). Preliminary studies, regarding the fingerprint of the investigated substances, have been carried out by cross-referencing database of infrared (IR) spectra, obtained using in-cell measurements, and typical Mixing Ratios in the examined region, extrapolated from the literature. First experiments in atmosphere have been performed into a suburban and moderately-busy area of Rome. Moreover, to optimize the automatic identification of the harmful species to be recognized on the basis of in cell measurements of the absorption coefficient spectra, an advanced multivariate statistical method for classification has been developed and tested.
Fu, Yili; Gao, Wenpeng; Chen, Xiaoguang; Zhu, Minwei; Shen, Weigao; Wang, Shuguo
2010-01-01
The reference system based on the fourth ventricular landmarks (including the fastigial point and ventricular floor plane) is used in medical image analysis of the brain stem. The objective of this study was to develop a rapid, robust, and accurate method for the automatic identification of this reference system on T1-weighted magnetic resonance images. The fully automated method developed in this study consisted of four stages: preprocessing of the data set, expectation-maximization algorithm-based extraction of the fourth ventricle in the region of interest, a coarse-to-fine strategy for identifying the fastigial point, and localization of the base point. The method was evaluated on 27 Brain Web data sets qualitatively and 18 Internet Brain Segmentation Repository data sets and 30 clinical scans quantitatively. The results of qualitative evaluation indicated that the method was robust to rotation, landmark variation, noise, and inhomogeneity. The results of quantitative evaluation indicated that the method was able to identify the reference system with an accuracy of 0.7 +/- 0.2 mm for the fastigial point and 1.1 +/- 0.3 mm for the base point. It took <6 seconds for the method to identify the related landmarks on a personal computer with an Intel Core 2 6300 processor and 2 GB of random-access memory. The proposed method for the automatic identification of the reference system based on the fourth ventricular landmarks was shown to be rapid, robust, and accurate. The method has potentially utility in image registration and computer-aided surgery.
Diaz-Varela, R A; Zarco-Tejada, P J; Angileri, V; Loudjani, P
2014-02-15
Agricultural terraces are features that provide a number of ecosystem services. As a result, their maintenance is supported by measures established by the European Common Agricultural Policy (CAP). In the framework of CAP implementation and monitoring, there is a current and future need for the development of robust, repeatable and cost-effective methodologies for the automatic identification and monitoring of these features at farm scale. This is a complex task, particularly when terraces are associated to complex vegetation cover patterns, as happens with permanent crops (e.g. olive trees). In this study we present a novel methodology for automatic and cost-efficient identification of terraces using only imagery from commercial off-the-shelf (COTS) cameras on board unmanned aerial vehicles (UAVs). Using state-of-the-art computer vision techniques, we generated orthoimagery and digital surface models (DSMs) at 11 cm spatial resolution with low user intervention. In a second stage, these data were used to identify terraces using a multi-scale object-oriented classification method. Results show the potential of this method even in highly complex agricultural areas, both regarding DSM reconstruction and image classification. The UAV-derived DSM had a root mean square error (RMSE) lower than 0.5 m when the height of the terraces was assessed against field GPS data. The subsequent automated terrace classification yielded an overall accuracy of 90% based exclusively on spectral and elevation data derived from the UAV imagery. Copyright © 2014 Elsevier Ltd. All rights reserved.
The contribution of dynamic visual cues to audiovisual speech perception.
Jaekl, Philip; Pesquita, Ana; Alsius, Agnes; Munhall, Kevin; Soto-Faraco, Salvador
2015-08-01
Seeing a speaker's facial gestures can significantly improve speech comprehension, especially in noisy environments. However, the nature of the visual information from the speaker's facial movements that is relevant for this enhancement is still unclear. Like auditory speech signals, visual speech signals unfold over time and contain both dynamic configural information and luminance-defined local motion cues; two information sources that are thought to engage anatomically and functionally separate visual systems. Whereas, some past studies have highlighted the importance of local, luminance-defined motion cues in audiovisual speech perception, the contribution of dynamic configural information signalling changes in form over time has not yet been assessed. We therefore attempted to single out the contribution of dynamic configural information to audiovisual speech processing. To this aim, we measured word identification performance in noise using unimodal auditory stimuli, and with audiovisual stimuli. In the audiovisual condition, speaking faces were presented as point light displays achieved via motion capture of the original talker. Point light displays could be isoluminant, to minimise the contribution of effective luminance-defined local motion information, or with added luminance contrast, allowing the combined effect of dynamic configural cues and local motion cues. Audiovisual enhancement was found in both the isoluminant and contrast-based luminance conditions compared to an auditory-only condition, demonstrating, for the first time the specific contribution of dynamic configural cues to audiovisual speech improvement. These findings imply that globally processed changes in a speaker's facial shape contribute significantly towards the perception of articulatory gestures and the analysis of audiovisual speech. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hybrid Speaker Recognition Using Universal Acoustic Model
NASA Astrophysics Data System (ADS)
Nishimura, Jun; Kuroda, Tadahiro
We propose a novel speaker recognition approach using a speaker-independent universal acoustic model (UAM) for sensornet applications. In sensornet applications such as “Business Microscope”, interactions among knowledge workers in an organization can be visualized by sensing face-to-face communication using wearable sensor nodes. In conventional studies, speakers are detected by comparing energy of input speech signals among the nodes. However, there are often synchronization errors among the nodes which degrade the speaker recognition performance. By focusing on property of the speaker's acoustic channel, UAM can provide robustness against the synchronization error. The overall speaker recognition accuracy is improved by combining UAM with the energy-based approach. For 0.1s speech inputs and 4 subjects, speaker recognition accuracy of 94% is achieved at the synchronization error less than 100ms.
Zdravevski, Eftim; Risteska Stojkoska, Biljana; Standl, Marie; Schulz, Holger
2017-01-01
Assessment of health benefits associated with physical activity depend on the activity duration, intensity and frequency, therefore their correct identification is very valuable and important in epidemiological and clinical studies. The aims of this study are: to develop an algorithm for automatic identification of intended jogging periods; and to assess whether the identification performance is improved when using two accelerometers at the hip and ankle, compared to when using only one at either position. The study used diarized jogging periods and the corresponding accelerometer data from thirty-nine, 15-year-old adolescents, collected under field conditions, as part of the GINIplus study. The data was obtained from two accelerometers placed at the hip and ankle. Automated feature engineering technique was performed to extract features from the raw accelerometer readings and to select a subset of the most significant features. Four machine learning algorithms were used for classification: Logistic regression, Support Vector Machines, Random Forest and Extremely Randomized Trees. Classification was performed using only data from the hip accelerometer, using only data from ankle accelerometer and using data from both accelerometers. The reported jogging periods were verified by visual inspection and used as golden standard. After the feature selection and tuning of the classification algorithms, all options provided a classification accuracy of at least 0.99, independent of the applied segmentation strategy with sliding windows of either 60s or 180s. The best matching ratio, i.e. the length of correctly identified jogging periods related to the total time including the missed ones, was up to 0.875. It could be additionally improved up to 0.967 by application of post-classification rules, which considered the duration of breaks and jogging periods. There was no obvious benefit of using two accelerometers, rather almost the same performance could be achieved from either accelerometer position. Machine learning techniques can be used for automatic activity recognition, as they provide very accurate activity recognition, significantly more accurate than when keeping a diary. Identification of jogging periods in adolescents can be performed using only one accelerometer. Performance-wise there is no significant benefit from using accelerometers on both locations.
Developing an Active Traffic Management System for I-70 in Colorado
DOT National Transportation Integrated Search
2012-09-01
The Colorado DOT is at the forefront of developing an Active Traffic Management (ATM) system that not only : considers operation aspects, but also integrates safety measures. In this research, data collected from Automatic : Vehicle Identification (A...
Electrical continuity scanner facilitates identification of wires for soldering to connectors
NASA Technical Reports Server (NTRS)
Boulton, H. C.; Diclemente, R. A.
1966-01-01
Electrical continuity scanner automatically scans 50 wires in 2 seconds to correlate all wires in a circuit with their respective known ends. Modifications made to the basic plan provide circuitry for scanning up to 250 wires.
Analysis of wolves and sheep. Final report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.; Papcun, G.; Zlokarnik, I.
1997-08-01
In evaluating speaker verification systems, asymmetries have been observed in the ease with which people are able to break into other people`s voice locks. People who are good at breaking into voice locks are called wolves, and people whose locks are easy to break into are called sheep. (Goats are people that have a difficult time opening their own voice locks.) Analyses of speaker verification algorithms could be used to understand wolf/sheep asymmetries. Using the notion of a ``speaker space``, it is demonstrated that such asymmetries could arise even though the similarity of voice 1 to voice 2 is themore » same as the inverse similarity. This explains partially the wolf/sheep asymmetries, although there may be other factors. The speaker space can be computed from interspeaker similarity data using multidimensional scaling, and such speaker space can be used to given a good approximation of the interspeaker similarities. The derived speaker space can be used to predict which of the enrolled speakers are likely to be wolves and which are likely to be sheep. However, a speaker must first enroll in the speaker key system and then be compared to each of the other speakers; a good estimate of a person`s speaker space position could be obtained using only a speech sample.« less
Investigating Auditory Processing of Syntactic Gaps with L2 Speakers Using Pupillometry
ERIC Educational Resources Information Center
Fernandez, Leigh; Höhle, Barbara; Brock, Jon; Nickels, Lyndsey
2018-01-01
According to the Shallow Structure Hypothesis (SSH), second language (L2) speakers, unlike native speakers, build shallow syntactic representations during sentence processing. In order to test the SSH, this study investigated the processing of a syntactic movement in both native speakers of English and proficient late L2 speakers of English using…
A Model of Mandarin Tone Categories--A Study of Perception and Production
ERIC Educational Resources Information Center
Yang, Bei
2010-01-01
The current study lays the groundwork for a model of Mandarin tones based on both native speakers' and non-native speakers' perception and production. It demonstrates that there is variability in non-native speakers' tone productions and that there are differences in the perceptual boundaries in native speakers and non-native speakers. There…
Literacy Skill Differences between Adult Native English and Native Spanish Speakers
ERIC Educational Resources Information Center
Herman, Julia; Cote, Nicole Gilbert; Reilly, Lenore; Binder, Katherine S.
2013-01-01
The goal of this study was to compare the literacy skills of adult native English and native Spanish ABE speakers. Participants were 169 native English speakers and 124 native Spanish speakers recruited from five prior research projects. The results showed that the native Spanish speakers were less skilled on morphology and passage comprehension…
ERIC Educational Resources Information Center
Lee, Jiyeon; Yoshida, Masaya; Thompson, Cynthia K.
2015-01-01
Purpose: Grammatical encoding (GE) is impaired in agrammatic aphasia; however, the nature of such deficits remains unclear. We examined grammatical planning units during real-time sentence production in speakers with agrammatic aphasia and control speakers, testing two competing models of GE. We queried whether speakers with agrammatic aphasia…
Plat, Rika; Lowie, Wander; de Bot, Kees
2018-01-01
Reaction time data have long been collected in order to gain insight into the underlying mechanisms involved in language processing. Means analyses often attempt to break down what factors relate to what portion of the total reaction time. From a dynamic systems theory perspective or an interaction dominant view of language processing, it is impossible to isolate discrete factors contributing to language processing, since these continually and interactively play a role. Non-linear analyses offer the tools to investigate the underlying process of language use in time, without having to isolate discrete factors. Patterns of variability in reaction time data may disclose the relative contribution of automatic (grapheme-to-phoneme conversion) processing and attention-demanding (semantic) processing. The presence of a fractal structure in the variability of a reaction time series indicates automaticity in the mental structures contributing to a task. A decorrelated pattern of variability will indicate a higher degree of attention-demanding processing. A focus on variability patterns allows us to examine the relative contribution of automatic and attention-demanding processing when a speaker is using the mother tongue (L1) or a second language (L2). A word naming task conducted in the L1 (Dutch) and L2 (English) shows L1 word processing to rely more on automatic spelling-to-sound conversion than L2 word processing. A word naming task with a semantic categorization subtask showed more reliance on attention-demanding semantic processing when using the L2. A comparison to L1 English data shows this was not only due to the amount of language use or language dominance, but also to the difference in orthographic depth between Dutch and English. An important implication of this finding is that when the same task is used to test and compare different languages, one cannot straightforwardly assume the same cognitive sub processes are involved to an equal degree using the same task in different languages. PMID:29403404
Automated determination of dust particles trajectories in the coma of comet 67P
NASA Astrophysics Data System (ADS)
Marín-Yaseli de la Parra, J.; Küppers, M.; Perez Lopez, F.; Besse, S.; Moissl, R.
2017-09-01
During more than two years Rosetta spent at comet 67P, it took thousands of images that contain individual dust particles. To arrive at a statistics of the dust properties, automatic image analysis is required. We present a new methodology for fast-dust identification using a star mask reference system for matching a set of images automatically. The main goal is to derive particle size distributions and to determine if traces of the size distribution of primordial pebbles are still present in today's cometary dust [1].
Automatic cytometric device using multiple wavelength excitations
NASA Astrophysics Data System (ADS)
Rongeat, Nelly; Ledroit, Sylvain; Chauvet, Laurence; Cremien, Didier; Urankar, Alexandra; Couderc, Vincent; Nérin, Philippe
2011-05-01
Precise identification of eosinophils, basophils, and specific subpopulations of blood cells (B lymphocytes) in an unconventional automatic hematology analyzer is demonstrated. Our specific apparatus mixes two excitation radiations by means of an acousto-optics tunable filter to properly control fluorescence emission of phycoerythrin cyanin 5 (PC5) conjugated to antibodies (anti-CD20 or anti-CRTH2) and Thiazole Orange. This way our analyzer combining techniques of hematology analysis and flow cytometry based on multiple fluorescence detection, drastically improves the signal to noise ratio and decreases the spectral overlaps impact coming from multiple fluorescence emissions.
Development of panel loudspeaker system: design, evaluation and enhancement.
Bai, M R; Huang, T
2001-06-01
Panel speakers are investigated in terms of structural vibration and acoustic radiation. A panel speaker primarily consists of a panel and an inertia exciter. Contrary to conventional speakers, flexural resonance is encouraged such that the panel vibrates as randomly as possible. Simulation tools are developed to facilitate system integration of panel speakers. In particular, electro-mechanical analogy, finite element analysis, and fast Fourier transform are employed to predict panel vibration and the acoustic radiation. Design procedures are also summarized. In order to compare the panel speakers with the conventional speakers, experimental investigations were undertaken to evaluate frequency response, directional response, sensitivity, efficiency, and harmonic distortion of both speakers. The results revealed that the panel speakers suffered from a problem of sensitivity and efficiency. To alleviate the problem, a woofer using electronic compensation based on H2 model matching principle is utilized to supplement the bass response. As indicated in the result, significant improvement over the panel speaker alone was achieved by using the combined panel-woofer system.
And then I saw her race: Race-based expectations affect infants' word processing.
Weatherhead, Drew; White, Katherine S
2018-08-01
How do our expectations about speakers shape speech perception? Adults' speech perception is influenced by social properties of the speaker (e.g., race). When in development do these influences begin? In the current study, 16-month-olds heard familiar words produced in their native accent (e.g., "dog") and in an unfamiliar accent involving a vowel shift (e.g., "dag"), in the context of an image of either a same-race speaker or an other-race speaker. Infants' interpretation of the words depended on the speaker's race. For the same-race speaker, infants only recognized words produced in the familiar accent; for the other-race speaker, infants recognized both versions of the words. Two additional experiments showed that infants only recognized an other-race speaker's atypical pronunciations when they differed systematically from the native accent. These results provide the first evidence that expectations driven by unspoken properties of speakers, such as race, influence infants' speech processing. Copyright © 2018 Elsevier B.V. All rights reserved.
Word Durations in Non-Native English
Baker, Rachel E.; Baese-Berk, Melissa; Bonnasse-Gahot, Laurent; Kim, Midam; Van Engen, Kristin J.; Bradlow, Ann R.
2010-01-01
In this study, we compare the effects of English lexical features on word duration for native and non-native English speakers and for non-native speakers with different L1s and a range of L2 experience. We also examine whether non-native word durations lead to judgments of a stronger foreign accent. We measured word durations in English paragraphs read by 12 American English (AE), 20 Korean, and 20 Chinese speakers. We also had AE listeners rate the `accentedness' of these non-native speakers. AE speech had shorter durations, greater within-speaker word duration variance, greater reduction of function words, and less between-speaker variance than non-native speech. However, both AE and non-native speakers showed sensitivity to lexical predictability by reducing second mentions and high frequency words. Non-native speakers with more native-like word durations, greater within-speaker word duration variance, and greater function word reduction were perceived as less accented. Overall, these findings identify word duration as an important and complex feature of foreign-accented English. PMID:21516172
Experimental study on GMM-based speaker recognition
NASA Astrophysics Data System (ADS)
Ye, Wenxing; Wu, Dapeng; Nucci, Antonio
2010-04-01
Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.
Steensberg, Alvilda T; Eriksen, Mette M; Andersen, Lars B; Hendriksen, Ole M; Larsen, Heinrich D; Laier, Gunnar H; Thougaard, Thomas
2017-06-01
The European Resuscitation Council Guidelines 2015 recommend bystanders to activate their mobile phone speaker function, if possible, in case of suspected cardiac arrest. This is to facilitate continuous dialogue with the dispatcher including (if required) cardiopulmonary resuscitation instructions. The aim of this study was to measure the bystander capability to activate speaker function in case of suspected cardiac arrest. In 87days, a systematic prospective registration of bystander capability to activate the speaker function, when cardiac arrest was suspected, was performed. For those asked, "can you activate your mobile phone's speaker function", audio recordings were examined and categorized into groups according to the bystanders capability to activate speaker function on their own initiative, without instructions, or with instructions from the emergency medical dispatcher. Time delay was measured, in seconds, for the bystanders without pre-activated speaker function. 42.0% (58) was able to activate the speaker function without instructions, 2.9% (4) with instructions, 18.1% (25) on own initiative and 37.0% (51) were unable to activate the speaker function. The median time to activate speaker function was 19s and 8s, with and without instructions, respectively. Dispatcher assisted cardiopulmonary resuscitation with activated speaker function, in cases of suspected cardiac arrest, allows for continuous dialogue between the emergency medical dispatcher and the bystander. In this study, we found a 63.0% success rate of activating the speaker function in such situations. Copyright © 2017 Elsevier B.V. All rights reserved.
Automatic retinal interest evaluation system (ARIES).
Yin, Fengshou; Wong, Damon Wing Kee; Yow, Ai Ping; Lee, Beng Hai; Quan, Ying; Zhang, Zhuo; Gopalakrishnan, Kavitha; Li, Ruoying; Liu, Jiang
2014-01-01
In recent years, there has been increasing interest in the use of automatic computer-based systems for the detection of eye diseases such as glaucoma, age-related macular degeneration and diabetic retinopathy. However, in practice, retinal image quality is a big concern as automatic systems without consideration of degraded image quality will likely generate unreliable results. In this paper, an automatic retinal image quality assessment system (ARIES) is introduced to assess both image quality of the whole image and focal regions of interest. ARIES achieves 99.54% accuracy in distinguishing fundus images from other types of images through a retinal image identification step in a dataset of 35342 images. The system employs high level image quality measures (HIQM) to perform image quality assessment, and achieves areas under curve (AUCs) of 0.958 and 0.987 for whole image and optic disk region respectively in a testing dataset of 370 images. ARIES acts as a form of automatic quality control which ensures good quality images are used for processing, and can also be used to alert operators of poor quality images at the time of acquisition.
NASA Astrophysics Data System (ADS)
Fink, Wolfgang; Brooks, Alexander J.-W.; Tarbell, Mark A.; Dohm, James M.
2017-05-01
Autonomous reconnaissance missions are called for in extreme environments, as well as in potentially hazardous (e.g., the theatre, disaster-stricken areas, etc.) or inaccessible operational areas (e.g., planetary surfaces, space). Such future missions will require increasing degrees of operational autonomy, especially when following up on transient events. Operational autonomy encompasses: (1) Automatic characterization of operational areas from different vantages (i.e., spaceborne, airborne, surface, subsurface); (2) automatic sensor deployment and data gathering; (3) automatic feature extraction including anomaly detection and region-of-interest identification; (4) automatic target prediction and prioritization; (5) and subsequent automatic (re-)deployment and navigation of robotic agents. This paper reports on progress towards several aspects of autonomous C4ISR systems, including: Caltech-patented and NASA award-winning multi-tiered mission paradigm, robotic platform development (air, ground, water-based), robotic behavior motifs as the building blocks for autonomous tele-commanding, and autonomous decision making based on a Caltech-patented framework comprising sensor-data-fusion (feature-vectors), anomaly detection (clustering and principal component analysis), and target prioritization (hypothetical probing).
Drechsler, Axel; Helling, Tobias; Steinfartz, Sebastian
2015-01-01
Capture–mark–recapture (CMR) approaches are the backbone of many studies in population ecology to gain insight on the life cycle, migration, habitat use, and demography of target species. The reliable and repeatable recognition of an individual throughout its lifetime is the basic requirement of a CMR study. Although invasive techniques are available to mark individuals permanently, noninvasive methods for individual recognition mainly rest on photographic identification of external body markings, which are unique at the individual level. The re-identification of an individual based on comparing shape patterns of photographs by eye is commonly used. Automated processes for photographic re-identification have been recently established, but their performance in large datasets (i.e., > 1000 individuals) has rarely been tested thoroughly. Here, we evaluated the performance of the program AMPHIDENT, an automatic algorithm to identify individuals on the basis of ventral spot patterns in the great crested newt (Triturus cristatus) versus the genotypic fingerprint of individuals based on highly polymorphic microsatellite loci using GENECAP. Between 2008 and 2010, we captured, sampled and photographed adult newts and calculated for 1648 samples/photographs recapture rates for both approaches. Recapture rates differed slightly with 8.34% for GENECAP and 9.83% for AMPHIDENT. With an estimated rate of 2% false rejections (FRR) and 0.00% false acceptances (FAR), AMPHIDENT proved to be a highly reliable algorithm for CMR studies of large datasets. We conclude that the application of automatic recognition software of individual photographs can be a rather powerful and reliable tool in noninvasive CMR studies for a large number of individuals. Because the cross-correlation of standardized shape patterns is generally applicable to any pattern that provides enough information, this algorithm is capable of becoming a single application with broad use in CMR studies for many species. PMID:25628871
Rusz, J; Cmejla, R; Ruzickova, H; Ruzicka, E
2011-01-01
An assessment of vocal impairment is presented for separating healthy people from persons with early untreated Parkinson's disease (PD). This study's main purpose was to (a) determine whether voice and speech disorder are present from early stages of PD before starting dopaminergic pharmacotherapy, (b) ascertain the specific characteristics of the PD-related vocal impairment, (c) identify PD-related acoustic signatures for the major part of traditional clinically used measurement methods with respect to their automatic assessment, and (d) design new automatic measurement methods of articulation. The varied speech data were collected from 46 Czech native speakers, 23 with PD. Subsequently, 19 representative measurements were pre-selected, and Wald sequential analysis was then applied to assess the efficiency of each measure and the extent of vocal impairment of each subject. It was found that measurement of the fundamental frequency variations applied to two selected tasks was the best method for separating healthy from PD subjects. On the basis of objective acoustic measures, statistical decision-making theory, and validation from practicing speech therapists, it has been demonstrated that 78% of early untreated PD subjects indicate some form of vocal impairment. The speech defects thus uncovered differ individually in various characteristics including phonation, articulation, and prosody.
Analysis and synthesis of intonation using the Tilt model.
Taylor, P
2000-03-01
This paper introduces the Tilt intonational model and describes how this model can be used to automatically analyze and synthesize intonation. In the model, intonation is represented as a linear sequence of events, which can be pitch accents or boundary tones. Each event is characterized by continuous parameters representing amplitude, duration, and tilt (a measure of the shape of the event). The paper describes an event detector, in effect an intonational recognition system, which produces a transcription of an utterance's intonation. The features and parameters of the event detector are discussed and performance figures are shown on a variety of read and spontaneous speaker independent conversational speech databases. Given the event locations, algorithms are described which produce an automatic analysis of each event in terms of the Tilt parameters. Synthesis algorithms are also presented which generate F0 contours from Tilt representations. The accuracy of these is shown by comparing synthetic F0 contours to real F0 contours. The paper concludes with an extensive discussion on linguistic representations of intonation and gives evidence that the Tilt model goes a long way to satisfying the desired goals of such a representation in that it has the right number of degrees of freedom to be able to describe and synthesize intonation accurately.
Early sensory encoding of affective prosody: neuromagnetic tomography of emotional category changes.
Thönnessen, Heike; Boers, Frank; Dammers, Jürgen; Chen, Yu-Han; Norra, Christine; Mathiak, Klaus
2010-03-01
In verbal communication, prosodic codes may be phylogenetically older than lexical ones. Little is known, however, about early, automatic encoding of emotional prosody. This study investigated the neuromagnetic analogue of mismatch negativity (MMN) as an index of early stimulus processing of emotional prosody using whole-head magnetoencephalography (MEG). We applied two different paradigms to study MMN; in addition to the traditional oddball paradigm, the so-called optimum design was adapted to emotion detection. In a sequence of randomly changing disyllabic pseudo-words produced by one male speaker in neutral intonation, a traditional oddball design with emotional deviants (10% happy and angry each) and an optimum design with emotional (17% happy and sad each) and nonemotional gender deviants (17% female) elicited the mismatch responses. The emotional category changes demonstrated early responses (<200 ms) at both auditory cortices with larger amplitudes at the right hemisphere. Responses to the nonemotional change from male to female voices emerged later ( approximately 300 ms). Source analysis pointed at bilateral auditory cortex sources without robust contribution from other such as frontal sources. Conceivably, both auditory cortices encode categorical representations of emotional prosodic. Processing of cognitive feature extraction and automatic emotion appraisal may overlap at this level enabling rapid attentional shifts to important social cues. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.
Mitra, Vikramjit; Nam, Hosung; Espy-Wilson, Carol Y; Saltzman, Elliot; Goldstein, Louis
2010-09-13
Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is usually termed "speech-inversion." This study aims to propose and compare various machine learning strategies for speech inversion: Trajectory mixture density networks (TMDNs), feedforward artificial neural networks (FF-ANN), support vector regression (SVR), autoregressive artificial neural network (AR-ANN), and distal supervised learning (DSL). Further, using a database generated by the Haskins Laboratories speech production model, we test the claim that information regarding constrictions produced by the distinct organs of the vocal tract (vocal tract variables) is superior to flesh-point information (articulatory pellet trajectories) for the inversion process.
Applying face identification to detecting hijacking of airplane
NASA Astrophysics Data System (ADS)
Luo, Xuanwen; Cheng, Qiang
2004-09-01
That terrorists hijacked the airplanes and crashed the World Trade Center is disaster to civilization. To avoid the happening of hijack is critical to homeland security. To report the hijacking in time, limit the terrorist to operate the plane if happened and land the plane to the nearest airport could be an efficient way to avoid the misery. Image processing technique in human face recognition or identification could be used for this task. Before the plane take off, the face images of pilots are input into a face identification system installed in the airplane. The camera in front of pilot seat keeps taking the pilot face image during the flight and comparing it with pre-input pilot face images. If a different face is detected, a warning signal is sent to ground automatically. At the same time, the automatic cruise system is started or the plane is controlled by the ground. The terrorists will have no control over the plane. The plane will be landed to a nearest or appropriate airport under the control of the ground or cruise system. This technique could also be used in automobile industry as an image key to avoid car stealth.
A De-Identification Pipeline for Ultrasound Medical Images in DICOM Format.
Monteiro, Eriksson; Costa, Carlos; Oliveira, José Luís
2017-05-01
Clinical data sharing between healthcare institutions, and between practitioners is often hindered by privacy protection requirements. This problem is critical in collaborative scenarios where data sharing is fundamental for establishing a workflow among parties. The anonymization of patient information burned in DICOM images requires elaborate processes somewhat more complex than simple de-identification of textual information. Usually, before sharing, there is a need for manual removal of specific areas containing sensitive information in the images. In this paper, we present a pipeline for ultrasound medical image de-identification, provided as a free anonymization REST service for medical image applications, and a Software-as-a-Service to streamline automatic de-identification of medical images, which is freely available for end-users. The proposed approach applies image processing functions and machine-learning models to bring about an automatic system to anonymize medical images. To perform character recognition, we evaluated several machine-learning models, being Convolutional Neural Networks (CNN) selected as the best approach. For accessing the system quality, 500 processed images were manually inspected showing an anonymization rate of 89.2%. The tool can be accessed at https://bioinformatics.ua.pt/dicom/anonymizer and it is available with the most recent version of Google Chrome, Mozilla Firefox and Safari. A Docker image containing the proposed service is also publicly available for the community.
ERIC Educational Resources Information Center
Ellis, Elizabeth M.
2016-01-01
Teacher linguistic identity has so far mainly been researched in terms of whether a teacher identifies (or is identified by others) as a native speaker (NEST) or nonnative speaker (NNEST) (Moussu & Llurda, 2008; Reis, 2011). Native speakers are presumed to be monolingual, and nonnative speakers, although by definition bilingual, tend to be…
Pitfalls of Establishing DNA Barcoding Systems in Protists: The Cryptophyceae as a Test Case
Hoef-Emden, Kerstin
2012-01-01
A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5′-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed. PMID:22970104
Pitfalls of establishing DNA barcoding systems in protists: the cryptophyceae as a test case.
Hoef-Emden, Kerstin
2012-01-01
A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5'-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed.
NASA Astrophysics Data System (ADS)
Wei, Qingyang; Ma, Tianyu; Xu, Tianpeng; Zeng, Ming; Gu, Yu; Dai, Tiantian; Liu, Yaqiang
2018-01-01
Modern positron emission tomography (PET) detectors are made from pixelated scintillation crystal arrays and readout by Anger logic. The interaction position of the gamma-ray should be assigned to a crystal using a crystal position map or look-up table. Crystal identification is a critical procedure for pixelated PET systems. In this paper, we propose a novel crystal identification method for a dual-layer-offset LYSO based animal PET system via Lu-176 background radiation and mean shift algorithm. Single photon event data of the Lu-176 background radiation are acquired in list-mode for 3 h to generate a single photon flood map (SPFM). Coincidence events are obtained from the same data using time information to generate a coincidence flood map (CFM). The CFM is used to identify the peaks of the inner layer using the mean shift algorithm. The response of the inner layer is deducted from the SPFM by subtracting CFM. Then, the peaks of the outer layer are also identified using the mean shift algorithm. The automatically identified peaks are manually inspected by a graphical user interface program. Finally, a crystal position map is generated using a distance criterion based on these peaks. The proposed method is verified on the animal PET system with 48 detector blocks on a laptop with an Intel i7-5500U processor. The total runtime for whole system peak identification is 67.9 s. Results show that the automatic crystal identification has 99.98% and 99.09% accuracy for the peaks of the inner and outer layers of the whole system respectively. In conclusion, the proposed method is suitable for the dual-layer-offset lutetium based PET system to perform crystal identification instead of external radiation sources.
DOT National Transportation Integrated Search
1993-01-01
ELECTRONIC TOLL COLLECTION OR ETC AND TRAFFIC MANAGEMENT OR ETTM, AUTOMATIC VEHICLE IDENTIFICATION OR AVI : ELECTRONIC TOLL COLLECTION AND TRAFFIC MANAGEMENT (ETTM) SYSTEMS ARE NOT A FUTURISTIC DREAM, THEY ARE OPERATING OR ARE BEING TESTED TODAY I...
Automatic vehicle identification technology applications to toll collection services
DOT National Transportation Integrated Search
1997-01-01
Intelligent transportation systems technologies are being developed and applied through transportation systems in the United States. An example of this type of innovation can be seen on toll roads where a driver is required to deposit a toll in order...
DOT National Transportation Integrated Search
2010-02-01
It is important for many applications, such as intersection delay estimation and adaptive signal : control, to obtain vehicle turning movement information at signalized intersections. However, : vehicle turning movement information is very time consu...
How Cognitive Load Influences Speakers' Choice of Referring Expressions.
Vogels, Jorrig; Krahmer, Emiel; Maes, Alfons
2015-08-01
We report on two experiments investigating the effect of an increased cognitive load for speakers on the choice of referring expressions. Speakers produced story continuations to addressees, in which they referred to characters that were either salient or non-salient in the discourse. In Experiment 1, referents that were salient for the speaker were non-salient for the addressee, and vice versa. In Experiment 2, all discourse information was shared between speaker and addressee. Cognitive load was manipulated by the presence or absence of a secondary task for the speaker. The results show that speakers under load are more likely to produce pronouns, at least when referring to less salient referents. We take this finding as evidence that speakers under load have more difficulties taking discourse salience into account, resulting in the use of expressions that are more economical for themselves. © 2014 Cognitive Science Society, Inc.
Identification of forensic samples by using an infrared-based automatic DNA sequencer.
Ricci, Ugo; Sani, Ilaria; Klintschar, Michael; Cerri, Nicoletta; De Ferrari, Francesco; Giovannucci Uzielli, Maria Luisa
2003-06-01
We have recently introduced a new protocol for analyzing all core loci of the Federal Bureau of Investigation's (FBI) Combined DNA Index System (CODIS) with an infrared (IR) automatic DNA sequencer (LI-COR 4200). The amplicons were labeled with forward oligonucleotide primers, covalently linked to a new infrared fluorescent molecule (IRDye 800). The alleles were displayed as familiar autoradiogram-like images with real-time detection. This protocol was employed for paternity testing, population studies, and identification of degraded forensic samples. We extensively analyzed some simulated forensic samples and mixed stains (blood, semen, saliva, bones, and fixed archival embedded tissues), comparing the results with donor samples. Sensitivity studies were also performed for the four multiplex systems. Our results show the efficiency, reliability, and accuracy of the IR system for the analysis of forensic samples. We also compared the efficiency of the multiplex protocol with ultraviolet (UV) technology. Paternity tests, undegraded DNA samples, and real forensic samples were analyzed with this approach based on IR technology and with UV-based automatic sequencers in combination with commercially-available kits. The comparability of the results with the widespread UV methods suggests that it is possible to exchange data between laboratories using the same core group of markers but different primer sets and detection methods.
Ramkumar, Barathram; Sabarimalai Manikandan, M.
2017-01-01
Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal. PMID:28529758
Multi-stage robust scheme for citrus identification from high resolution airborne images
NASA Astrophysics Data System (ADS)
Amorós-López, Julia; Izquierdo Verdiguier, Emma; Gómez-Chova, Luis; Muñoz-Marí, Jordi; Zoilo Rodríguez-Barreiro, Jorge; Camps-Valls, Gustavo; Calpe-Maravilla, Javier
2008-10-01
Identification of land cover types is one of the most critical activities in remote sensing. Nowadays, managing land resources by using remote sensing techniques is becoming a common procedure to speed up the process while reducing costs. However, data analysis procedures should satisfy the accuracy figures demanded by institutions and governments for further administrative actions. This paper presents a methodological scheme to update the citrus Geographical Information Systems (GIS) of the Comunidad Valenciana autonomous region, Spain). The proposed approach introduces a multi-stage automatic scheme to reduce visual photointerpretation and ground validation tasks. First, an object-oriented feature extraction process is carried out for each cadastral parcel from very high spatial resolution (VHR) images (0.5m) acquired in the visible and near infrared. Next, several automatic classifiers (decision trees, multilayer perceptron, and support vector machines) are trained and combined to improve the final accuracy of the results. The proposed strategy fulfills the high accuracy demanded by policy makers by means of combining automatic classification methods with visual photointerpretation available resources. A level of confidence based on the agreement between classifiers allows us an effective management by fixing the quantity of parcels to be reviewed. The proposed methodology can be applied to similar problems and applications.
Satija, Udit; Ramkumar, Barathram; Sabarimalai Manikandan, M
2017-02-01
Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal.
NASA Astrophysics Data System (ADS)
Fujiwara, Yukihiro; Yoshii, Masakazu; Arai, Yasuhito; Adachi, Shuichi
Advanced safety vehicle(ASV)assists drivers’ manipulation to avoid trafic accidents. A variety of researches on automatic driving systems are necessary as an element of ASV. Among them, we focus on visual feedback approach in which the automatic driving system is realized by recognizing road trajectory using image information. The purpose of this paper is to examine the validity of this approach by experiments using a radio-controlled car. First, a practical image processing algorithm to recognize white lines on the road is proposed. Second, a model of the radio-controlled car is built by system identication experiments. Third, an automatic steering control system is designed based on H∞ control theory. Finally, the effectiveness of the designed control system is examined via traveling experiments.
Automatic high-throughput screening of colloidal crystals using machine learning
NASA Astrophysics Data System (ADS)
Spellings, Matthew; Glotzer, Sharon C.
Recent improvements in hardware and software have united to pose an interesting problem for computational scientists studying self-assembly of particles into crystal structures: while studies covering large swathes of parameter space can be dispatched at once using modern supercomputers and parallel architectures, identifying the different regions of a phase diagram is often a serial task completed by hand. While analytic methods exist to distinguish some simple structures, they can be difficult to apply, and automatic identification of more complex structures is still lacking. In this talk we describe one method to create numerical ``fingerprints'' of local order and use them to analyze a study of complex ordered structures. We can use these methods as first steps toward automatic exploration of parameter space and, more broadly, the strategic design of new materials.