ERIC Educational Resources Information Center
Imani, Sahar Sadat Afshar
2013-01-01
Modular EFL Educational Program has managed to offer specialized language education in two specific fields: Audio-visual Materials Translation and Translation of Deeds and Documents. However, no explicit empirical studies can be traced on both internal and external validity measures as well as the extent of compatibility of both courses with the…
Fuzzy Logic-Based Audio Pattern Recognition
NASA Astrophysics Data System (ADS)
Malcangi, M.
2008-11-01
Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
Robot Command Interface Using an Audio-Visual Speech Recognition System
NASA Astrophysics Data System (ADS)
Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy
In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.
pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.
Giannakopoulos, Theodoros
2015-01-01
Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.
pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis
Giannakopoulos, Theodoros
2015-01-01
Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library. PMID:26656189
Method and apparatus for obtaining complete speech signals for speech recognition applications
NASA Technical Reports Server (NTRS)
Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)
2009-01-01
The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.
ERIC Educational Resources Information Center
Schlenker, Richard M.; And Others
Presented is a manuscript for an introductory boiler water chemistry course for marine engineer education. The course is modular, self-paced, audio-tutorial, contract graded and combined lecture-laboratory instructed. Lectures are presented to students individually via audio-tapes and 35 mm slides. The course consists of a total of 17 modules -…
Talker variability in audio-visual speech perception
Heald, Shannon L. M.; Nusbaum, Howard C.
2014-01-01
A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker’s face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919
Talker variability in audio-visual speech perception.
Heald, Shannon L M; Nusbaum, Howard C
2014-01-01
A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.
Audio-visual affective expression recognition
NASA Astrophysics Data System (ADS)
Huang, Thomas S.; Zeng, Zhihong
2007-11-01
Automatic affective expression recognition has attracted more and more attention of researchers from different disciplines, which will significantly contribute to a new paradigm for human computer interaction (affect-sensitive interfaces, socially intelligent environments) and advance the research in the affect-related fields including psychology, psychiatry, and education. Multimodal information integration is a process that enables human to assess affective states robustly and flexibly. In order to understand the richness and subtleness of human emotion behavior, the computer should be able to integrate information from multiple sensors. We introduce in this paper our efforts toward machine understanding of audio-visual affective behavior, based on both deliberate and spontaneous displays. Some promising methods are presented to integrate information from both audio and visual modalities. Our experiments show the advantage of audio-visual fusion in affective expression recognition over audio-only or visual-only approaches.
Speech information retrieval: a review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hafen, Ryan P.; Henry, Michael J.
Audio is an information-rich component of multimedia. Information can be extracted from audio in a number of different ways, and thus there are several established audio signal analysis research fields. These fields include speech recognition, speaker recognition, audio segmentation and classification, and audio finger-printing. The information that can be extracted from tools and methods developed in these fields can greatly enhance multimedia systems. In this paper, we present the current state of research in each of the major audio analysis fields. The goal is to introduce enough back-ground for someone new in the field to quickly gain high-level understanding andmore » to provide direction for further study.« less
CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset
Cao, Houwei; Cooper, David G.; Keutmann, Michael K.; Gur, Ruben C.; Nenkova, Ani; Verma, Ragini
2014-01-01
People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual data are 40.9%, 58.2% and 63.6% respectively. Recognition rates are highest for neutral, followed by happy, anger, disgust, fear, and sad. Average intensity levels of emotion are rated highest for visual-only perception. The accurate recognition of disgust and fear requires simultaneous audio-visual cues, while anger and happiness can be well recognized based on evidence from a single modality. The large dataset we introduce can be used to probe other questions concerning the audio-visual perception of emotion. PMID:25653738
Multi-modal gesture recognition using integrated model of motion, audio and video
NASA Astrophysics Data System (ADS)
Goutsu, Yusuke; Kobayashi, Takaki; Obara, Junya; Kusajima, Ikuo; Takeichi, Kazunari; Takano, Wataru; Nakamura, Yoshihiko
2015-07-01
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.
NASA Technical Reports Server (NTRS)
2003-01-01
Topics covered include: Stable, Thermally Conductive Fillers for Bolted Joints; Connecting to Thermocouples with Fewer Lead Wires; Zipper Connectors for Flexible Electronic Circuits; Safety Interlock for Angularly Misdirected Power Tool; Modular, Parallel Pulse-Shaping Filter Architectures; High-Fidelity Piezoelectric Audio Device; Photovoltaic Power Station with Ultracapacitors for Storage; Time Analyzer for Time Synchronization and Monitor of the Deep Space Network; Program for Computing Albedo; Integrated Software for Analyzing Designs of Launch Vehicles; Abstract-Reasoning Software for Coordinating Multiple Agents; Software Searches for Better Spacecraft-Navigation Models; Software for Partly Automated Recognition of Targets; Antistatic Polycarbonate/Copper Oxide Composite; Better VPS Fabrication of Crucibles and Furnace Cartridges; Burn-Resistant, Strong Metal-Matrix Composites; Self-Deployable Spring-Strip Booms; Explosion Welding for Hermetic Containerization; Improved Process for Fabricating Carbon Nanotube Probes; Automated Serial Sectioning for 3D Reconstruction; and Parallel Subconvolution Filtering Architectures.
Robust audio-visual speech recognition under noisy audio-video conditions.
Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji
2014-02-01
This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.
Utterance independent bimodal emotion recognition in spontaneous communication
NASA Astrophysics Data System (ADS)
Tao, Jianhua; Pan, Shifeng; Yang, Minghao; Li, Ya; Mu, Kaihui; Che, Jianfeng
2011-12-01
Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introduces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (e.g., SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.
The sweet-home project: audio technology in smart homes to improve well-being and reliance.
Vacher, Michel; Istrate, Dan; Portet, François; Joubert, Thierry; Chevalier, Thierry; Smidtas, Serge; Meillon, Brigitte; Lecouteux, Benjamin; Sehili, Mohamed; Chahuara, Pedro; Méniard, Sylvain
2011-01-01
The Sweet-Home project aims at providing audio-based interaction technology that lets the user have full control over their home environment, at detecting distress situations and at easing the social inclusion of the elderly and frail population. This paper presents an overview of the project focusing on the multimodal sound corpus acquisition and labelling and on the investigated techniques for speech and sound recognition. The user study and the recognition performances show the interest of this audio technology.
Specific and Modular Binding Code for Cytosine Recognition in Pumilio/FBF (PUF) RNA-binding Domains
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong, Shuyun; Wang, Yang; Cassidy-Amstutz, Caleb
2011-10-28
Pumilio/fem-3 mRNA-binding factor (PUF) proteins possess a recognition code for bases A, U, and G, allowing designed RNA sequence specificity of their modular Pumilio (PUM) repeats. However, recognition side chains in a PUM repeat for cytosine are unknown. Here we report identification of a cytosine-recognition code by screening random amino acid combinations at conserved RNA recognition positions using a yeast three-hybrid system. This C-recognition code is specific and modular as specificity can be transferred to different positions in the RNA recognition sequence. A crystal structure of a modified PUF domain reveals specific contacts between an arginine side chain and themore » cytosine base. We applied the C-recognition code to design PUF domains that recognize targets with multiple cytosines and to generate engineered splicing factors that modulate alternative splicing. Finally, we identified a divergent yeast PUF protein, Nop9p, that may recognize natural target RNAs with cytosine. This work deepens our understanding of natural PUF protein target recognition and expands the ability to engineer PUF domains to recognize any RNA sequence.« less
Automatic violence detection in digital movies
NASA Astrophysics Data System (ADS)
Fischer, Stephan
1996-11-01
Research on computer-based recognition of violence is scant. We are working on the automatic recognition of violence in digital movies, a first step towards the goal of a computer- assisted system capable of protecting children against TV programs containing a great deal of violence. In the video domain a collision detection and a model-mapping to locate human figures are run, while the creation and comparison of fingerprints to find certain events are run int he audio domain. This article centers on the recognition of fist- fights in the video domain and on the recognition of shots, explosions and cries in the audio domain.
Recognition and characterization of unstructured environmental sounds
NASA Astrophysics Data System (ADS)
Chu, Selina
2011-12-01
Environmental sounds are what we hear everyday, or more generally sounds that surround us ambient or background audio. Humans utilize both vision and hearing to respond to their surroundings, a capability still quite limited in machine processing. The first step toward achieving multimodal input applications is the ability to process unstructured audio and recognize audio scenes (or environments). Such ability would have applications in content analysis and mining of multimedia data or improving robustness in context aware applications through multi-modality, such as in assistive robotics, surveillances, or mobile device-based services. The goal of this thesis is on the characterization of unstructured environmental sounds for understanding and predicting the context surrounding of an agent or device. Most research on audio recognition has focused primarily on speech and music. Less attention has been paid to the challenges and opportunities for using audio to characterize unstructured audio. My research focuses on investigating challenging issues in characterizing unstructured environmental audio and to develop novel algorithms for modeling the variations of the environment. The first step in building a recognition system for unstructured auditory environment was to investigate on techniques and audio features for working with such audio data. We begin by performing a study that explore suitable features and the feasibility of designing an automatic environment recognition system using audio information. In my initial investigation to explore the feasibility of designing an automatic environment recognition system using audio information, I have found that traditional recognition and feature extraction for audio were not suitable for environmental sound, as they lack any type of structures, unlike those of speech and music which contain formantic and harmonic structures, thus dispelling the notion that traditional speech and music recognition techniques can simply be used for realistic environmental sound. Natural unstructured environment sounds contain a large variety of sounds, which are in fact noise-like and are not effectively modeled by Mel-frequency cepstral coefficients (MFCCs) or other commonly-used audio features, e.g. energy, zero-crossing, etc. Due to the lack of appropriate features that is suitable for environmental audio and to achieve a more effective representation, I proposed a specialized feature extraction algorithm for environmental sounds that utilizes the matching pursuit (MP) algorithm to learn the inherent structure of each type of sounds, which we called MP-features. MP-features have shown to capture and represent sounds from different sources and different ranges, where frequency domain features (e.g., MFCCs) fail and can be advantageous when combining with MFCCs to improve the overall performance. The third component leads to our investigation on modeling and detecting the background audio. One of the goals of this research is to characterize an environment. Since many events would blend into the background, I wanted to look for a way to achieve a general model for any particular environment. Once we have an idea of the background, it will enable us to identify foreground events even if we havent seen these events before. Therefore, the next step is to investigate into learning the audio background model for each environment type, despite the occurrences of different foreground events. In this work, I presented a framework for robust audio background modeling, which includes learning the models for prediction, data knowledge and persistent characteristics of the environment. This approach has the ability to model the background and detect foreground events as well as the ability to verify whether the predicted background is indeed the background or a foreground event that protracts for a longer period of time. In this work, I also investigated the use of a semi-supervised learning technique to exploit and label new unlabeled audio data. The final components of my thesis will involve investigating on learning sound structures for generalization and applying the proposed ideas to context aware applications. The inherent nature of environmental sound is noisy and contains relatively large amounts of overlapping events between different environments. Environmental sounds contain large variances even within a single environment type, and frequently, there are no divisible or clear boundaries between some types. Traditional methods of classification are generally not robust enough to handle classes with overlaps. This audio, hence, requires representation by complex models. Using deep learning architecture provides a way to obtain a generative model-based method for classification. Specifically, I considered the use of Deep Belief Networks (DBNs) to model environmental audio and investigate its applicability with noisy data to improve robustness and generalization. A framework was proposed using composite-DBNs to discover high-level representations and to learn a hierarchical structure for different acoustic environments in a data-driven fashion. Experimental results on real data sets demonstrate its effectiveness over traditional methods with over 90% accuracy on recognition for a high number of environmental sound types.
Luque, Joaquín; Larios, Diego F; Personal, Enrique; Barbancho, Julio; León, Carlos
2016-05-18
Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance.
Luque, Joaquín; Larios, Diego F.; Personal, Enrique; Barbancho, Julio; León, Carlos
2016-01-01
Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance. PMID:27213375
Advances in audio source seperation and multisource audio content retrieval
NASA Astrophysics Data System (ADS)
Vincent, Emmanuel
2012-06-01
Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.
Structuring Broadcast Audio for Information Access
NASA Astrophysics Data System (ADS)
Gauvain, Jean-Luc; Lamel, Lori
2003-12-01
One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.
An Introduction to the Fundamentals of Chemistry for the Marine Engineer.
ERIC Educational Resources Information Center
Schlenker, Richard M.
This document describes an introduction course in the fundamentals of chemistry for marine engineers. The course is modularized, audio tutorial allowing the student to progress at his own rate while integrating laboratory and lecture materials. (SL)
A Grey Wolf Optimizer for Modular Granular Neural Networks for Human Recognition
Sánchez, Daniela; Melin, Patricia
2017-01-01
A grey wolf optimizer for modular neural network (MNN) with a granular approach is proposed. The proposed method performs optimal granulation of data and design of modular neural networks architectures to perform human recognition, and to prove its effectiveness benchmark databases of ear, iris, and face biometric measures are used to perform tests and comparisons against other works. The design of a modular granular neural network (MGNN) consists in finding optimal parameters of its architecture; these parameters are the number of subgranules, percentage of data for the training phase, learning algorithm, goal error, number of hidden layers, and their number of neurons. Nowadays, there is a great variety of approaches and new techniques within the evolutionary computing area, and these approaches and techniques have emerged to help find optimal solutions to problems or models and bioinspired algorithms are part of this area. In this work a grey wolf optimizer is proposed for the design of modular granular neural networks, and the results are compared against a genetic algorithm and a firefly algorithm in order to know which of these techniques provides better results when applied to human recognition. PMID:28894461
A Grey Wolf Optimizer for Modular Granular Neural Networks for Human Recognition.
Sánchez, Daniela; Melin, Patricia; Castillo, Oscar
2017-01-01
A grey wolf optimizer for modular neural network (MNN) with a granular approach is proposed. The proposed method performs optimal granulation of data and design of modular neural networks architectures to perform human recognition, and to prove its effectiveness benchmark databases of ear, iris, and face biometric measures are used to perform tests and comparisons against other works. The design of a modular granular neural network (MGNN) consists in finding optimal parameters of its architecture; these parameters are the number of subgranules, percentage of data for the training phase, learning algorithm, goal error, number of hidden layers, and their number of neurons. Nowadays, there is a great variety of approaches and new techniques within the evolutionary computing area, and these approaches and techniques have emerged to help find optimal solutions to problems or models and bioinspired algorithms are part of this area. In this work a grey wolf optimizer is proposed for the design of modular granular neural networks, and the results are compared against a genetic algorithm and a firefly algorithm in order to know which of these techniques provides better results when applied to human recognition.
Channel Compensation for Speaker Recognition using MAP Adapted PLDA and Denoising DNNs
2016-06-21
improvement has been the availability of large quantities of speaker-labeled data from telephone recordings. For new data applications, such as audio from...mi- crophone channels to the telephone channel. Audio files were rejected if the alignment process failed. At the end of the pro- cess a total of 873...Microphone 01 AT3035 ( Audio Technica Studio Mic) 02 MX418S (Shure Gooseneck Mic) 03 Crown PZM Soundgrabber II 04 AT Pro45 ( Audio Technica Hanging Mic
NASA Astrophysics Data System (ADS)
Medjkoune, Sofiane; Mouchère, Harold; Petitrenaud, Simon; Viard-Gaudin, Christian
2013-01-01
The work reported in this paper concerns the problem of mathematical expressions recognition. This task is known to be a very hard one. We propose to alleviate the difficulties by taking into account two complementary modalities. The modalities referred to are handwriting and audio ones. To combine the signals coming from both modalities, various fusion methods are explored. Performances evaluated on the HAMEX dataset show a significant improvement compared to a single modality (handwriting) based system.
Robust Radio Broadcast Monitoring Using a Multi-Band Spectral Entropy Signature
NASA Astrophysics Data System (ADS)
Camarena-Ibarrola, Antonio; Chávez, Edgar; Tellez, Eric Sadit
Monitoring media broadcast content has deserved a lot of attention lately from both academy and industry due to the technical challenge involved and its economic importance (e.g. in advertising). The problem pose a unique challenge from the pattern recognition point of view because a very high recognition rate is needed under non ideal conditions. The problem consist in comparing a small audio sequence (the commercial ad) with a large audio stream (the broadcast) searching for matches.
Fifty years of progress in speech and speaker recognition
NASA Astrophysics Data System (ADS)
Furui, Sadaoki
2004-10-01
Speech and speaker recognition technology has made very significant progress in the past 50 years. The progress can be summarized by the following changes: (1) from template matching to corpus-base statistical modeling, e.g., HMM and n-grams, (2) from filter bank/spectral resonance to Cepstral features (Cepstrum + DCepstrum + DDCepstrum), (3) from heuristic time-normalization to DTW/DP matching, (4) from gdistanceh-based to likelihood-based methods, (5) from maximum likelihood to discriminative approach, e.g., MCE/GPD and MMI, (6) from isolated word to continuous speech recognition, (7) from small vocabulary to large vocabulary recognition, (8) from context-independent units to context-dependent units for recognition, (9) from clean speech to noisy/telephone speech recognition, (10) from single speaker to speaker-independent/adaptive recognition, (11) from monologue to dialogue/conversation recognition, (12) from read speech to spontaneous speech recognition, (13) from recognition to understanding, (14) from single-modality (audio signal only) to multi-modal (audio/visual) speech recognition, (15) from hardware recognizer to software recognizer, and (16) from no commercial application to many practical commercial applications. Most of these advances have taken place in both the fields of speech recognition and speaker recognition. The majority of technological changes have been directed toward the purpose of increasing robustness of recognition, including many other additional important techniques not noted above.
Engel, Annerose; Bangert, Marc; Horbank, David; Hijmans, Brenda S; Wilkens, Katharina; Keller, Peter E; Keysers, Christian
2012-11-01
To investigate the cross-modal transfer of movement patterns necessary to perform melodies on the piano, 22 non-musicians learned to play short sequences on a piano keyboard by (1) merely listening and replaying (vision of own fingers occluded) or (2) merely observing silent finger movements and replaying (on a silent keyboard). After training, participants recognized with above chance accuracy (1) audio-motor learned sequences upon visual presentation (89±17%), and (2) visuo-motor learned sequences upon auditory presentation (77±22%). The recognition rates for visual presentation significantly exceeded those for auditory presentation (p<.05). fMRI revealed that observing finger movements corresponding to audio-motor trained melodies is associated with stronger activation in the left rolandic operculum than observing untrained sequences. This region was also involved in silent execution of sequences, suggesting that a link to motor representations may play a role in cross-modal transfer from audio-motor training condition to visual recognition. No significant differences in brain activity were found during listening to visuo-motor trained compared to untrained melodies. Cross-modal transfer was stronger from the audio-motor training condition to visual recognition and this is discussed in relation to the fact that non-musicians are familiar with how their finger movements look (motor-to-vision transformation), but not with how they sound on a piano (motor-to-sound transformation). Copyright © 2012 Elsevier Inc. All rights reserved.
Cross-modal individual recognition in wild African lions.
Gilfillan, Geoffrey; Vitale, Jessica; McNutt, John Weldon; McComb, Karen
2016-08-01
Individual recognition is considered to have been fundamental in the evolution of complex social systems and is thought to be a widespread ability throughout the animal kingdom. Although robust evidence for individual recognition remains limited, recent experimental paradigms that examine cross-modal processing have demonstrated individual recognition in a range of captive non-human animals. It is now highly relevant to test whether cross-modal individual recognition exists within wild populations and thus examine how it is employed during natural social interactions. We address this question by testing audio-visual cross-modal individual recognition in wild African lions (Panthera leo) using an expectancy-violation paradigm. When presented with a scenario where the playback of a loud-call (roaring) broadcast from behind a visual block is incongruent with the conspecific previously seen there, subjects responded more strongly than during the congruent scenario where the call and individual matched. These findings suggest that lions are capable of audio-visual cross-modal individual recognition and provide a useful method for studying this ability in wild populations. © 2016 The Author(s).
Reconstruction of audio waveforms from spike trains of artificial cochlea models
Zai, Anja T.; Bhargava, Saurabh; Mesgarani, Nima; Liu, Shih-Chii
2015-01-01
Spiking cochlea models describe the analog processing and spike generation process within the biological cochlea. Reconstructing the audio input from the artificial cochlea spikes is therefore useful for understanding the fidelity of the information preserved in the spikes. The reconstruction process is challenging particularly for spikes from the mixed signal (analog/digital) integrated circuit (IC) cochleas because of multiple non-linearities in the model and the additional variance caused by random transistor mismatch. This work proposes an offline method for reconstructing the audio input from spike responses of both a particular spike-based hardware model called the AEREAR2 cochlea and an equivalent software cochlea model. This method was previously used to reconstruct the auditory stimulus based on the peri-stimulus histogram of spike responses recorded in the ferret auditory cortex. The reconstructed audio from the hardware cochlea is evaluated against an analogous software model using objective measures of speech quality and intelligibility; and further tested in a word recognition task. The reconstructed audio under low signal-to-noise (SNR) conditions (SNR < –5 dB) gives a better classification performance than the original SNR input in this word recognition task. PMID:26528113
MPEG-7 audio-visual indexing test-bed for video retrieval
NASA Astrophysics Data System (ADS)
Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian
2003-12-01
This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
The $19.95 Solution to Large Group Telephone Interviews with Special Speakers.
ERIC Educational Resources Information Center
Robinson, George H.
1998-01-01
Describes an inexpensive solution for holding large-group telephone interviews, listing the equipment needed (record control, telephone, phone line with modular jack, portable amplifier with microphone-level input jack, audio cable with jack and plug compatible with the microphone input jack on the amplifier) and providing directions for setup.…
Improved Techniques for Automatic Chord Recognition from Music Audio Signals
ERIC Educational Resources Information Center
Cho, Taemin
2014-01-01
This thesis is concerned with the development of techniques that facilitate the effective implementation of capable automatic chord transcription from music audio signals. Since chord transcriptions can capture many important aspects of music, they are useful for a wide variety of music applications and also useful for people who learn and perform…
Multimodal fusion of polynomial classifiers for automatic person recgonition
NASA Astrophysics Data System (ADS)
Broun, Charles C.; Zhang, Xiaozheng
2001-03-01
With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with AWGN in the audio domain over a range of signal-to-noise ratios.
[Intermodal timing cues for audio-visual speech recognition].
Hashimoto, Masahiro; Kumashiro, Masaharu
2004-06-01
The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
2011-11-17
Mr. Frank Salvatore, High Performance Technologies FIXED AND ROTARY WING AIRCRAFT 13274 - “CREATE-AV DaVinci : Model-Based Engineering for Systems... Tools for Reliability Improvement and Addressing Modularity Issues in Evaluation and Physical Testing”, Dr. Richard Heine, Army Materiel Systems
Vacher, Michel; Chahuara, Pedro; Lecouteux, Benjamin; Istrate, Dan; Portet, Francois; Joubert, Thierry; Sehili, Mohamed; Meillon, Brigitte; Bonnefond, Nicolas; Fabre, Sébastien; Roux, Camille; Caffiau, Sybille
2013-01-01
The Sweet-Home project aims at providing audio-based interaction technology that lets the user have full control over their home environment, at detecting distress situations and at easing the social inclusion of the elderly and frail population. This paper presents an overview of the project focusing on the implemented techniques for speech and sound recognition as context-aware decision making with uncertainty. A user experiment in a smart home demonstrates the interest of this audio-based technology.
High-Level Event Recognition in Unconstrained Videos
2013-01-01
frames per- forms well for urban soundscapes but not for polyphonic music. In place of GMM, Lu et al. [78] adopted spectral clustering to generate...Aucouturier JJ, Defreville B, Pachet F (2007) The bag-of-frames approach to audio pattern recognition: a sufficientmodel for urban soundscapes but not
Modular structure of functional networks in olfactory memory.
Meunier, David; Fonlupt, Pierre; Saive, Anne-Lise; Plailly, Jane; Ravel, Nadine; Royet, Jean-Pierre
2014-07-15
Graph theory enables the study of systems by describing those systems as a set of nodes and edges. Graph theory has been widely applied to characterize the overall structure of data sets in the social, technological, and biological sciences, including neuroscience. Modular structure decomposition enables the definition of sub-networks whose components are gathered in the same module and work together closely, while working weakly with components from other modules. This processing is of interest for studying memory, a cognitive process that is widely distributed. We propose a new method to identify modular structure in task-related functional magnetic resonance imaging (fMRI) networks. The modular structure was obtained directly from correlation coefficients and thus retained information about both signs and weights. The method was applied to functional data acquired during a yes-no odor recognition memory task performed by young and elderly adults. Four response categories were explored: correct (Hit) and incorrect (False alarm, FA) recognition and correct and incorrect rejection. We extracted time series data for 36 areas as a function of response categories and age groups and calculated condition-based weighted correlation matrices. Overall, condition-based modular partitions were more homogeneous in young than elderly subjects. Using partition similarity-based statistics and a posteriori statistical analyses, we demonstrated that several areas, including the hippocampus, caudate nucleus, and anterior cingulate gyrus, belonged to the same module more frequently during Hit than during all other conditions. Modularity values were negatively correlated with memory scores in the Hit condition and positively correlated with bias scores (liberal/conservative attitude) in the Hit and FA conditions. We further demonstrated that the proportion of positive and negative links between areas of different modules (i.e., the proportion of correlated and anti-correlated areas) accounted for most of the observed differences in signed modularity. Taken together, our results provided some evidence that the neural networks involved in odor recognition memory are organized into modules and that these modular partitions are linked to behavioral performance and individual strategies. Copyright © 2014 Elsevier Inc. All rights reserved.
Psychological Operations: Fighting the War of Ideas
2007-05-18
is the success of the Joint Interagency Task Force on the Former Regime Elements (JIATF- FRE) operation to capture Fadhil Ibrahim Habib al-Mashadani... DAPS ), Fly Away Broadcast System (FABS), and Target Audience Analysis Detachment (TAAD). This provides the Brigade a radio development and broadcast...level. Production and dissemination assets must include a Modular Print System (MPS), Deployable Audio Production Suite ( DAPS ), and Special
Schierholz, Irina; Finke, Mareike; Kral, Andrej; Büchner, Andreas; Rach, Stefan; Lenarz, Thomas; Dengler, Reinhard; Sandmann, Pascale
2017-04-01
There is substantial variability in speech recognition ability across patients with cochlear implants (CIs), auditory brainstem implants (ABIs), and auditory midbrain implants (AMIs). To better understand how this variability is related to central processing differences, the current electroencephalography (EEG) study compared hearing abilities and auditory-cortex activation in patients with electrical stimulation at different sites of the auditory pathway. Three different groups of patients with auditory implants (Hannover Medical School; ABI: n = 6, CI: n = 6; AMI: n = 2) performed a speeded response task and a speech recognition test with auditory, visual, and audio-visual stimuli. Behavioral performance and cortical processing of auditory and audio-visual stimuli were compared between groups. ABI and AMI patients showed prolonged response times on auditory and audio-visual stimuli compared with NH listeners and CI patients. This was confirmed by prolonged N1 latencies and reduced N1 amplitudes in ABI and AMI patients. However, patients with central auditory implants showed a remarkable gain in performance when visual and auditory input was combined, in both speech and non-speech conditions, which was reflected by a strong visual modulation of auditory-cortex activation in these individuals. In sum, the results suggest that the behavioral improvement for audio-visual conditions in central auditory implant patients is based on enhanced audio-visual interactions in the auditory cortex. Their findings may provide important implications for the optimization of electrical stimulation and rehabilitation strategies in patients with central auditory prostheses. Hum Brain Mapp 38:2206-2225, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Using speech recognition to enhance the Tongue Drive System functionality in computer access.
Huo, Xueliang; Ghovanloo, Maysam
2011-01-01
Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing.
Transfer Learning for Improved Audio-Based Human Activity Recognition.
Ntalampiras, Stavros; Potamitis, Ilyas
2018-06-25
Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.
Single-sensor multispeaker listening with acoustic metamaterials
Xie, Yangbo; Tsai, Tsung-Han; Konneker, Adam; Popa, Bogdan-Ioan; Brady, David J.; Cummer, Steven A.
2015-01-01
Designing a “cocktail party listener” that functionally mimics the selective perception of a human auditory system has been pursued over the past decades. By exploiting acoustic metamaterials and compressive sensing, we present here a single-sensor listening device that separates simultaneous overlapping sounds from different sources. The device with a compact array of resonant metamaterials is demonstrated to distinguish three overlapping and independent sources with 96.67% correct audio recognition. Segregation of the audio signals is achieved using physical layer encoding without relying on source characteristics. This hardware approach to multichannel source separation can be applied to robust speech recognition and hearing aids and may be extended to other acoustic imaging and sensing applications. PMID:26261314
Blind speech separation system for humanoid robot with FastICA for audio filtering and separation
NASA Astrophysics Data System (ADS)
Budiharto, Widodo; Santoso Gunawan, Alexander Agung
2016-07-01
Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.
NASA Astrophysics Data System (ADS)
Campo, D.; Quintero, O. L.; Bastidas, M.
2016-04-01
We propose a study of the mathematical properties of voice as an audio signal. This work includes signals in which the channel conditions are not ideal for emotion recognition. Multiresolution analysis- discrete wavelet transform - was performed through the use of Daubechies Wavelet Family (Db1-Haar, Db6, Db8, Db10) allowing the decomposition of the initial audio signal into sets of coefficients on which a set of features was extracted and analyzed statistically in order to differentiate emotional states. ANNs proved to be a system that allows an appropriate classification of such states. This study shows that the extracted features using wavelet decomposition are enough to analyze and extract emotional content in audio signals presenting a high accuracy rate in classification of emotional states without the need to use other kinds of classical frequency-time features. Accordingly, this paper seeks to characterize mathematically the six basic emotions in humans: boredom, disgust, happiness, anxiety, anger and sadness, also included the neutrality, for a total of seven states to identify.
Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation
Banks, Briony; Gowen, Emma; Munro, Kevin J.; Adank, Patti
2015-01-01
Perceptual adaptation allows humans to recognize different varieties of accented speech. We investigated whether perceptual adaptation to accented speech is facilitated if listeners can see a speaker’s facial and mouth movements. In Study 1, participants listened to sentences in a novel accent and underwent a period of training with audiovisual or audio-only speech cues, presented in quiet or in background noise. A control group also underwent training with visual-only (speech-reading) cues. We observed no significant difference in perceptual adaptation between any of the groups. To address a number of remaining questions, we carried out a second study using a different accent, speaker and experimental design, in which participants listened to sentences in a non-native (Japanese) accent with audiovisual or audio-only cues, without separate training. Participants’ eye gaze was recorded to verify that they looked at the speaker’s face during audiovisual trials. Recognition accuracy was significantly better for audiovisual than for audio-only stimuli; however, no statistical difference in perceptual adaptation was observed between the two modalities. Furthermore, Bayesian analysis suggested that the data supported the null hypothesis. Our results suggest that although the availability of visual speech cues may be immediately beneficial for recognition of unfamiliar accented speech in noise, it does not improve perceptual adaptation. PMID:26283946
Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation.
Banks, Briony; Gowen, Emma; Munro, Kevin J; Adank, Patti
2015-01-01
Perceptual adaptation allows humans to recognize different varieties of accented speech. We investigated whether perceptual adaptation to accented speech is facilitated if listeners can see a speaker's facial and mouth movements. In Study 1, participants listened to sentences in a novel accent and underwent a period of training with audiovisual or audio-only speech cues, presented in quiet or in background noise. A control group also underwent training with visual-only (speech-reading) cues. We observed no significant difference in perceptual adaptation between any of the groups. To address a number of remaining questions, we carried out a second study using a different accent, speaker and experimental design, in which participants listened to sentences in a non-native (Japanese) accent with audiovisual or audio-only cues, without separate training. Participants' eye gaze was recorded to verify that they looked at the speaker's face during audiovisual trials. Recognition accuracy was significantly better for audiovisual than for audio-only stimuli; however, no statistical difference in perceptual adaptation was observed between the two modalities. Furthermore, Bayesian analysis suggested that the data supported the null hypothesis. Our results suggest that although the availability of visual speech cues may be immediately beneficial for recognition of unfamiliar accented speech in noise, it does not improve perceptual adaptation.
Rainsford, M; Palmer, M A; Paine, G
2018-04-01
Despite numerous innovative studies, rates of replication in the field of music psychology are extremely low (Frieler et al., 2013). Two key methodological challenges affecting researchers wishing to administer and reproduce studies in music cognition are the difficulty of measuring musical responses, particularly when conducting free-recall studies, and access to a reliable set of novel stimuli unrestricted by copyright or licensing issues. In this article, we propose a solution for these challenges in computer-based administration. We present a computer-based application for testing memory for melodies. Created using the software Max/MSP (Cycling '74, 2014a), the MUSOS (Music Software System) Toolkit uses a simple modular framework configurable for testing common paradigms such as recall, old-new recognition, and stem completion. The program is accompanied by a stimulus set of 156 novel, copyright-free melodies, in audio and Max/MSP file formats. Two pilot tests were conducted to establish the properties of the accompanying stimulus set that are relevant to music cognition and general memory research. By using this software, a researcher without specialist musical training may administer and accurately measure responses from common paradigms used in the study of memory for music.
Spatialized audio improves call sign recognition during multi-aircraft control.
Kim, Sungbin; Miller, Michael E; Rusnock, Christina F; Elshaw, John J
2018-07-01
We investigated the impact of a spatialized audio display on response time, workload, and accuracy while monitoring auditory information for relevance. The human ability to differentiate sound direction implies that spatial audio may be used to encode information. Therefore, it is hypothesized that spatial audio cues can be applied to aid differentiation of critical versus noncritical verbal auditory information. We used a human performance model and a laboratory study involving 24 participants to examine the effect of applying a notional, automated parser to present audio in a particular ear depending on information relevance. Operator workload and performance were assessed while subjects listened for and responded to relevant audio cues associated with critical information among additional noncritical information. Encoding relevance through spatial location in a spatial audio display system--as opposed to monophonic, binaural presentation--significantly reduced response time and workload, particularly for noncritical information. Future auditory displays employing spatial cues to indicate relevance have the potential to reduce workload and improve operator performance in similar task domains. Furthermore, these displays have the potential to reduce the dependence of workload and performance on the number of audio cues. Published by Elsevier Ltd.
Multifunctional microcontrollable interface module
NASA Astrophysics Data System (ADS)
Spitzer, Mark B.; Zavracky, Paul M.; Rensing, Noa M.; Crawford, J.; Hockman, Angela H.; Aquilino, P. D.; Girolamo, Henry J.
2001-08-01
This paper reports the development of a complete eyeglass- mounted computer interface system including display, camera and audio subsystems. The display system provides an SVGA image with a 20 degree horizontal field of view. The camera system has been optimized for face recognition and provides a 19 degree horizontal field of view. A microphone and built-in pre-amp optimized for voice recognition and a speaker on an articulated arm are included for audio. An important feature of the system is a high degree of adjustability and reconfigurability. The system has been developed for testing by the Military Police, in a complete system comprising the eyeglass-mounted interface, a wearable computer, and an RF link. Details of the design, construction, and performance of the eyeglass-based system are discussed.
Two Stage Data Augmentation for Low Resourced Speech Recognition (Author’s Manuscript)
2016-09-12
speech recognition, deep neural networks, data augmentation 1. Introduction When training data is limited—whether it be audio or text—the obvious...Schwartz, and S. Tsakalidis, “Enhancing low resource keyword spotting with au- tomatically retrieved web documents,” in Interspeech, 2015, pp. 839–843. [2...and F. Seide, “Feature learning in deep neural networks - a study on speech recognition tasks,” in International Conference on Learning Representations
Using Speech Recognition to Enhance the Tongue Drive System Functionality in Computer Access
Huo, Xueliang; Ghovanloo, Maysam
2013-01-01
Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing. PMID:22255801
How Deep Neural Networks Can Improve Emotion Recognition on Video Data
2016-09-25
HOW DEEP NEURAL NETWORKS CAN IMPROVE EMOTION RECOGNITION ON VIDEO DATA Pooya Khorrami1 , Tom Le Paine1, Kevin Brady2, Charlie Dagli2, Thomas S...this work, we present a system that per- forms emotion recognition on video data using both con- volutional neural networks (CNNs) and recurrent...neural net- works (RNNs). We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects
Clearing the skies over modular polyketide synthases.
Sherman, David H; Smith, Janet L
2006-09-19
Modular polyketide synthases (PKSs) are large multifunctional proteins that synthesize complex polyketide metabolites in microbial cells. A series of recent studies confirm the close protein structural relationship between catalytic domains in the type I mammalian fatty acid synthase (FAS) and the basic synthase unit of the modular PKS. They also establish a remarkable similarity in the overall organization of the type I FAS and the PKS module. This information provides important new conclusions about catalytic domain architecture, function, and molecular recognition that are essential for future efforts to engineer useful polyketide metabolites with valuable biological activities.
Automatic lip reading by using multimodal visual features
NASA Astrophysics Data System (ADS)
Takahashi, Shohei; Ohya, Jun
2013-12-01
Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.
Rasmussen, Luke V; Peissig, Peggy L; McCarty, Catherine A; Starren, Justin
2012-06-01
Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.
Peissig, Peggy L; McCarty, Catherine A; Starren, Justin
2011-01-01
Background Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. Methods We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. Observations The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. Discussion While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline. PMID:21890871
FIRRE command and control station (C2)
NASA Astrophysics Data System (ADS)
Laird, R. T.; Kramer, T. A.; Cruickshanks, J. R.; Curd, K. M.; Thomas, K. M.; Moneyhun, J.
2006-05-01
The Family of Integrated Rapid Response Equipment (FIRRE) is an advanced technology demonstration program intended to develop a family of affordable, scalable, modular, and logistically supportable unmanned systems to meet urgent operational force protection needs and requirements worldwide. The near-term goal is to provide the best available unmanned ground systems to the warfighter in Iraq and Afghanistan. The overarching long-term goal is to develop a fully-integrated, layered force protection system of systems for our forward deployed forces that is networked with the future force C4ISR systems architecture. The intent of the FIRRE program is to reduce manpower requirements, enhance force protection capabilities, and reduce casualties through the use of unmanned systems. FIRRE is sponsored by the Office of the Under Secretary of Defense, Acquisitions, Technology and Logistics (OUSD AT&L), and is managed by the Product Manager, Force Protection Systems (PM-FPS). The FIRRE Command and Control (C2) Station supports two operators, hosts the Joint Battlespace Command and Control Software for Manned and Unmanned Assets (JBC2S), and will be able to host Mission Planning and Rehearsal (MPR) software. The C2 Station consists of an M1152 HMMWV fitted with an S-788 TYPE I shelter. The C2 Station employs five 24" LCD monitors for display of JBC2S software [1], MPR software, and live video feeds from unmanned systems. An audio distribution system allows each operator to select between various audio sources including: AN/PRC-117F tactical radio (SINCGARS compatible), audio prompts from JBC2S software, audio from unmanned systems, audio from other operators, and audio from external sources such as an intercom in an adjacent Tactical Operations Center (TOC). A power distribution system provides battery backup for momentary outages. The Ethernet network, audio distribution system, and audio/video feeds are available for use outside the C2 Station.
Face recognition by applying wavelet subband representation and kernel associative memory.
Zhang, Bai-Ling; Zhang, Haihong; Ge, Shuzhi Sam
2004-01-01
In this paper, we propose an efficient face recognition scheme which has two features: 1) representation of face images by two-dimensional (2-D) wavelet subband coefficients and 2) recognition by a modular, personalised classification method based on kernel associative memory models. Compared to PCA projections and low resolution "thumb-nail" image representations, wavelet subband coefficients can efficiently capture substantial facial features while keeping computational complexity low. As there are usually very limited samples, we constructed an associative memory (AM) model for each person and proposed to improve the performance of AM models by kernel methods. Specifically, we first applied kernel transforms to each possible training pair of faces sample and then mapped the high-dimensional feature space back to input space. Our scheme using modular autoassociative memory for face recognition is inspired by the same motivation as using autoencoders for optical character recognition (OCR), for which the advantages has been proven. By associative memory, all the prototypical faces of one particular person are used to reconstruct themselves and the reconstruction error for a probe face image is used to decide if the probe face is from the corresponding person. We carried out extensive experiments on three standard face recognition datasets, the FERET data, the XM2VTS data, and the ORL data. Detailed comparisons with earlier published results are provided and our proposed scheme offers better recognition accuracy on all of the face datasets.
The Benefit of Remote Microphones Using Four Wireless Protocols.
Rodemerk, Krishna S; Galster, Jason A
2015-09-01
Many studies have reported the speech recognition benefits of a personal remote microphone system when used by adult listeners with hearing loss. The advance of wireless technology has allowed for many wireless audio transmission protocols. Some of these protocols interface with commercially available hearing aids. As a result, commercial remote microphone systems use a variety of different protocols for wireless audio transmission. It is not known how these systems compare, with regard to adult speech recognition in noise. The primary goal of this investigation was to determine the speech recognition benefits of four different commercially available remote microphone systems, each with a different wireless audio transmission protocol. A repeated-measures design was used in this study. Sixteen adults, ages 52 to 81 yr, with mild to severe sensorineural hearing loss participated in this study. Participants were fit with three different sets of bilateral hearing aids and four commercially available remote microphone systems (FM, 900 MHz, 2.4 GHz, and Bluetooth(®) paired with near-field magnetic induction). Speech recognition scores were measured by an adaptive version of the Hearing in Noise Test (HINT). The participants were seated both 6 and 12' away from the talker loudspeaker. Participants repeated HINT sentences with and without hearing aids and with four commercially available remote microphone systems in both seated positions with and without contributions from the hearing aid or environmental microphone (24 total conditions). The HINT SNR-50, or the signal-to-noise ratio required for correct repetition of 50% of the sentences, was recorded for all conditions. A one-way repeated measures analysis of variance was used to determine statistical significance of microphone condition. The results of this study revealed that use of the remote microphone systems statistically improved speech recognition in noise relative to unaided and hearing aid-only conditions across all four wireless transmission protocols at 6 and 12' away from the talker. Participants showed a significant improvement in speech recognition in noise when comparing four remote microphone systems with different wireless transmission methods to hearing aids alone. American Academy of Audiology.
Automated Cough Assessment on a Mobile Platform
2014-01-01
The development of an Automated System for Asthma Monitoring (ADAM) is described. This consists of a consumer electronics mobile platform running a custom application. The application acquires an audio signal from an external user-worn microphone connected to the device analog-to-digital converter (microphone input). This signal is processed to determine the presence or absence of cough sounds. Symptom tallies and raw audio waveforms are recorded and made easily accessible for later review by a healthcare provider. The symptom detection algorithm is based upon standard speech recognition and machine learning paradigms and consists of an audio feature extraction step followed by a Hidden Markov Model based Viterbi decoder that has been trained on a large database of audio examples from a variety of subjects. Multiple Hidden Markov Model topologies and orders are studied. Performance of the recognizer is presented in terms of the sensitivity and the rate of false alarm as determined in a cross-validation test. PMID:25506590
Simpson, Claire; Pinkham, Amy E; Kelsven, Skylar; Sasson, Noah J
2013-12-01
Emotion can be expressed by both the voice and face, and previous work suggests that presentation modality may impact emotion recognition performance in individuals with schizophrenia. We investigated the effect of stimulus modality on emotion recognition accuracy and the potential role of visual attention to faces in emotion recognition abilities. Thirty-one patients who met DSM-IV criteria for schizophrenia (n=8) or schizoaffective disorder (n=23) and 30 non-clinical control individuals participated. Both groups identified emotional expressions in three different conditions: audio only, visual only, combined audiovisual. In the visual only and combined conditions, time spent visually fixating salient features of the face were recorded. Patients were significantly less accurate than controls in emotion recognition during both the audio and visual only conditions but did not differ from controls on the combined condition. Analysis of visual scanning behaviors demonstrated that patients attended less than healthy individuals to the mouth in the visual condition but did not differ in visual attention to salient facial features in the combined condition, which may in part explain the absence of a deficit for patients in this condition. Collectively, these findings demonstrate that patients benefit from multimodal stimulus presentations of emotion and support hypotheses that visual attention to salient facial features may serve as a mechanism for accurate emotion identification. © 2013.
Santos, Rui; Pombo, Nuno; Flórez-Revuelta, Francisco
2018-01-01
An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of the sound in that particular environment. This paper reviews audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices. A comprehensive literature search was conducted in order to identify relevant English language works aimed at the identification of the environment of ADLs using data acquired with mobile devices, published between 2002 and 2017. In total, 40 studies were analyzed and selected from 115 citations. The results highlight several audio fingerprinting techniques, including Modified discrete cosine transform (MDCT), Mel-frequency cepstrum coefficients (MFCC), Principal Component Analysis (PCA), Fast Fourier Transform (FFT), Gaussian mixture models (GMM), likelihood estimation, logarithmic moduled complex lapped transform (LMCLT), support vector machine (SVM), constant Q transform (CQT), symmetric pairwise boosting (SPB), Philips robust hash (PRH), linear discriminant analysis (LDA) and discrete cosine transform (DCT). PMID:29315232
NASA Astrophysics Data System (ADS)
Vassiliou, Marius S.; Sundareswaran, Venkataraman; Chen, S.; Behringer, Reinhold; Tam, Clement K.; Chan, M.; Bangayan, Phil T.; McGee, Joshua H.
2000-08-01
We describe new systems for improved integrated multimodal human-computer interaction and augmented reality for a diverse array of applications, including future advanced cockpits, tactical operations centers, and others. We have developed an integrated display system featuring: speech recognition of multiple concurrent users equipped with both standard air- coupled microphones and novel throat-coupled sensors (developed at Army Research Labs for increased noise immunity); lip reading for improving speech recognition accuracy in noisy environments, three-dimensional spatialized audio for improved display of warnings, alerts, and other information; wireless, coordinated handheld-PC control of a large display; real-time display of data and inferences from wireless integrated networked sensors with on-board signal processing and discrimination; gesture control with disambiguated point-and-speak capability; head- and eye- tracking coupled with speech recognition for 'look-and-speak' interaction; and integrated tetherless augmented reality on a wearable computer. The various interaction modalities (speech recognition, 3D audio, eyetracking, etc.) are implemented a 'modality servers' in an Internet-based client-server architecture. Each modality server encapsulates and exposes commercial and research software packages, presenting a socket network interface that is abstracted to a high-level interface, minimizing both vendor dependencies and required changes on the client side as the server's technology improves.
Modular Algorithm Testbed Suite (MATS): A Software Framework for Automatic Target Recognition
2017-01-01
004 OFFICE OF NAVAL RESEARCH ATTN JASON STACK MINE WARFARE & OCEAN ENGINEERING PROGRAMS CODE 32, SUITE 1092 875 N RANDOLPH ST ARLINGTON VA 22203 ONR...naval mine countermeasures (MCM) operations by automating a large portion of the data analysis. Successful long-term implementation of ATR requires a...Modular Algorithm Testbed Suite; MATS; Mine Countermeasures Operations U U U SAR 24 Derek R. Kolacinski (850) 230-7218 THIS PAGE INTENTIONALLY LEFT
Using peptide array to identify binding motifs and interaction networks for modular domains.
Li, Shawn S-C; Wu, Chenggang
2009-01-01
Specific protein-protein interactions underlie all essential biological processes and form the basis of cellular signal transduction. The recognition of a short, linear peptide sequence in one protein by a modular domain in another represents a common theme of macromolecular recognition in cells, and the importance of this mode of protein-protein interaction is highlighted by the large number of peptide-binding domains encoded by the human genome. This phenomenon also provides a unique opportunity to identify protein-protein binding events using peptide arrays and complementary biochemical assays. Accordingly, high-density peptide array has emerged as a useful tool by which to map domain-mediated protein-protein interaction networks at the proteome level. Using the Src-homology 2 (SH2) and 3 (SH3) domains as examples, we describe the application of oriented peptide array libraries in uncovering specific motifs recognized by an SH2 domain and the use of high-density peptide arrays in identifying interaction networks mediated by the SH3 domain. Methods reviewed here could also be applied to other modular domains, including catalytic domains, that recognize linear peptide sequences.
Modular Activating Receptors in Innate and Adaptive Immunity.
Berry, Richard; Call, Matthew E
2017-03-14
Triggering of cell-mediated immunity is largely dependent on the recognition of foreign or abnormal molecules by a myriad of cell surface-bound receptors. Many activating immune receptors do not possess any intrinsic signaling capacity but instead form noncovalent complexes with one or more dimeric signaling modules that communicate with a common set of kinases to initiate intracellular information-transfer pathways. This modular architecture, where the ligand binding and signaling functions are detached from one another, is a common theme that is widely employed throughout the innate and adaptive arms of immune systems. The evolutionary advantages of this highly adaptable platform for molecular recognition are visible in the variety of ligand-receptor interactions that can be linked to common signaling pathways, the diversification of receptor modules in response to pathogen challenges, and the amplification of cellular responses through incorporation of multiple signaling motifs. Here we provide an overview of the major classes of modular activating immune receptors and outline the current state of knowledge regarding how these receptors assemble, recognize their ligands, and ultimately trigger intracellular signal transduction pathways that activate immune cell effector functions.
A microcomputer interface for a digital audio processor-based data recording system.
Croxton, T L; Stump, S J; Armstrong, W M
1987-10-01
An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer.
A microcomputer interface for a digital audio processor-based data recording system.
Croxton, T L; Stump, S J; Armstrong, W M
1987-01-01
An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer. PMID:3676444
Smartphone based face recognition tool for the blind.
Kramer, K M; Hedin, D S; Rolkosky, D J
2010-01-01
The inability to identify people during group meetings is a disadvantage for blind people in many professional and educational situations. To explore the efficacy of face recognition using smartphones in these settings, we have prototyped and tested a face recognition tool for blind users. The tool utilizes Smartphone technology in conjunction with a wireless network to provide audio feedback of the people in front of the blind user. Testing indicated that the face recognition technology can tolerate up to a 40 degree angle between the direction a person is looking and the camera's axis and a 96% success rate with no false positives. Future work will be done to further develop the technology for local face recognition on the smartphone in addition to remote server based face recognition.
Automatic image database generation from CAD for 3D object recognition
NASA Astrophysics Data System (ADS)
Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.
1993-06-01
The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.
Robotics control using isolated word recognition of voice input
NASA Technical Reports Server (NTRS)
Weiner, J. M.
1977-01-01
A speech input/output system is presented that can be used to communicate with a task oriented system. Human speech commands and synthesized voice output extend conventional information exchange capabilities between man and machine by utilizing audio input and output channels. The speech input facility is comprised of a hardware feature extractor and a microprocessor implemented isolated word or phrase recognition system. The recognizer offers a medium sized (100 commands), syntactically constrained vocabulary, and exhibits close to real time performance. The major portion of the recognition processing required is accomplished through software, minimizing the complexity of the hardware feature extractor.
NASA Astrophysics Data System (ADS)
Costache, G. N.; Gavat, I.
2004-09-01
Along with the aggressive growing of the amount of digital data available (text, audio samples, digital photos and digital movies joined all in the multimedia domain) the need for classification, recognition and retrieval of this kind of data became very important. In this paper will be presented a system structure to handle multimedia data based on a recognition perspective. The main processing steps realized for the interesting multimedia objects are: first, the parameterization, by analysis, in order to obtain a description based on features, forming the parameter vector; second, a classification, generally with a hierarchical structure to make the necessary decisions. For audio signals, both speech and music, the derived perceptual features are the melcepstral (MFCC) and the perceptual linear predictive (PLP) coefficients. For images, the derived features are the geometric parameters of the speaker mouth. The hierarchical classifier consists generally in a clustering stage, based on the Kohonnen Self-Organizing Maps (SOM) and a final stage, based on a powerful classification algorithm called Support Vector Machines (SVM). The system, in specific variants, is applied with good results in two tasks: the first, is a bimodal speech recognition which uses features obtained from speech signal fused to features obtained from speaker's image and the second is a music retrieval from large music database.
Affective State Level Recognition in Naturalistic Facial and Vocal Expressions.
Meng, Hongying; Bianchi-Berthouze, Nadia
2014-03-01
Naturalistic affective expressions change at a rate much slower than the typical rate at which video or audio is recorded. This increases the probability that consecutive recorded instants of expressions represent the same affective content. In this paper, we exploit such a relationship to improve the recognition performance of continuous naturalistic affective expressions. Using datasets of naturalistic affective expressions (AVEC 2011 audio and video dataset, PAINFUL video dataset) continuously labeled over time and over different dimensions, we analyze the transitions between levels of those dimensions (e.g., transitions in pain intensity level). We use an information theory approach to show that the transitions occur very slowly and hence suggest modeling them as first-order Markov models. The dimension levels are considered to be the hidden states in the Hidden Markov Model (HMM) framework. Their discrete transition and emission matrices are trained by using the labels provided with the training set. The recognition problem is converted into a best path-finding problem to obtain the best hidden states sequence in HMMs. This is a key difference from previous use of HMMs as classifiers. Modeling of the transitions between dimension levels is integrated in a multistage approach, where the first level performs a mapping between the affective expression features and a soft decision value (e.g., an affective dimension level), and further classification stages are modeled as HMMs that refine that mapping by taking into account the temporal relationships between the output decision labels. The experimental results for each of the unimodal datasets show overall performance to be significantly above that of a standard classification system that does not take into account temporal relationships. In particular, the results on the AVEC 2011 audio dataset outperform all other systems presented at the international competition.
Visual face-movement sensitive cortex is relevant for auditory-only speech recognition.
Riedel, Philipp; Ragert, Patrick; Schelinski, Stefanie; Kiebel, Stefan J; von Kriegstein, Katharina
2015-07-01
It is commonly assumed that the recruitment of visual areas during audition is not relevant for performing auditory tasks ('auditory-only view'). According to an alternative view, however, the recruitment of visual cortices is thought to optimize auditory-only task performance ('auditory-visual view'). This alternative view is based on functional magnetic resonance imaging (fMRI) studies. These studies have shown, for example, that even if there is only auditory input available, face-movement sensitive areas within the posterior superior temporal sulcus (pSTS) are involved in understanding what is said (auditory-only speech recognition). This is particularly the case when speakers are known audio-visually, that is, after brief voice-face learning. Here we tested whether the left pSTS involvement is causally related to performance in auditory-only speech recognition when speakers are known by face. To test this hypothesis, we applied cathodal transcranial direct current stimulation (tDCS) to the pSTS during (i) visual-only speech recognition of a speaker known only visually to participants and (ii) auditory-only speech recognition of speakers they learned by voice and face. We defined the cathode as active electrode to down-regulate cortical excitability by hyperpolarization of neurons. tDCS to the pSTS interfered with visual-only speech recognition performance compared to a control group without pSTS stimulation (tDCS to BA6/44 or sham). Critically, compared to controls, pSTS stimulation additionally decreased auditory-only speech recognition performance selectively for voice-face learned speakers. These results are important in two ways. First, they provide direct evidence that the pSTS is causally involved in visual-only speech recognition; this confirms a long-standing prediction of current face-processing models. Secondly, they show that visual face-sensitive pSTS is causally involved in optimizing auditory-only speech recognition. These results are in line with the 'auditory-visual view' of auditory speech perception, which assumes that auditory speech recognition is optimized by using predictions from previously encoded speaker-specific audio-visual internal models. Copyright © 2015 Elsevier Ltd. All rights reserved.
Information system for diagnosis of respiratory system diseases
NASA Astrophysics Data System (ADS)
Abramov, G. V.; Korobova, L. A.; Ivashin, A. L.; Matytsina, I. A.
2018-05-01
An information system is for the diagnosis of patients with lung diseases. The main problem solved by this system is the definition of the parameters of cough fragments in the monitoring recordings using a voice recorder. The authors give the recognition criteria of recorded cough moments, audio records analysis. The results of the research are systematized. The cough recognition system can be used by the medical specialists to diagnose the condition of the patients and to monitor the process of their treatment.
Large Vocabulary Audio-Visual Speech Recognition
2002-06-12
www.is.cs.cmu.edu Email: waibel(a)cs.cmu~edu Inttractive Systenms Labs ttoctis Ssstms Labs Meeting Browser - -- Interpreting Human Communication "Why did...Speech Interacti Stams Labs t-cive Systms Focus of Attention Tracking Conclusion - Complete Model of Human Communication is Needed - Include all
Speech to Text Translation for Malay Language
NASA Astrophysics Data System (ADS)
Al-khulaidi, Rami Ali; Akmeliawati, Rini
2017-11-01
The speech recognition system is a front end and a back-end process that receives an audio signal uttered by a speaker and converts it into a text transcription. The speech system can be used in several fields including: therapeutic technology, education, social robotics and computer entertainments. In most cases in control tasks, which is the purpose of proposing our system, wherein the speed of performance and response concern as the system should integrate with other controlling platforms such as in voiced controlled robots. Therefore, the need for flexible platforms, that can be easily edited to jibe with functionality of the surroundings, came to the scene; unlike other software programs that require recording audios and multiple training for every entry such as MATLAB and Phoenix. In this paper, a speech recognition system for Malay language is implemented using Microsoft Visual Studio C#. 90 (ninety) Malay phrases were tested by 10 (ten) speakers from both genders in different contexts. The result shows that the overall accuracy (calculated from Confusion Matrix) is satisfactory as it is 92.69%.
Sex differences in the ability to recognise non-verbal displays of emotion: a meta-analysis.
Thompson, Ashley E; Voyer, Daniel
2014-01-01
The present study aimed to quantify the magnitude of sex differences in humans' ability to accurately recognise non-verbal emotional displays. Studies of relevance were those that required explicit labelling of discrete emotions presented in the visual and/or auditory modality. A final set of 551 effect sizes from 215 samples was included in a multilevel meta-analysis. The results showed a small overall advantage in favour of females on emotion recognition tasks (d=0.19). However, the magnitude of that sex difference was moderated by several factors, namely specific emotion, emotion type (negative, positive), sex of the actor, sensory modality (visual, audio, audio-visual) and age of the participants. Method of presentation (computer, slides, print, etc.), type of measurement (response time, accuracy) and year of publication did not significantly contribute to variance in effect sizes. These findings are discussed in the context of social and biological explanations of sex differences in emotion recognition.
Four-Channel Biosignal Analysis and Feature Extraction for Automatic Emotion Recognition
NASA Astrophysics Data System (ADS)
Kim, Jonghwa; André, Elisabeth
This paper investigates the potential of physiological signals as a reliable channel for automatic recognition of user's emotial state. For the emotion recognition, little attention has been paid so far to physiological signals compared to audio-visual emotion channels such as facial expression or speech. All essential stages of automatic recognition system using biosignals are discussed, from recording physiological dataset up to feature-based multiclass classification. Four-channel biosensors are used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to search the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by emotion recognition results.
Modularity of music: evidence from a case of pure amusia.
Piccirilli, M; Sciarma, T; Luzzi, S
2000-10-01
A case of pure amusia in a 20 year old left handed non-professional musician is reported. The patient showed an impairment of music abilities in the presence of normal processing of speech and environmental sounds. Furthermore, whereas recognition and production of melodic sequences were grossly disturbed, both the recognition and production of rhythm patterns were preserved. This selective breakdown pattern was produced by a focal lesion in the left superior temporal gyrus. This case thus suggests that not only linguistic and musical skills, but also melodic and rhythmic processing are independent of each other. This functional dissociation in the musical domain supports the hypothesis that music components have a modular organisation. Furthermore, there is the suggestion that amusia may be produced by a lesion located strictly in one hemisphere and that the superior temporal gyrus plays a crucial part in melodic processing.
Introducing a modular activity monitoring system.
Reiss, Attila; Stricker, Didier
2011-01-01
In this paper, the idea of a modular activity monitoring system is introduced. By using different combinations of the system's three modules, different functionality becomes available: 1) a coarse intensity estimation of physical activities 2) different features based on HR-data and 3) the recognition of basic activities and postures. 3D-accelerometers--placed on lower arm, chest and foot--and a heart rate monitor were used as sensors. A dataset with 8 subjects and 14 different activities was recorded to evaluate the performance of the system. The overall performance on the intensity estimation task, relying on the chest-worn accelerometer and the HR-monitor, was 94.37%. The overall performance on the activity recognition task, using all three accelerometer placements and the HR-monitor, was 90.65%. This paper also gives an analysis of the importance of different accelerometer placements and the importance of a HR-monitor for both tasks.
Audio feature extraction using probability distribution function
NASA Astrophysics Data System (ADS)
Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.
2015-05-01
Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.
Unsupervised real-time speaker identification for daily movies
NASA Astrophysics Data System (ADS)
Li, Ying; Kuo, C.-C. Jay
2002-07-01
The problem of identifying speakers for movie content analysis is addressed in this paper. While most previous work on speaker identification was carried out in a supervised mode using pure audio data, more robust results can be obtained in real-time by integrating knowledge from multiple media sources in an unsupervised mode. In this work, both audio and visual cues will be employed and subsequently combined in a probabilistic framework to identify speakers. Particularly, audio information is used to identify speakers with a maximum likelihood (ML)-based approach while visual information is adopted to distinguish speakers by detecting and recognizing their talking faces based on face detection/recognition and mouth tracking techniques. Moreover, to accommodate for speakers' acoustic variations along time, we update their models on the fly by adapting to their newly contributed speech data. Encouraging results have been achieved through extensive experiments, which shows a promising future of the proposed audiovisual-based unsupervised speaker identification system.
Savran, Arman; Cao, Houwei; Shah, Miraj; Nenkova, Ani; Verma, Ragini
2013-01-01
We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively. PMID:25300451
Savran, Arman; Cao, Houwei; Shah, Miraj; Nenkova, Ani; Verma, Ragini
2012-01-01
We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.
Using Voice Coils to Actuate Modular Soft Robots: Wormbot, an Example.
Nemitz, Markus P; Mihaylov, Pavel; Barraclough, Thomas W; Ross, Dylan; Stokes, Adam A
2016-12-01
In this study, we present a modular worm-like robot, which utilizes voice coils as a new paradigm in soft robot actuation. Drive electronics are incorporated into the actuators, providing a significant improvement in self-sufficiency when compared with existing soft robot actuation modes such as pneumatics or hydraulics. The body plan of this robot is inspired by the phylum Annelida and consists of three-dimensional printed voice coil actuators, which are connected by flexible silicone membranes. Each electromagnetic actuator engages with its neighbor to compress or extend the membrane of each segment, and the sequence in which they are actuated results in an earthworm-inspired peristaltic motion. We find that a minimum of three segments is required for locomotion, but due to our modular design, robots of any length can be quickly and easily assembled. In addition to actuation, voice coils provide audio input and output capabilities. We demonstrate transmission of data between segments by high-frequency carrier waves and, using a similar mechanism, we note that the passing of power between coupled coils in neighboring modules-or from an external power source-is also possible. Voice coils are a convenient multifunctional alternative to existing soft robot actuators. Their self-contained nature and ability to communicate with each other are ideal for modular robotics, and the additional functionality of sound input/output and power transfer will become increasingly useful as soft robots begin the transition from early proof-of-concept systems toward fully functional and highly integrated robotic systems.
Audio-Visual Speech in Noise Perception in Dyslexia
ERIC Educational Resources Information Center
van Laarhoven, Thijs; Keetels, Mirjam; Schakel, Lemmy; Vroomen, Jean
2018-01-01
Individuals with developmental dyslexia (DD) may experience, besides reading problems, other speech-related processing deficits. Here, we examined the influence of visual articulatory information (lip-read speech) at various levels of background noise on auditory word recognition in children and adults with DD. We found that children with a…
Inferring Speaker Affect in Spoken Natural Language Communication
ERIC Educational Resources Information Center
Pon-Barry, Heather Roberta
2013-01-01
The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
The distinctiveness heuristic in false recognition and false recall.
McCabe, David P; Smith, Anderson D
2006-07-01
The effects of generative processing on false recognition and recall were examined in four experiments using the Deese-Roediger-McDermott false memory paradigm (Deese, 1959; Roediger & McDermott, 1995). In each experiment, a Generate condition in which subjects generated studied words from audio anagrams was compared to a Control condition in which subjects simply listened to studied words presented normally. Rates of false recognition and false recall were lower for critical lures associated with generated lists, than for critical lures associated with control lists, but only in between-subjects designs. False recall and recognition did not differ when generate and control conditions were manipulated within-subjects. This pattern of results is consistent with the distinctiveness heuristic (Schacter, Israel, & Racine, 1999), a metamemorial decision-based strategy whereby global changes in decision criteria lead to reductions of false memories. This retrieval-based monitoring mechanism appears to operate in a similar fashion in reducing false recognition and false recall.
Audio-based deep music emotion recognition
NASA Astrophysics Data System (ADS)
Liu, Tong; Han, Li; Ma, Liangkai; Guo, Dongwei
2018-05-01
As the rapid development of multimedia networking, more and more songs are issued through the Internet and stored in large digital music libraries. However, music information retrieval on these libraries can be really hard, and the recognition of musical emotion is especially challenging. In this paper, we report a strategy to recognize the emotion contained in songs by classifying their spectrograms, which contain both the time and frequency information, with a convolutional neural network (CNN). The experiments conducted on the l000-song dataset indicate that the proposed model outperforms traditional machine learning method.
ERIC Educational Resources Information Center
Gunal, Serkan
2008-01-01
Digital libraries play a crucial role in distance learning. Nowadays, they are one of the fundamental information sources for the students enrolled in this learning system. These libraries contain huge amount of instructional data (text, audio and video) offered by the distance learning program. Organization of the digital libraries is…
ERIC Educational Resources Information Center
Carlin, Michael; Toglia, Michael P.; Belmonte, Colleen; DiMeglio, Chiara
2012-01-01
In the present study the effects of visual, auditory, and audio-visual presentation formats on memory for thematically constructed lists were assessed in individuals with intellectual disability and mental age-matched children. The auditory recognition test included target items, unrelated foils, and two types of semantic lures: critical related…
Long-Term Animal Observation by Wireless Sensor Networks with Sound Recognition
NASA Astrophysics Data System (ADS)
Liu, Ning-Han; Wu, Chen-An; Hsieh, Shu-Ju
Due to wireless sensor networks can transmit data wirelessly and can be disposed easily, they are used in the wild to monitor the change of environment. However, the lifetime of sensor is limited by the battery, especially when the monitored data type is audio, the lifetime is very short due to a huge amount of data transmission. By intuition, sensor mote analyzes the sensed data and decides not to deliver them to server that can reduce the expense of energy. Nevertheless, the ability of sensor mote is not powerful enough to work on complicated methods. Therefore, it is an urgent issue to design a method to keep analyzing speed and accuracy under the restricted memory and processor. This research proposed an embedded audio processing module in the sensor mote to extract and analyze audio features in advance. Then, through the estimation of likelihood of observed animal sound by the frequencies distribution, only the interesting audio data are sent back to server. The prototype of WSN system is built and examined in the wild to observe frogs. According to the results of experiments, the energy consumed by sensors through our method can be reduced effectively to prolong the observing time of animal detecting sensors.
Multi-stream face recognition on dedicated mobile devices for crime-fighting
NASA Astrophysics Data System (ADS)
Jassim, Sabah A.; Sellahewa, Harin
2006-09-01
Automatic face recognition is a useful tool in the fight against crime and terrorism. Technological advance in mobile communication systems and multi-application mobile devices enable the creation of hybrid platforms for active and passive surveillance. A dedicated mobile device that incorporates audio-visual sensors would not only complement existing networks of fixed surveillance devices (e.g. CCTV) but could also provide wide geographical coverage in almost any situation and anywhere. Such a device can hold a small portion of a law-enforcing agency biometric database that consist of audio and/or visual data of a number of suspects/wanted or missing persons who are expected to be in a local geographical area. This will assist law-enforcing officers on the ground in identifying persons whose biometric templates are downloaded onto their devices. Biometric data on the device can be regularly updated which will reduce the number of faces an officer has to remember. Such a dedicated device would act as an active/passive mobile surveillance unit that incorporate automatic identification. This paper is concerned with the feasibility of using wavelet-based face recognition schemes on such devices. The proposed schemes extend our recently developed face verification scheme for implementation on a currently available PDA. In particular we will investigate the use of a combination of wavelet frequency channels for multi-stream face recognition. We shall present experimental results on the performance of our proposed schemes for a number of publicly available face databases including a new AV database of videos recorded on a PDA.
Tardif, Carole; Lainé, France; Rodriguez, Mélissa; Gepner, Bruno
2007-09-01
This study examined the effects of slowing down presentation of facial expressions and their corresponding vocal sounds on facial expression recognition and facial and/or vocal imitation in children with autism. Twelve autistic children and twenty-four normal control children were presented with emotional and non-emotional facial expressions on CD-Rom, under audio or silent conditions, and under dynamic visual conditions (slowly, very slowly, at normal speed) plus a static control. Overall, children with autism showed lower performance in expression recognition and more induced facial-vocal imitation than controls. In the autistic group, facial expression recognition and induced facial-vocal imitation were significantly enhanced in slow conditions. Findings may give new perspectives for understanding and intervention for verbal and emotional perceptive and communicative impairments in autistic populations.
SNR-adaptive stream weighting for audio-MES ASR.
Lee, Ki-Seung
2008-08-01
Myoelectric signals (MESs) from the speaker's mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES observation vector was given by a linear combination of class-conditional observation log-likelihoods of two classifiers, using appropriate weights. We developed a weighting process adaptive to SNRs. The main objective of the paper involves determining the optimal SNR classification boundaries and constructing a set of optimum stream weights for each SNR class. These two parameters were determined by a method based on a maximum mutual information criterion. Acoustic and facial MES data were collected from five subjects, using a 60-word vocabulary. Four types of acoustic noise including babble, car, aircraft, and white noise were acoustically added to clean speech signals with SNR ranging from -14 to 31 dB. The classification accuracy of the audio ASR was as low as 25.5%. Whereas, the classification accuracy of the MES ASR was 85.2%. The classification accuracy could be further improved by employing the proposed audio-MES weighting method, which was as high as 89.4% in the case of babble noise. A similar result was also found for the other types of noise.
Kwon, Young-Min
2016-07-01
Although dual taper modular-neck total hip arthroplasty (THA) design with additional neck-stem modularity has the potential to optimize hip biomechanical parameters by facilitating adjustments of leg length, femoral neck version and offset, there is increasing concern regarding this stem design as a result of the growing numbers of adverse local tissue reactions due to fretting and corrosion at the neck-stem taper junction. Implant factors such as taper cone angle, taper surface roughness, taper contact area, modular neck taper metallurgy, and femoral head size play important roles in influencing extent of taper corrosion. There should be a low threshold to conduct a systematic clinical evaluation of patients with dual-taper modular-neck stem THA using systematic risk stratification algorithms as early recognition and diagnosis will ensure prompt and appropriate treatment. Although specialized tests such as metal ion analysis and cross-sectional imaging modalities such as metal artifact reduction sequence magnetic resonance imaging (MARS MRI) are useful in optimizing clinical decision-making, overreliance on any single investigative tool in the clinical decision-making process for revision surgery should be avoided. Copyright © 2016 Elsevier Inc. All rights reserved.
You, Mingxu; Zhu, Guizhi; Chen, Tao; Donovan, Michael J; Tan, Weihong
2015-01-21
The specific inventory of molecules on diseased cell surfaces (e.g., cancer cells) provides clinicians an opportunity for accurate diagnosis and intervention. With the discovery of panels of cancer markers, carrying out analyses of multiple cell-surface markers is conceivable. As a trial to accomplish this, we have recently designed a DNA-based device that is capable of performing autonomous logic-based analysis of two or three cancer cell-surface markers. Combining the specific target-recognition properties of DNA aptamers with toehold-mediated strand displacement reactions, multicellular marker-based cancer analysis can be realized based on modular AND, OR, and NOT Boolean logic gates. Specifically, we report here a general approach for assembling these modular logic gates to execute programmable and higher-order profiling of multiple coexisting cell-surface markers, including several found on cancer cells, with the capacity to report a diagnostic signal and/or deliver targeted photodynamic therapy. The success of this strategy demonstrates the potential of DNA nanotechnology in facilitating targeted disease diagnosis and effective therapy.
The pandemonium system of reflective agents.
Smieja, F
1996-01-01
The Pandemonium system of reflective MINOS agents solves problems by automatic dynamic modularization of the input space. The agents contain feedforward neural networks which adapt using the backpropagation algorithm. We demonstrate the performance of Pandemonium on various categories of problems. These include learning continuous functions with discontinuities, separating two spirals, learning the parity function, and optical character recognition. It is shown how strongly the advantages gained from using a modularization technique depend on the nature of the problem. The superiority of the Pandemonium method over a single net on the first two test categories is contrasted with its limited advantages for the second two categories. In the first case the system converges quicker with modularization and is seen to lead to simpler solutions. For the second case the problem is not significantly simplified through flat decomposition of the input space, although convergence is still quicker.
Using Voice Coils to Actuate Modular Soft Robots: Wormbot, an Example
Nemitz, Markus P.; Mihaylov, Pavel; Barraclough, Thomas W.; Ross, Dylan
2016-01-01
Abstract In this study, we present a modular worm-like robot, which utilizes voice coils as a new paradigm in soft robot actuation. Drive electronics are incorporated into the actuators, providing a significant improvement in self-sufficiency when compared with existing soft robot actuation modes such as pneumatics or hydraulics. The body plan of this robot is inspired by the phylum Annelida and consists of three-dimensional printed voice coil actuators, which are connected by flexible silicone membranes. Each electromagnetic actuator engages with its neighbor to compress or extend the membrane of each segment, and the sequence in which they are actuated results in an earthworm-inspired peristaltic motion. We find that a minimum of three segments is required for locomotion, but due to our modular design, robots of any length can be quickly and easily assembled. In addition to actuation, voice coils provide audio input and output capabilities. We demonstrate transmission of data between segments by high-frequency carrier waves and, using a similar mechanism, we note that the passing of power between coupled coils in neighboring modules—or from an external power source—is also possible. Voice coils are a convenient multifunctional alternative to existing soft robot actuators. Their self-contained nature and ability to communicate with each other are ideal for modular robotics, and the additional functionality of sound input/output and power transfer will become increasingly useful as soft robots begin the transition from early proof-of-concept systems toward fully functional and highly integrated robotic systems. PMID:28078195
Challenges older adults face in detecting deceit: the role of emotion recognition.
Stanley, Jennifer Tehan; Blanchard-Fields, Fredda
2008-03-01
Facial expressions of emotion are key cues to deceit (M. G. Frank & P. Ekman, 1997). Given that the literature on aging has shown an age-related decline in decoding emotions, we investigated (a) whether there are age differences in deceit detection and (b) if so, whether they are related to impairments in emotion recognition. Young and older adults (N = 364) were presented with 20 interviews (crime and opinion topics) and asked to decide whether each interview subject was lying or telling the truth. There were 3 presentation conditions: visual, audio, or audiovisual. In older adults, reduced emotion recognition was related to poor deceit detection in the visual condition for crime interviews only. (c) 2008 APA, all rights reserved.
Donato, Anthony A; Kaliyadan, Antony G; Wasser, Thomas
2014-01-01
Studies of physicians at all levels of training demonstrate significant deficiencies in cardiac auscultation skills. The best instructional methods to augment these skills are not known. This study was a randomized, controlled trial of 83 noncardiologist volunteers exposed to a 12-week lower cognitive load self-study group using MP3 players containing heart sound audio files compared to a group receiving a 1-time 1-hour higher cognitive load multimedia lecture using the same audio files. The primary outcome measure was change in 15-question posttest score at 4 and 12 weeks as compared to pretest on recognition of identical audio files introduced during training. In the self-study group, the association of total exposure and deliberate practice effort (estimated by standard deviation of files played/mean) to improvement in test score was measured as a secondary end point. Self-study group participants improved as compared to pretest by 4.42 ± 3.41 answers correct at 12 weeks (5.09-9.51 correct, p < .001), while those exposed to the multimedia lecture improved by an average of 1.13 ± 3.2 answers correct (4.48-5.61 correct, p = .03). In the self-study arm, improvement in the posttest was positively associated with both total exposure (β = 0.55, p < .001) and deliberate practice score (β = 0.31, p = .02). A lower cognitive load self-study of audio files improved recognition of cardiac sounds, as compared to multimedia lecture, and deliberate practice strategies improved study efficiency. More investigation is needed to assess transfer of learning to a wider range of cardiac sounds in both simulated and clinical environments. © 2014 The Alliance for Continuing Education in the Health Professions, the Society for Academic Continuing Medical Education, and the Council on Continuing Medical Education, Association for Hospital Medical Education.
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition
NASA Astrophysics Data System (ADS)
Ryumin, D.; Karpov, A. A.
2017-05-01
In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
Planning the National Agricultural Library's Multimedia CD-ROM "Ornamental Horticulture."
ERIC Educational Resources Information Center
Mason, Pamela R.
1991-01-01
Discussion of issues involved in planning a multimedia CD-ROM product explains the selection of authoring tools, the design of a user interface, expert systems, text conversion and capture (including scanning and optical character recognition), and problems associated with image files. The use of audio is also discussed, and a 14-item glossary is…
Polyphonic Music Information Retrieval Based on Multi-Label Cascade Classification System
ERIC Educational Resources Information Center
Jiang, Wenxin
2009-01-01
Recognition and separation of sounds played by various instruments is very useful in labeling audio files with semantic information. This is a non-trivial task requiring sound analysis, but the results can aid automatic indexing and browsing music data when searching for melodies played by user specified instruments. Melody match based on pitch…
Finger vein recognition based on finger crease location
NASA Astrophysics Data System (ADS)
Lu, Zhiying; Ding, Shumeng; Yin, Jing
2016-07-01
Finger vein recognition technology has significant advantages over other methods in terms of accuracy, uniqueness, and stability, and it has wide promising applications in the field of biometric recognition. We propose using finger creases to locate and extract an object region. Then we use linear fitting to overcome the problem of finger rotation in the plane. The method of modular adaptive histogram equalization (MAHE) is presented to enhance image contrast and reduce computational cost. To extract the finger vein features, we use a fusion method, which can obtain clear and distinguishable vein patterns under different conditions. We used the Hausdorff average distance algorithm to examine the recognition performance of the system. The experimental results demonstrate that MAHE can better balance the recognition accuracy and the expenditure of time compared with three other methods. Our resulting equal error rate throughout the total procedure was 3.268% in a database of 153 finger vein images.
Contempt - Where the modularity of the mind meets the modularity of the brain?
Bzdok, Danilo; Schilbach, Leonhard
2017-01-01
"Contempt" is proposed to be a unique aspect of human nature, yet a non-natural kind. Its psychological construct is framed as a sentiment emerging from a stratification of diverse basic emotions and dispositional attitudes. Accordingly, "contempt" might transcend traditional conceptual levels in social psychology, including experience and recognition of emotion, dyadic and group dynamics, context-conditioned attitudes, time-enduring personality structure, and morality. This strikes us as a modern psychological account of a high-level, social-affective cognitive facet that joins forces with recent developments in the social neuroscience by drawing psychological conclusions from brain biology.
Nirme, Jens; Haake, Magnus; Lyberg Åhlander, Viveka; Brännström, Jonas; Sahlén, Birgitta
2018-04-05
Seeing a speaker's face facilitates speech recognition, particularly under noisy conditions. Evidence for how it might affect comprehension of the content of the speech is more sparse. We investigated how children's listening comprehension is affected by multi-talker babble noise, with or without presentation of a digitally animated virtual speaker, and whether successful comprehension is related to performance on a test of executive functioning. We performed a mixed-design experiment with 55 (34 female) participants (8- to 9-year-olds), recruited from Swedish elementary schools. The children were presented with four different narratives, each in one of four conditions: audio-only presentation in a quiet setting, audio-only presentation in noisy setting, audio-visual presentation in a quiet setting, and audio-visual presentation in a noisy setting. After each narrative, the children answered questions on the content and rated their perceived listening effort. Finally, they performed a test of executive functioning. We found significantly fewer correct answers to explicit content questions after listening in noise. This negative effect was only mitigated to a marginally significant degree by audio-visual presentation. Strong executive function only predicted more correct answers in quiet settings. Altogether, our results are inconclusive regarding how seeing a virtual speaker affects listening comprehension. We discuss how methodological adjustments, including modifications to our virtual speaker, can be used to discriminate between possible explanations to our results and contribute to understanding the listening conditions children face in a typical classroom.
A framework of text detection and recognition from natural images for mobile device
NASA Astrophysics Data System (ADS)
Selmi, Zied; Ben Halima, Mohamed; Wali, Ali; Alimi, Adel M.
2017-03-01
On the light of the remarkable audio-visual effect on modern life, and the massive use of new technologies (smartphones, tablets ...), the image has been given a great importance in the field of communication. Actually, it has become the most effective, attractive and suitable means of communication for transmitting information between different people. Of all the various parts of information that can be extracted from the image, our focus will be particularly on the text. Actually, since its detection and recognition in a natural image is a major problem in many applications, the text has drawn the attention of a great number of researchers in recent years. In this paper, we present a framework for text detection and recognition from natural images for mobile devices.
Concurrent evolution of feature extractors and modular artificial neural networks
NASA Astrophysics Data System (ADS)
Hannak, Victor; Savakis, Andreas; Yang, Shanchieh Jay; Anderson, Peter
2009-05-01
This paper presents a new approach for the design of feature-extracting recognition networks that do not require expert knowledge in the application domain. Feature-Extracting Recognition Networks (FERNs) are composed of interconnected functional nodes (feurons), which serve as feature extractors, and are followed by a subnetwork of traditional neural nodes (neurons) that act as classifiers. A concurrent evolutionary process (CEP) is used to search the space of feature extractors and neural networks in order to obtain an optimal recognition network that simultaneously performs feature extraction and recognition. By constraining the hill-climbing search functionality of the CEP on specific parts of the solution space, i.e., individually limiting the evolution of feature extractors and neural networks, it was demonstrated that concurrent evolution is a necessary component of the system. Application of this approach to a handwritten digit recognition task illustrates that the proposed methodology is capable of producing recognition networks that perform in-line with other methods without the need for expert knowledge in image processing.
ERIC Educational Resources Information Center
Kamijo, Haruo; Morii, Shingo; Yamaguchi, Wataru; Toyooka, Naoki; Tada-Umezaki, Masahito; Hirobayashi, Shigeki
2016-01-01
Various tactile methods, such as Braille, have been employed to enhance the recognition ability of chemical structures by individuals with visual disabilities. However, it is unknown whether reading aloud the names of chemical compounds would be effective in this regard. There are no systems currently available using an audio component to assist…
Pitch-Based Segregation of Reverberant Speech
2005-02-01
speaker recognition in real environments, audio information retrieval and hearing prosthesis. Second, although binaural listening improves the...intelligibility of target speech under anechoic conditions (Bronkhorst, 2000), this binaural advantage is largely eliminated by reverberation (Plomp, 1976...Brown and Cooke, 1994; Wang and Brown, 1999; Hu and Wang, 2004) as well as in binaural separation (e.g., Roman et al., 2003; Palomaki et al., 2004
The Downside of Greater Lexical Influences: Selectively Poorer Speech Perception in Noise
Xie, Zilong; Tessmer, Rachel; Chandrasekaran, Bharath
2017-01-01
Purpose Although lexical information influences phoneme perception, the extent to which reliance on lexical information enhances speech processing in challenging listening environments is unclear. We examined the extent to which individual differences in lexical influences on phonemic processing impact speech processing in maskers containing varying degrees of linguistic information (2-talker babble or pink noise). Method Twenty-nine monolingual English speakers were instructed to ignore the lexical status of spoken syllables (e.g., gift vs. kift) and to only categorize the initial phonemes (/g/ vs. /k/). The same participants then performed speech recognition tasks in the presence of 2-talker babble or pink noise in audio-only and audiovisual conditions. Results Individuals who demonstrated greater lexical influences on phonemic processing experienced greater speech processing difficulties in 2-talker babble than in pink noise. These selective difficulties were present across audio-only and audiovisual conditions. Conclusion Individuals with greater reliance on lexical processes during speech perception exhibit impaired speech recognition in listening conditions in which competing talkers introduce audible linguistic interferences. Future studies should examine the locus of lexical influences/interferences on phonemic processing and speech-in-speech processing. PMID:28586824
Speaker emotion recognition: from classical classifiers to deep neural networks
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2018-04-01
Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.
Crossmodal and incremental perception of audiovisual cues to emotional speech.
Barkhuysen, Pashiera; Krahmer, Emiel; Swerts, Marc
2010-01-01
In this article we report on two experiments about the perception of audiovisual cues to emotional speech. The article addresses two questions: 1) how do visual cues from a speaker's face to emotion relate to auditory cues, and (2) what is the recognition speed for various facial cues to emotion? Both experiments reported below are based on tests with video clips of emotional utterances collected via a variant of the well-known Velten method. More specifically, we recorded speakers who displayed positive or negative emotions, which were congruent or incongruent with the (emotional) lexical content of the uttered sentence. In order to test this, we conducted two experiments. The first experiment is a perception experiment in which Czech participants, who do not speak Dutch, rate the perceived emotional state of Dutch speakers in a bimodal (audiovisual) or a unimodal (audio- or vision-only) condition. It was found that incongruent emotional speech leads to significantly more extreme perceived emotion scores than congruent emotional speech, where the difference between congruent and incongruent emotional speech is larger for the negative than for the positive conditions. Interestingly, the largest overall differences between congruent and incongruent emotions were found for the audio-only condition, which suggests that posing an incongruent emotion has a particularly strong effect on the spoken realization of emotions. The second experiment uses a gating paradigm to test the recognition speed for various emotional expressions from a speaker's face. In this experiment participants were presented with the same clips as experiment I, but this time presented vision-only. The clips were shown in successive segments (gates) of increasing duration. Results show that participants are surprisingly accurate in their recognition of the various emotions, as they already reach high recognition scores in the first gate (after only 160 ms). Interestingly, the recognition scores raise faster for positive than negative conditions. Finally, the gating results suggest that incongruent emotions are perceived as more intense than congruent emotions, as the former get more extreme recognition scores than the latter, already after a short period of exposure.
Duke, Mila Morais; Wolfe, Jace; Schafer, Erin
2016-05-01
Cochlear implant (CI) recipients often experience difficulty understanding speech in noise and speech that originates from a distance. Many CI recipients also experience difficulty understanding speech originating from a television. Use of hearing assistance technology (HAT) may improve speech recognition in noise and for signals that originate from more than a few feet from the listener; however, there are no published studies evaluating the potential benefits of a wireless HAT designed to deliver audio signals from a television directly to a CI sound processor. The objective of this study was to compare speech recognition in quiet and in noise of CI recipients with the use of their CI alone and with the use of their CI and a wireless HAT (Cochlear Wireless TV Streamer). A two-way repeated measures design was used to evaluate performance differences obtained in quiet and in competing noise (65 dBA) with the CI sound processor alone and with the sound processor coupled to the Cochlear Wireless TV Streamer. Sixteen users of Cochlear Nucleus 24 Freedom, CI512, and CI422 implants were included in the study. Participants were evaluated in four conditions including use of the sound processor alone and use of the sound processor with the wireless streamer in quiet and in the presence of competing noise at 65 dBA. Speech recognition was evaluated in each condition with two full lists of Computer-Assisted Speech Perception Testing and Training Sentence-Level Test sentences presented from a light-emitting diode television. Speech recognition in noise was significantly better with use of the wireless streamer compared to participants' performance with their CI sound processor alone. There was also a nonsignificant trend toward better performance in quiet with use of the TV Streamer. Performance was significantly poorer when evaluated in noise compared to performance in quiet when the TV Streamer was not used. Use of the Cochlear Wireless TV Streamer designed to stream audio from a television directly to a CI sound processor provides better speech recognition in quiet and in noise when compared to performance obtained with use of the CI sound processor alone. American Academy of Audiology.
Programmable and Multiparameter DNA-Based Logic Platform For Cancer Recognition and Targeted Therapy
2014-01-01
The specific inventory of molecules on diseased cell surfaces (e.g., cancer cells) provides clinicians an opportunity for accurate diagnosis and intervention. With the discovery of panels of cancer markers, carrying out analyses of multiple cell-surface markers is conceivable. As a trial to accomplish this, we have recently designed a DNA-based device that is capable of performing autonomous logic-based analysis of two or three cancer cell-surface markers. Combining the specific target-recognition properties of DNA aptamers with toehold-mediated strand displacement reactions, multicellular marker-based cancer analysis can be realized based on modular AND, OR, and NOT Boolean logic gates. Specifically, we report here a general approach for assembling these modular logic gates to execute programmable and higher-order profiling of multiple coexisting cell-surface markers, including several found on cancer cells, with the capacity to report a diagnostic signal and/or deliver targeted photodynamic therapy. The success of this strategy demonstrates the potential of DNA nanotechnology in facilitating targeted disease diagnosis and effective therapy. PMID:25361164
Phosphorescent nanosensors for in vivo tracking of histamine levels.
Cash, Kevin J; Clark, Heather A
2013-07-02
Continuously tracking bioanalytes in vivo will enable clinicians and researchers to profile normal physiology and monitor diseased states. Current in vivo monitoring system designs are limited by invasive implantation procedures and biofouling, limiting the utility of these tools for obtaining physiologic data. In this work, we demonstrate the first success in optically tracking histamine levels in vivo using a modular, injectable sensing platform based on diamine oxidase and a phosphorescent oxygen nanosensor. Our new approach increases the range of measurable analytes by combining an enzymatic recognition element with a reversible nanosensor capable of measuring the effects of enzymatic activity. We use these enzyme nanosensors (EnzNS) to monitor the in vivo histamine dynamics as the concentration rapidly increases and decreases due to administration and clearance. The EnzNS system measured kinetics that match those reported from ex vivo measurements. This work establishes a modular approach to in vivo nanosensor design for measuring a broad range of potential target analytes. Simply replacing the recognition enzyme, or both the enzyme and nanosensor, can produce a new sensor system capable of measuring a wide range of specific analytical targets in vivo.
Cultural Specific Effects on the Recognition of Basic Emotions: A Study on Italian Subjects
NASA Astrophysics Data System (ADS)
Esposito, Anna; Riviello, Maria Teresa; Bourbakis, Nikolaos
The present work reports the results of perceptual experiments aimed to investigate if some of the basic emotions are perceptually privileged and if the cultural environment and the perceptual mode play a role in this preference. To this aim, Italian subjects were requested to assess emotional stimuli extracted from Italian and American English movies in the single (either video or audio alone) and the combined audio/video mode. Results showed that anger, fear, and sadness are better perceived than surprise, happiness in both the cultural environments (irony instead strongly depend on the language), that emotional information is affected by the communication mode and that language plays a role in assessing emotional information. Implications for the implementation of emotionally colored interactive systems are discussed.
Kasabov, Nikola; Scott, Nathan Matthew; Tu, Enmei; Marks, Stefan; Sengupta, Neelava; Capecci, Elisa; Othman, Muhaini; Doborjeh, Maryam Gholami; Murli, Norhanifah; Hartono, Reggio; Espinosa-Ramos, Josafath Israel; Zhou, Lei; Alvi, Fahad Bashir; Wang, Grace; Taylor, Denise; Feigin, Valery; Gulyaev, Sergei; Mahmoud, Mahmoud; Hou, Zeng-Guang; Yang, Jie
2016-06-01
The paper describes a new type of evolving connectionist systems (ECOS) called evolving spatio-temporal data machines based on neuromorphic, brain-like information processing principles (eSTDM). These are multi-modular computer systems designed to deal with large and fast spatio/spectro temporal data using spiking neural networks (SNN) as major processing modules. ECOS and eSTDM in particular can learn incrementally from data streams, can include 'on the fly' new input variables, new output class labels or regression outputs, can continuously adapt their structure and functionality, can be visualised and interpreted for new knowledge discovery and for a better understanding of the data and the processes that generated it. eSTDM can be used for early event prediction due to the ability of the SNN to spike early, before whole input vectors (they were trained on) are presented. A framework for building eSTDM called NeuCube along with a design methodology for building eSTDM using this is presented. The implementation of this framework in MATLAB, Java, and PyNN (Python) is presented. The latter facilitates the use of neuromorphic hardware platforms to run the eSTDM. Selected examples are given of eSTDM for pattern recognition and early event prediction on EEG data, fMRI data, multisensory seismic data, ecological data, climate data, audio-visual data. Future directions are discussed, including extension of the NeuCube framework for building neurogenetic eSTDM and also new applications of eSTDM. Copyright © 2015 Elsevier Ltd. All rights reserved.
Improved Open-Microphone Speech Recognition
NASA Astrophysics Data System (ADS)
Abrash, Victor
2002-12-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Improved Open-Microphone Speech Recognition
NASA Technical Reports Server (NTRS)
Abrash, Victor
2002-01-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Thin membrane sensor with biochemical switch
NASA Technical Reports Server (NTRS)
Worley, III, Jennings F. (Inventor); Case, George D. (Inventor)
1994-01-01
A modular biosensor system for chemical or biological agent detection utilizes electrochemical measurement of an ion current across a gate membrane triggered by the reaction of the target agent with a recognition protein conjugated to a channel blocker. The sensor system includes a bioresponse simulator or biochemical switch module which contains the recognition protein-channel blocker conjugate, and in which the detection reactions occur, and a transducer module which contains a gate membrane and a measuring electrode, and in which the presence of agent is sensed electrically. In the poised state, ion channels in the gate membrane are blocked by the recognition protein-channel blocker conjugate. Detection reactions remove the recognition protein-channel blocker conjugate from the ion channels, thus eliciting an ion current surge in the gate membrane which subsequently triggers an output alarm. Sufficiently large currents are generated that simple direct current electronics are adequate for the measurements. The biosensor has applications for environmental, medical, and industrial use.
Gender differences in emotion recognition: Impact of sensory modality and emotional category.
Lambrecht, Lena; Kreifelts, Benjamin; Wildgruber, Dirk
2014-04-01
Results from studies on gender differences in emotion recognition vary, depending on the types of emotion and the sensory modalities used for stimulus presentation. This makes comparability between different studies problematic. This study investigated emotion recognition of healthy participants (N = 84; 40 males; ages 20 to 70 years), using dynamic stimuli, displayed by two genders in three different sensory modalities (auditory, visual, audio-visual) and five emotional categories. The participants were asked to categorise the stimuli on the basis of their nonverbal emotional content (happy, alluring, neutral, angry, and disgusted). Hit rates and category selection biases were analysed. Women were found to be more accurate in recognition of emotional prosody. This effect was partially mediated by hearing loss for the frequency of 8,000 Hz. Moreover, there was a gender-specific selection bias for alluring stimuli: Men, as compared to women, chose "alluring" more often when a stimulus was presented by a woman as compared to a man.
Advertisement recognition using mode voting acoustic fingerprint
NASA Astrophysics Data System (ADS)
Fahmi, Reza; Abedi Firouzjaee, Hosein; Janalizadeh Choobbasti, Ali; Mortazavi Najafabadi, S. H. E.; Safavi, Saeid
2017-12-01
Emergence of media outlets and public relations tools such as TV, radio and the Internet since the 20th century provided the companies with a good platform for advertising their goods and services. Advertisement recognition is an important task that can help companies measure the efficiency of their advertising campaigns in the market and make it possible to compare their performance with competitors in order to get better business insights. Advertisement recognition is usually performed manually with help of human labor or is done through automated methods that are mainly based on heuristics features, these methods usually lack abilities such as scalability, being able to be generalized and be used in different situations. In this paper, we present an automated method for advertisement recognition based on audio processing method that could make this process fairly simple and eliminate the human factor out of the equation. This method has ultimately been used in Miras information technology in order to monitor 56 TV channels to detect all ad video clips broadcast over some networks.
ERIC Educational Resources Information Center
Suendermann-Oeft, David; Ramanarayanan, Vikram; Yu, Zhou; Qian, Yao; Evanini, Keelan; Lange, Patrick; Wang, Xinhao; Zechner, Klaus
2017-01-01
We present work in progress on a multimodal dialog system for English language assessment using a modular cloud-based architecture adhering to open industry standards. Among the modules being developed for the system, multiple modules heavily exploit machine learning techniques, including speech recognition, spoken language proficiency rating,…
A neural network with modular hierarchical learning
NASA Technical Reports Server (NTRS)
Baldi, Pierre F. (Inventor); Toomarian, Nikzad (Inventor)
1994-01-01
This invention provides a new hierarchical approach for supervised neural learning of time dependent trajectories. The modular hierarchical methodology leads to architectures which are more structured than fully interconnected networks. The networks utilize a general feedforward flow of information and sparse recurrent connections to achieve dynamic effects. The advantages include the sparsity of units and connections, the modular organization. A further advantage is that the learning is much more circumscribed learning than in fully interconnected systems. The present invention is embodied by a neural network including a plurality of neural modules each having a pre-established performance capability wherein each neural module has an output outputting present results of the performance capability and an input for changing the present results of the performance capabilitiy. For pattern recognition applications, the performance capability may be an oscillation capability producing a repeating wave pattern as the present results. In the preferred embodiment, each of the plurality of neural modules includes a pre-established capability portion and a performance adjustment portion connected to control the pre-established capability portion.
FunBlocks. A modular framework for AmI system development.
Baquero, Rafael; Rodríguez, José; Mendoza, Sonia; Decouchant, Dominique; Papis, Alfredo Piero Mateos
2012-01-01
The last decade has seen explosive growth in the technologies required to implement Ambient Intelligence (AmI) systems. Technologies such as facial and speech recognition, home networks, household cleaning robots, to name a few, have become commonplace. However, due to the multidisciplinary nature of AmI systems and the distinct requirements of different user groups, integrating these developments into full-scale systems is not an easy task. In this paper we propose FunBlocks, a minimalist modular framework for the development of AmI systems based on the function module abstraction used in the IEC 61499 standard for distributed control systems. FunBlocks provides a framework for the development of AmI systems through the integration of modules loosely joined by means of an event-driven middleware and a module and sensor/actuator catalog. The modular design of the FunBlocks framework allows the development of AmI systems which can be customized to a wide variety of usage scenarios.
FunBlocks. A Modular Framework for AmI System Development
Baquero, Rafael; Rodríguez, José; Mendoza, Sonia; Decouchant, Dominique; Papis, Alfredo Piero Mateos
2012-01-01
The last decade has seen explosive growth in the technologies required to implement Ambient Intelligence (AmI) systems. Technologies such as facial and speech recognition, home networks, household cleaning robots, to name a few, have become commonplace. However, due to the multidisciplinary nature of AmI systems and the distinct requirements of different user groups, integrating these developments into full-scale systems is not an easy task. In this paper we propose FunBlocks, a minimalist modular framework for the development of AmI systems based on the function module abstraction used in the IEC 61499 standard for distributed control systems. FunBlocks provides a framework for the development of AmI systems through the integration of modules loosely joined by means of an event-driven middleware and a module and sensor/actuator catalog. The modular design of the FunBlocks framework allows the development of AmI systems which can be customized to a wide variety of usage scenarios. PMID:23112599
Video2vec Embeddings Recognize Events When Examples Are Scarce.
Habibian, Amirhossein; Mensink, Thomas; Snoek, Cees G M
2017-10-01
This paper aims for event recognition when video examples are scarce or even completely absent. The key in such a challenging setting is a semantic video representation. Rather than building the representation from individual attribute detectors and their annotations, we propose to learn the entire representation from freely available web videos and their descriptions using an embedding between video features and term vectors. In our proposed embedding, which we call Video2vec, the correlations between the words are utilized to learn a more effective representation by optimizing a joint objective balancing descriptiveness and predictability. We show how learning the Video2vec embedding using a multimodal predictability loss, including appearance, motion and audio features, results in a better predictable representation. We also propose an event specific variant of Video2vec to learn a more accurate representation for the words, which are indicative of the event, by introducing a term sensitive descriptiveness loss. Our experiments on three challenging collections of web videos from the NIST TRECVID Multimedia Event Detection and Columbia Consumer Videos datasets demonstrate: i) the advantages of Video2vec over representations using attributes or alternative embeddings, ii) the benefit of fusing video modalities by an embedding over common strategies, iii) the complementarity of term sensitive descriptiveness and multimodal predictability for event recognition. By its ability to improve predictability of present day audio-visual video features, while at the same time maximizing their semantic descriptiveness, Video2vec leads to state-of-the-art accuracy for both few- and zero-example recognition of events in video.
NASA Astrophysics Data System (ADS)
Lecun, Yann; Bengio, Yoshua; Hinton, Geoffrey
2015-05-01
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Influences of emotion on context memory while viewing film clips.
Anderson, Lisa; Shimamura, Arthur P
2005-01-01
Participants listened to words while viewing film clips (audio off). Film clips were classified as neutral, positively valenced, negatively valenced, and arousing. Memory was assessed in three ways: recall of film content, recall of words, and context recognition. In the context recognition test, participants were presented a word and determined which film clip was showing when the word was originally presented. In two experiments, context memory performance was disrupted when words were presented during negatively valenced film clips, whereas it was enhanced when words were presented during arousing film clips. Free recall of words presented during the negatively valenced films was also disrupted. These findings suggest multiple influences of emotion on memory performance.
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
2015-05-28
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Mining the modular structure of protein interaction networks.
Berenstein, Ariel José; Piñero, Janet; Furlong, Laura Inés; Chernomoretz, Ariel
2015-01-01
Cluster-based descriptions of biological networks have received much attention in recent years fostered by accumulated evidence of the existence of meaningful correlations between topological network clusters and biological functional modules. Several well-performing clustering algorithms exist to infer topological network partitions. However, due to respective technical idiosyncrasies they might produce dissimilar modular decompositions of a given network. In this contribution, we aimed to analyze how alternative modular descriptions could condition the outcome of follow-up network biology analysis. We considered a human protein interaction network and two paradigmatic cluster recognition algorithms, namely: the Clauset-Newman-Moore and the infomap procedures. We analyzed to what extent both methodologies yielded different results in terms of granularity and biological congruency. In addition, taking into account Guimera's cartographic role characterization of network nodes, we explored how the adoption of a given clustering methodology impinged on the ability to highlight relevant network meso-scale connectivity patterns. As a case study we considered a set of aging related proteins and showed that only the high-resolution modular description provided by infomap, could unveil statistically significant associations between them and inter/intra modular cartographic features. Besides reporting novel biological insights that could be gained from the discovered associations, our contribution warns against possible technical concerns that might affect the tools used to mine for interaction patterns in network biology studies. In particular our results suggested that sub-optimal partitions from the strict point of view of their modularity levels might still be worth being analyzed when meso-scale features were to be explored in connection with external source of biological knowledge.
Split-brain reveals separate but equal self-recognition in the two cerebral hemispheres.
Uddin, Lucina Q; Rayman, Jan; Zaidel, Eran
2005-09-01
To assess the ability of the disconnected cerebral hemispheres to recognize images of the self, a split-brain patient (an individual who underwent complete cerebral commissurotomy to relieve intractable epilepsy) was tested using morphed self-face images presented to one visual hemifield (projecting to one hemisphere) at a time while making "self/other" judgments. The performance of the right and left hemispheres of this patient as assessed by a signal detection method was not significantly different, though a measure of bias did reveal hemispheric differences. The right and left hemispheres of this patient independently and equally possessed the ability to self-recognize, but only the right hemisphere could successfully recognize familiar others. This supports a modular concept of self-recognition and other-recognition, separately present in each cerebral hemisphere.
ERIC Educational Resources Information Center
Ritzhaupt, Albert Dieter; Barron, Ann
2008-01-01
The purpose of this study was to investigate the effect of time-compressed narration and representational adjunct images on a learner's ability to recall and recognize information. The experiment was a 4 Audio Speeds (1.0 = normal vs. 1.5 = moderate vs. 2.0 = fast vs. 2.5 = fastest rate) x Adjunct Image (Image Present vs. Image Absent) factorial…
Aucouturier, Jean-Julien; Defreville, Boris; Pachet, François
2007-08-01
The "bag-of-frames" approach (BOF) to audio pattern recognition represents signals as the long-term statistical distribution of their local spectral features. This approach has proved nearly optimal for simulating the auditory perception of natural and human environments (or soundscapes), and is also the most predominent paradigm to extract high-level descriptions from music signals. However, recent studies show that, contrary to its application to soundscape signals, BOF only provides limited performance when applied to polyphonic music signals. This paper proposes to explicitly examine the difference between urban soundscapes and polyphonic music with respect to their modeling with the BOF approach. First, the application of the same measure of acoustic similarity on both soundscape and music data sets confirms that the BOF approach can model soundscapes to near-perfect precision, and exhibits none of the limitations observed in the music data set. Second, the modification of this measure by two custom homogeneity transforms reveals critical differences in the temporal and statistical structure of the typical frame distribution of each type of signal. Such differences may explain the uneven performance of BOF algorithms on soundscapes and music signals, and suggest that their human perception rely on cognitive processes of a different nature.
Exploring expressivity and emotion with artificial voice and speech technologies.
Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James
2013-10-01
Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.
Facial recognition using multisensor images based on localized kernel eigen spaces.
Gundimada, Satyanadh; Asari, Vijayan K
2009-06-01
A feature selection technique along with an information fusion procedure for improving the recognition accuracy of a visual and thermal image-based facial recognition system is presented in this paper. A novel modular kernel eigenspaces approach is developed and implemented on the phase congruency feature maps extracted from the visual and thermal images individually. Smaller sub-regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are then projected into higher dimensional spaces using kernel methods. The proposed localized nonlinear feature selection procedure helps to overcome the bottlenecks of illumination variations, partial occlusions, expression variations and variations due to temperature changes that affect the visual and thermal face recognition techniques. AR and Equinox databases are used for experimentation and evaluation of the proposed technique. The proposed feature selection procedure has greatly improved the recognition accuracy for both the visual and thermal images when compared to conventional techniques. Also, a decision level fusion methodology is presented which along with the feature selection procedure has outperformed various other face recognition techniques in terms of recognition accuracy.
Extracting semantics from audio-visual content: the final frontier in multimedia retrieval.
Naphade, M R; Huang, T S
2002-01-01
Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.
Debener, Stefan; Emkes, Reiner; Volkening, Nils; Fudickar, Sebastian; Bleichner, Martin G.
2017-01-01
Objective Our aim was the development and validation of a modular signal processing and classification application enabling online electroencephalography (EEG) signal processing on off-the-shelf mobile Android devices. The software application SCALA (Signal ProCessing and CLassification on Android) supports a standardized communication interface to exchange information with external software and hardware. Approach In order to implement a closed-loop brain-computer interface (BCI) on the smartphone, we used a multiapp framework, which integrates applications for stimulus presentation, data acquisition, data processing, classification, and delivery of feedback to the user. Main Results We have implemented the open source signal processing application SCALA. We present timing test results supporting sufficient temporal precision of audio events. We also validate SCALA with a well-established auditory selective attention paradigm and report above chance level classification results for all participants. Regarding the 24-channel EEG signal quality, evaluation results confirm typical sound onset auditory evoked potentials as well as cognitive event-related potentials that differentiate between correct and incorrect task performance feedback. Significance We present a fully smartphone-operated, modular closed-loop BCI system that can be combined with different EEG amplifiers and can easily implement other paradigms. PMID:29349070
Blum, Sarah; Debener, Stefan; Emkes, Reiner; Volkening, Nils; Fudickar, Sebastian; Bleichner, Martin G
2017-01-01
Our aim was the development and validation of a modular signal processing and classification application enabling online electroencephalography (EEG) signal processing on off-the-shelf mobile Android devices. The software application SCALA (Signal ProCessing and CLassification on Android) supports a standardized communication interface to exchange information with external software and hardware. In order to implement a closed-loop brain-computer interface (BCI) on the smartphone, we used a multiapp framework, which integrates applications for stimulus presentation, data acquisition, data processing, classification, and delivery of feedback to the user. We have implemented the open source signal processing application SCALA. We present timing test results supporting sufficient temporal precision of audio events. We also validate SCALA with a well-established auditory selective attention paradigm and report above chance level classification results for all participants. Regarding the 24-channel EEG signal quality, evaluation results confirm typical sound onset auditory evoked potentials as well as cognitive event-related potentials that differentiate between correct and incorrect task performance feedback. We present a fully smartphone-operated, modular closed-loop BCI system that can be combined with different EEG amplifiers and can easily implement other paradigms.
Birkun, Alexei; Glotov, Maksim; Ndjamen, Herman Franklin; Alaiye, Esther; Adeleke, Temidara; Samarin, Sergey
2018-01-01
To assess the effectiveness of the telephone chest-compression-only cardiopulmonary resuscitation (CPR) guided by a pre-recorded instructional audio when compared with dispatcher-assisted resuscitation. It was a prospective, blind, randomised controlled study involving 109 medical students without previous CPR training. In a standardized mannequin scenario, after the step of dispatcher-assisted cardiac arrest recognition, the participants performed compression-only resuscitation guided over the telephone by either: (1) the pre-recorded instructional audio ( n =57); or (2) verbal dispatcher assistance ( n =52). The simulation video records were reviewed to assess the CPR performance using a 13-item checklist. The interval from call reception to the first compression, total number and rate of compressions, total number and duration of pauses after the first compression were also recorded. There were no significant differences between the recording-assisted and dispatcher-assisted groups based on the overall performance score (5.6±2.2 vs. 5.1±1.9, P >0.05) or individual criteria of the CPR performance checklist. The recording-assisted group demonstrated significantly shorter time interval from call receipt to the first compression (86.0±14.3 vs. 91.2±14.2 s, P <0.05), higher compression rate (94.9±26.4 vs. 89.1±32.8 min -1 ) and number of compressions provided (170.2±48.0 vs. 156.2±60.7). When provided by untrained persons in the simulated settings, the compression-only resuscitation guided by the pre-recorded instructional audio is no less efficient than dispatcher-assisted CPR. Future studies are warranted to further assess feasibility of using instructional audio aid as a potential alternative to dispatcher assistance.
Birkun, Alexei; Glotov, Maksim; Ndjamen, Herman Franklin; Alaiye, Esther; Adeleke, Temidara; Samarin, Sergey
2018-01-01
BACKGROUND: To assess the effectiveness of the telephone chest-compression-only cardiopulmonary resuscitation (CPR) guided by a pre-recorded instructional audio when compared with dispatcher-assisted resuscitation. METHODS: It was a prospective, blind, randomised controlled study involving 109 medical students without previous CPR training. In a standardized mannequin scenario, after the step of dispatcher-assisted cardiac arrest recognition, the participants performed compression-only resuscitation guided over the telephone by either: (1) the pre-recorded instructional audio (n=57); or (2) verbal dispatcher assistance (n=52). The simulation video records were reviewed to assess the CPR performance using a 13-item checklist. The interval from call reception to the first compression, total number and rate of compressions, total number and duration of pauses after the first compression were also recorded. RESULTS: There were no significant differences between the recording-assisted and dispatcher-assisted groups based on the overall performance score (5.6±2.2 vs. 5.1±1.9, P>0.05) or individual criteria of the CPR performance checklist. The recording-assisted group demonstrated significantly shorter time interval from call receipt to the first compression (86.0±14.3 vs. 91.2±14.2 s, P<0.05), higher compression rate (94.9±26.4 vs. 89.1±32.8 min-1) and number of compressions provided (170.2±48.0 vs. 156.2±60.7). CONCLUSION: When provided by untrained persons in the simulated settings, the compression-only resuscitation guided by the pre-recorded instructional audio is no less efficient than dispatcher-assisted CPR. Future studies are warranted to further assess feasibility of using instructional audio aid as a potential alternative to dispatcher assistance.
A modular framework for biomedical concept recognition
2013-01-01
Background Concept recognition is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. The development of such solutions is typically performed in an ad-hoc manner or using general information extraction frameworks, which are not optimized for the biomedical domain and normally require the integration of complex external libraries and/or the development of custom tools. Results This article presents Neji, an open source framework optimized for biomedical concept recognition built around four key characteristics: modularity, scalability, speed, and usability. It integrates modules for biomedical natural language processing, such as sentence splitting, tokenization, lemmatization, part-of-speech tagging, chunking and dependency parsing. Concept recognition is provided through dictionary matching and machine learning with normalization methods. Neji also integrates an innovative concept tree implementation, supporting overlapped concept names and respective disambiguation techniques. The most popular input and output formats, namely Pubmed XML, IeXML, CoNLL and A1, are also supported. On top of the built-in functionalities, developers and researchers can implement new processing modules or pipelines, or use the provided command-line interface tool to build their own solutions, applying the most appropriate techniques to identify heterogeneous biomedical concepts. Neji was evaluated against three gold standard corpora with heterogeneous biomedical concepts (CRAFT, AnEM and NCBI disease corpus), achieving high performance results on named entity recognition (F1-measure for overlap matching: species 95%, cell 92%, cellular components 83%, gene and proteins 76%, chemicals 65%, biological processes and molecular functions 63%, disorders 85%, and anatomical entities 82%) and on entity normalization (F1-measure for overlap name matching and correct identifier included in the returned list of identifiers: species 88%, cell 71%, cellular components 72%, gene and proteins 64%, chemicals 53%, and biological processes and molecular functions 40%). Neji provides fast and multi-threaded data processing, annotating up to 1200 sentences/second when using dictionary-based concept identification. Conclusions Considering the provided features and underlying characteristics, we believe that Neji is an important contribution to the biomedical community, streamlining the development of complex concept recognition solutions. Neji is freely available at http://bioinformatics.ua.pt/neji. PMID:24063607
Benchmarking multimedia performance
NASA Astrophysics Data System (ADS)
Zandi, Ahmad; Sudharsanan, Subramania I.
1998-03-01
With the introduction of faster processors and special instruction sets tailored to multimedia, a number of exciting applications are now feasible on the desktops. Among these is the DVD playback consisting, among other things, of MPEG-2 video and Dolby digital audio or MPEG-2 audio. Other multimedia applications such as video conferencing and speech recognition are also becoming popular on computer systems. In view of this tremendous interest in multimedia, a group of major computer companies have formed, Multimedia Benchmarks Committee as part of Standard Performance Evaluation Corp. to address the performance issues of multimedia applications. The approach is multi-tiered with three tiers of fidelity from minimal to full compliant. In each case the fidelity of the bitstream reconstruction as well as quality of the video or audio output are measured and the system is classified accordingly. At the next step the performance of the system is measured. In many multimedia applications such as the DVD playback the application needs to be run at a specific rate. In this case the measurement of the excess processing power, makes all the difference. All these make a system level, application based, multimedia benchmark very challenging. Several ideas and methodologies for each aspect of the problems will be presented and analyzed.
Automatic concept extraction from spoken medical reports.
Happe, André; Pouliquen, Bruno; Burgun, Anita; Cuggia, Marc; Le Beux, Pierre
2003-07-01
The objective of this project is to investigate methods whereby a combination of speech recognition and automated indexing methods substitute for current transcription and indexing practices. We based our study on existing speech recognition software programs and on NOMINDEX, a tool that extracts MeSH concepts from medical text in natural language and that is mainly based on a French medical lexicon and on the UMLS. For each document, the process consists of three steps: (1) dictation and digital audio recording, (2) speech recognition, (3) automatic indexing. The evaluation consisted of a comparison between the set of concepts extracted by NOMINDEX after the speech recognition phase and the set of keywords manually extracted from the initial document. The method was evaluated on a set of 28 patient discharge summaries extracted from the MENELAS corpus in French, corresponding to in-patients admitted for coronarography. The overall precision was 73% and the overall recall was 90%. Indexing errors were mainly due to word sense ambiguity and abbreviations. A specific issue was the fact that the standard French translation of MeSH terms lacks diacritics. A preliminary evaluation of speech recognition tools showed that the rate of accurate recognition was higher than 98%. Only 3% of the indexing errors were generated by inadequate speech recognition. We discuss several areas to focus on to improve this prototype. However, the very low rate of indexing errors due to speech recognition errors highlights the potential benefits of combining speech recognition techniques and automatic indexing.
Wolfe, Jace; Morais Duke, Mila; Schafer, Erin; Cire, George; Menapace, Christine; O'Neill, Lori
2016-01-01
The objective of this study was to evaluate the potential improvement in word recognition in quiet and in noise obtained with use of a Bluetooth-compatible wireless hearing assistance technology (HAT) relative to the acoustic mobile telephone condition (e.g. the mobile telephone receiver held to the microphone of the sound processor). A two-way repeated measures design was used to evaluate differences in telephone word recognition obtained in quiet and in competing noise in the acoustic mobile telephone condition compared to performance obtained with use of the CI sound processor and a telephone HAT. Sixteen adult users of Nucleus cochlear implants and the Nucleus 6 sound processor were included in this study. Word recognition over the mobile telephone in quiet and in noise was significantly better with use of the wireless HAT compared to performance in the acoustic mobile telephone condition. Word recognition over the mobile telephone was better in quiet when compared to performance in noise. The results of this study indicate that use of a wireless HAT improves word recognition over the mobile telephone in quiet and in noise relative to performance in the acoustic mobile telephone condition for a group of adult cochlear implant recipients.
Real-Time Reconfigurable Adaptive Speech Recognition Command and Control Apparatus and Method
NASA Technical Reports Server (NTRS)
Salazar, George A. (Inventor); Haynes, Dena S. (Inventor); Sommers, Marc J. (Inventor)
1998-01-01
An adaptive speech recognition and control system and method for controlling various mechanisms and systems in response to spoken instructions and in which spoken commands are effective to direct the system into appropriate memory nodes, and to respective appropriate memory templates corresponding to the voiced command is discussed. Spoken commands from any of a group of operators for which the system is trained may be identified, and voice templates are updated as required in response to changes in pronunciation and voice characteristics over time of any of the operators for which the system is trained. Provisions are made for both near-real-time retraining of the system with respect to individual terms which are determined not be positively identified, and for an overall system training and updating process in which recognition of each command and vocabulary term is checked, and in which the memory templates are retrained if necessary for respective commands or vocabulary terms with respect to an operator currently using the system. In one embodiment, the system includes input circuitry connected to a microphone and including signal processing and control sections for sensing the level of vocabulary recognition over a given period and, if recognition performance falls below a given level, processing audio-derived signals for enhancing recognition performance of the system.
Recognition as a patient-centered medical home: fundamental or incidental?
Dohan, Daniel; McCuistion, Mary Honodel; Frosch, Dominick L; Hung, Dorothy Y; Tai-Seale, Ming
2013-01-01
Little is known about reasons why a medical group would seek recognition as a patient-centered medical home (PCMH). We examined the motivations for seeking recognition in one group and assessed why the group allowed recognition to lapse 3 years later. As part of a larger mixed methods case study, we conducted 38 key informant interviews with executives, clinicians, and front-line staff. Interviews were conducted according to a guide that evolved during the project and were audio-recorded and fully transcribed. Transcripts were analyzed and thematically coded. PCMH principles were consistent with the organization's culture and mission, which valued innovation and putting patients first. Motivations for implementing specific PCMH components varied; some components were seen as part of the organization's patient-centered culture, whereas others helped the practice compete in its local market. Informants consistently reported that National Committee for Quality Assurance recognition arose incidentally because of a 1-time incentive from a local group of large employers and because the organization decided to allocate some organizational resources to respond to the complex reporting requirements for about one-half of its clinics. Becoming patient centered and seeking recognition as such ran along separate but parallel tracks within this organization. As the Affordable Care Act continues to focus attention on primary care redesign, this apparent disconnect should be borne in mind.
Multisensory emotion perception in congenitally, early, and late deaf CI users
Nava, Elena; Villwock, Agnes K.; Büchner, Andreas; Lenarz, Thomas; Röder, Brigitte
2017-01-01
Emotions are commonly recognized by combining auditory and visual signals (i.e., vocal and facial expressions). Yet it is unknown whether the ability to link emotional signals across modalities depends on early experience with audio-visual stimuli. In the present study, we investigated the role of auditory experience at different stages of development for auditory, visual, and multisensory emotion recognition abilities in three groups of adolescent and adult cochlear implant (CI) users. CI users had a different deafness onset and were compared to three groups of age- and gender-matched hearing control participants. We hypothesized that congenitally deaf (CD) but not early deaf (ED) and late deaf (LD) CI users would show reduced multisensory interactions and a higher visual dominance in emotion perception than their hearing controls. The CD (n = 7), ED (deafness onset: <3 years of age; n = 7), and LD (deafness onset: >3 years; n = 13) CI users and the control participants performed an emotion recognition task with auditory, visual, and audio-visual emotionally congruent and incongruent nonsense speech stimuli. In different blocks, participants judged either the vocal (Voice task) or the facial expressions (Face task). In the Voice task, all three CI groups performed overall less efficiently than their respective controls and experienced higher interference from incongruent facial information. Furthermore, the ED CI users benefitted more than their controls from congruent faces and the CD CI users showed an analogous trend. In the Face task, recognition efficiency of the CI users and controls did not differ. Our results suggest that CI users acquire multisensory interactions to some degree, even after congenital deafness. When judging affective prosody they appear impaired and more strongly biased by concurrent facial information than typically hearing individuals. We speculate that limitations inherent to the CI contribute to these group differences. PMID:29023525
Multisensory emotion perception in congenitally, early, and late deaf CI users.
Fengler, Ineke; Nava, Elena; Villwock, Agnes K; Büchner, Andreas; Lenarz, Thomas; Röder, Brigitte
2017-01-01
Emotions are commonly recognized by combining auditory and visual signals (i.e., vocal and facial expressions). Yet it is unknown whether the ability to link emotional signals across modalities depends on early experience with audio-visual stimuli. In the present study, we investigated the role of auditory experience at different stages of development for auditory, visual, and multisensory emotion recognition abilities in three groups of adolescent and adult cochlear implant (CI) users. CI users had a different deafness onset and were compared to three groups of age- and gender-matched hearing control participants. We hypothesized that congenitally deaf (CD) but not early deaf (ED) and late deaf (LD) CI users would show reduced multisensory interactions and a higher visual dominance in emotion perception than their hearing controls. The CD (n = 7), ED (deafness onset: <3 years of age; n = 7), and LD (deafness onset: >3 years; n = 13) CI users and the control participants performed an emotion recognition task with auditory, visual, and audio-visual emotionally congruent and incongruent nonsense speech stimuli. In different blocks, participants judged either the vocal (Voice task) or the facial expressions (Face task). In the Voice task, all three CI groups performed overall less efficiently than their respective controls and experienced higher interference from incongruent facial information. Furthermore, the ED CI users benefitted more than their controls from congruent faces and the CD CI users showed an analogous trend. In the Face task, recognition efficiency of the CI users and controls did not differ. Our results suggest that CI users acquire multisensory interactions to some degree, even after congenital deafness. When judging affective prosody they appear impaired and more strongly biased by concurrent facial information than typically hearing individuals. We speculate that limitations inherent to the CI contribute to these group differences.
Intentional Voice Command Detection for Trigger-Free Speech Interface
NASA Astrophysics Data System (ADS)
Obuchi, Yasunari; Sumiyoshi, Takashi
In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.
Concept recognition for extracting protein interaction relations from biomedical text
Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence
2008-01-01
Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500
Sequence Discrimination by Alternatively Spliced Isoforms of a DNA Binding Zinc Finger Domain
NASA Astrophysics Data System (ADS)
Gogos, Joseph A.; Hsu, Tien; Bolton, Jesse; Kafatos, Fotis C.
1992-09-01
Two major developmentally regulated isoforms of the Drosophila chorion transcription factor CF2 differ by an extra zinc finger within the DNA binding domain. The preferred DNA binding sites were determined and are distinguished by an internal duplication of TAT in the site recognized by the isoform with the extra finger. The results are consistent with modular interactions between zinc fingers and trinucleotides and also suggest rules for recognition of AT-rich DNA sites by zinc finger proteins. The results show how modular finger interactions with trinucleotides can be used, in conjunction with alternative splicing, to alter the binding specificity and increase the spectrum of sites recognized by a DNA binding domain. Thus, CF2 may potentially regulate distinct sets of target genes during development.
Zhu, Jing; Wang, Lei; Xu, Xiaowen; Wei, Haiping; Jiang, Wei
2016-04-05
Here, we explored a modular strategy for rational design of nuclease-responsive three-way junctions (TWJs) and fabricated a dynamic DNA device in a "plug-and-play" fashion. First, inactivated TWJs were designed, which contained three functional domains: the inaccessible toehold and branch migration domains, the specific sites of nucleases, and the auxiliary complementary sequence. The actions of different nucleases on their specific sites in TWJs caused the close proximity of the same toehold and branch migration domains, resulting in the activation of the TWJs and the formation of a universal trigger for the subsequent dynamic assembly. Second, two hairpins (H1 and H2) were introduced, which could coexist in a metastable state, initially to act as the components for the dynamic assembly. Once the trigger initiated the opening of H1 via TWJs-driven strand displacement, the cascade hybridization of hairpins immediately switched on, resulting in the formation of the concatemers of H1/H2 complex appending numerous integrated G-quadruplexes, which were used to obtain label-free signal readout. The inherent modularity of this design allowed us to fabricate a flexible DNA dynamic device and detect multiple nucleases through altering the recognition pattern slightly. Taking uracil-DNA glycosylase and CpG methyltransferase M.SssI as models, we successfully realized the butt joint between the uracil-DNA glycosylase and M.SssI recognition events and the dynamic assembly process. Furthermore, we achieved ultrasensitive assay of nuclease activity and the inhibitor screening. The DNA device proposed here will offer an adaptive and flexible tool for clinical diagnosis and anticancer drug discovery.
Construction and updating of event models in auditory event processing.
Huff, Markus; Maurer, Annika E; Brich, Irina; Pagenkopf, Anne; Wickelmaier, Florian; Papenmeier, Frank
2018-02-01
Humans segment the continuous stream of sensory information into distinct events at points of change. Between 2 events, humans perceive an event boundary. Present theories propose changes in the sensory information to trigger updating processes of the present event model. Increased encoding effort finally leads to a memory benefit at event boundaries. Evidence from reading time studies (increased reading times with increasing amount of change) suggest that updating of event models is incremental. We present results from 5 experiments that studied event processing (including memory formation processes and reading times) using an audio drama as well as a transcript thereof as stimulus material. Experiments 1a and 1b replicated the event boundary advantage effect for memory. In contrast to recent evidence from studies using visual stimulus material, Experiments 2a and 2b found no support for incremental updating with normally sighted and blind participants for recognition memory. In Experiment 3, we replicated Experiment 2a using a written transcript of the audio drama as stimulus material, allowing us to disentangle encoding and retrieval processes. Our results indicate incremental updating processes at encoding (as measured with reading times). At the same time, we again found recognition performance to be unaffected by the amount of change. We discuss these findings in light of current event cognition theories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Performance evaluation of wavelet-based face verification on a PDA recorded database
NASA Astrophysics Data System (ADS)
Sellahewa, Harin; Jassim, Sabah A.
2006-05-01
The rise of international terrorism and the rapid increase in fraud and identity theft has added urgency to the task of developing biometric-based person identification as a reliable alternative to conventional authentication methods. Human Identification based on face images is a tough challenge in comparison to identification based on fingerprints or Iris recognition. Yet, due to its unobtrusive nature, face recognition is the preferred method of identification for security related applications. The success of such systems will depend on the support of massive infrastructures. Current mobile communication devices (3G smart phones) and PDA's are equipped with a camera which can capture both still and streaming video clips and a touch sensitive display panel. Beside convenience, such devices provide an adequate secure infrastructure for sensitive & financial transactions, by protecting against fraud and repudiation while ensuring accountability. Biometric authentication systems for mobile devices would have obvious advantages in conflict scenarios when communication from beyond enemy lines is essential to save soldier and civilian life. In areas of conflict or disaster the luxury of fixed infrastructure is not available or destroyed. In this paper, we present a wavelet-based face verification scheme that have been specifically designed and implemented on a currently available PDA. We shall report on its performance on the benchmark audio-visual BANCA database and on a newly developed PDA recorded audio-visual database that take include indoor and outdoor recordings.
Ultraino: An Open Phased-Array System for Narrowband Airborne Ultrasound Transmission.
Marzo, Asier; Corkett, Tom; Drinkwater, Bruce W
2018-01-01
Modern ultrasonic phased-array controllers are electronic systems capable of delaying the transmitted or received signals of multiple transducers. Configurable transmit-receive array systems, capable of electronic steering and shaping of the beam in near real-time, are available commercially, for example, for medical imaging. However, emerging applications, such as ultrasonic haptics, parametric audio, or ultrasonic levitation, require only a small subset of the capabilities provided by the existing controllers. To meet this need, we present Ultraino, a modular, inexpensive, and open platform that provides hardware, software, and example applications specifically aimed at controlling the transmission of narrowband airborne ultrasound. Our system is composed of software, driver boards, and arrays that enable users to quickly and efficiently perform research in various emerging applications. The software can be used to define array geometries, simulate the acoustic field in real time, and control the connected driver boards. The driver board design is based on an Arduino Mega and can control 64 channels with a square wave of up to 17 Vpp and /5 phase resolution. Multiple boards can be chained together to increase the number of channels. The 40-kHz arrays with flat and spherical geometries are demonstrated for parametric audio generation, acoustic levitation, and haptic feedback.
Characterizing the role benthos plays in large coastal seas and estuaries: A modular approach
Tenore, K.R.; Zajac, R.N.; Terwin, J.; Andrade, F.; Blanton, J.; Boynton, W.; Carey, D.; Diaz, R.; Holland, Austin F.; Lopez-Jamar, E.; Montagna, P.; Nichols, F.; Rosenberg, R.; Queiroga, H.; Sprung, M.; Whitlatch, R.B.
2006-01-01
Ecologists studying coastal and estuarine benthic communities have long taken a macroecological view, by relating benthic community patterns to environmental factors across several spatial scales. Although many general ecological patterns have been established, often a significant amount of the spatial and temporal variation in soft-sediment communities within and among systems remains unexplained. Here we propose a framework that may aid in unraveling the complex influence of environmental factors associated with the different components of coastal systems (i.e. the terrestrial and benthic landscapes, and the hydrological seascape) on benthic communities, and use this information to assess the role played by benthos in coastal ecosystems. A primary component of the approach is the recognition of system modules (e.g. marshes, dendritic systems, tidal rivers, enclosed basins, open bays, lagoons). The modules may differentially interact with key forcing functions (e.g. temperature, salinity, currents) that influence system processes and in turn benthic responses and functions. Modules may also constrain benthic characteristics and related processes within certain ecological boundaries and help explain their overall spatio-temporal variation. We present an example of how benthic community characteristics are related to the modular structure of 14 coastal seas and estuaries, and show that benthic functional group composition is significantly related to the modular structure of these systems. We also propose a framework for exploring the role of benthic communities in coastal systems using this modular approach and offer predictions of how benthic communities may vary depending on the modular composition and characteristics of a coastal system. ?? 2006 Elsevier B.V. All rights reserved.
Engineering Translational Activators with CRISPR-Cas System.
Du, Pei; Miao, Chensi; Lou, Qiuli; Wang, Zefeng; Lou, Chunbo
2016-01-15
RNA parts often serve as critical components in genetic engineering. Here we report a design of translational activators which is composed of an RNA endoribonuclease (Csy4) and two exchangeable RNA modules. Csy4, a member of Cas endoribonuclease, cleaves at a specific recognition site; this cleavage releases a cis-repressive RNA module (crRNA) from the masked ribosome binding site (RBS), which subsequently allows the downstream translation initiation. Unlike small RNA as a translational activator, the endoribonuclease-based activator is able to efficiently unfold the perfect RBS-crRNA pairing. As an exchangeable module, the crRNA-RBS duplex was forwardly and reversely engineered to modulate the dynamic range of translational activity. We further showed that Csy4 and its recognition site, together as a module, can also be replaced by orthogonal endoribonuclease-recognition site homologues. These modularly structured, high-performance translational activators would endow the programming of gene expression in the translation level with higher feasibility.
Lozano-Diez, Alicia; Zazo, Ruben; Toledano, Doroteo T; Gonzalez-Rodriguez, Joaquin
2017-01-01
Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.
Thin-Membrane Sensor With Biochemical Switch
NASA Technical Reports Server (NTRS)
Case, George D.; Worley, Jennings F.
1992-01-01
Modular sensor electrochemically detects chemical or biological agent, indicating presence of agent via gate-membrane-crossing ion current triggered by chemical reaction between agent and recognition protein conjugated to channel blocker. Used in such laboratory, industrial, or field applications as detection of bacterial toxins in food, military chemical agents in air, and pesticides or other contaminants in environment. Also used in biological screening for hepatitis, acquired immune-deficiency syndrome, and like.
Exploring the repeat protein universe through computational protein design
Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...
2015-12-16
A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less
Hutchinson, David; Bradley, Samuel D
2009-03-01
In the recent United States-led "war on terror," including ongoing engagements in Iraq and Afghanistan, news organizations have been accused of showing a negative view of developments on the ground. In particular, news depictions of casualties have brought accusations of anti-Americanism and aiding and abetting the terrorists' cause. In this study, video footage of war from television news stories was manipulated to investigate the effects of negative compelling images on cognitive resource allocation, physiological arousal, and recognition memory. Results of a within-subjects experiment indicate that negatively valenced depictions of casualties and destruction elicit greater attention and physiological arousal than positive and low-intensity images. Recognition memory for visual information in the graphic negative news condition was highest, whereas audio recognition for this condition was lowest. The results suggest that negative, high-intensity video imagery diverts cognitive resources away from the encoding of verbal information in the newscast, positioning visual images and not the spoken narrative as a primary channel of viewer learning.
NASA Astrophysics Data System (ADS)
Cerwin, Steve; Barnes, Julie; Kell, Scott; Walters, Mark
2003-09-01
This paper describes development and application of a novel method to accomplish real-time solid angle acoustic direction finding using two 8-element orthogonal microphone arrays. The developed prototype system was intended for localization and signature recognition of ground-based sounds from a small UAV. Recent advances in computer speeds have enabled the implementation of microphone arrays in many audio applications. Still, the real-time presentation of a two-dimensional sound field for the purpose of audio target localization is computationally challenging. In order to overcome this challenge, a crosspower spectrum phase1 (CSP) technique was applied to each 8-element arm of a 16-element cross array to provide audio target localization. In this paper, we describe the technique and compare it with two other commonly used techniques; Cross-Spectral Matrix2 and MUSIC3. The results show that the CSP technique applied to two 8-element orthogonal arrays provides a computationally efficient solution with reasonable accuracy and tolerable artifacts, sufficient for real-time applications. Additional topics include development of a synchronized 16-channel transmitter and receiver to relay the airborne data to the ground-based processor and presentation of test data demonstrating both ground-mounted operation and airborne localization of ground-based gunshots and loud engine sounds.
1990-03-01
are linked together so that a user can easily move from one to 5 another." ([Ref. 2], Doc.#1522) Music , audio and other signals can be added to the...videodisc player, starting a video presentation, complete with music , highlighting the benefits of hyper.aedia to the company’s information needs...a Entertainment ; o Travel; & Multi-language applications; o Real estate; 7 " Retail kiosks and information booths; " Landscaping, design and
NASA Astrophysics Data System (ADS)
Lozano-Vega, Gildardo; Benezeth, Yannick; Marzani, Franck; Boochs, Frank
2014-09-01
Accurate recognition of airborne pollen taxa is crucial for understanding and treating allergic diseases which affect an important proportion of the world population. Modern computer vision techniques enable the detection of discriminant characteristics. Apertures are among the important characteristics which have not been adequately explored until now. A flexible method of detection, localization, and counting of apertures of different pollen taxa with varying appearances is proposed. Aperture description is based on primitive images following the bag-of-words strategy. A confidence map is estimated based on the classification of sampled regions. The method is designed to be extended modularly to new aperture types employing the same algorithm by building individual classifiers. The method was evaluated on the top five allergenic pollen taxa in Germany, and its robustness to unseen particles was verified.
Wolfe, Jace; Morais, Mila; Schafer, Erin
2016-02-01
The goals of the present investigation were (1) to evaluate recognition of recorded speech presented over a mobile telephone for a group of adult bimodal cochlear implant users, and (2) to measure the potential benefits of wireless hearing assistance technology (HAT) for mobile telephone speech recognition using bimodal stimulation (i.e., a cochlear implant in one ear and a hearing aid on the other ear). A three-by-two-way repeated measures design was used to evaluate mobile telephone sentence-recognition performance differences obtained in quiet and in noise with and without the wireless HAT accessory coupled to the hearing aid alone, CI sound processor alone, and in the bimodal condition. Outpatient cochlear implant clinic. Sixteen bimodal users with Nucleus 24, Freedom, CI512, or CI422 cochlear implants participated in this study. Performance was measured with and without the use of a wireless HAT for the telephone used with the hearing aid alone, CI alone, and bimodal condition. CNC word recognition in quiet and in noise with and without the use of a wireless HAT telephone accessory in the hearing aid alone, CI alone, and bimodal conditions. Results suggested that the bimodal condition gave significantly better speech recognition on the mobile telephone with the wireless HAT. A wireless HAT for the mobile telephone provides bimodal users with significant improvement in word recognition in quiet and in noise over the mobile telephone.
Modular Neural Networks for Speech Recognition.
1996-08-01
automatic speech rccogni- tion, understanding and translation since the early 1950’ s . Although researchers have demonstrated impressive results with...nodes. It serves only as a data source for the following hidden layer( s ). Finally, the networks output is computed by neurons in the output layer. The...following update rule for weights in the hidden layer: w (,,•+I) ("’) E/V S (W W k- = wj, -- 7 - / v It is easy to generalize the backpropagation
Scholze, Heidi; Boch, Jens
2010-01-01
TAL effectors are important virulence factors of bacterial plant pathogenic Xanthomonas, which infect a wide variety of plants including valuable crops like pepper, rice, and citrus. TAL proteins are translocated via the bacterial type III secretion system into host cells and induce transcription of plant genes by binding to target gene promoters. Members of the TAL effector family differ mainly in their central domain of tandemly arranged repeats of typically 34 amino acids each with hypervariable di-amino acids at positions 12 and 13. We recently showed that target DNA-recognition specificity of TAL effectors is encoded in a modular and clearly predictable mode. The repeats of TAL effectors feature a surprising one repeat-to-one-bp correlation with different repeat types exhibiting a different DNA base pair specificity. Accordingly, we predicted DNA specificities of TAL effectors and generated artificial TAL proteins with novel DNA recognition specificities. We describe here novel artificial TALs and discuss implications for the DNA recognition specificity. The unique TAL-DNA binding domain allows design of proteins with potentially any given DNA recognition specificity enabling many uses for biotechnology.
Kim, Min-Beom; Chung, Won-Ho; Choi, Jeesun; Hong, Sung Hwa; Cho, Yang-Sun; Park, Gyuseok; Lee, Sangmin
2014-06-01
The object was to evaluate speech perception improvement through Bluetooth-implemented hearing aids in hearing-impaired adults. Thirty subjects with bilateral symmetric moderate sensorineural hearing loss participated in this study. A Bluetooth-implemented hearing aid was fitted unilaterally in all study subjects. Objective speech recognition score and subjective satisfaction were measured with a Bluetooth-implemented hearing aid to replace the acoustic connection from either a cellular phone or a loudspeaker system. In each system, participants were assigned to 4 conditions: wireless speech signal transmission into hearing aid (wireless mode) in quiet or noisy environment and conventional speech signal transmission using external microphone of hearing aid (conventional mode) in quiet or noisy environment. Also, participants completed questionnaires to investigate subjective satisfaction. Both cellular phone and loudspeaker system situation, participants showed improvements in sentence and word recognition scores with wireless mode compared to conventional mode in both quiet and noise conditions (P < .001). Participants also reported subjective improvements, including better sound quality, less noise interference, and better accuracy naturalness, when using the wireless mode (P < .001). Bluetooth-implemented hearing aids helped to improve subjective and objective speech recognition performances in quiet and noisy environments during the use of electronic audio devices.
Large-Scale Pattern Discovery in Music
NASA Astrophysics Data System (ADS)
Bertin-Mahieux, Thierry
This work focuses on extracting patterns in musical data from very large collections. The problem is split in two parts. First, we build such a large collection, the Million Song Dataset, to provide researchers access to commercial-size datasets. Second, we use this collection to study cover song recognition which involves finding harmonic patterns from audio features. Regarding the Million Song Dataset, we detail how we built the original collection from an online API, and how we encouraged other organizations to participate in the project. The result is the largest research dataset with heterogeneous sources of data available to music technology researchers. We demonstrate some of its potential and discuss the impact it already has on the field. On cover song recognition, we must revisit the existing literature since there are no publicly available results on a dataset of more than a few thousand entries. We present two solutions to tackle the problem, one using a hashing method, and one using a higher-level feature computed from the chromagram (dubbed the 2DFTM). We further investigate the 2DFTM since it has potential to be a relevant representation for any task involving audio harmonic content. Finally, we discuss the future of the dataset and the hope of seeing more work making use of the different sources of data that are linked in the Million Song Dataset. Regarding cover songs, we explain how this might be a first step towards defining a harmonic manifold of music, a space where harmonic similarities between songs would be more apparent.
Automated Assessment of Child Vocalization Development Using LENA.
Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance
2017-07-12
To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.
Dura-Bernal, Salvador; Garreau, Guillaume; Georgiou, Julius; Andreou, Andreas G; Denham, Susan L; Wennekers, Thomas
2013-10-01
The ability to recognize the behavior of individuals is of great interest in the general field of safety (e.g. building security, crowd control, transport analysis, independent living for the elderly). Here we report a new real-time acoustic system for human action and behavior recognition that integrates passive audio and active micro-Doppler sonar signatures over multiple time scales. The system architecture is based on a six-layer convolutional neural network, trained and evaluated using a dataset of 10 subjects performing seven different behaviors. Probabilistic combination of system output through time for each modality separately yields 94% (passive audio) and 91% (micro-Doppler sonar) correct behavior classification; probabilistic multimodal integration increases classification performance to 98%. This study supports the efficacy of micro-Doppler sonar systems in characterizing human actions, which can then be efficiently classified using ConvNets. It also demonstrates that the integration of multiple sources of acoustic information can significantly improve the system's performance.
Using voice input and audio feedback to enhance the reality of a virtual experience
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miner, N.E.
1994-04-01
Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantagesmore » and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.« less
Lee, Sungkyoung; Cappella, Joseph N.
2014-01-01
Findings from previous studies on smoking cues and argument strength in antismoking messages have shown that the presence of smoking cues undermines the persuasiveness of antismoking public service announcements (PSAs) with weak arguments. This study conceptualized smoking cues (i.e., scenes showing smoking-related objects and behaviors) as stimuli motivationally relevant to the former smoker population and examined how smoking cues influence former smokers’ processing of antismoking PSAs. Specifically, by defining smoking cues and the strength of antismoking arguments in terms of resource allocation, this study examined former smokers’ recognition accuracy, memory strength, and memory judgment of visual (i.e., scenes excluding smoking cues) and audio information from antismoking PSAs. In line with previous findings, the results of the study showed that the presence of smoking cues undermined former smokers’ encoding of antismoking arguments, which includes the visual and audio information that compose the main content of antismoking messages. PMID:25477766
Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration.
Stropahl, Maren; Debener, Stefan
2017-01-01
There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users ( n = 18), untreated mild to moderately hearing impaired individuals (n = 18) and normal hearing controls ( n = 17). Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the auditory system even at early stages of hearing loss.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bach, Christian; Sherman, William; Pallis, Jani
Zinc finger nucleases (ZFNs) are associated with cell death and apoptosis by binding at countless undesired locations. This cytotoxicity is associated with the binding ability of engineered zinc finger domains to bind dissimilar DNA sequences with high affinity. In general, binding preferences of transcription factors are associated with significant degenerated diversity and complexity which convolutes the design and engineering of precise DNA binding domains. Evolutionary success of natural zinc finger proteins, however, evinces that nature created specific evolutionary traits and strategies, such as modularity and rank-specific recognition to cope with binding complexity that are critical for creating clinical viable toolsmore » to precisely modify the human genome. Our findings indicate preservation of general modularity and significant alteration of the rank-specific binding preferences of the three-finger binding domain of transcription factor SP1 when exchanging amino acids in the 2nd finger.« less
Modular architecture of eukaryotic RNase P and RNase MRP revealed by electron microscopy.
Hipp, Katharina; Galani, Kyriaki; Batisse, Claire; Prinz, Simone; Böttcher, Bettina
2012-04-01
Ribonuclease P (RNase P) and RNase MRP are closely related ribonucleoprotein enzymes, which process RNA substrates including tRNA precursors for RNase P and 5.8 S rRNA precursors, as well as some mRNAs, for RNase MRP. The structures of RNase P and RNase MRP have not yet been solved, so it is unclear how the proteins contribute to the structure of the complexes and how substrate specificity is determined. Using electron microscopy and image processing we show that eukaryotic RNase P and RNase MRP have a modular architecture, where proteins stabilize the RNA fold and contribute to cavities, channels and chambers between the modules. Such features are located at strategic positions for substrate recognition by shape and coordination of the cleaved-off sequence. These are also the sites of greatest difference between RNase P and RNase MRP, highlighting the importance of the adaptation of this region to the different substrates.
Bach, Christian; Sherman, William; Pallis, Jani; ...
2014-01-01
Zinc finger nucleases (ZFNs) are associated with cell death and apoptosis by binding at countless undesired locations. This cytotoxicity is associated with the binding ability of engineered zinc finger domains to bind dissimilar DNA sequences with high affinity. In general, binding preferences of transcription factors are associated with significant degenerated diversity and complexity which convolutes the design and engineering of precise DNA binding domains. Evolutionary success of natural zinc finger proteins, however, evinces that nature created specific evolutionary traits and strategies, such as modularity and rank-specific recognition to cope with binding complexity that are critical for creating clinical viable toolsmore » to precisely modify the human genome. Our findings indicate preservation of general modularity and significant alteration of the rank-specific binding preferences of the three-finger binding domain of transcription factor SP1 when exchanging amino acids in the 2nd finger.« less
NASA Astrophysics Data System (ADS)
Hsu, Charles; Viazanko, Michael; O'Looney, Jimmy; Szu, Harold
2009-04-01
Modularity Biometric System (MBS) is an approach to support AiTR of the cooperated and/or non-cooperated standoff biometric in an area persistent surveillance. Advanced active and passive EOIR and RF sensor suite is not considered here. Neither will we consider the ROC, PD vs. FAR, versus the standoff POT in this paper. Our goal is to catch the "most wanted (MW)" two dozens, separately furthermore ad hoc woman MW class from man MW class, given their archrivals sparse front face data basis, by means of various new instantaneous input called probing faces. We present an advanced algorithm: mini-Max classifier, a sparse sample realization of Cramer-Rao Fisher bound of the Maximum Likelihood classifier that minimize the dispersions among the same woman classes and maximize the separation among different man-woman classes, based on the simple feature space of MIT Petland eigen-faces. The original aspect consists of a modular structured design approach at the system-level with multi-level architectures, multiple computing paradigms, and adaptable/evolvable techniques to allow for achieving a scalable structure in terms of biometric algorithms, identification quality, sensors, database complexity, database integration, and component heterogenity. MBS consist of a number of biometric technologies including fingerprints, vein maps, voice and face recognitions with innovative DSP algorithm, and their hardware implementations such as using Field Programmable Gate arrays (FPGAs). Biometric technologies and the composed modularity biometric system are significant for governmental agencies, enterprises, banks and all other organizations to protect people or control access to critical resources.
FTDD973: A multimedia knowledge-based system and methodology for operator training and diagnostics
NASA Technical Reports Server (NTRS)
Hekmatpour, Amir; Brown, Gary; Brault, Randy; Bowen, Greg
1993-01-01
FTDD973 (973 Fabricator Training, Documentation, and Diagnostics) is an interactive multimedia knowledge based system and methodology for computer-aided training and certification of operators, as well as tool and process diagnostics in IBM's CMOS SGP fabrication line (building 973). FTDD973 is an example of what can be achieved with modern multimedia workstations. Knowledge-based systems, hypertext, hypergraphics, high resolution images, audio, motion video, and animation are technologies that in synergy can be far more useful than each by itself. FTDD973's modular and object-oriented architecture is also an example of how improvements in software engineering are finally making it possible to combine many software modules into one application. FTDD973 is developed in ExperMedia/2; and OS/2 multimedia expert system shell for domain experts.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, T; Huang, S; Zhao, XF
Recent studies indicate that the DNA recognition domain of transcription activator-like (TAL) effectors can be combined with the nuclease domain of FokI restriction enzyme to produce TAL effector nucleases (TALENs) that, in pairs, bind adjacent DNA target sites and produce double-strand breaks between the target sequences, stimulating non-homologous end-joining and homologous recombination. Here, we exploit the four prevalent TAL repeats and their DNA recognition cipher to develop a 'modular assembly' method for rapid production of designer TALENs (dTALENs) that recognize unique DNA sequence up to 23 bases in any gene. We have used this approach to engineer 10 dTALENs tomore » target specific loci in native yeast chromosomal genes. All dTALENs produced high rates of site-specific gene disruptions and created strains with expected mutant phenotypes. Moreover, dTALENs stimulated high rates (up to 34%) of gene replacement by homologous recombination. Finally, dTALENs caused no detectable cytotoxicity and minimal levels of undesired genetic mutations in the treated yeast strains. These studies expand the realm of verified TALEN activity from cultured human cells to an intact eukaryotic organism and suggest that low-cost, highly dependable dTALENs can assume a significant role for gene modifications of value in human and animal health, agriculture and industry.« less
François, Clément; Cunillera, Toni; Garcia, Enara; Laine, Matti; Rodriguez-Fornells, Antoni
2017-04-01
Learning a new language requires the identification of word units from continuous speech (the speech segmentation problem) and mapping them onto conceptual representation (the word to world mapping problem). Recent behavioral studies have revealed that the statistical properties found within and across modalities can serve as cues for both processes. However, segmentation and mapping have been largely studied separately, and thus it remains unclear whether both processes can be accomplished at the same time and if they share common neurophysiological features. To address this question, we recorded EEG of 20 adult participants during both an audio alone speech segmentation task and an audiovisual word-to-picture association task. The participants were tested for both the implicit detection of online mismatches (structural auditory and visual semantic violations) as well as for the explicit recognition of words and word-to-picture associations. The ERP results from the learning phase revealed a delayed learning-related fronto-central negativity (FN400) in the audiovisual condition compared to the audio alone condition. Interestingly, while online structural auditory violations elicited clear MMN/N200 components in the audio alone condition, visual-semantic violations induced meaning-related N400 modulations in the audiovisual condition. The present results support the idea that speech segmentation and meaning mapping can take place in parallel and act in synergy to enhance novel word learning. Copyright © 2016 Elsevier Ltd. All rights reserved.
Drosou, A.; Ioannidis, D.; Moustakas, K.; Tzovaras, D.
2011-01-01
Unobtrusive Authentication Using ACTIvity-Related and Soft BIOmetrics (ACTIBIO) is an EU Specific Targeted Research Project (STREP) where new types of biometrics are combined with state-of-the-art unobtrusive technologies in order to enhance security in a wide spectrum of applications. The project aims to develop a modular, robust, multimodal biometrics security authentication and monitoring system, which uses a biodynamic physiological profile, unique for each individual, and advancements of the state of the art in unobtrusive behavioral and other biometrics, such as face, gait recognition, and seat-based anthropometrics. Several shortcomings of existing biometric recognition systems are addressed within this project, which have helped in improving existing sensors, in developing new algorithms, and in designing applications, towards creating new, unobtrusive, biometric authentication procedures in security-sensitive, Ambient Intelligence environments. This paper presents the concept of the ACTIBIO project and describes its unobtrusive authentication demonstrator in a real scenario by focusing on the vision-based biometric recognition modalities. PMID:21380485
The Precise and Efficient Identification of Medical Order Forms Using Shape Trees
NASA Astrophysics Data System (ADS)
Henker, Uwe; Petersohn, Uwe; Ultsch, Alfred
A powerful and flexible technique to identify, classify and process documents using images from a scanning process is presented. The types of documents can be described to the system as a set of differentiating features in a case base using shape trees. The features are filtered and abstracted from an extremely reduced scanner image of the document. Classification rules are stored with the cases to enable precise recognition and further mark reading and Optical Character Recognition (OCR) process. The method is implemented in a system which actually processes the majority of requests for medical lab procedures in Germany. A large practical experiment with data from practitioners was performed. An average of 97% of the forms were correctly identified; none were identified incorrectly. This meets the quality requirements for most medical applications. The modular description of the recognition process allows for a flexible adaptation of future changes to the form and content of the document’s structures.
Drosou, A; Ioannidis, D; Moustakas, K; Tzovaras, D
2011-03-01
Unobtrusive Authentication Using ACTIvity-Related and Soft BIOmetrics (ACTIBIO) is an EU Specific Targeted Research Project (STREP) where new types of biometrics are combined with state-of-the-art unobtrusive technologies in order to enhance security in a wide spectrum of applications. The project aims to develop a modular, robust, multimodal biometrics security authentication and monitoring system, which uses a biodynamic physiological profile, unique for each individual, and advancements of the state of the art in unobtrusive behavioral and other biometrics, such as face, gait recognition, and seat-based anthropometrics. Several shortcomings of existing biometric recognition systems are addressed within this project, which have helped in improving existing sensors, in developing new algorithms, and in designing applications, towards creating new, unobtrusive, biometric authentication procedures in security-sensitive, Ambient Intelligence environments. This paper presents the concept of the ACTIBIO project and describes its unobtrusive authentication demonstrator in a real scenario by focusing on the vision-based biometric recognition modalities.
NASA Astrophysics Data System (ADS)
Holtzman, B. K.; Paté, A.; Paisley, J.; Waldhauser, F.; Repetto, D.; Boschi, L.
2017-12-01
The earthquake process reflects complex interactions of stress, fracture and frictional properties. New machine learning methods reveal patterns in time-dependent spectral properties of seismic signals and enable identification of changes in faulting processes. Our methods are based closely on those developed for music information retrieval and voice recognition, using the spectrogram instead of the waveform directly. Unsupervised learning involves identification of patterns based on differences among signals without any additional information provided to the algorithm. Clustering of 46,000 earthquakes of $0.3
Plant pattern recognition receptor complexes at the plasma membrane.
Monaghan, Jacqueline; Zipfel, Cyril
2012-08-01
A key feature of innate immunity is the ability to recognize and respond to potential pathogens in a highly sensitive and specific manner. In plants, the activation of pattern recognition receptors (PRRs) by pathogen-associated molecular patterns (PAMPs) elicits a defense programme known as PAMP-triggered immunity (PTI). Although only a handful of PAMP-PRR pairs have been defined, all known PRRs are modular transmembrane proteins containing ligand-binding ectodomains. It is becoming clear that PRRs do not act alone but rather function as part of multi-protein complexes at the plasma membrane. Recent studies describing the molecular interactions and protein modifications that occur between PRRs and their regulatory proteins have provided important mechanistic insight into how plants avoid infection and achieve immunity. Copyright © 2012 Elsevier Ltd. All rights reserved.
More About Thin-Membrane Biosensor
NASA Technical Reports Server (NTRS)
Case, George D.; Worley, Jennings F., III
1994-01-01
Report presents additional information about device described in "Thin-Membrane Sensor With Biochemical Switch" (MFS-26121). Device is modular sensor that puts out electrical signal indicative of chemical or biological agent. Signal produced as membrane-crossing ion current triggered by chemical reaction between agent and recognition protein conjugated to channel blocker. Prototype of biosensor useful in numerous laboratory, industrial, or field applications; such as to detect bacterial toxins in food, to screen for disease-producing micro-organisms, or to warn of toxins or pollutants in air.
Wang, Wen-Jie; Cheng, Wang; Luo, Ming; Yan, Qingyu; Yu, Hong-Mei; Li, Qiong; Cao, Dong-Dong; Huang, Shengfeng; Xu, Anlong; Mariuzza, Roy A.; Chen, Yuxing; Zhou, Cong-Zhao
2015-01-01
Peptidoglycan recognition proteins (PGRPs), which have been identified in most animals, are pattern recognition molecules that involve antimicrobial defense. Resulting from extraordinary expansion of innate immune genes, the amphioxus encodes many PGRPs of diverse functions. For instance, three isoforms of PGRP encoded by Branchiostoma belcheri tsingtauense, termed BbtPGRP1~3, are fused with a chitin binding domain (CBD) at the N-terminus. Here we report the 2.7 Å crystal structure of BbtPGRP3, revealing an overall structure of an N-terminal hevein-like CBD followed by a catalytic PGRP domain. Activity assays combined with site-directed mutagenesis indicated that the individual PGRP domain exhibits amidase activity towards both DAP-type and Lys-type peptidoglycans (PGNs), the former of which is favored. The N-terminal CBD not only has the chitin-binding activity, but also enables BbtPGRP3 to gain a five-fold increase of amidase activity towards the Lys-type PGNs, leading to a significantly broadened substrate spectrum. Together, we propose that modular evolution via domain shuffling combined with gene horizontal transfer makes BbtPGRP1~3 novel PGRPs of augmented catalytic activity and broad recognition spectrum. PMID:26479246
Speech endpoint detection with non-language speech sounds for generic speech processing applications
NASA Astrophysics Data System (ADS)
McClain, Matthew; Romanowski, Brian
2009-05-01
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Sugiura, Motoaki; Sassa, Yuko; Watanabe, Jobu; Akitsuki, Yuko; Maeda, Yasuhiro; Matsue, Yoshihiko; Kawashima, Ryuta
2009-10-01
Person recognition has been assumed to entail many types of person-specific cognitive responses, including retrieval of knowledge, episodic recollection, and emotional responses. To demonstrate the cortical correlates of this modular structure of multimodal person representation, we investigated neural responses preferential to personally familiar people and responses dependent on familiarity with famous people in the temporal and parietal cortices. During functional magnetic resonance imaging (fMRI) measurements, normal subjects recognized personally familiar names (personal) or famous names with high or low degrees of familiarity (high or low, respectively). Effects of familiarity with famous people (i.e., high-low) were identified in the bilateral angular gyri, the left supramarginal gyrus, the middle part of the bilateral posterior cingulate cortices, and the left precuneus. Activation preferentially relevant to personally familiar people (i.e., personal-high) was identified in the bilateral temporo-parietal junctions, the right anterolateral temporal cortices, posterior middle temporal gyrus, posterior cingulate cortex (with a peak in the posterodorsal part), and the left precuneus; these activation foci exhibited varying degrees of activation for high and low names. An equivalent extent of activation was observed for all familiar names in the bilateral temporal poles, the left orbito-insular junction, the middle temporal gyrus, and the anterior part of the posterior cingulate cortex. The results demonstrated that distinct cortical areas supported different types of cognitive responses, induced to different degrees during recognition of famous and personally familiar people, providing neuroscientific evidence for the modularity of multimodal person representation.
NASA Astrophysics Data System (ADS)
Martin, P.; Tseu, A.; Férey, N.; Touraine, D.; Bourdot, P.
2014-02-01
Most advanced immersive devices provide collaborative environment within several users have their distinct head-tracked stereoscopic point of view. Combining with common used interactive features such as voice and gesture recognition, 3D mouse, haptic feedback, and spatialized audio rendering, these environments should faithfully reproduce a real context. However, even if many studies have been carried out on multimodal systems, we are far to definitively solve the issue of multimodal fusion, which consists in merging multimodal events coming from users and devices, into interpretable commands performed by the application. Multimodality and collaboration was often studied separately, despite of the fact that these two aspects share interesting similarities. We discuss how we address this problem, thought the design and implementation of a supervisor that is able to deal with both multimodal fusion and collaborative aspects. The aim of this supervisor is to ensure the merge of user's input from virtual reality devices in order to control immersive multi-user applications. We deal with this problem according to a practical point of view, because the main requirements of this supervisor was defined according to a industrial task proposed by our automotive partner, that as to be performed with multimodal and collaborative interactions in a co-located multi-user environment. In this task, two co-located workers of a virtual assembly chain has to cooperate to insert a seat into the bodywork of a car, using haptic devices to feel collision and to manipulate objects, combining speech recognition and two hands gesture recognition as multimodal instructions. Besides the architectural aspect of this supervisor, we described how we ensure the modularity of our solution that could apply on different virtual reality platforms, interactive contexts and virtual contents. A virtual context observer included in this supervisor in was especially designed to be independent to the content of the virtual scene of targeted application, and is use to report high-level interactive and collaborative events. This context observer allows the supervisor to merge these interactive and collaborative events, but is also used to deal with new issues coming from our observation of two co-located users in an immersive device performing this assembly task. We highlight the fact that when speech recognition features are provided to the two users, it is required to automatically detect according to the interactive context, whether the vocal instructions must be translated into commands that have to be performed by the machine, or whether they take a part of the natural communication necessary for collaboration. Information coming from this context observer that indicates a user is looking at its collaborator, is important to detect if the user is talking to its partner. Moreover, as the users are physically co-localised and head-tracking is used to provide high fidelity stereoscopic rendering, and natural walking navigation in the virtual scene, we have to deals with collision and screen occlusion between the co-located users in the physical work space. Working area and focus of each user, computed and reported by the context observer is necessary to prevent or avoid these situations.
Integrated Collision Avoidance System for Air Vehicle
NASA Technical Reports Server (NTRS)
Lin, Ching-Fang (Inventor)
2013-01-01
Collision with ground/water/terrain and midair obstacles is one of the common causes of severe aircraft accidents. The various data from the coremicro AHRS/INS/GPS Integration Unit, terrain data base, and object detection sensors are processed to produce collision warning audio/visual messages and collision detection and avoidance of terrain and obstacles through generation of guidance commands in a closed-loop system. The vision sensors provide more information for the Integrated System, such as, terrain recognition and ranging of terrain and obstacles, which plays an important role to the improvement of the Integrated Collision Avoidance System.
NASA Astrophysics Data System (ADS)
Dehé, Alfons
2017-06-01
After decades of research and more than ten years of successful production in very high volumes Silicon MEMS microphones are mature and unbeatable in form factor and robustness. Audio applications such as video, noise cancellation and speech recognition are key differentiators in smart phones. Microphones with low self-noise enable those functions. Backplate-free microphones enter the signal to noise ratios above 70dB(A). This talk will describe state of the art MEMS technology of Infineon Technologies. An outlook on future technologies such as the comb sensor microphone will be given.
Modular architecture of eukaryotic RNase P and RNase MRP revealed by electron microscopy
Hipp, Katharina; Galani, Kyriaki; Batisse, Claire; Prinz, Simone; Böttcher, Bettina
2012-01-01
Ribonuclease P (RNase P) and RNase MRP are closely related ribonucleoprotein enzymes, which process RNA substrates including tRNA precursors for RNase P and 5.8 S rRNA precursors, as well as some mRNAs, for RNase MRP. The structures of RNase P and RNase MRP have not yet been solved, so it is unclear how the proteins contribute to the structure of the complexes and how substrate specificity is determined. Using electron microscopy and image processing we show that eukaryotic RNase P and RNase MRP have a modular architecture, where proteins stabilize the RNA fold and contribute to cavities, channels and chambers between the modules. Such features are located at strategic positions for substrate recognition by shape and coordination of the cleaved-off sequence. These are also the sites of greatest difference between RNase P and RNase MRP, highlighting the importance of the adaptation of this region to the different substrates. PMID:22167472
Zhang, Min; Shi, Zhen; Bai, Yinjuan; Gao, Yong; Hu, Rongzu; Zhao, Fenqi
2006-02-01
This study presents a novel method for determining the molecular weights of low molecular weight (MW) energetic compounds through their complexes of beta-cyclodextrin (beta-CD) and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) in a mass range of 500 to 1700 Da, avoiding matrix interference. The MWs of one composite explosive composed of 2,6-DNT, TNT, and RDX, one propellant with unknown components, and 14 single-compound explosives (RDX, HMX, 3,4-DNT, 2,6-DNT, 2,5-DNT, 2,4,6-TNT, TNAZ, DNI, BTTN, NG, TO, NTO, NP, and 662) were measured. The molecular recognition and inclusion behavior of beta-CD to energetic materials (EMs) were investigated. The results show that (1) the established method is sensitive, simple, accurate, and suitable for determining the MWs of low-MW single-compound explosives and energetic components in composite explosives and propellants; and (2) beta-CD has good inclusion and modular recognition abilities to the above EMs.
Real-time mental arithmetic task recognition from EEG signals.
Wang, Qiang; Sourina, Olga
2013-03-01
Electroencephalography (EEG)-based monitoring the state of the user's brain functioning and giving her/him the visual/audio/tactile feedback is called neurofeedback technique, and it could allow the user to train the corresponding brain functions. It could provide an alternative way of treatment for some psychological disorders such as attention deficit hyperactivity disorder (ADHD), where concentration function deficit exists, autism spectrum disorder (ASD), or dyscalculia where the difficulty in learning and comprehending the arithmetic exists. In this paper, a novel method for multifractal analysis of EEG signals named generalized Higuchi fractal dimension spectrum (GHFDS) was proposed and applied in mental arithmetic task recognition from EEG signals. Other features such as power spectrum density (PSD), autoregressive model (AR), and statistical features were analyzed as well. The usage of the proposed fractal dimension spectrum of EEG signal in combination with other features improved the mental arithmetic task recognition accuracy in both multi-channel and one-channel subject-dependent algorithms up to 97.87% and 84.15% correspondingly. Based on the channel ranking, four channels were chosen which gave the accuracy up to 97.11%. Reliable real-time neurofeedback system could be implemented based on the algorithms proposed in this paper.
Modular supramolecular approach for co-crystallization of donors and acceptors into ordered networks
Stupp, Samuel I.; Stoddart, J. Fraser; Shveyd, Alex K.; Tayi, Alok S.; Sue, Andrew C. H.; Narayanan, Ashwin
2016-09-20
Organic charge-transfer (CT) co-crystals in a mixed stack system are disclosed, wherein a donor molecule (D) and an acceptor molecule (A) occupy alternating positions (DADADA) along the CT axis. A platform is provided which amplifies the molecular recognition of donors and acceptors and produces co-crystals at ambient conditions, wherein the platform comprises (i) a molecular design of the first constituent (.alpha.-complement), (ii) a molecular design of the second compound (.beta.-complement), and (iii) a solvent system that promotes co-crystallization.
Modular supramolecular approach for co-crystallization of donors and acceptors into ordered networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stupp, Samuel I.; Stoddart, J. Fraser; Shveyd, Alexander K.
Organic charge-transfer (CT) co-crystals in a mixed stack system are disclosed, wherein a donor molecule (D) and an acceptor molecule (A) occupy alternating positions (DADADA) along the CT axis. A platform is provided which amplifies the molecular recognition of donors and acceptors and produces co-crystals at ambient conditions, wherein the platform comprises (i) a molecular design of the first constituent (.alpha.-complement), (ii) a molecular design of the second compound (.beta.-complement), and (iii) a solvent system that promotes co-crystallization.
OSCAR4: a flexible architecture for chemical text-mining.
Jessop, David M; Adams, Sam E; Willighagen, Egon L; Hawizy, Lezan; Murray-Rust, Peter
2011-10-14
The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.
Toward an automated signature recognition toolkit for mission operations
NASA Technical Reports Server (NTRS)
Cleghorn, T.; Laird, P; Perrine, L.; Culbert, C.; Macha, M.; Saul, R.; Hammen, D.; Moebes, T.; Shelton, R.
1994-01-01
Signature recognition is the problem of identifying an event or events from its time series. The generic problem has numerous applications to science and engineering. At NASA's Johnson Space Center, for example, mission control personnel, using electronic displays and strip chart recorders, monitor telemetry data from three-phase electrical buses on the Space Shuttle and maintain records of device activation and deactivation. Since few electrical devices have sensors to indicate their actual status, changes of state are inferred from characteristic current and voltage fluctuations. Controllers recognize these events both by examining the waveform signatures and by listening to audio channels between ground and crew. Recently the authors have developed a prototype system that identifies major electrical events from the telemetry and displays them on a workstation. Eventually the system will be able to identify accurately the signatures of over fifty distinct events in real time, while contending with noise, intermittent loss of signal, overlapping events, and other complications. This system is just one of many possible signature recognition applications in Mission Control. While much of the technology underlying these applications is the same, each application has unique data characteristics, and every control position has its own interface and performance requirements. There is a need, therefore, for CASE tools that can reduce the time to implement a running signature recognition application from months to weeks or days. This paper describes our work to date and our future plans.
Toward an automated signature recognition toolkit for mission operations
NASA Astrophysics Data System (ADS)
Cleghorn, T.; Laird, P.; Perrine, L.; Culbert, C.; Macha, M.; Saul, R.; Hammen, D.; Moebes, T.; Shelton, R.
1994-10-01
Signature recognition is the problem of identifying an event or events from its time series. The generic problem has numerous applications to science and engineering. At NASA's Johnson Space Center, for example, mission control personnel, using electronic displays and strip chart recorders, monitor telemetry data from three-phase electrical buses on the Space Shuttle and maintain records of device activation and deactivation. Since few electrical devices have sensors to indicate their actual status, changes of state are inferred from characteristic current and voltage fluctuations. Controllers recognize these events both by examining the waveform signatures and by listening to audio channels between ground and crew. Recently the authors have developed a prototype system that identifies major electrical events from the telemetry and displays them on a workstation. Eventually the system will be able to identify accurately the signatures of over fifty distinct events in real time, while contending with noise, intermittent loss of signal, overlapping events, and other complications. This system is just one of many possible signature recognition applications in Mission Control. While much of the technology underlying these applications is the same, each application has unique data characteristics, and every control position has its own interface and performance requirements. There is a need, therefore, for CASE tools that can reduce the time to implement a running signature recognition application from months to weeks or days. This paper describes our work to date and our future plans.
Lightweight composites for modular panelized construction
NASA Astrophysics Data System (ADS)
Vaidya, Amol S.
Rapid advances in construction materials technology have enabled civil engineers to achieve impressive gains in the safety, economy, and functionality of structures built to serve the common needs of society. Modular building systems is a fast-growing modern, form of construction gaining recognition for its increased efficiency and ability to apply modern technology to the needs of the market place. In the modular construction technique, a single structural panel can perform a number of functions such as providing thermal insulation, vibration damping, and structural strength. These multifunctional panels can be prefabricated in a manufacturing facility and then transferred to the construction site. A system that uses prefabricated panels for construction is called a "panelized construction system". This study focuses on the development of pre-cast, lightweight, multifunctional sandwich composite panels to be used for panelized construction. Two thermoplastic composite panels are proposed in this study, namely Composite Structural Insulated Panels (CSIPs) for exterior walls, floors and roofs, and Open Core Sandwich composite for multifunctional interior walls of a structure. Special manufacturing techniques are developed for manufacturing these panels. The structural behavior of these panels is analyzed based on various building design codes. Detailed descriptions of the design, cost analysis, manufacturing, finite element modeling and structural testing of these proposed panels are included in this study in the of form five peer-reviewed journal articles. The structural testing of the proposed panels involved in this study included flexural testing, axial compression testing, and low and high velocity impact testing. Based on the current study, the proposed CSIP wall and floor panels were found satisfactory, based on building design codes ASCE-7-05 and ACI-318-05. Joining techniques are proposed in this study for connecting the precast panels on the construction site. Keywords: Modular panelized construction, sandwich composites, composite structural insulated panels (CSIPs).
Modular evolution of phosphorylation-based signalling systems
Jin, Jing; Pawson, Tony
2012-01-01
Phosphorylation sites are formed by protein kinases (‘writers’), frequently exert their effects following recognition by phospho-binding proteins (‘readers’) and are removed by protein phosphatases (‘erasers’). This writer–reader–eraser toolkit allows phosphorylation events to control a broad range of regulatory processes, and has been pivotal in the evolution of new functions required for the development of multi-cellular animals. The proteins that comprise this system of protein kinases, phospho-binding targets and phosphatases are typically modular in organization, in the sense that they are composed of multiple globular domains and smaller peptide motifs with binding or catalytic properties. The linkage of these binding and catalytic modules in new ways through genetic recombination, and the selection of particular domain combinations, has promoted the evolution of novel, biologically useful processes. Conversely, the joining of domains in aberrant combinations can subvert cell signalling and be causative in diseases such as cancer. Major inventions such as phosphotyrosine (pTyr)-mediated signalling that flourished in the first multi-cellular animals and their immediate predecessors resulted from stepwise evolutionary progression. This involved changes in the binding properties of interaction domains such as SH2 and their linkage to new domain types, and alterations in the catalytic specificities of kinases and phosphatases. This review will focus on the modular aspects of signalling networks and the mechanism by which they may have evolved. PMID:22889906
Schulz, Sebastian; Eckweiler, Denitsa; Bielecka, Agata; Nicolai, Tanja; Franke, Raimo; Dötsch, Andreas; Hornischer, Klaus; Bruchmann, Sebastian; Düvel, Juliane; Häussler, Susanne
2015-01-01
Sigma factors are essential global regulators of transcription initiation in bacteria which confer promoter recognition specificity to the RNA polymerase core enzyme. They provide effective mechanisms for simultaneously regulating expression of large numbers of genes in response to challenging conditions, and their presence has been linked to bacterial virulence and pathogenicity. In this study, we constructed nine his-tagged sigma factor expressing and/or deletion mutant strains in the opportunistic pathogen Pseudomonas aeruginosa. To uncover the direct and indirect sigma factor regulons, we performed mRNA profiling, as well as chromatin immunoprecipitation coupled to high-throughput sequencing. We furthermore elucidated the de novo binding motif of each sigma factor, and validated the RNA- and ChIP-seq results by global motif searches in the proximity of transcriptional start sites (TSS). Our integrated approach revealed a highly modular network architecture which is composed of insulated functional sigma factor modules. Analysis of the interconnectivity of the various sigma factor networks uncovered a limited, but highly function-specific, crosstalk which orchestrates complex cellular processes. Our data indicate that the modular structure of sigma factor networks enables P. aeruginosa to function adequately in its environment and at the same time is exploited to build up higher-level functions by specific interconnections that are dominated by a participation of RpoN. PMID:25780925
Public domain optical character recognition
NASA Astrophysics Data System (ADS)
Garris, Michael D.; Blue, James L.; Candela, Gerald T.; Dimmick, Darrin L.; Geist, Jon C.; Grother, Patrick J.; Janet, Stanley A.; Wilson, Charles L.
1995-03-01
A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on handwriting sample forms like the ones distributed with NIST Special Database 1. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized probabilistic neural network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.
Presentation video retrieval using automatically recovered slide and spoken text
NASA Astrophysics Data System (ADS)
Cooper, Matthew
2013-03-01
Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
Language Model Combination and Adaptation Using Weighted Finite State Transducers
NASA Technical Reports Server (NTRS)
Liu, X.; Gales, M. J. F.; Hieronymus, J. L.; Woodland, P. C.
2010-01-01
In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences
Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.
Borrie, Stephanie A
2015-03-01
This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.
Statistical data mining of streaming motion data for fall detection in assistive environments.
Tasoulis, S K; Doukas, C N; Maglogiannis, I; Plagianakos, V P
2011-01-01
The analysis of human motion data is interesting for the purpose of activity recognition or emergency event detection, especially in the case of elderly or disabled people living independently in their homes. Several techniques have been proposed for identifying such distress situations using either motion, audio or video sensors on the monitored subject (wearable sensors) or the surrounding environment. The output of such sensors is data streams that require real time recognition, especially in emergency situations, thus traditional classification approaches may not be applicable for immediate alarm triggering or fall prevention. This paper presents a statistical mining methodology that may be used for the specific problem of real time fall detection. Visual data captured from the user's environment, using overhead cameras along with motion data are collected from accelerometers on the subject's body and are fed to the fall detection system. The paper includes the details of the stream data mining methodology incorporated in the system along with an initial evaluation of the achieved accuracy in detecting falls.
Why are some languages confused for others? Investigating data from the Great Language Game
2017-01-01
In this paper we explore the results of a large-scale online game called ‘the Great Language Game’, in which people listen to an audio speech sample and make a forced-choice guess about the identity of the language from 2 or more alternatives. The data include 15 million guesses from 400 audio recordings of 78 languages. We investigate which languages are confused for which in the game, and if this correlates with the similarities that linguists identify between languages. This includes shared lexical items, similar sound inventories and established historical relationships. Our findings are, as expected, that players are more likely to confuse two languages that are objectively more similar. We also investigate factors that may affect players’ ability to accurately select the target language, such as how many people speak the language, how often the language is mentioned in written materials and the economic power of the target language community. We see that non-linguistic factors affect players’ ability to accurately identify the target. For example, languages with wider ‘global reach’ are more often identified correctly. This suggests that both linguistic and cultural knowledge influence the perception and recognition of languages and their similarity. PMID:28379970
Why are some languages confused for others? Investigating data from the Great Language Game.
Skirgård, Hedvig; Roberts, Seán G; Yencken, Lars
2017-01-01
In this paper we explore the results of a large-scale online game called 'the Great Language Game', in which people listen to an audio speech sample and make a forced-choice guess about the identity of the language from 2 or more alternatives. The data include 15 million guesses from 400 audio recordings of 78 languages. We investigate which languages are confused for which in the game, and if this correlates with the similarities that linguists identify between languages. This includes shared lexical items, similar sound inventories and established historical relationships. Our findings are, as expected, that players are more likely to confuse two languages that are objectively more similar. We also investigate factors that may affect players' ability to accurately select the target language, such as how many people speak the language, how often the language is mentioned in written materials and the economic power of the target language community. We see that non-linguistic factors affect players' ability to accurately identify the target. For example, languages with wider 'global reach' are more often identified correctly. This suggests that both linguistic and cultural knowledge influence the perception and recognition of languages and their similarity.
Visual cues and listening effort: individual variability.
Picou, Erin M; Ricketts, Todd A; Hornsby, Benjamin W Y
2011-10-01
To investigate the effect of visual cues on listening effort as well as whether predictive variables such as working memory capacity (WMC) and lipreading ability affect the magnitude of listening effort. Twenty participants with normal hearing were tested using a paired-associates recall task in 2 conditions (quiet and noise) and 2 presentation modalities (audio only [AO] and auditory-visual [AV]). Signal-to-noise ratios were adjusted to provide matched speech recognition across audio-only and AV noise conditions. Also measured were subjective perceptions of listening effort and 2 predictive variables: (a) lipreading ability and (b) WMC. Objective and subjective results indicated that listening effort increased in the presence of noise, but on average the addition of visual cues did not significantly affect the magnitude of listening effort. Although there was substantial individual variability, on average participants who were better lipreaders or had larger WMCs demonstrated reduced listening effort in noise in AV conditions. Overall, the results support the hypothesis that integrating auditory and visual cues requires cognitive resources in some participants. The data indicate that low lipreading ability or low WMC is associated with relatively effortful integration of auditory and visual information in noise.
Action Unit Models of Facial Expression of Emotion in the Presence of Speech
Shah, Miraj; Cooper, David G.; Cao, Houwei; Gur, Ruben C.; Nenkova, Ani; Verma, Ragini
2014-01-01
Automatic recognition of emotion using facial expressions in the presence of speech poses a unique challenge because talking reveals clues for the affective state of the speaker but distorts the canonical expression of emotion on the face. We introduce a corpus of acted emotion expression where speech is either present (talking) or absent (silent). The corpus is uniquely suited for analysis of the interplay between the two conditions. We use a multimodal decision level fusion classifier to combine models of emotion from talking and silent faces as well as from audio to recognize five basic emotions: anger, disgust, fear, happy and sad. Our results strongly indicate that emotion prediction in the presence of speech from action unit facial features is less accurate when the person is talking. Modeling talking and silent expressions separately and fusing the two models greatly improves accuracy of prediction in the talking setting. The advantages are most pronounced when silent and talking face models are fused with predictions from audio features. In this multi-modal prediction both the combination of modalities and the separate models of talking and silent facial expression of emotion contribute to the improvement. PMID:25525561
OSCAR4: a flexible architecture for chemical text-mining
2011-01-01
The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed. PMID:21999457
Hattori, Takamitsu; Lai, Darson; Dementieva, Irina S.; ...
2016-02-09
Antibodies have a well-established modular architecture wherein the antigen-binding site residing in the antigen-binding fragment (Fab or Fv) is an autonomous and complete unit for antigen recognition. Here, we describe antibodies departing from this paradigm. We developed recombinant antibodies to trimethylated lysine residues on histone H3, important epigenetic marks and challenging targets for molecular recognition. Quantitative characterization demonstrated their exquisite specificity and high affinity, and they performed well in common epigenetics applications. Surprisingly, crystal structures and biophysical analyses revealed that two antigen-binding sites of these antibodies form a head-to-head dimer and cooperatively recognize the antigen in the dimer interface. Thismore » “antigen clasping” produced an expansive interface where trimethylated Lys bound to an unusually extensive aromatic cage in one Fab and the histone N terminus to a pocket in the other, thereby rationalizing the high specificity. A long-neck antibody format with a long linker between the antigen-binding module and the Fc region facilitated antigen clasping and achieved both high specificity and high potency. Antigen clasping substantially expands the paradigm of antibody–antigen recognition and suggests a strategy for developing extremely specific antibodies.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hattori, Takamitsu; Lai, Darson; Dementieva, Irina S.
Antibodies have a well-established modular architecture wherein the antigen-binding site residing in the antigen-binding fragment (Fab or Fv) is an autonomous and complete unit for antigen recognition. Here, we describe antibodies departing from this paradigm. We developed recombinant antibodies to trimethylated lysine residues on histone H3, important epigenetic marks and challenging targets for molecular recognition. Quantitative characterization demonstrated their exquisite specificity and high affinity, and they performed well in common epigenetics applications. Surprisingly, crystal structures and biophysical analyses revealed that two antigen-binding sites of these antibodies form a head-to-head dimer and cooperatively recognize the antigen in the dimer interface. Thismore » “antigen clasping” produced an expansive interface where trimethylated Lys bound to an unusually extensive aromatic cage in one Fab and the histone N terminus to a pocket in the other, thereby rationalizing the high specificity. A long-neck antibody format with a long linker between the antigen-binding module and the Fc region facilitated antigen clasping and achieved both high specificity and high potency. Antigen clasping substantially expands the paradigm of antibody–antigen recognition and suggests a strategy for developing extremely specific antibodies.« less
Krylov, Victor; Shaburova, Olga; Pleteneva, Elena; Bourkaltseva, Maria; Krylov, Sergey; Kaplan, Alla; Chesnokova, Elena; Kulakov, Leonid; Magill, Damian; Polygach, Olga
2016-01-01
This review discusses the potential application of bacterial viruses (phage therapy) toward the eradication of antibiotic resistant Pseudomonas aeruginosa in children with cystic fibrosis (CF). In this regard, several potential relationships between bacteria and their bacteriophages are considered. The most important aspect that must be addressed with respect to phage therapy of bacterial infections in the lungs of CF patients is in ensuring the continuity of treatment in light of the continual occurrence of resistant bacteria. This depends on the ability to rapidly select phages exhibiting an enhanced spectrum of lytic activity among several well-studied phage groups of proven safety. We propose a modular based approach, utilizing both mono-species and hetero-species phage mixtures. With an approach involving the visual recognition of characteristics exhibited by phages of well-studied phage groups on lawns of the standard P. aeruginosa PAO1 strain, the simple and rapid enhancement of the lytic spectrum of cocktails is permitted, allowing the development of tailored preparations for patients capable of circumventing problems associated with phage resistant bacterial mutants. PMID:27790211
Thubagere, Anupama J; Li, Wei; Johnson, Robert F; Chen, Zibo; Doroudi, Shayan; Lee, Yae Lim; Izatt, Gregory; Wittman, Sarah; Srinivas, Niranjan; Woods, Damien; Winfree, Erik; Qian, Lulu
2017-09-15
Two critical challenges in the design and synthesis of molecular robots are modularity and algorithm simplicity. We demonstrate three modular building blocks for a DNA robot that performs cargo sorting at the molecular level. A simple algorithm encoding recognition between cargos and their destinations allows for a simple robot design: a single-stranded DNA with one leg and two foot domains for walking, and one arm and one hand domain for picking up and dropping off cargos. The robot explores a two-dimensional testing ground on the surface of DNA origami, picks up multiple cargos of two types that are initially at unordered locations, and delivers them to specified destinations until all molecules are sorted into two distinct piles. The robot is designed to perform a random walk without any energy supply. Exploiting this feature, a single robot can repeatedly sort multiple cargos. Localization on DNA origami allows for distinct cargo-sorting tasks to take place simultaneously in one test tube or for multiple robots to collectively perform the same task. Copyright © 2017, American Association for the Advancement of Science.
Fengler, Ineke; Delfau, Pia-Céline; Röder, Brigitte
2018-04-01
It is yet unclear whether congenitally deaf cochlear implant (CD CI) users' visual and multisensory emotion perception is influenced by their history in sign language acquisition. We hypothesized that early-signing CD CI users, relative to late-signing CD CI users and hearing, non-signing controls, show better facial expression recognition and rely more on the facial cues of audio-visual emotional stimuli. Two groups of young adult CD CI users-early signers (ES CI users; n = 11) and late signers (LS CI users; n = 10)-and a group of hearing, non-signing, age-matched controls (n = 12) performed an emotion recognition task with auditory, visual, and cross-modal emotionally congruent and incongruent speech stimuli. On different trials, participants categorized either the facial or the vocal expressions. The ES CI users more accurately recognized affective prosody than the LS CI users in the presence of congruent facial information. Furthermore, the ES CI users, but not the LS CI users, gained more than the controls from congruent visual stimuli when recognizing affective prosody. Both CI groups performed overall worse than the controls in recognizing affective prosody. These results suggest that early sign language experience affects multisensory emotion perception in CD CI users.
Orchestration of Molecular Information through Higher Order Chemical Recognition
NASA Astrophysics Data System (ADS)
Frezza, Brian M.
Broadly defined, higher order chemical recognition is the process whereby discrete chemical building blocks capable of specifically binding to cognate moieties are covalently linked into oligomeric chains. These chains, or sequences, are then able to recognize and bind to their cognate sequences with a high degree of cooperativity. Principally speaking, DNA and RNA are the most readily obtained examples of this chemical phenomenon, and function via Watson-Crick cognate pairing: guanine pairs with cytosine and adenine with thymine (DNA) or uracil (RNA), in an anti-parallel manner. While the theoretical principles, techniques, and equations derived herein apply generally to any higher-order chemical recognition system, in practice we utilize DNA oligomers as a model-building material to experimentally investigate and validate our hypotheses. Historically, general purpose information processing has been a task limited to semiconductor electronics. Molecular computing on the other hand has been limited to ad hoc approaches designed to solve highly specific and unique computation problems, often involving components or techniques that cannot be applied generally in a manner suitable for precise and predictable engineering. Herein, we provide a fundamental framework for harnessing high-order recognition in a modular and programmable fashion to synthesize molecular information process networks of arbitrary construction and complexity. This document provides a solid foundation for routinely embedding computational capability into chemical and biological systems where semiconductor electronics are unsuitable for practical application.
Brambilla, Marco; Martani, Francesca; Branduardi, Paola
2017-09-01
The Saccharomyces cerevisiae poly(A)-binding protein Pab1 is a modular protein composed of four RNA recognition motifs (RRM), a proline-rich domain (P) and a C-terminus. Thanks to this modularity, Pab1 is involved in different interactions that regulate many aspects of mRNA metabolism, including the assembly of stress granules. In this work, we analyzed the contribution of each domain for the recruitment of the protein within stress granules by comparing the intracellular distribution of synthetic Pab1-GFP variants, lacking one or more domains, with the localization of the endogenous mCherry-tagged Pab1. Glucose starvation and heat shock were used to trigger the formation of stress granules. We found that Pab1 association into these aggregates relies mainly on RRMs, whose number is important for an efficient recruitment of the protein. Interestingly, although the P and C domains do not directly participate in Pab1 association to stress granules, their presence strengthens or decreases, respectively, the distribution of synthetic Pab1 lacking at least one RRM into these aggregates. In addition to describing the contribution of domains in determining Pab1 association within stress granules, the outcomes of this study suggest the modularity of Pab1 as an attractive platform for synthetic biology approaches aimed at rewiring mRNA metabolism. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Polyethylene wear debris in modular acetabular prostheses.
Chen, P C; Mead, E H; Pinto, J G; Colwell, C W
1995-08-01
The longevity of total hip arthroplasty has brought forth the recognition of aseptic loosening of prosthetic components as the leading cause of implant failure. Modularity of implants, although a significant improvement in versatility, may increase debris formation, a recognized cause of implant failure. This study was designed to measure the relative motion, and to assess the polyethylene wear debris production at the interface between the metal acetabular shell and the back side of the polyethylene liner, in modular hip prostheses. Five models from 4 manufacturers with different locking mechanisms and acetabular shell surface treatments were tested under long-term simultaneous sinusoidal and static loading (10(7) cycles at 3 Hz with +/- 2.5 Nmeter and 220 N static load). Results showed that there were marked differences in the security of the acetabular shell and polyethylene liner locking mechanism, wear pattern, damage sites, and amount of polyethylene debris on the acetabular shell and polyethylene liner surfaces. The range of polyethylene liner motion observed among the 5 models during 1 cycle of testing varied from an average of 0.96 degrees to movement too small to be detected by the test machines. Image and scanning electron microscopy analysis showed different wear patterns and a wide range in the average polyethylene liner surface wear area (0.26 cm2-4.61 cm2). In general, a stable locking mechanism and a smooth acetabular shell surface are essential in minimizing polyethylene liner wear and polyethylene debris production.
Modular packaging concept for MEMS and MOEMS
NASA Astrophysics Data System (ADS)
Stenchly, Vanessa; Reinert, Wolfgang; Quenzer, Hans-Joachim
2017-11-01
Wherever technical systems detect objects in their environment or interact with people, optical devices may play an important role. Light can be relatively easily produced and spatially and temporally modulated. Laser can project sharp images over long distances or cut materials in short distances. Depending on the wavelength an invisible scanning in near infrared for gesture recognition is possible as well as a projection of brilliant colour images. For several years, the Fraunhofer ISIT develops Opto-Packaging processes based on the viscous reshaping of glass wafers: First, hermetically sealed laser micro-mirror scanners WLP with inclined windows deflect in the central light reflex of the window out of the image area. Second, housing with lateral light exit permits hermetic sealing of edge-emitting lasers for highest reliability and durability. Such systems are currently experiencing an extremely high interest of the industry in all segments, from consumer to automotive through to materials processing. Our modular Opto-Packaging platform enables fast product developments. Housing for opto mechanical MEMS devices are equipped with inclined windows to minimize distortion, stray light and reflection losses. The hot viscous glass forming technology is also applied to functionalized substrate wafers which possess areas with high heat dissipation in addition to thermally insulating areas. Electrical contacts may be realized with metal filled vias or TGV (Through Glass Vias). The modular system reduces the development times for new, miniaturized optical systems so that manufacturers can focus on the essentials in their development, namely their product functionalities.
Prevalence of co-morbid depression in out-patients with type 2 diabetes mellitus in Bangladesh.
Roy, Tapash; Lloyd, Cathy E; Parvin, Masuma; Mohiuddin, Khondker Galib B; Rahman, Mosiur
2012-08-22
Little is known about the prevalence of depression in people with diabetes in Bangladesh. This study examined the prevalence and factors associated with depression in out-patients with Type 2 diabetes in Bangladesh. In this cross-sectional study a random sample of 483 diabetes out-patients from three diabetes clinics in Bangladesh was invited to participate. Of them 417 patients took part. Depressive symptoms were measured using previously developed and culturally standardized Bengali and Sylheti versions of the World HealthOrganization-5 Well Being Index (WHO-5) and the Patient Health Questionairre-9 (PHQ-9) with predefined cut-off scores. Data was collected using two different modes; e.g. standard assisted collection and audio questionnaire methods. Associations between depression and patient characteristics were explored using regression analysis. The prevalence of depressive symptoms was 34% (PHQ-9 score ≥ 5) and 36% (WHO-5 score < 52) with audio questionnaire delivery method. The prevalence rates were similar regardless of the type (PHQ-9 vs. WHO-5) and language (Sylheti vs. Bengali) of the questionnaires, and methods of delivery (standard assisted vs. audio methods). The significant predictors of depressive symptoms using either the PHQ-9 or WHO-5 questionnaires were; age, income, gender, treatment intensity, and co-morbid cardiovascular disease. Further, depression was strongly associated with poor glycaemic control and number of co-morbid conditions. This study demonstrated that depression prevalence is common in out-patients with type 2 diabetes in Bangladesh. In a setting where recognition, screening and treatment levels remain low, health care providers need to focus their efforts on diagnosing, referring and effectively treating this important disease in order to improve service delivery.
McMenamin, Brenton W.; Deason, Rebecca G.; Steele, Vaughn R.; Koutstaal, Wilma; Marsolek, Chad J.
2014-01-01
Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. PMID:25528436
McMenamin, Brenton W; Deason, Rebecca G; Steele, Vaughn R; Koutstaal, Wilma; Marsolek, Chad J
2015-02-01
Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. Copyright © 2014 Elsevier Inc. All rights reserved.
Activities report of PTT Research
NASA Astrophysics Data System (ADS)
In the field of postal infrastructure research, activities were performed on postcode readers, radiolabels, and techniques of operations research and artificial intelligence. In the field of telecommunication, transportation, and information, research was made on multipurpose coding schemes, speech recognition, hypertext, a multimedia information server, security of electronic data interchange, document retrieval, improvement of the quality of user interfaces, domotics living support (techniques), and standardization of telecommunication prototcols. In the field of telecommunication infrastructure and provisions research, activities were performed on universal personal telecommunications, advanced broadband network technologies, coherent techniques, measurement of audio quality, near field facilities, local beam communication, local area networks, network security, coupling of broadband and narrowband integrated services digital networks, digital mapping, and standardization of protocols.
Department of Cybernetic Acoustics
NASA Astrophysics Data System (ADS)
The development of the theory, instrumentation and applications of methods and systems for the measurement, analysis, processing and synthesis of acoustic signals within the audio frequency range, particularly of the speech signal and the vibro-acoustic signal emitted by technical and industrial equipments treated as noise and vibration sources was discussed. The research work, both theoretical and experimental, aims at applications in various branches of science, and medicine, such as: acoustical diagnostics and phoniatric rehabilitation of pathological and postoperative states of the speech organ; bilateral ""man-machine'' speech communication based on the analysis, recognition and synthesis of the speech signal; vibro-acoustical diagnostics and continuous monitoring of the state of machines, technical equipments and technological processes.
NASA Astrophysics Data System (ADS)
Guidang, Excel Philip B.; Llanda, Christopher John R.; Palaoag, Thelma D.
2018-03-01
Face Detection Technique as a strategy in controlling a multimedia instructional material was implemented in this study. Specifically, it achieved the following objectives: 1) developed a face detection application that controls an embedded mother-tongue-based instructional material for face-recognition configuration using Python; 2) determined the perceptions of the students using the Mutt Susan’s student app review rubric. The study concludes that face detection technique is effective in controlling an electronic instructional material. It can be used to change the method of interaction of the student with an instructional material. 90% of the students perceived the application to be a great app and 10% rated the application to be good.
The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening
Cibelli, Emily S.; Leonard, Matthew K.; Johnson, Keith; Chang, Edward F.
2015-01-01
Neural representations of words are thought to have a complex spatio-temporal cortical basis. It has been suggested that spoken word recognition is not a process of feed-forward computations from phonetic to lexical forms, but rather involves the online integration of bottom-up input with stored lexical knowledge. Using direct neural recordings from the temporal lobe, we examined cortical responses to words and pseudowords. We found that neural populations were not only sensitive to lexical status (real vs. pseudo), but also to cohort size (number of words matching the phonetic input at each time point) and cohort frequency (lexical frequency of those words). These lexical variables modulated neural activity from the posterior to anterior temporal lobe, and also dynamically as the stimuli unfolded on a millisecond time scale. Our findings indicate that word recognition is not purely modular, but relies on rapid and online integration of multiple sources of lexical knowledge. PMID:26072003
Huang, Ying; Bayfield, Mark A; Intine, Robert V; Maraia, Richard J
2006-07-01
By sequence-specific binding to 3' UUU-OH, the La protein shields precursor (pre)-RNAs from 3' end digestion and is required to protect defective pre-transfer RNAs from decay. Although La is comprised of a La motif and an RNA-recognition motif (RRM), a recent structure indicates that the RRM beta-sheet surface is not involved in UUU-OH recognition, raising questions as to its function. Progressively defective suppressor tRNAs in Schizosaccharomyces pombe reveal differential sensitivities to La and Rrp6p, a 3' exonuclease component of pre-tRNA decay. 3' end protection is compromised by mutations to the La motif but not the RRM surface. The most defective pre-tRNAs require a second activity of La, in addition to 3' protection, that requires an intact RRM surface. The two activities of La in tRNA maturation map to its two conserved RNA-binding surfaces and suggest a modular model that has implications for its other ligands.
Multiperson visual focus of attention from head pose and meeting contextual cues.
Ba, Sileye O; Odobez, Jean-Marc
2011-01-01
This paper introduces a novel contextual model for the recognition of people's visual focus of attention (VFOA) in meetings from audio-visual perceptual cues. More specifically, instead of independently recognizing the VFOA of each meeting participant from his own head pose, we propose to jointly recognize the participants' visual attention in order to introduce context-dependent interaction models that relate to group activity and the social dynamics of communication. Meeting contextual information is represented by the location of people, conversational events identifying floor holding patterns, and a presentation activity variable. By modeling the interactions between the different contexts and their combined and sometimes contradictory impact on the gazing behavior, our model allows us to handle VFOA recognition in difficult task-based meetings involving artifacts, presentations, and moving people. We validated our model through rigorous evaluation on a publicly available and challenging data set of 12 real meetings (5 hours of data). The results demonstrated that the integration of the presentation and conversation dynamical context using our model can lead to significant performance improvements.
García-Hernández, Alejandra; Galván-Tejada, Carlos E; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio
2017-11-21
Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location.
Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems
NASA Technical Reports Server (NTRS)
Ye, Sherry
2015-01-01
NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.
García-Hernández, Alejandra; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio
2017-01-01
Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location. PMID:29160799
Unspoken vowel recognition using facial electromyogram.
Arjunan, Sridhar P; Kumar, Dinesh K; Yau, Wai C; Weghorn, Hans
2006-01-01
The paper aims to identify speech using the facial muscle activity without the audio signals. The paper presents an effective technique that measures the relative muscle activity of the articulatory muscles. Five English vowels were used as recognition variables. This paper reports using moving root mean square (RMS) of surface electromyogram (SEMG) of four facial muscles to segment the signal and identify the start and end of the utterance. The RMS of the signal between the start and end markers was integrated and normalised. This represented the relative muscle activity of the four muscles. These were classified using back propagation neural network to identify the speech. The technique was successfully used to classify 5 vowels into three classes and was not sensitive to the variation in speed and the style of speaking of the different subjects. The results also show that this technique was suitable for classifying the 5 vowels into 5 classes when trained for each of the subjects. It is suggested that such a technology may be used for the user to give simple unvoiced commands when trained for the specific user.
Three-dimensional audio-magnetotelluric sounding in monitoring coalbed methane reservoirs
NASA Astrophysics Data System (ADS)
Wang, Nan; Zhao, Shanshan; Hui, Jian; Qin, Qiming
2017-03-01
Audio-magnetotelluric (AMT) sounding is widely employed in rapid resistivity delineation of objective geometry in near surface exploration. According to reservoir patterns and electrical parameters obtained in Qinshui Basin, China, two-dimensional and three-dimensional synthetic "objective anomaly" models were designed and inverted with the availability of a modular system for electromagnetic inversion (ModEM). The results revealed that 3-D full impedance inversion yielded the subsurface models closest to synthetic models. One or more conductive targets were correctly recovered. Therefore, conductive aquifers in the study area, including hydrous coalbed methane (CBM) reservoirs, were suggested to be the interpretation signs for reservoir characterization. With the aim of dynamic monitoring of CBM reservoirs, the AMT surveys in continuous years (June 2013-May 2015) were carried out. 3-D inversion results demonstrated that conductive anomalies accumulated around the producing reservoirs at the corresponding depths if CBM reservoirs were in high water production rates. In contrast, smaller conductive anomalies were generally identical with rapid gas production or stopping production of reservoirs. These analyses were in accordance with actual production history of CBM wells. The dynamic traces of conductive anomalies revealed that reservoir water migrated deep or converged in axial parts and wings of folds, which contributed significantly to formations of CBM traps. Then the well spacing scenario was also evaluated based on the dynamic production analysis. Wells distributed near closed faults or flat folds, rather than open faults, had CBM production potential to ascertain stable gas production. Therefore, three-dimensional AMT sounding becomes an attractive option with the ability of dynamic monitoring of CBM reservoirs, and lays a solid foundation of quantitative evaluation of reservoir parameters.
Gontier, Félix; Lagrange, Mathieu; Can, Arnaud; Lavandier, Catherine
2017-01-01
The spreading of urban areas and the growth of human population worldwide raise societal and environmental concerns. To better address these concerns, the monitoring of the acoustic environment in urban as well as rural or wilderness areas is an important matter. Building on the recent development of low cost hardware acoustic sensors, we propose in this paper to consider a sensor grid approach to tackle this issue. In this kind of approach, the crucial question is the nature of the data that are transmitted from the sensors to the processing and archival servers. To this end, we propose an efficient audio coding scheme based on third octave band spectral representation that allows: (1) the estimation of standard acoustic indicators; and (2) the recognition of acoustic events at state-of-the-art performance rate. The former is useful to provide quantitative information about the acoustic environment, while the latter is useful to gather qualitative information and build perceptually motivated indicators using for example the emergence of a given sound source. The coding scheme is also demonstrated to transmit spectrally encoded data that, reverted to the time domain using state-of-the-art techniques, are not intelligible, thus protecting the privacy of citizens. PMID:29186021
Is automatic speech-to-text transcription ready for use in psychological experiments?
Ziman, Kirsten; Heusser, Andrew C; Fitzpatrick, Paxton C; Field, Campbell E; Manning, Jeremy R
2018-04-23
Verbal responses are a convenient and naturalistic way for participants to provide data in psychological experiments (Salzinger, The Journal of General Psychology, 61(1),65-94:1959). However, audio recordings of verbal responses typically require additional processing, such as transcribing the recordings into text, as compared with other behavioral response modalities (e.g., typed responses, button presses, etc.). Further, the transcription process is often tedious and time-intensive, requiring human listeners to manually examine each moment of recorded speech. Here we evaluate the performance of a state-of-the-art speech recognition algorithm (Halpern et al., 2016) in transcribing audio data into text during a list-learning experiment. We compare transcripts made by human annotators to the computer-generated transcripts. Both sets of transcripts matched to a high degree and exhibited similar statistical properties, in terms of the participants' recall performance and recall dynamics that the transcripts captured. This proof-of-concept study suggests that speech-to-text engines could provide a cheap, reliable, and rapid means of automatically transcribing speech data in psychological experiments. Further, our findings open the door for verbal response experiments that scale to thousands of participants (e.g., administered online), as well as a new generation of experiments that decode speech on the fly and adapt experimental parameters based on participants' prior responses.
Design, Assembly, and Characterization of TALE-Based Transcriptional Activators and Repressors.
Thakore, Pratiksha I; Gersbach, Charles A
2016-01-01
Transcription activator-like effectors (TALEs) are modular DNA-binding proteins that can be fused to a variety of effector domains to regulate the epigenome. Nucleotide recognition by TALE monomers follows a simple cipher, making this a powerful and versatile method to activate or repress gene expression. Described here are methods to design, assemble, and test TALE transcription factors (TALE-TFs) for control of endogenous gene expression. In this protocol, TALE arrays are constructed by Golden Gate cloning and tested for activity by transfection and quantitative RT-PCR. These methods for engineering TALE-TFs are useful for studies in reverse genetics and genomics, synthetic biology, and gene therapy.
Lyons, Peter B
2011-01-01
The history of nuclear regulation is briefly reviewed to underscore the early recognition that independence of the regulator was essential in achieving and maintaining public credibility. The current licensing process is reviewed along with the status of applications. Challenges faced by both the NRC and the industry are reviewed, such as new construction techniques involving modular construction, digital controls replacing analog circuitry, globalization of the entire supply chain, and increased security requirements. The vital area of safety culture is discussed in some detail, and its importance is emphasized. Copyright © 2010 Health Physics Society
Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie
2014-01-01
Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928
Automated recognition and tracking of aerosol threat plumes with an IR camera pod
NASA Astrophysics Data System (ADS)
Fauth, Ryan; Powell, Christopher; Gruber, Thomas; Clapp, Dan
2012-06-01
Protection of fixed sites from chemical, biological, or radiological aerosol plume attacks depends on early warning so that there is time to take mitigating actions. Early warning requires continuous, autonomous, and rapid coverage of large surrounding areas; however, this must be done at an affordable cost. Once a potential threat plume is detected though, a different type of sensor (e.g., a more expensive, slower sensor) may be cued for identification purposes, but the problem is to quickly identify all of the potential threats around the fixed site of interest. To address this problem of low cost, persistent, wide area surveillance, an IR camera pod and multi-image stitching and processing algorithms have been developed for automatic recognition and tracking of aerosol plumes. A rugged, modular, static pod design, which accommodates as many as four micro-bolometer IR cameras for 45deg to 180deg of azimuth coverage, is presented. Various OpenCV1 based image-processing algorithms, including stitching of multiple adjacent FOVs, recognition of aerosol plume objects, and the tracking of aerosol plumes, are presented using process block diagrams and sample field test results, including chemical and biological simulant plumes. Methods for dealing with the background removal, brightness equalization between images, and focus quality for optimal plume tracking are also discussed.
Wavelet-based audio embedding and audio/video compression
NASA Astrophysics Data System (ADS)
Mendenhall, Michael J.; Claypoole, Roger L., Jr.
2001-12-01
Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.
Three-Dimensional Audio Client Library
NASA Technical Reports Server (NTRS)
Rizzi, Stephen A.
2005-01-01
The Three-Dimensional Audio Client Library (3DAudio library) is a group of software routines written to facilitate development of both stand-alone (audio only) and immersive virtual-reality application programs that utilize three-dimensional audio displays. The library is intended to enable the development of three-dimensional audio client application programs by use of a code base common to multiple audio server computers. The 3DAudio library calls vendor-specific audio client libraries and currently supports the AuSIM Gold-Server and Lake Huron audio servers. 3DAudio library routines contain common functions for (1) initiation and termination of a client/audio server session, (2) configuration-file input, (3) positioning functions, (4) coordinate transformations, (5) audio transport functions, (6) rendering functions, (7) debugging functions, and (8) event-list-sequencing functions. The 3DAudio software is written in the C++ programming language and currently operates under the Linux, IRIX, and Windows operating systems.
SCORPION II persistent surveillance system update
NASA Astrophysics Data System (ADS)
Coster, Michael; Chambers, Jon
2010-04-01
This paper updates the improvements and benefits demonstrated in the next generation Northrop Grumman SCORPION II family of persistent surveillance and target recognition systems produced by the Xetron Campus in Cincinnati, Ohio. SCORPION II reduces the size, weight, and cost of all SCORPION components in a flexible, field programmable system that is easier to conceal and enables integration of over fifty different Unattended Ground Sensor (UGS) and camera types from a variety of manufacturers, with a modular approach to supporting multiple Line of Sight (LOS) and Beyond Line of Sight (BLOS) communications interfaces. Since 1998 Northrop Grumman has been integrating best in class sensors with its proven universal modular Gateway to provide encrypted data exfiltration to Common Operational Picture (COP) systems and remote sensor command and control. In addition to feeding COP systems, SCORPION and SCORPION II data can be directly processed using a common sensor status graphical user interface (GUI) that allows for viewing and analysis of images and sensor data from up to seven hundred SCORPION system gateways on single or multiple displays. This GUI enables a large amount of sensor data and imagery to be used for actionable intelligence as well as remote sensor command and control by a minimum number of analysts.
A modular DNA signal translator for the controlled release of a protein by an aptamer.
Beyer, Stefan; Simmel, Friedrich C
2006-01-01
Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA 'effector' strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary 'input' sequence into a desired effector sequence may be required. Here we demonstrate a molecular 'translator' for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences.
A modular DNA signal translator for the controlled release of a protein by an aptamer
Beyer, Stefan; Simmel, Friedrich C.
2006-01-01
Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA ‘effector’ strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary ‘input’ sequence into a desired effector sequence may be required. Here we demonstrate a molecular ‘translator’ for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences. PMID:16547201
Incipient failure detection (IFD) of SSME ball bearings
NASA Technical Reports Server (NTRS)
1982-01-01
Because of the immense noise background during the operation of a large engine such as the SSME, the relatively low level unique ball bearing signatures were often buried by the overall machine signal. As a result, the most commonly used bearing failure detection technique, pattern recognition using power spectral density (PSD) constructed from the extracted bearing signals, is rendered useless. Data enhancement techniques were carried out by using a HP5451C Fourier Analyzer. The signal was preprocessed by a Digital Audio Crop. DAC-1024I noise cancelling filter in order to estimate the desired signal corrupted by the backgound noise. Reference levels of good bearings were established. Any deviation of bearing signals from these reference levels indicate the incipient bearing failures.
Effects of Instructor Attractiveness on Learning.
Westfall, Richard; Millar, Murray; Walsh, Mandy
2016-01-01
Although a considerable body of research has examined the impact of student attractiveness on instructors, little attention has been given to the influence of instructor attractiveness on students. This study tested the hypothesis that persons would perform significantly better on a learning task when they perceived their instructor to be high in physical attractiveness. To test the hypothesis, participants listened to an audio lecture while viewing a photograph of instructor. The photograph depicted either a physically attractive instructor or a less attractive instructor. Following the lecture, participants completed a forced choice recognition task covering material from the lecture. Consistent with the predictions; attractive instructors were associated with more learning. Finally, we replicated previous findings demonstrating the role attractiveness plays in person perception.
Ad Hoc Selection of Voice over Internet Streams
NASA Technical Reports Server (NTRS)
Macha, Mitchell G. (Inventor); Bullock, John T. (Inventor)
2014-01-01
A method and apparatus for a communication system technique involving ad hoc selection of at least two audio streams is provided. Each of the at least two audio streams is a packetized version of an audio source. A data connection exists between a server and a client where a transport protocol actively propagates the at least two audio streams from the server to the client. Furthermore, software instructions executable on the client indicate a presence of the at least two audio streams, allow selection of at least one of the at least two audio streams, and direct the selected at least one of the at least two audio streams for audio playback.
Ad Hoc Selection of Voice over Internet Streams
NASA Technical Reports Server (NTRS)
Macha, Mitchell G. (Inventor); Bullock, John T. (Inventor)
2008-01-01
A method and apparatus for a communication system technique involving ad hoc selection of at least two audio streams is provided. Each of the at least two audio streams is a packetized version of an audio source. A data connection exists between a server and a client where a transport protocol actively propagates the at least two audio streams from the server to the client. Furthermore, software instructions executable on the client indicate a presence of the at least two audio streams, allow selection of at least one of the at least two audio streams, and direct the selected at least one of the at least two audio streams for audio playback.
Audio in Courseware: Design Knowledge Issues.
ERIC Educational Resources Information Center
Aarntzen, Diana
1993-01-01
Considers issues that need to be addressed when incorporating audio in courseware design. Topics discussed include functions of audio in courseware; the relationship between auditive and visual information; learner characteristics in relation to audio; events of instruction; and audio characteristics, including interactivity and speech technology.…
A Virtual Audio Guidance and Alert System for Commercial Aircraft Operations
NASA Technical Reports Server (NTRS)
Begault, Durand R.; Wenzel, Elizabeth M.; Shrum, Richard; Miller, Joel; Null, Cynthia H. (Technical Monitor)
1996-01-01
Our work in virtual reality systems at NASA Ames Research Center includes the area of aurally-guided visual search, using specially-designed audio cues and spatial audio processing (also known as virtual or "3-D audio") techniques (Begault, 1994). Previous studies at Ames had revealed that use of 3-D audio for Traffic Collision Avoidance System (TCAS) advisories significantly reduced head-down time, compared to a head-down map display (0.5 sec advantage) or no display at all (2.2 sec advantage) (Begault, 1993, 1995; Begault & Pittman, 1994; see Wenzel, 1994, for an audio demo). Since the crew must keep their head up and looking out the window as much as possible when taxiing under low-visibility conditions, and the potential for "blunder" is increased under such conditions, it was sensible to evaluate the audio spatial cueing for a prototype audio ground collision avoidance warning (GCAW) system, and a 3-D audio guidance system. Results were favorable for GCAW, but not for the audio guidance system.
The priming function of in-car audio instruction.
Keyes, Helen; Whitmore, Antony; Naneva, Stanislava; McDermott, Daragh
2018-05-01
Studies to date have focused on the priming power of visual road signs, but not the priming potential of audio road scene instruction. Here, the relative priming power of visual, audio, and multisensory road scene instructions was assessed. In a lab-based study, participants responded to target road scene turns following visual, audio, or multisensory road turn primes which were congruent or incongruent to the primes in direction, or control primes. All types of instruction (visual, audio, and multisensory) were successful in priming responses to a road scene. Responses to multisensory-primed targets (both audio and visual) were faster than responses to either audio or visual primes alone. Incongruent audio primes did not affect performance negatively in the manner of incongruent visual or multisensory primes. Results suggest that audio instructions have the potential to prime drivers to respond quickly and safely to their road environment. Peak performance will be observed if audio and visual road instruction primes can be timed to co-occur.
Audio-visual interactions in environment assessment.
Preis, Anna; Kociński, Jędrzej; Hafke-Dys, Honorata; Wrzosek, Małgorzata
2015-08-01
The aim of the study was to examine how visual and audio information influences audio-visual environment assessment. Original audio-visual recordings were made at seven different places in the city of Poznań. Participants of the psychophysical experiments were asked to rate, on a numerical standardized scale, the degree of comfort they would feel if they were in such an environment. The assessments of audio-visual comfort were carried out in a laboratory in four different conditions: (a) audio samples only, (b) original audio-visual samples, (c) video samples only, and (d) mixed audio-visual samples. The general results of this experiment showed a significant difference between the investigated conditions, but not for all the investigated samples. There was a significant improvement in comfort assessment when visual information was added (in only three out of 7 cases), when conditions (a) and (b) were compared. On the other hand, the results show that the comfort assessment of audio-visual samples could be changed by manipulating the audio rather than the video part of the audio-visual sample. Finally, it seems, that people could differentiate audio-visual representations of a given place in the environment based rather of on the sound sources' compositions than on the sound level. Object identification is responsible for both landscape and soundscape grouping. Copyright © 2015. Published by Elsevier B.V.
47 CFR 73.403 - Digital audio broadcasting service requirements.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 47 Telecommunication 4 2012-10-01 2012-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 47 Telecommunication 4 2011-10-01 2011-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 4 2014-10-01 2014-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 47 Telecommunication 4 2013-10-01 2013-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
Desantis, Andrea; Haggard, Patrick
2016-01-01
To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events. PMID:27982063
Desantis, Andrea; Haggard, Patrick
2016-12-16
To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events.
The power of digital audio in interactive instruction: An unexploited medium
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratt, J.; Trainor, M.
1989-01-01
Widespread use of audio in computer-based training (CBT) occurred with the advent of the interactive videodisc technology. This paper discusses the alternative of digital audio, which, unlike videodisc audio, enables one to rapidly revise the audio used in the CBT and which may be used in nonvideo CBT applications as well. We also discuss techniques used in audio script writing, editing, and production. Results from evaluations indicate a high degree of user satisfaction. 4 refs.
47 CFR 11.51 - EAS code and Attention Signal Transmission requirements.
Code of Federal Regulations, 2012 CFR
2012-10-01
... Message (EOM) codes using the EAS Protocol. The Attention Signal must precede any emergency audio message... audio messages. No Attention Signal is required for EAS messages that do not contain audio programming... EAS messages in the main audio channel. All DAB stations shall also transmit EAS messages on all audio...
47 CFR 11.51 - EAS code and Attention Signal Transmission requirements.
Code of Federal Regulations, 2014 CFR
2014-10-01
... Message (EOM) codes using the EAS Protocol. The Attention Signal must precede any emergency audio message... audio messages. No Attention Signal is required for EAS messages that do not contain audio programming... EAS messages in the main audio channel. All DAB stations shall also transmit EAS messages on all audio...
47 CFR 11.51 - EAS code and Attention Signal Transmission requirements.
Code of Federal Regulations, 2013 CFR
2013-10-01
... Message (EOM) codes using the EAS Protocol. The Attention Signal must precede any emergency audio message... audio messages. No Attention Signal is required for EAS messages that do not contain audio programming... EAS messages in the main audio channel. All DAB stations shall also transmit EAS messages on all audio...
Communicative Competence in Audio Classrooms: A Position Paper for the CADE 1991 Conference.
ERIC Educational Resources Information Center
Burge, Liz
Classroom practitioners need to move their attention away from the technological and logistical competencies required for audio conferencing (AC) to the required communicative competencies in order to advance their skills in handling the psychodynamics of audio virtual classrooms which include audio alone and audio with graphics. While the…
The Audio Description as a Physics Teaching Tool
ERIC Educational Resources Information Center
Cozendey, Sabrina; Costa, Maria da Piedade
2016-01-01
This study analyses the use of audio description in teaching physics concepts, aiming to determine the variables that influence the understanding of the concept. One education resource was audio described. For make the audio description the screen was freezing. The video with and without audio description should be presented to students, so that…
Design, Assembly, and Characterization of TALE-Based Transcriptional Activators and Repressors
Thakore, Pratiksha I.; Gersbach, Charles A.
2016-01-01
Transcription activator-like effectors (TALEs) are modular DNA-binding proteins that can be fused to a variety of effector domains to regulate the epigenome. Nucleotide recognition by TALE monomers follows a simple cipher, making this a powerful and versatile method to activate or repress gene expression. Described here are methods to design, assemble, and test TALE transcription factors (TALE-TFs) for control of endogenous gene expression. In this protocol, TALE arrays are constructed by Golden Gate cloning and tested for activity by transfection and quantitative RT-PCR. These methods for engineering TALE-TFs are useful for studies in reverse genetics and genomics, synthetic biology, and gene therapy. PMID:26443215
Adaptive fuzzy leader clustering of complex data sets in pattern recognition
NASA Technical Reports Server (NTRS)
Newton, Scott C.; Pemmaraju, Surya; Mitra, Sunanda
1992-01-01
A modular, unsupervised neural network architecture for clustering and classification of complex data sets is presented. The adaptive fuzzy leader clustering (AFLC) architecture is a hybrid neural-fuzzy system that learns on-line in a stable and efficient manner. The initial classification is performed in two stages: a simple competitive stage and a distance metric comparison stage. The cluster prototypes are then incrementally updated by relocating the centroid positions from fuzzy C-means system equations for the centroids and the membership values. The AFLC algorithm is applied to the Anderson Iris data and laser-luminescent fingerprint image data. It is concluded that the AFLC algorithm successfully classifies features extracted from real data, discrete or continuous.
Painting recognition with smartphones equipped with inertial measurement unit
NASA Astrophysics Data System (ADS)
Masiero, Andrea; Guarnieri, Alberto; Pirotti, Francesco; Vettore, Antonio
2015-06-01
Recently, several works have been proposed in the literature to take advantage of the diffusion of smartphones to improve people experience during museum visits. The rationale is that of substituting traditional written/audio guides with interactive electronic guides usable on a mobile phone. Augmented reality systems are usually considered to make the use of such electronic guides more effective for the user. The main goal of such augmented reality system (i.e. providing the user with the information of his/her interest) is usually achieved by properly executing the following three tasks: recognizing the object of interest to the user, retrieving the most relevant information about it, properly presenting the retrieved information. This paper focuses on the first task: we consider the problem of painting recognition by means of measure- ments provided by a smartphone. We assume that the user acquires one image of the painting of interest with the standard camera of the device. This image is compared with a set of reference images of the museum objects in order to recognize the object of interest to the user. Since comparing images taken in different conditions can lead to unsatisfactory recognition results, the acquired image is typically properly transformed in order to improve the results of the recognition system: first, the system estimates the homography between properly matched features in the two images. Then, the user image is transformed accordingly to the estimated homography. Finally, it is compared with the reference one. This work proposes a novel method to exploit inertial measurement unit (IMU) measurements to improve the system performance, in particular in terms of computational load reduction: IMU measurements are exploited to reduce both the computational burden required to estimate the transformation to be applied to the user image, and the number of reference images to be compared with it.
Music-based memory enhancement in Alzheimer's disease: promise and limitations.
Simmons-Stern, Nicholas R; Deason, Rebecca G; Brandler, Brian J; Frustace, Bruno S; O'Connor, Maureen K; Ally, Brandon A; Budson, Andrew E
2012-12-01
In a previous study (Simmons-Stern, Budson & Ally, 2010), we found that patients with Alzheimer's disease (AD) better recognized visually presented lyrics when the lyrics were also sung rather than spoken at encoding. The present study sought to further investigate the effects of music on memory in patients with AD by making the content of the song lyrics relevant for the daily life of an older adult and by examining how musical encoding alters several different aspects of episodic memory. Patients with AD and healthy older adults studied visually presented novel song lyrics related to instrumental activities of daily living (IADL) that were accompanied by either a sung or a spoken recording. Overall, participants performed better on a memory test of general lyric content for lyrics that were studied sung as compared to spoken. However, on a memory test of specific lyric content, participants performed equally well for sung and spoken lyrics. We interpret these results in terms of a dual-process model of recognition memory such that the general content questions represent a familiarity-based representation that is preferentially sensitive to enhancement via music, while the specific content questions represent a recollection-based representation unaided by musical encoding. Additionally, in a test of basic recognition memory for the audio stimuli, patients with AD demonstrated equal discrimination for sung and spoken stimuli. We propose that the perceptual distinctiveness of musical stimuli enhanced metamemorial awareness in AD patients via a non-selective distinctiveness heuristic, thereby reducing false recognition while at the same time reducing true recognition and eliminating the mnemonic benefit of music. These results are discussed in the context of potential music-based memory enhancement interventions for the care of patients with AD. Published by Elsevier Ltd.
Music-Based Memory Enhancement in Alzheimer’s Disease: Promise and Limitations
Simmons-Stern, Nicholas R.; Deason, Rebecca G.; Brandler, Brian J.; Frustace, Bruno S.; O’Connor, Maureen K.; Ally, Brandon A.; Budson, Andrew E.
2012-01-01
In a previous study (Simmons-Stern, Budson, & Ally 2010), we found that patients with Alzheimer’s disease (AD) better recognized visually presented lyrics when the lyrics were also sung rather than spoken at encoding. The present study sought to further investigate the effects of music on memory in patients with AD by making the content of the song lyrics relevant for the daily life of an older adult and by examining how musical encoding alters several different aspects of episodic memory. Patients with AD and healthy older adults studied visually presented novel song lyrics related to instrumental activities of daily living (IADL) that were accompanied by either a sung or a spoken recording. Overall, participants performed better on a memory test of general lyric content for lyrics that were studied sung as compared to spoken. However, on a memory test of specific lyric content, participants performed equally well for sung and spoken lyrics. We interpret these results in terms of a dual-process model of recognition memory such that the general content questions represent a familiarity-based representation that is preferentially sensitive to enhancement via music, while the specific content questions represent a recollection-based representation unaided by musical encoding. Additionally, in a test of basic recognition memory for the audio stimuli, patients with AD demonstrated equal discrimination for sung and spoken stimuli. We propose that the perceptual distinctiveness of musical stimuli enhanced metamemorial awareness in AD patients via a non-selective distinctiveness heuristic, thereby reducing false recognition while at the same time reducing true recognition and eliminating the mnemonic benefit of music. These results are discussed in the context of potential music-based memory enhancement interventions for the care of patients with AD. PMID:23000133
Falcón-González, Juan C; Borkoski-Barreiro, Silvia; Limiñana-Cañal, José María; Ramos-Macías, Angel
2014-01-01
Music is a universal, cross-cultural phenomenon. Perception and enjoyment of music are still not solved with current technological objectives of cochlear implants. The objective of this article was to advance the development and validation of a method of programming of cochlear implants that implements a frequency allocation strategy. We compared standard programming vs frequency programming in every subject. We studied a total of 40 patients with cochlear implants. Each patient was programmed with a optimal version of the standard program, using the Custom Sound Suite 3.2 cochlear platform. Speech tests in quiet were performed using syllable word lists from the protocol for the assessment of hearing in the Spanish language. Patients implanted bilaterally were tested in both ears at the same time. For assessing music listening habits we used the Munich Music Questionnaire and «MACarena»(minimum auditory capability) software. All patients achieved better results in recognition, instrument tests and tonal scales with frequency programming (P<.005). Likewise, there were better results with frequency programming in recognising harmonics and pitch test (P<.005). Frequency programming achieves better perception and recognition results in patients in comparison with standard programming. Bilateral stimulation patients have better perception of musical patterns and better performance in recognition of tonal scales, harmonics and musical instruments compared with patients with unilateral stimulation. Modification and frequency allocation during programming allows decreased levels of current intensity and increase the dynamic range, which allows mapping of each audio band less obtrusively and improves the quality of representation of the signal. Copyright © 2013 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Patología Cérvico-Facial. All rights reserved.
Elucidation of the binding preferences of peptide recognition modules: SH3 and PDZ domains.
Teyra, Joan; Sidhu, Sachdev S; Kim, Philip M
2012-08-14
Peptide-binding domains play a critical role in regulation of cellular processes by mediating protein interactions involved in signalling. In recent years, the development of large-scale technologies has enabled exhaustive studies on the peptide recognition preferences for a number of peptide-binding domain families. These efforts have provided significant insights into the binding specificities of these modular domains. Many research groups have taken advantage of this unprecedented volume of specificity data and have developed a variety of new algorithms for the prediction of binding specificities of peptide-binding domains and for the prediction of their natural binding targets. This knowledge has also been applied to the design of synthetic peptide-binding domains in order to rewire protein-protein interaction networks. Here, we describe how these experimental technologies have impacted on our understanding of peptide-binding domain specificities and on the elucidation of their natural ligands. We discuss SH3 and PDZ domains as well characterized examples, and we explore the feasibility of expanding high-throughput experiments to other peptide-binding domains. Copyright © 2012. Published by Elsevier B.V.
Localizing Tortoise Nests by Neural Networks.
Barbuti, Roberto; Chessa, Stefano; Micheli, Alessio; Pucci, Rita
2016-01-01
The goal of this research is to recognize the nest digging activity of tortoises using a device mounted atop the tortoise carapace. The device classifies tortoise movements in order to discriminate between nest digging, and non-digging activity (specifically walking and eating). Accelerometer data was collected from devices attached to the carapace of a number of tortoises during their two-month nesting period. Our system uses an accelerometer and an activity recognition system (ARS) which is modularly structured using an artificial neural network and an output filter. For the purpose of experiment and comparison, and with the aim of minimizing the computational cost, the artificial neural network has been modelled according to three different architectures based on the input delay neural network (IDNN). We show that the ARS can achieve very high accuracy on segments of data sequences, with an extremely small neural network that can be embedded in programmable low power devices. Given that digging is typically a long activity (up to two hours), the application of ARS on data segments can be repeated over time to set up a reliable and efficient system, called Tortoise@, for digging activity recognition.
Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin'ya
2013-01-01
It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.
Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap
Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin’Ya
2013-01-01
It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap. PMID:23658549
Yu, Zhiqiang; Paul, Rakesh; Bhattacharya, Chandrabali; Bozeman, Trevor C; Rishel, Michael J; Hecht, Sidney M
2015-05-19
We have shown previously that the bleomycin (BLM) carbohydrate moiety can recapitulate the tumor cell targeting effects of the entire BLM molecule, that BLM itself is modular in nature consisting of a DNA-cleaving aglycone which is delivered selectively to the interior of tumor cells by its carbohydrate moiety, and that there are disaccharides structurally related to the BLM disaccharide which are more efficient than the natural disaccharide at tumor cell targeting/uptake. Because BLM sugars can deliver molecular cargoes selectively to tumor cells, and thus potentially form the basis for a novel antitumor strategy, it seemed important to consider additional structural features capable of affecting the efficiency of tumor cell recognition and delivery. These included the effects of sugar polyvalency and net charge (at physiological pH) on tumor cell recognition, internalization, and trafficking. Since these parameters have been shown to affect cell surface recognition, internalization, and distribution in other contexts, this study has sought to define the effects of these structural features on tumor cell recognition by bleomycin and its disaccharide. We demonstrate that both can have a significant effect on tumor cell binding/internalization, and present data which suggests that the metal ions normally bound by bleomycin following clinical administration may significantly contribute to the efficiency of tumor cell uptake, in addition to their characterized function in DNA cleavage. A BLM disaccharide-Cy5** conjugate incorporating the positively charged dipeptide d-Lys-d-Lys was found to associate with both the mitochondria and the nuclear envelope of DU145 cells, suggesting possible cellular targets for BLM disaccharide-cytotoxin conjugates.
2016-01-01
We have shown previously that the bleomycin (BLM) carbohydrate moiety can recapitulate the tumor cell targeting effects of the entire BLM molecule, that BLM itself is modular in nature consisting of a DNA-cleaving aglycone which is delivered selectively to the interior of tumor cells by its carbohydrate moiety, and that there are disaccharides structurally related to the BLM disaccharide which are more efficient than the natural disaccharide at tumor cell targeting/uptake. Because BLM sugars can deliver molecular cargoes selectively to tumor cells, and thus potentially form the basis for a novel antitumor strategy, it seemed important to consider additional structural features capable of affecting the efficiency of tumor cell recognition and delivery. These included the effects of sugar polyvalency and net charge (at physiological pH) on tumor cell recognition, internalization, and trafficking. Since these parameters have been shown to affect cell surface recognition, internalization, and distribution in other contexts, this study has sought to define the effects of these structural features on tumor cell recognition by bleomycin and its disaccharide. We demonstrate that both can have a significant effect on tumor cell binding/internalization, and present data which suggests that the metal ions normally bound by bleomycin following clinical administration may significantly contribute to the efficiency of tumor cell uptake, in addition to their characterized function in DNA cleavage. A BLM disaccharide-Cy5** conjugate incorporating the positively charged dipeptide d-Lys-d-Lys was found to associate with both the mitochondria and the nuclear envelope of DU145 cells, suggesting possible cellular targets for BLM disaccharide–cytotoxin conjugates. PMID:25905565
47 CFR 73.322 - FM stereophonic sound transmission standards.
Code of Federal Regulations, 2014 CFR
2014-10-01
... transmission, modulation of the carrier by audio components within the baseband range of 50 Hz to 15 kHz shall... the carrier by audio components within the audio baseband range of 23 kHz to 99 kHz shall not exceed... method described in (a), must limit the modulation of the carrier by audio components within the audio...
47 CFR 73.322 - FM stereophonic sound transmission standards.
Code of Federal Regulations, 2013 CFR
2013-10-01
... transmission, modulation of the carrier by audio components within the baseband range of 50 Hz to 15 kHz shall... the carrier by audio components within the audio baseband range of 23 kHz to 99 kHz shall not exceed... method described in (a), must limit the modulation of the carrier by audio components within the audio...
47 CFR 73.322 - FM stereophonic sound transmission standards.
Code of Federal Regulations, 2011 CFR
2011-10-01
... transmission, modulation of the carrier by audio components within the baseband range of 50 Hz to 15 kHz shall... the carrier by audio components within the audio baseband range of 23 kHz to 99 kHz shall not exceed... method described in (a), must limit the modulation of the carrier by audio components within the audio...
47 CFR 73.322 - FM stereophonic sound transmission standards.
Code of Federal Regulations, 2012 CFR
2012-10-01
... transmission, modulation of the carrier by audio components within the baseband range of 50 Hz to 15 kHz shall... the carrier by audio components within the audio baseband range of 23 kHz to 99 kHz shall not exceed... method described in (a), must limit the modulation of the carrier by audio components within the audio...
Video content parsing based on combined audio and visual information
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-08-01
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Comparing Audio and Video Data for Rating Communication
Williams, Kristine; Herman, Ruth; Bontempo, Daniel
2013-01-01
Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with ICC (2,1) for audio = .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio recorded data should be evaluated in designing studies evaluating nursing care. PMID:23579475
SH2 Domains Recognize Contextual Peptide Sequence Information to Determine Selectivity*
Liu, Bernard A.; Jablonowski, Karl; Shah, Eshana E.; Engelmann, Brett W.; Jones, Richard B.; Nash, Piers D.
2010-01-01
Selective ligand recognition by modular protein interaction domains is a primary determinant of specificity in signaling pathways. Src homology 2 (SH2) domains fulfill this capacity immediately downstream of tyrosine kinases, acting to recruit their host polypeptides to ligand proteins harboring phosphorylated tyrosine residues. The degree to which SH2 domains are selective and the mechanisms underlying selectivity are fundamental to understanding phosphotyrosine signaling networks. An examination of interactions between 50 SH2 domains and a set of 192 phosphotyrosine peptides corresponding to physiological motifs within FGF, insulin, and IGF-1 receptor pathways indicates that individual SH2 domains have distinct recognition properties and exhibit a remarkable degree of selectivity beyond that predicted by previously described binding motifs. The underlying basis for such selectivity is the ability of SH2 domains to recognize both permissive amino acid residues that enhance binding and non-permissive amino acid residues that oppose binding in the vicinity of the essential phosphotyrosine. Neighboring positions affect one another so local sequence context matters to SH2 domains. This complex linguistics allows SH2 domains to distinguish subtle differences in peptide ligands. This newly appreciated contextual dependence substantially increases the accessible information content embedded in the peptide ligands that can be effectively integrated to determine binding. This concept may serve more broadly as a paradigm for subtle recognition of physiological ligands by protein interaction domains. PMID:20627867
A human transcription factor in search mode.
Hauser, Kevin; Essuman, Bernard; He, Yiqing; Coutsias, Evangelos; Garcia-Diaz, Miguel; Simmerling, Carlos
2016-01-08
Transcription factors (TF) can change shape to bind and recognize DNA, shifting the energy landscape from a weak binding, rapid search mode to a higher affinity recognition mode. However, the mechanism(s) driving this conformational change remains unresolved and in most cases high-resolution structures of the non-specific complexes are unavailable. Here, we investigate the conformational switch of the human mitochondrial transcription termination factor MTERF1, which has a modular, superhelical topology complementary to DNA. Our goal was to characterize the details of the non-specific search mode to complement the crystal structure of the specific binding complex, providing a basis for understanding the recognition mechanism. In the specific complex, MTERF1 binds a significantly distorted and unwound DNA structure, exhibiting a protein conformation incompatible with binding to B-form DNA. In contrast, our simulations of apo MTERF1 revealed significant flexibility, sampling structures with superhelical pitch and radius complementary to the major groove of B-DNA. Docking these structures to B-DNA followed by unrestrained MD simulations led to a stable complex in which MTERF1 was observed to undergo spontaneous diffusion on the DNA. Overall, the data support an MTERF1-DNA binding and recognition mechanism driven by intrinsic dynamics of the MTERF1 superhelical topology. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Predicting the Overall Spatial Quality of Automotive Audio Systems
NASA Astrophysics Data System (ADS)
Koya, Daisuke
The spatial quality of automotive audio systems is often compromised due to their unideal listening environments. Automotive audio systems need to be developed quickly due to industry demands. A suitable perceptual model could evaluate the spatial quality of automotive audio systems with similar reliability to formal listening tests but take less time. Such a model is developed in this research project by adapting an existing model of spatial quality for automotive audio use. The requirements for the adaptation were investigated in a literature review. A perceptual model called QESTRAL was reviewed, which predicts the overall spatial quality of domestic multichannel audio systems. It was determined that automotive audio systems are likely to be impaired in terms of the spatial attributes that were not considered in developing the QESTRAL model, but metrics are available that might predict these attributes. To establish whether the QESTRAL model in its current form can accurately predict the overall spatial quality of automotive audio systems, MUSHRA listening tests using headphone auralisation with head tracking were conducted to collect results to be compared against predictions by the model. Based on guideline criteria, the model in its current form could not accurately predict the overall spatial quality of automotive audio systems. To improve prediction performance, the QESTRAL model was recalibrated and modified using existing metrics of the model, those that were proposed from the literature review, and newly developed metrics. The most important metrics for predicting the overall spatial quality of automotive audio systems included those that were interaural cross-correlation (IACC) based, relate to localisation of the frontal audio scene, and account for the perceived scene width in front of the listener. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems. The resulting model predicts the overall spatial quality of 2- and 5-channel automotive audio systems with a cross-validation performance of R. 2 = 0.85 and root-mean-squareerror (RMSE) = 11.03%.
Exploring the Implementation of Steganography Protocols on Quantum Audio Signals
NASA Astrophysics Data System (ADS)
Chen, Kehan; Yan, Fei; Iliyasu, Abdullah M.; Zhao, Jianping
2018-02-01
Two quantum audio steganography (QAS) protocols are proposed, each of which manipulates or modifies the least significant qubit (LSQb) of the host quantum audio signal that is encoded as an FRQA (flexible representation of quantum audio) audio content. The first protocol (i.e. the conventional LSQb QAS protocol or simply the cLSQ stego protocol) is built on the exchanges between qubits encoding the quantum audio message and the LSQb of the amplitude information in the host quantum audio samples. In the second protocol, the embedding procedure to realize it implants information from a quantum audio message deep into the constraint-imposed most significant qubit (MSQb) of the host quantum audio samples, we refer to it as the pseudo MSQb QAS protocol or simply the pMSQ stego protocol. The cLSQ stego protocol is designed to guarantee high imperceptibility between the host quantum audio and its stego version, whereas the pMSQ stego protocol ensures that the resulting stego quantum audio signal is better immune to illicit tampering and copyright violations (a.k.a. robustness). Built on the circuit model of quantum computation, the circuit networks to execute the embedding and extraction algorithms of both QAS protocols are determined and simulation-based experiments are conducted to demonstrate their implementation. Outcomes attest that both protocols offer promising trade-offs in terms of imperceptibility and robustness.
Comparing audio and video data for rating communication.
Williams, Kristine; Herman, Ruth; Bontempo, Daniel
2013-09-01
Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with Interclass Correlation Coefficient (ICC) (2,1) for audio .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio-recorded data should be evaluated in designing studies evaluating nursing care.
Moeller, Sara K; Lee, Elizabeth A Ewing; Robinson, Michael D
2011-08-01
Dominance and submission constitute fundamentally different social interaction strategies that may be enacted most effectively to the extent that the emotions of others are relatively ignored (dominance) versus noticed (submission). On the basis of such considerations, we hypothesized a systematic relationship between chronic tendencies toward high versus low levels of interpersonal dominance and emotion decoding accuracy in objective tasks. In two studies (total N = 232), interpersonally dominant individuals exhibited poorer levels of emotion recognition in response to audio and video clips (Study 1) and facial expressions of emotion (Study 2). The results provide a novel perspective on interpersonal dominance, suggest its strategic nature (Study 2), and are discussed in relation to Fiske's (1993) social-cognitive theory of power. 2011 APA, all rights reserved
Tse, Longping V; Moller-Tank, Sven; Meganck, Rita M; Asokan, Aravind
2018-04-25
Adeno-associated viruses (AAV) encode a unique assembly activating protein (AAP) within their genome that is essential for capsid assembly. Studies to date have focused on establishing the role of AAP as a chaperone that mediates stability, nucleolar transport, and assembly of AAV capsid proteins. Here, we map structure-function correlates of AAP using secondary structure analysis followed by deletion and substitutional mutagenesis of specific domains, namely, the hydrophobic N-terminal domain (HR), conserved core (CC), proline-rich region (PRR), threonine/serine rich region (T/S) and basic region (BR). First, we establish that the centrally located PRR and T/S regions are flexible linker domains that can either be deleted completely or replaced by heterologous functional domains that enable ancillary functions such as fluorescent imaging or increased AAP stability. We also demonstrate that the C-terminal BR domains can be substituted with heterologous nuclear or nucleolar localization sequences that display varying ability to support AAV capsid assembly. Further, by replacing the BR domain with immunoglobulin (IgG) Fc domains, we assessed AAP complexation with AAV capsid subunits and demonstrate that the hydrophobic region (HR) and the conserved core (CC) in the AAP N-terminus are the sole determinants for viral protein (VP) recognition. However, VP recognition alone is not sufficient for capsid assembly. Our study sheds light on the modular structure-function correlates of AAP and provides multiple approaches to engineer AAP that might prove useful towards understanding and controlling AAV capsid assembly. Importance: Adeno-associated viruses (AAV) encode a unique assembly activating protein (AAP) within their genome that is essential for capsid assembly. Understanding how AAP acts as a chaperone for viral assembly could help improve efficiency and potentially control this process. Our studies reveal that AAP has a modular architecture, with each module playing a distinct role and can be engineered for carrying out new functions. Copyright © 2018 American Society for Microbiology.
Babjack, Destiny L; Cernicky, Brandon; Sobotka, Andrew J; Basler, Lee; Struthers, Devon; Kisic, Richard; Barone, Kimberly; Zuccolotto, Anthony P
2015-09-01
Using differing computer platforms and audio output devices to deliver audio stimuli often introduces (1) substantial variability across labs and (2) variable time between the intended and actual sound delivery (the sound onset latency). Fast, accurate audio onset latencies are particularly important when audio stimuli need to be delivered precisely as part of studies that depend on accurate timing (e.g., electroencephalographic, event-related potential, or multimodal studies), or in multisite studies in which standardization and strict control over the computer platforms used is not feasible. This research describes the variability introduced by using differing configurations and introduces a novel approach to minimizing audio sound latency and variability. A stimulus presentation and latency assessment approach is presented using E-Prime and Chronos (a new multifunction, USB-based data presentation and collection device). The present approach reliably delivers audio stimuli with low latencies that vary by ≤1 ms, independent of hardware and Windows operating system (OS)/driver combinations. The Chronos audio subsystem adopts a buffering, aborting, querying, and remixing approach to the delivery of audio, to achieve a consistent 1-ms sound onset latency for single-sound delivery, and precise delivery of multiple sounds that achieves standard deviations of 1/10th of a millisecond without the use of advanced scripting. Chronos's sound onset latencies are small, reliable, and consistent across systems. Testing of standard audio delivery devices and configurations highlights the need for careful attention to consistency between labs, experiments, and multiple study sites in their hardware choices, OS selections, and adoption of audio delivery systems designed to sidestep the audio latency variability issue.
Phillips, Yvonne F; Towsey, Michael; Roe, Paul
2018-01-01
Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration.
Towsey, Michael; Roe, Paul
2018-01-01
Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration. PMID:29494629
Holographic disk with high data transfer rate: its application to an audio response memory.
Kubota, K; Ono, Y; Kondo, M; Sugama, S; Nishida, N; Sakaguchi, M
1980-03-15
This paper describes a memory realized with a high data transfer rate using the holographic parallel-processing function and its application to an audio response system that supplies many audio messages to many terminals simultaneously. Digitalized audio messages are recorded as tiny 1-D Fourier transform holograms on a holographic disk. A hologram recorder and a hologram reader were constructed to test and demonstrate the holographic audio response memory feasibility. Experimental results indicate the potentiality of an audio response system with a 2000-word vocabulary and 250-Mbit/sec bit transfer rate.
Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J
2012-01-01
Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.
Modular representation of layered neural networks.
Watanabe, Chihiro; Hiramatsu, Kaoru; Kashino, Kunio
2018-01-01
Layered neural networks have greatly improved the performance of various applications including image processing, speech recognition, natural language processing, and bioinformatics. However, it is still difficult to discover or interpret knowledge from the inference provided by a layered neural network, since its internal representation has many nonlinear and complex parameters embedded in hierarchical layers. Therefore, it becomes important to establish a new methodology by which layered neural networks can be understood. In this paper, we propose a new method for extracting a global and simplified structure from a layered neural network. Based on network analysis, the proposed method detects communities or clusters of units with similar connection patterns. We show its effectiveness by applying it to three use cases. (1) Network decomposition: it can decompose a trained neural network into multiple small independent networks thus dividing the problem and reducing the computation time. (2) Training assessment: the appropriateness of a trained result with a given hyperparameter or randomly chosen initial parameters can be evaluated by using a modularity index. And (3) data analysis: in practical data it reveals the community structure in the input, hidden, and output layers, which serves as a clue for discovering knowledge from a trained neural network. Copyright © 2017 Elsevier Ltd. All rights reserved.
Eye center localization and gaze gesture recognition for human-computer interaction.
Zhang, Wenhao; Smith, Melvyn L; Smith, Lyndon N; Farooq, Abdul
2016-03-01
This paper introduces an unsupervised modular approach for accurate and real-time eye center localization in images and videos, thus allowing a coarse-to-fine, global-to-regional scheme. The trajectories of eye centers in consecutive frames, i.e., gaze gestures, are further analyzed, recognized, and employed to boost the human-computer interaction (HCI) experience. This modular approach makes use of isophote and gradient features to estimate the eye center locations. A selective oriented gradient filter has been specifically designed to remove strong gradients from eyebrows, eye corners, and shadows, which sabotage most eye center localization methods. A real-world implementation utilizing these algorithms has been designed in the form of an interactive advertising billboard to demonstrate the effectiveness of our method for HCI. The eye center localization algorithm has been compared with 10 other algorithms on the BioID database and six other algorithms on the GI4E database. It outperforms all the other algorithms in comparison in terms of localization accuracy. Further tests on the extended Yale Face Database b and self-collected data have proved this algorithm to be robust against moderate head poses and poor illumination conditions. The interactive advertising billboard has manifested outstanding usability and effectiveness in our tests and shows great potential for benefiting a wide range of real-world HCI applications.
SCORPION II persistent surveillance system with universal gateway
NASA Astrophysics Data System (ADS)
Coster, Michael; Chambers, Jonathan; Brunck, Albert
2009-05-01
This paper addresses improvements and benefits derived from the next generation Northrop Grumman SCORPION II family of persistent surveillance and target recognition systems produced by the Xetron campus in Cincinnati, Ohio. SCORPION II reduces the size, weight, and cost of all SCORPION components in a flexible, field programmable system that is easier to conceal, backward compatible, and enables integration of over forty Unattended Ground Sensor (UGS) and camera types from a variety of manufacturers, with a modular approach to supporting multiple Line of Sight (LOS) and Beyond Line of Sight (BLOS) communications interfaces. Since 1998 Northrop Grumman has been integrating best in class sensors with its proven universal modular Gateway to provide encrypted data exfiltration to Common Operational Picture (COP) systems and remote sensor command and control. In addition to being fed to COP systems, SCORPION and SCORPION II data can be directly processed using a common sensor status graphical user interface (GUI) that allows for viewing and analysis of images and sensor data from up to seven hundred SCORPION system Gateways on single or multiple displays. This GUI enables a large amount of sensor data and imagery to be used for actionable intelligence as well as remote sensor command and control by a minimum number of analysts.
Electrophysiological evidence for Audio-visuo-lingual speech integration.
Treille, Avril; Vilain, Coriandre; Schwartz, Jean-Luc; Hueber, Thomas; Sato, Marc
2018-01-31
Recent neurophysiological studies demonstrate that audio-visual speech integration partly operates through temporal expectations and speech-specific predictions. From these results, one common view is that the binding of auditory and visual, lipread, speech cues relies on their joint probability and prior associative audio-visual experience. The present EEG study examined whether visual tongue movements integrate with relevant speech sounds, despite little associative audio-visual experience between the two modalities. A second objective was to determine possible similarities and differences of audio-visual speech integration between unusual audio-visuo-lingual and classical audio-visuo-labial modalities. To this aim, participants were presented with auditory, visual, and audio-visual isolated syllables, with the visual presentation related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, with lingual and facial movements previously recorded by an ultrasound imaging system and a video camera. In line with previous EEG studies, our results revealed an amplitude decrease and a latency facilitation of P2 auditory evoked potentials in both audio-visual-lingual and audio-visuo-labial conditions compared to the sum of unimodal conditions. These results argue against the view that auditory and visual speech cues solely integrate based on prior associative audio-visual perceptual experience. Rather, they suggest that dynamic and phonetic informational cues are sharable across sensory modalities, possibly through a cross-modal transfer of implicit articulatory motor knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.
78 FR 38093 - Seventh Meeting: RTCA Special Committee 226, Audio Systems and Equipment
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-25
... Committee 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... 226, Audio Systems and Equipment [[Page 38094
Diagnostic accuracy of sleep bruxism scoring in absence of audio-video recording: a pilot study.
Carra, Maria Clotilde; Huynh, Nelly; Lavigne, Gilles J
2015-03-01
Based on the most recent polysomnographic (PSG) research diagnostic criteria, sleep bruxism is diagnosed when >2 rhythmic masticatory muscle activity (RMMA)/h of sleep are scored on the masseter and/or temporalis muscles. These criteria have not yet been validated for portable PSG systems. This pilot study aimed to assess the diagnostic accuracy of scoring sleep bruxism in absence of audio-video recordings. Ten subjects (mean age 24.7 ± 2.2) with a clinical diagnosis of sleep bruxism spent one night in the sleep laboratory. PSG were performed with a portable system (type 2) while audio-video was recorded. Sleep studies were scored by the same examiner three times: (1) without, (2) with, and (3) without audio-video in order to test the intra-scoring and intra-examiner reliability for RMMA scoring. The RMMA event-by-event concordance rate between scoring without audio-video and with audio-video was 68.3 %. Overall, the RMMA index was overestimated by 23.8 % without audio-video. However, the intra-class correlation coefficient (ICC) between scorings with and without audio-video was good (ICC = 0.91; p < 0.001); the intra-examiner reliability was high (ICC = 0.97; p < 0.001). The clinical diagnosis of sleep bruxism was confirmed in 8/10 subjects based on scoring without audio-video and in 6/10 subjects with audio-video. Although the absence of audio-video recording, the diagnostic accuracy of assessing RMMA with portable PSG systems appeared to remain good, supporting their use for both research and clinical purposes. However, the risk of moderate overestimation in absence of audio-video must be taken into account.
Theory for the Emergence of Modularity in Complex Systems
NASA Astrophysics Data System (ADS)
Deem, Michael; Park, Jeong-Man
2013-03-01
Biological systems are modular, and this modularity evolves over time and in different environments. A number of observations have been made of increased modularity in biological systems under increased environmental pressure. We here develop a theory for the dynamics of modularity in these systems. We find a principle of least action for the evolved modularity at long times. In addition, we find a fluctuation dissipation relation for the rate of change of modularity at short times. We discuss a number of biological and social systems that can be understood with this framework. The modularity of the protein-protein interaction network increases when yeast are exposed to heat shock, and the modularity of the protein-protein networks in both yeast and E. coli appears to have increased over evolutionary time. Food webs in low-energy, stressful environments are more modular than those in plentiful environments, arid ecologies are more modular during droughts, and foraging of sea otters is more modular when food is limiting. The modularity of social networks changes over time: stock brokers instant messaging networks are more modular under stressful market conditions, criminal networks are more modular under increased police pressure, and world trade network modularity has decreased
47 CFR 73.403 - Digital audio broadcasting service requirements.
Code of Federal Regulations, 2010 CFR
2010-10-01
... programming stream at no direct charge to listeners. In addition, a broadcast radio station must simulcast its analog audio programming on one of its digital audio programming streams. The DAB audio programming... analog programming service currently provided to listeners. (b) Emergency information. The emergency...
High-Fidelity Piezoelectric Audio Device
NASA Technical Reports Server (NTRS)
Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.
2003-01-01
ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.
Defraene, Bruno; van Waterschoot, Toon; Diehl, Moritz; Moonen, Marc
2016-07-01
Subjective audio quality evaluation experiments have been conducted to assess the performance of embedded-optimization-based precompensation algorithms for mitigating perceptible linear and nonlinear distortion in audio signals. It is concluded with statistical significance that the perceived audio quality is improved by applying an embedded-optimization-based precompensation algorithm, both in case (i) nonlinear distortion and (ii) a combination of linear and nonlinear distortion is present. Moreover, a significant positive correlation is reported between the collected subjective and objective PEAQ audio quality scores, supporting the validity of using PEAQ to predict the impact of linear and nonlinear distortion on the perceived audio quality.
Validation of a digital audio recording method for the objective assessment of cough in the horse.
Duz, M; Whittaker, A G; Love, S; Parkin, T D H; Hughes, K J
2010-10-01
To validate the use of digital audio recording and analysis for quantification of coughing in horses. Part A: Nine simultaneous digital audio and video recordings were collected individually from seven stabled horses over a 1 h period using a digital audio recorder attached to the halter. Audio files were analysed using audio analysis software. Video and audio recordings were analysed for cough count and timing by two blinded operators on two occasions using a randomised study design for determination of intra-operator and inter-operator agreement. Part B: Seventy-eight hours of audio recordings obtained from nine horses were analysed once by two blinded operators to assess inter-operator repeatability on a larger sample. Part A: There was complete agreement between audio and video analyses and inter- and intra-operator analyses. Part B: There was >97% agreement between operators on number and timing of 727 coughs recorded over 78 h. The results of this study suggest that the cough monitor methodology used has excellent sensitivity and specificity for the objective assessment of cough in horses and intra- and inter-operator variability of recorded coughs is minimal. Crown Copyright 2010. Published by Elsevier India Pvt Ltd. All rights reserved.
47 CFR 73.9005 - Compliance requirements for covered demodulator products: Audio.
Code of Federal Regulations, 2010 CFR
2010-10-01
... products: Audio. 73.9005 Section 73.9005 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED....9005 Compliance requirements for covered demodulator products: Audio. Except as otherwise provided in §§ 73.9003(a) or 73.9004(a), covered demodulator products shall not output the audio portions of...
36 CFR 1002.12 - Audio disturbances.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 36 Parks, Forests, and Public Property 3 2014-07-01 2014-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
36 CFR 1002.12 - Audio disturbances.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 36 Parks, Forests, and Public Property 3 2012-07-01 2012-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
50 CFR 27.72 - Audio equipment.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 50 Wildlife and Fisheries 6 2010-10-01 2010-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
36 CFR 1002.12 - Audio disturbances.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 36 Parks, Forests, and Public Property 3 2011-07-01 2011-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
36 CFR 1002.12 - Audio disturbances.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
50 CFR 27.72 - Audio equipment.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 50 Wildlife and Fisheries 8 2011-10-01 2011-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
50 CFR 27.72 - Audio equipment.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 50 Wildlife and Fisheries 9 2012-10-01 2012-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
47 CFR 87.483 - Audio visual warning systems.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 5 2014-10-01 2014-10-01 false Audio visual warning systems. 87.483 Section 87... AVIATION SERVICES Stations in the Radiodetermination Service § 87.483 Audio visual warning systems. An audio visual warning system (AVWS) is a radar-based obstacle avoidance system. AVWS activates...
Valency-Controlled Framework Nucleic Acid Signal Amplifiers.
Liu, Qi; Ge, Zhilei; Mao, Xiuhai; Zhou, Guobao; Zuo, Xiaolei; Shen, Juwen; Shi, Jiye; Li, Jiang; Wang, Lihua; Chen, Xiaoqing; Fan, Chunhai
2018-06-11
Weak ligand-receptor recognition events are often amplified by recruiting multiple regulatory biomolecules to the action site in biological systems. However, signal amplification in in vitro biomimetic systems generally lack the spatiotemporal regulation in vivo. Herein we report a framework nucleic acid (FNA)-programmed strategy to develop valence-controlled signal amplifiers with high modularity for ultrasensitive biosensing. We demonstrated that the FNA-programmed signal amplifiers could recruit nucleic acids, proteins, and inorganic nanoparticles in a stoichiometric manner. The valence-controlled signal amplifier enhanced the quantification ability of electrochemical biosensors, and enabled ultrasensitive detection of tumor-relevant circulating free DNA (cfDNA) with sensitivity enhancement of 3-5 orders of magnitude and improved dynamic range. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Semantic Context Detection Using Audio Event Fusion
NASA Astrophysics Data System (ADS)
Chu, Wei-Ta; Cheng, Wen-Huang; Wu, Ja-Ling
2006-12-01
Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.
Effect of Audio Coaching on Correlation of Abdominal Displacement With Lung Tumor Motion
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakamura, Mitsuhiro; Narita, Yuichiro; Matsuo, Yukinori
2009-10-01
Purpose: To assess the effect of audio coaching on the time-dependent behavior of the correlation between abdominal motion and lung tumor motion and the corresponding lung tumor position mismatches. Methods and Materials: Six patients who had a lung tumor with a motion range >8 mm were enrolled in the present study. Breathing-synchronized fluoroscopy was performed initially without audio coaching, followed by fluoroscopy with recorded audio coaching for multiple days. Two different measurements, anteroposterior abdominal displacement using the real-time positioning management system and superoinferior (SI) lung tumor motion by X-ray fluoroscopy, were performed simultaneously. Their sequential images were recorded using onemore » display system. The lung tumor position was automatically detected with a template matching technique. The relationship between the abdominal and lung tumor motion was analyzed with and without audio coaching. Results: The mean SI tumor displacement was 10.4 mm without audio coaching and increased to 23.0 mm with audio coaching (p < .01). The correlation coefficients ranged from 0.89 to 0.97 with free breathing. Applying audio coaching, the correlation coefficients improved significantly (range, 0.93-0.99; p < .01), and the SI lung tumor position mismatches became larger in 75% of all sessions. Conclusion: Audio coaching served to increase the degree of correlation and make it more reproducible. In addition, the phase shifts between tumor motion and abdominal displacement were improved; however, all patients breathed more deeply, and the SI lung tumor position mismatches became slightly larger with audio coaching than without audio coaching.« less
47 CFR 10.520 - Common audio attention signal.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 47 Telecommunication 1 2011-10-01 2011-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
36 CFR 2.12 - Audio disturbances.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 36 Parks, Forests, and Public Property 1 2012-07-01 2012-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
36 CFR 2.12 - Audio disturbances.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.
Code of Federal Regulations, 2011 CFR
2011-07-01
... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
36 CFR § 1002.12 - Audio disturbances.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 36 Parks, Forests, and Public Property 3 2013-07-01 2012-07-01 true Audio disturbances. § 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
47 CFR 10.520 - Common audio attention signal.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 47 Telecommunication 1 2013-10-01 2013-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.
Code of Federal Regulations, 2012 CFR
2012-07-01
... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
36 CFR 2.12 - Audio disturbances.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 36 Parks, Forests, and Public Property 1 2013-07-01 2013-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.
Code of Federal Regulations, 2013 CFR
2013-07-01
... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
ENERGY STAR Certified Audio Video
Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of May 1, 2013. A detailed listing of key efficiency criteria are available at http://www.energystar.gov/index.cfm?c=audio_dvd.pr_crit_audio_dvd
36 CFR 2.12 - Audio disturbances.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 36 Parks, Forests, and Public Property 1 2014-07-01 2014-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
Code of Federal Regulations, 2014 CFR
2014-10-01
...: (1) Inputs. Decoders must have the capability to receive at least two audio inputs from EAS... externally, at least two minutes of audio or text messages. A decoder manufactured without an internal means to record and store audio or text must be equipped with a means (such as an audio or digital jack...
Code of Federal Regulations, 2013 CFR
2013-10-01
...: (1) Inputs. Decoders must have the capability to receive at least two audio inputs from EAS... externally, at least two minutes of audio or text messages. A decoder manufactured without an internal means to record and store audio or text must be equipped with a means (such as an audio or digital jack...
47 CFR 10.520 - Common audio attention signal.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 1 2014-10-01 2014-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.
Code of Federal Regulations, 2014 CFR
2014-07-01
... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
47 CFR 10.520 - Common audio attention signal.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 47 Telecommunication 1 2012-10-01 2012-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
Code of Federal Regulations, 2012 CFR
2012-10-01
...: (1) Inputs. Decoders must have the capability to receive at least two audio inputs from EAS... externally, at least two minutes of audio or text messages. A decoder manufactured without an internal means to record and store audio or text must be equipped with a means (such as an audio or digital jack...
36 CFR 2.12 - Audio disturbances.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 36 Parks, Forests, and Public Property 1 2011-07-01 2011-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
Camuñas-Mesa, Luis A; Domínguez-Cordero, Yaisel L; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé
2018-01-01
Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network.
Camuñas-Mesa, Luis A.; Domínguez-Cordero, Yaisel L.; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé
2018-01-01
Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network. PMID:29515349
Sounding ruins: reflections on the production of an 'audio drift'.
Gallagher, Michael
2015-07-01
This article is about the use of audio media in researching places, which I term 'audio geography'. The article narrates some episodes from the production of an 'audio drift', an experimental environmental sound work designed to be listened to on a portable MP3 player whilst walking in a ruinous landscape. Reflecting on how this work functions, I argue that, as well as representing places, audio geography can shape listeners' attention and bodily movements, thereby reworking places, albeit temporarily. I suggest that audio geography is particularly apt for amplifying the haunted and uncanny qualities of places. I discuss some of the issues raised for research ethics, epistemology and spectral geographies.
Sounding ruins: reflections on the production of an ‘audio drift’
Gallagher, Michael
2014-01-01
This article is about the use of audio media in researching places, which I term ‘audio geography’. The article narrates some episodes from the production of an ‘audio drift’, an experimental environmental sound work designed to be listened to on a portable MP3 player whilst walking in a ruinous landscape. Reflecting on how this work functions, I argue that, as well as representing places, audio geography can shape listeners’ attention and bodily movements, thereby reworking places, albeit temporarily. I suggest that audio geography is particularly apt for amplifying the haunted and uncanny qualities of places. I discuss some of the issues raised for research ethics, epistemology and spectral geographies. PMID:29708107
DETECTOR FOR MODULATED AND UNMODULATED SIGNALS
Patterson, H.H.; Webber, G.H.
1959-08-25
An r-f signal-detecting device is described, which is embodied in a compact coaxial circuit principally comprising a detecting crystal diode and a modulating crystal diode connected in parallel. Incoming modulated r-f signals are demodulated by the detecting crystal diode to furnish an audio input to an audio amplifier. The detecting diode will not, however, produce an audio signal from an unmodulated r-f signal. In order that unmodulated signals may be detected, such incoming signals have a locally produced audio signal superimposed on them at the modulating crystal diode and then the"induced or artificially modulated" signal is reflected toward the detecting diode which in the process of demodulation produces an audio signal for the audio amplifier.
A digital audio/video interleaving system. [for Shuttle Orbiter
NASA Technical Reports Server (NTRS)
Richards, R. W.
1978-01-01
A method of interleaving an audio signal with its associated video signal for simultaneous transmission or recording, and the subsequent separation of the two signals, is described. Comparisons are made between the new audio signal interleaving system and the Skylab Pam audio/video interleaving system, pointing out improvements gained by using the digital audio/video interleaving system. It was found that the digital technique is the simplest, most effective and most reliable method for interleaving audio and/or other types of data into the video signal for the Shuttle Orbiter application. Details of the design of a multiplexer capable of accommodating two basic data channels, each consisting of a single 31.5-kb/s digital bit stream are given. An adaptive slope delta modulation system is introduced to digitize audio signals, producing a high immunity of work intelligibility to channel errors, primarily due to the robust nature of the delta-modulation algorithm.
Characteristics of audio and sub-audio telluric signals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Telford, W.M.
1977-06-01
Telluric current measurements in the audio and sub-audio frequency range, made in various parts of Canada and South America over the past four years, indicate that the signal amplitude is relatively uniform over 6 to 8 midday hours (LMT) except in Chile and that the signal anisotropy is reasonably constant in azimuth.
43 CFR 8365.2-2 - Audio devices.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 43 Public Lands: Interior 2 2013-10-01 2013-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 43 Public Lands: Interior 2 2012-10-01 2012-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 43 Public Lands: Interior 2 2014-10-01 2014-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
78 FR 18416 - Sixth Meeting: RTCA Special Committee 226, Audio Systems and Equipment
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-26
... 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... 226, Audio Systems and Equipment. DATES: The meeting will be held April 15-17, 2013 from 9:00 a.m.-5...
Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study
ERIC Educational Resources Information Center
Romero-Fresco, Pablo; Fryer, Louise
2013-01-01
Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…
Audio-Vision: Audio-Visual Interaction in Desktop Multimedia.
ERIC Educational Resources Information Center
Daniels, Lee
Although sophisticated multimedia authoring applications are now available to amateur programmers, the use of audio in of these programs has been inadequate. Due to the lack of research in the use of audio in instruction, there are few resources to assist the multimedia producer in using sound effectively and efficiently. This paper addresses the…
Audio Frequency Analysis in Mobile Phones
ERIC Educational Resources Information Center
Aguilar, Horacio Munguía
2016-01-01
A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…
A Longitudinal, Quantitative Study of Student Attitudes towards Audio Feedback for Assessment
ERIC Educational Resources Information Center
Parkes, Mitchell; Fletcher, Peter
2017-01-01
This paper reports on the findings of a three-year longitudinal study investigating the experiences of postgraduate level students who were provided with audio feedback for their assessment. Results indicated that students positively received audio feedback. Overall, students indicated a preference for audio feedback over written feedback. No…
Audio-Tutorial Instruction: A Strategy For Teaching Introductory College Geology.
ERIC Educational Resources Information Center
Fenner, Peter; Andrews, Ted F.
The rationale of audio-tutorial instruction is discussed, and the history and development of the audio-tutorial botany program at Purdue University is described. Audio-tutorial programs in geology at eleven colleges and one school are described, illustrating several ways in which programs have been developed and integrated into courses. Programs…
Audio-video decision support for patients: the documentary genré as a basis for decision aids.
Volandes, Angelo E; Barry, Michael J; Wood, Fiona; Elwyn, Glyn
2013-09-01
Decision support tools are increasingly using audio-visual materials. However, disagreement exists about the use of audio-visual materials as they may be subjective and biased. This is a literature review of the major texts for documentary film studies to extrapolate issues of objectivity and bias from film to decision support tools. The key features of documentary films are that they attempt to portray real events and that the attempted reality is always filtered through the lens of the filmmaker. The same key features can be said of decision support tools that use audio-visual materials. Three concerns arising from documentary film studies as they apply to the use of audio-visual materials in decision support tools include whose perspective matters (stakeholder bias), how to choose among audio-visual materials (selection bias) and how to ensure objectivity (editorial bias). Decision science needs to start a debate about how audio-visual materials are to be used in decision support tools. Simply because audio-visual materials may be subjective and open to bias does not mean that we should not use them. Methods need to be found to ensure consensus around balance and editorial control, such that audio-visual materials can be used. © 2011 John Wiley & Sons Ltd.
Survey of Modular Military Vehicles: Benefits and Burdens
2016-01-01
Survey of Modular Military Vehicles: BENEFITS and BURDENS Jean M. Dasch and David J. Gorsich Modularity in military vehicle design is generally...considered a positive attribute that promotes adaptability, resilience, and cost savings. The benefits and burdens of modularity are considered by...Engineering Center, vehicles were considered based on horizontal modularity , vertical modularity , and distributed modularity . Examples were given for each
Audio Motor Training at the Foot Level Improves Space Representation.
Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica
2017-01-01
Spatial representation is developed thanks to the integration of visual signals with the other senses. It has been shown that the lack of vision compromises the development of some spatial representations. In this study we tested the effect of a new rehabilitation device called ABBI (Audio Bracelet for Blind Interaction) to improve space representation. ABBI produces an audio feedback linked to body movement. Previous studies from our group showed that this device improves the spatial representation of space in early blind adults around the upper part of the body. Here we evaluate whether the audio motor feedback produced by ABBI can also improve audio spatial representation of sighted individuals in the space around the legs. Forty five blindfolded sighted subjects participated in the study, subdivided into three experimental groups. An audio space localization (front-back discrimination) task was performed twice by all groups of subjects before and after different kind of training conditions. A group (experimental) performed an audio-motor training with the ABBI device placed on their foot. Another group (control) performed a free motor activity without audio feedback associated with body movement. The other group (control) passively listened to the ABBI sound moved at foot level by the experimenter without producing any body movement. Results showed that only the experimental group, which performed the training with the audio-motor feedback, showed an improvement in accuracy for sound discrimination. No improvement was observed for the two control groups. These findings suggest that the audio-motor training with ABBI improves audio space perception also in the space around the legs in sighted individuals. This result provides important inputs for the rehabilitation of the space representations in the lower part of the body.
Audio Motor Training at the Foot Level Improves Space Representation
Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica
2017-01-01
Spatial representation is developed thanks to the integration of visual signals with the other senses. It has been shown that the lack of vision compromises the development of some spatial representations. In this study we tested the effect of a new rehabilitation device called ABBI (Audio Bracelet for Blind Interaction) to improve space representation. ABBI produces an audio feedback linked to body movement. Previous studies from our group showed that this device improves the spatial representation of space in early blind adults around the upper part of the body. Here we evaluate whether the audio motor feedback produced by ABBI can also improve audio spatial representation of sighted individuals in the space around the legs. Forty five blindfolded sighted subjects participated in the study, subdivided into three experimental groups. An audio space localization (front-back discrimination) task was performed twice by all groups of subjects before and after different kind of training conditions. A group (experimental) performed an audio-motor training with the ABBI device placed on their foot. Another group (control) performed a free motor activity without audio feedback associated with body movement. The other group (control) passively listened to the ABBI sound moved at foot level by the experimenter without producing any body movement. Results showed that only the experimental group, which performed the training with the audio-motor feedback, showed an improvement in accuracy for sound discrimination. No improvement was observed for the two control groups. These findings suggest that the audio-motor training with ABBI improves audio space perception also in the space around the legs in sighted individuals. This result provides important inputs for the rehabilitation of the space representations in the lower part of the body. PMID:29326564
Navit, Saumya; Johri, Nikita; Khan, Suleman Abbas; Singh, Rahul Kumar; Chadha, Dheera; Navit, Pragati; Sharma, Anshul; Bahuguna, Rachana
2015-12-01
Dental anxiety is a widespread phenomenon and a concern for paediatric dentistry. The inability of children to deal with threatening dental stimuli often manifests as behaviour management problems. Nowadays, the use of non-aversive behaviour management techniques is more advocated, which are more acceptable to parents, patients and practitioners. Therefore, this present study was conducted to find out which audio aid was the most effective in the managing anxious children. The aim of the present study was to compare the efficacy of audio-distraction aids in reducing the anxiety of paediatric patients while undergoing various stressful and invasive dental procedures. The objectives were to ascertain whether audio distraction is an effective means of anxiety management and which type of audio aid is the most effective. A total number of 150 children, aged between 6 to 12 years, randomly selected amongst the patients who came for their first dental check-up, were placed in five groups of 30 each. These groups were the control group, the instrumental music group, the musical nursery rhymes group, the movie songs group and the audio stories group. The control group was treated under normal set-up & audio group listened to various audio presentations during treatment. Each child had four visits. In each visit, after the procedures was completed, the anxiety levels of the children were measured by the Venham's Picture Test (VPT), Venham's Clinical Rating Scale (VCRS) and pulse rate measurement with the help of pulse oximeter. A significant difference was seen between all the groups for the mean pulse rate, with an increase in subsequent visit. However, no significant difference was seen in the VPT & VCRS scores between all the groups. Audio aids in general reduced anxiety in comparison to the control group, and the most significant reduction in anxiety level was observed in the audio stories group. The conclusion derived from the present study was that audio distraction was effective in reducing anxiety and audio-stories were the most effective.
Software for Acoustic Rendering
NASA Technical Reports Server (NTRS)
Miller, Joel D.
2003-01-01
SLAB is a software system that can be run on a personal computer to simulate an acoustic environment in real time. SLAB was developed to enable computational experimentation in which one can exert low-level control over a variety of signal-processing parameters, related to spatialization, for conducting psychoacoustic studies. Among the parameters that can be manipulated are the number and position of reflections, the fidelity (that is, the number of taps in finite-impulse-response filters), the system latency, and the update rate of the filters. Another goal in the development of SLAB was to provide an inexpensive means of dynamic synthesis of virtual audio over headphones, without need for special-purpose signal-processing hardware. SLAB has a modular, object-oriented design that affords the flexibility and extensibility needed to accommodate a variety of computational experiments and signal-flow structures. SLAB s spatial renderer has a fixed signal-flow architecture corresponding to a set of parallel signal paths from each source to a listener. This fixed architecture can be regarded as a compromise that optimizes efficiency at the expense of complete flexibility. Such a compromise is necessary, given the design goal of enabling computational psychoacoustic experimentation on inexpensive personal computers.
Christakis, D. A.; Ramirez, J. S. B.; Ramirez, J. M.
2012-01-01
Observational studies in humans have found associations between overstimulation in infancy via excessive television viewing and subsequent deficits in cognition and attention. We developed and tested a mouse model of overstimulation whereby p10 mice were subjected to audio (70 db) and visual stimulation (flashing lights) for six hours per day for a total of 42 days. 10 days later cognition and behavior were tested using the following tests: Light Dark Latency, Elevated Plus Maze, Novel Object Recognition, and Barnes Maze. In all tests, overstimulated mice performed significantly worse compared to controls suggesting increased activity and risk taking, diminished short term memory, and decreased cognitive function. These findings suggest that excessive non-normative stimulation during critical periods of brain development can have demonstrable untoward effects on subsequent neurocognitive function. PMID:22855702
ERIC Educational Resources Information Center
Bilbro, J.; Iluzada, C.; Clark, D. E.
2013-01-01
The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…
Design and Usability Testing of an Audio Platform Game for Players with Visual Impairments
ERIC Educational Resources Information Center
Oren, Michael; Harding, Chris; Bonebright, Terri L.
2008-01-01
This article reports on the evaluation of a novel audio platform game that creates a spatial, interactive experience via audio cues. A pilot study with players with visual impairments, and usability testing comparing the visual and audio game versions using both sighted players and players with visual impairments, revealed that all the…
78 FR 57673 - Eighth Meeting: RTCA Special Committee 226, Audio Systems and Equipment
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-19
... Committee 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... Committee 226, Audio Systems and Equipment. DATES: The meeting will be held October 8-10, 2012 from 9:00 a.m...
77 FR 37732 - Fourteenth Meeting: RTCA Special Committee 224, Audio Systems and Equipment
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-22
... Committee 224, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 224, Audio Systems and Equipment. SUMMARY... Committee 224, Audio Systems and Equipment. DATES: The meeting will be held July 11, 2012, from 10 a.m.-4 p...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-09-19
... Rules and Policies for the Satellite Digital Audio Radio Service in the 2310-2360 MHz Frequency Band... Digital Audio Radio Service (SDARS) Second Report and Order. The information collection requirements were... of these rule sections. See Satellite Digital Audio Radio Service (SDARS) Second Report and Order...
The Use of Asynchronous Audio Feedback with Online RN-BSN Students
ERIC Educational Resources Information Center
London, Julie E.
2013-01-01
The use of audio technology by online nursing educators is a recent phenomenon. Research has been conducted in the area of audio technology in different domains and populations, but very few researchers have focused on nursing. Preliminary results have indicated that using audio in place of text can increase student cognition and socialization.…
ERIC Educational Resources Information Center
Aleman-Centeno, Josefina R.
1983-01-01
Discusses the development and evaluation of CAVIS, which consists of an Apple microcomputer used with audiovisual dialogs. Includes research on the effects of three conditions: (1) computer with audio and visual, (2) computer with audio alone and (3) audio alone in short-term and long-term recall. (EKN)
Low-delay predictive audio coding for the HIVITS HDTV codec
NASA Astrophysics Data System (ADS)
McParland, A. K.; Gilchrist, N. H. C.
1995-01-01
The status of work relating to predictive audio coding, as part of the European project on High Quality Video Telephone and HD(TV) Systems (HIVITS), is reported. The predictive coding algorithm is developed, along with six-channel audio coding and decoding hardware. Demonstrations of the audio codec operating in conjunction with the video codec, are given.
Zekveld, Adriana A.; Kramer, Sophia E.; Kessens, Judith M.; Vlaming, Marcel S. M. G.; Houtgast, Tammo
2009-01-01
This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word accuracy (WA) level of the captions was 60% or 70% and the text was presented delayed to the speech. After each test, the hearing impaired participants (n = 20) completed the NASA-Task Load Index and several rating scales evaluating the support from the captions. Participants indicated that using the erroneous text in speech comprehension was difficult and the reported task load did not differ between the audio + text and audio-only conditions. In a follow-up experiment (n = 10), the perceived benefit of presenting captions increased with an increase of WA levels to 80% and 90%, and elimination of the text delay. However, in general, the task load did not decrease when captions were presented. These results suggest that the extra effort required to process the text could have been compensated for by less effort required to comprehend the speech. Future research should aim at reducing the complexity of the task to increase the willingness of hearing impaired persons to use an assistive communication system automatically providing captions. The current results underline the need for obtaining both objective and subjective measures of benefit when evaluating assistive communication systems. PMID:19126551
Escorihuela García, Vicente; Pitarch Ribas, María Ignacia; Llópez Carratalá, Ignacio; Latorre Monteagudo, Emilia; Morant Ventura, Antonio; Marco Algarra, Jaime
2016-01-01
The studies that have evaluated the effectiveness of bilateral cochlear implantation in children suggest an improvement in hearing about sound localization and speech discrimination. In this paper we show the differences in audio-linguistic achievements with early bilateral cochlear implantation versus unilateral, and differences between simultaneous and sequential bilateral implantation. We present 88 children with bilateral profound sensorineural hearing loss, treated with bilateral cochlear implantation in 32 cases and unilateral in 56 cases, during the first 12 months (27 children) of life and between 12 and 24 months (61 children). We conducted a statistical comparison of both groups in the audiometry, IT-Mais, Nottingham, LittlEars scales and verbal tests. No significant differences in hearing thresholds and questionnaires between unilateral and bilateral implantation were detected in either the first or second year. Verbal tests do show statistically significant differences: children with bilateral cochlear implant obtain 100% recognition of disyllabic and phrases within 2-3 years after implantation whilst children with one implant do not obtain those results at 5 years after surgery. No differences between simultaneous and sequential bilateral implantation were detected. We emphasize the importance of ensuring good early audiological screening, to carry out an early and bilateral cochlear implantation with the consequent development of audio-language skills similar to normal hearing children. Copyright © 2015 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Cirugía de Cabeza y Cuello. All rights reserved.
Behavior Selection of Mobile Robot Based on Integration of Multimodal Information
NASA Astrophysics Data System (ADS)
Chen, Bin; Kaneko, Masahide
Recently, biologically inspired robots have been developed to acquire the capacity for directing visual attention to salient stimulus generated from the audiovisual environment. On purpose to realize this behavior, a general method is to calculate saliency maps to represent how much the external information attracts the robot's visual attention, where the audiovisual information and robot's motion status should be involved. In this paper, we represent a visual attention model where three modalities, that is, audio information, visual information and robot's motor status are considered, while the previous researches have not considered all of them. Firstly, we introduce a 2-D density map, on which the value denotes how much the robot pays attention to each spatial location. Then we model the attention density using a Bayesian network where the robot's motion statuses are involved. Secondly, the information from both of audio and visual modalities is integrated with the attention density map in integrate-fire neurons. The robot can direct its attention to the locations where the integrate-fire neurons are fired. Finally, the visual attention model is applied to make the robot select the visual information from the environment, and react to the content selected. Experimental results show that it is possible for robots to acquire the visual information related to their behaviors by using the attention model considering motion statuses. The robot can select its behaviors to adapt to the dynamic environment as well as to switch to another task according to the recognition results of visual attention.
Fort, Alexandra; Delpuech, Claude; Pernier, Jacques; Giard, Marie-Hélène
2002-10-01
Very recently, a number of neuroimaging studies in humans have begun to investigate the question of how the brain integrates information from different sensory modalities to form unified percepts. Already, intermodal neural processing appears to depend on the modalities of inputs or the nature (speech/non-speech) of information to be combined. Yet, the variety of paradigms, stimuli and technics used make it difficult to understand the relationships between the factors operating at the perceptual level and the underlying physiological processes. In a previous experiment, we used event-related potentials to describe the spatio-temporal organization of audio-visual interactions during a bimodal object recognition task. Here we examined the network of cross-modal interactions involved in simple detection of the same objects. The objects were defined either by unimodal auditory or visual features alone, or by the combination of the two features. As expected, subjects detected bimodal stimuli more rapidly than either unimodal stimuli. Combined analysis of potentials, scalp current densities and dipole modeling revealed several interaction patterns within the first 200 micro s post-stimulus: in occipito-parietal visual areas (45-85 micro s), in deep brain structures, possibly the superior colliculus (105-140 micro s), and in right temporo-frontal regions (170-185 micro s). These interactions differed from those found during object identification in sensory-specific areas and possibly in the superior colliculus, indicating that the neural operations governing multisensory integration depend crucially on the nature of the perceptual processes involved.
A neural network based artificial vision system for licence plate recognition.
Draghici, S
1997-02-01
This paper presents a neural network based artificial vision system able to analyze the image of a car given by a camera, locate the registration plate and recognize the registration number of the car. The paper describes in detail various practical problems encountered in implementing this particular application and the solutions used to solve them. The main features of the system presented are: controlled stability-plasticity behavior, controlled reliability threshold, both off-line and on-line learning, self assessment of the output reliability and high reliability based on high level multiple feedback. The system has been designed using a modular approach. Sub-modules can be upgraded and/or substituted independently, thus making the system potentially suitable in a large variety of vision applications. The OCR engine was designed as an interchangeable plug-in module. This allows the user to choose an OCR engine which is suited to the particular application and to upgrade it easily in the future. At present, there are several versions of this OCR engine. One of them is based on a fully connected feedforward artificial neural network with sigmoidal activation functions. This network can be trained with various training algorithms such as error backpropagation. An alternative OCR engine is based on the constraint based decomposition (CBD) training architecture. The system has showed the following performances (on average) on real-world data: successful plate location and segmentation about 99%, successful character recognition about 98% and successful recognition of complete registration plates about 80%.
Geophysical phenomena classification by artificial neural networks
NASA Technical Reports Server (NTRS)
Gough, M. P.; Bruckner, J. R.
1995-01-01
Space science information systems involve accessing vast data bases. There is a need for an automatic process by which properties of the whole data set can be assimilated and presented to the user. Where data are in the form of spectrograms, phenomena can be detected by pattern recognition techniques. Presented are the first results obtained by applying unsupervised Artificial Neural Networks (ANN's) to the classification of magnetospheric wave spectra. The networks used here were a simple unsupervised Hamming network run on a PC and a more sophisticated CALM network run on a Sparc workstation. The ANN's were compared in their geophysical data recognition performance. CALM networks offer such qualities as fast learning, superiority in generalizing, the ability to continuously adapt to changes in the pattern set, and the possibility to modularize the network to allow the inter-relation between phenomena and data sets. This work is the first step toward an information system interface being developed at Sussex, the Whole Information System Expert (WISE). Phenomena in the data are automatically identified and provided to the user in the form of a data occurrence morphology, the Whole Information System Data Occurrence Morphology (WISDOM), along with relationships to other parameters and phenomena.
Code of Federal Regulations, 2011 CFR
2011-10-01
... Digital Audio Broadcasting § 73.402 Definitions. (a) DAB. Digital audio broadcast stations are those radio... into multiple channels for additional audio programming uses. (g) Datacasting. Subdividing the digital...
Code of Federal Regulations, 2012 CFR
2012-10-01
... Digital Audio Broadcasting § 73.402 Definitions. (a) DAB. Digital audio broadcast stations are those radio... into multiple channels for additional audio programming uses. (g) Datacasting. Subdividing the digital...
Code of Federal Regulations, 2014 CFR
2014-10-01
... Digital Audio Broadcasting § 73.402 Definitions. (a) DAB. Digital audio broadcast stations are those radio... into multiple channels for additional audio programming uses. (g) Datacasting. Subdividing the digital...
Code of Federal Regulations, 2013 CFR
2013-10-01
... Digital Audio Broadcasting § 73.402 Definitions. (a) DAB. Digital audio broadcast stations are those radio... into multiple channels for additional audio programming uses. (g) Datacasting. Subdividing the digital...
The relative efficiency of modular and non-modular networks of different size
Tosh, Colin R.; McNally, Luke
2015-01-01
Most biological networks are modular but previous work with small model networks has indicated that modularity does not necessarily lead to increased functional efficiency. Most biological networks are large, however, and here we examine the relative functional efficiency of modular and non-modular neural networks at a range of sizes. We conduct a detailed analysis of efficiency in networks of two size classes: ‘small’ and ‘large’, and a less detailed analysis across a range of network sizes. The former analysis reveals that while the modular network is less efficient than one of the two non-modular networks considered when networks are small, it is usually equally or more efficient than both non-modular networks when networks are large. The latter analysis shows that in networks of small to intermediate size, modular networks are much more efficient that non-modular networks of the same (low) connective density. If connective density must be kept low to reduce energy needs for example, this could promote modularity. We have shown how relative functionality/performance scales with network size, but the precise nature of evolutionary relationship between network size and prevalence of modularity will depend on the costs of connectivity. PMID:25631996
DOE Office of Scientific and Technical Information (OSTI.GOV)
George, Rohini; Department of Biomedical Engineering, Virginia Commonwealth University, Richmond, VA; Chung, Theodore D.
2006-07-01
Purpose: Respiratory gating is a commercially available technology for reducing the deleterious effects of motion during imaging and treatment. The efficacy of gating is dependent on the reproducibility within and between respiratory cycles during imaging and treatment. The aim of this study was to determine whether audio-visual biofeedback can improve respiratory reproducibility by decreasing residual motion and therefore increasing the accuracy of gated radiotherapy. Methods and Materials: A total of 331 respiratory traces were collected from 24 lung cancer patients. The protocol consisted of five breathing training sessions spaced about a week apart. Within each session the patients initially breathedmore » without any instruction (free breathing), with audio instructions and with audio-visual biofeedback. Residual motion was quantified by the standard deviation of the respiratory signal within the gating window. Results: Audio-visual biofeedback significantly reduced residual motion compared with free breathing and audio instruction. Displacement-based gating has lower residual motion than phase-based gating. Little reduction in residual motion was found for duty cycles less than 30%; for duty cycles above 50% there was a sharp increase in residual motion. Conclusions: The efficiency and reproducibility of gating can be improved by: incorporating audio-visual biofeedback, using a 30-50% duty cycle, gating during exhalation, and using displacement-based gating.« less
Understanding the mechanisms of familiar voice-identity recognition in the human brain.
Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina
2018-03-31
Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.
Doi, Hirokazu; Shinohara, Kazuyuki
2015-03-01
Cross-modal integration of visual and auditory emotional cues is supposed to be advantageous in the accurate recognition of emotional signals. However, the neural locus of cross-modal integration between affective prosody and unconsciously presented facial expression in the neurologically intact population is still elusive at this point. The present study examined the influences of unconsciously presented facial expressions on the event-related potentials (ERPs) in emotional prosody recognition. In the experiment, fearful, happy, and neutral faces were presented without awareness by continuous flash suppression simultaneously with voices containing laughter and a fearful shout. The conventional peak analysis revealed that the ERPs were modulated interactively by emotional prosody and facial expression at multiple latency ranges, indicating that audio-visual integration of emotional signals takes place automatically without conscious awareness. In addition, the global field power during the late-latency range was larger for shout than for laughter only when a fearful face was presented unconsciously. The neural locus of this effect was localized to the left posterior fusiform gyrus, giving support to the view that the cortical region, traditionally considered to be unisensory region for visual processing, functions as the locus of audiovisual integration of emotional signals. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Brochier, Tim; McDermott, Hugh J; McKay, Colette M
2017-06-01
In order to improve speech understanding for cochlear implant users, it is important to maximize the transmission of temporal information. The combined effects of stimulation rate and presentation level on temporal information transfer and speech understanding remain unclear. The present study systematically varied presentation level (60, 50, and 40 dBA) and stimulation rate [500 and 2400 pulses per second per electrode (pps)] in order to observe how the effect of rate on speech understanding changes for different presentation levels. Speech recognition in quiet and noise, and acoustic amplitude modulation detection thresholds (AMDTs) were measured with acoustic stimuli presented to speech processors via direct audio input (DAI). With the 500 pps processor, results showed significantly better performance for consonant-vowel nucleus-consonant words in quiet, and a reduced effect of noise on sentence recognition. However, no rate or level effect was found for AMDTs, perhaps partly because of amplitude compression in the sound processor. AMDTs were found to be strongly correlated with the effect of noise on sentence perception at low levels. These results indicate that AMDTs, at least when measured with the CP910 Freedom speech processor via DAI, explain between-subject variance of speech understanding, but do not explain within-subject variance for different rates and levels.
ERIC Educational Resources Information Center
Bergman, Daniel
2015-01-01
This study examined the effects of audio and video self-recording on preservice teachers' written reflections. Participants (n = 201) came from a secondary teaching methods course and its school-based (clinical) fieldwork. The audio group (n[subscript A] = 106) used audio recorders to monitor their teaching in fieldwork placements; the video group…
ERIC Educational Resources Information Center
Rush, S. Craig
2014-01-01
This article draws on the author's experience using qualitative video and audio analysis, most notably through use of the Transana qualitative video and audio analysis software program, as an alternative method for teaching IQ administration skills to students in a graduate psychology program. Qualitative video and audio analysis may be useful for…
Development and Assessment of Web Courses That Use Streaming Audio and Video Technologies.
ERIC Educational Resources Information Center
Ingebritsen, Thomas S.; Flickinger, Kathleen
Iowa State University, through a program called Project BIO (Biology Instructional Outreach), has been using RealAudio technology for about 2 years in college biology courses that are offered entirely via the World Wide Web. RealAudio is a type of streaming media technology that can be used to deliver audio content and a variety of other media…
Audio distribution and Monitoring Circuit
NASA Technical Reports Server (NTRS)
Kirkland, J. M.
1983-01-01
Versatile circuit accepts and distributes TV audio signals. Three-meter audio distribution and monitoring circuit provides flexibility in monitoring, mixing, and distributing audio inputs and outputs at various signal and impedance levels. Program material is simultaneously monitored on three channels, or single-channel version built to monitor transmitted or received signal levels, drive speakers, interface to building communications, and drive long-line circuits.
Hearing You Loud and Clear: Student Perspectives of Audio Feedback in Higher Education
ERIC Educational Resources Information Center
Gould, Jill; Day, Pat
2013-01-01
The use of audio feedback for students in a full-time community nursing degree course is appraised. The aim of this mixed methods study was to examine student views on audio feedback for written assignments. Questionnaires and a focus group were used to capture student opinion of this pilot project. The majority of students valued audio feedback…
How we give personalised audio feedback after summative OSCEs.
Harrison, Christopher J; Molyneux, Adrian J; Blackwell, Sara; Wass, Valerie J
2015-04-01
Students often receive little feedback after summative objective structured clinical examinations (OSCEs) to enable them to improve their performance. Electronic audio feedback has shown promise in other educational areas. We investigated the feasibility of electronic audio feedback in OSCEs. An electronic OSCE system was designed, comprising (1) an application for iPads allowing examiners to mark in the key consultation skill domains, provide "tick-box" feedback identifying strengths and difficulties, and record voice feedback; (2) a feedback website giving students the opportunity to view/listen in multiple ways to the feedback. Acceptability of the audio feedback was investigated, using focus groups with students and questionnaires with both examiners and students. 87 (95%) students accessed the examiners' audio comments; 83 (90%) found the comments useful and 63 (68%) reported changing the way they perform a skill as a result of the audio feedback. They valued its highly personalised, relevant nature and found it much more useful than written feedback. Eighty-nine per cent of examiners gave audio feedback to all students on their stations. Although many found the method easy, lack of time was a factor. Electronic audio feedback provides timely, personalised feedback to students after a summative OSCE provided enough time is allocated to the process.
Space Shuttle Orbiter audio subsystem. [to communication and tracking system
NASA Technical Reports Server (NTRS)
Stewart, C. H.
1978-01-01
The selection of the audio multiplex control configuration for the Space Shuttle Orbiter audio subsystem is discussed and special attention is given to the evaluation criteria of cost, weight and complexity. The specifications and design of the subsystem are described and detail is given to configurations of the audio terminal and audio central control unit (ATU, ACCU). The audio input from the ACCU, at a signal level of -12.2 to 14.8 dBV, nominal range, at 1 kHz, was found to have balanced source impedance and a balanced local impedance of 6000 + or - 600 ohms at 1 kHz, dc isolated. The Lyndon B. Johnson Space Center (JSC) electroacoustic test laboratory, an audio engineering facility consisting of a collection of acoustic test chambers, analyzed problems of speaker and headset performance, multiplexed control data coupled with audio channels, and the Orbiter cabin acoustic effects on the operational performance of voice communications. This system allows technical management and project engineering to address key constraining issues, such as identifying design deficiencies of the headset interface unit and the assessment of the Orbiter cabin performance of voice communications, which affect the subsystem development.
Implementing Audio-CASI on Windows’ Platforms
Cooley, Philip C.; Turner, Charles F.
2011-01-01
Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743
Audio Steganography with Embedded Text
NASA Astrophysics Data System (ADS)
Teck Jian, Chua; Chai Wen, Chuah; Rahman, Nurul Hidayah Binti Ab.; Hamid, Isredza Rahmi Binti A.
2017-08-01
Audio steganography is about hiding the secret message into the audio. It is a technique uses to secure the transmission of secret information or hide their existence. It also may provide confidentiality to secret message if the message is encrypted. To date most of the steganography software such as Mp3Stego and DeepSound use block cipher such as Advanced Encryption Standard or Data Encryption Standard to encrypt the secret message. It is a good practice for security. However, the encrypted message may become too long to embed in audio and cause distortion of cover audio if the secret message is too long. Hence, there is a need to encrypt the message with stream cipher before embedding the message into the audio. This is because stream cipher provides bit by bit encryption meanwhile block cipher provide a fixed length of bits encryption which result a longer output compare to stream cipher. Hence, an audio steganography with embedding text with Rivest Cipher 4 encryption cipher is design, develop and test in this project.
High capacity reversible watermarking for audio by histogram shifting and predicted error expansion.
Wang, Fei; Xie, Zhaoxin; Chen, Zuo
2014-01-01
Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability.
Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech.
Alm, Magnus; Behne, Dawn
2013-10-01
Previous research indicates that perception of audio-visual (AV) synchrony changes in adulthood. Possible explanations for these age differences include a decline in hearing acuity, a decline in cognitive processing speed, and increased experience with AV binding. The current study aims to isolate the effect of AV experience by comparing synchrony judgments from 20 young adults (20 to 30 yrs) and 20 normal-hearing middle-aged adults (50 to 60 yrs), an age range for which a decline of cognitive processing speed is expected to be minimal. When presented with AV stop consonant syllables with asynchronies ranging from 440 ms audio-lead to 440 ms visual-lead, middle-aged adults showed significantly less tolerance for audio-lead than young adults. Middle-aged adults also showed a greater shift in their point of subjective simultaneity than young adults. Natural audio-lead asynchronies are arguably more predictable than natural visual-lead asynchronies, and this predictability may render audio-lead thresholds more prone to experience-related fine-tuning.
WebGL and web audio software lightweight components for multimedia education
NASA Astrophysics Data System (ADS)
Chang, Xin; Yuksel, Kivanc; Skarbek, Władysław
2017-08-01
The paper presents the results of our recent work on development of contemporary computing platform DC2 for multimedia education usingWebGL andWeb Audio { the W3C standards. Using literate programming paradigm the WEBSA educational tools were developed. It offers for a user (student), the access to expandable collection of WEBGL Shaders and web Audio scripts. The unique feature of DC2 is the option of literate programming, offered for both, the author and the reader in order to improve interactivity to lightweightWebGL andWeb Audio components. For instance users can define: source audio nodes including synthetic sources, destination audio nodes, and nodes for audio processing such as: sound wave shaping, spectral band filtering, convolution based modification, etc. In case of WebGL beside of classic graphics effects based on mesh and fractal definitions, the novel image processing analysis by shaders is offered like nonlinear filtering, histogram of gradients, and Bayesian classifiers.
Design and implementation of an audio indicator
NASA Astrophysics Data System (ADS)
Zheng, Shiyong; Li, Zhao; Li, Biqing
2017-04-01
This page proposed an audio indicator which designed by using C9014, LED by operational amplifier level indicator, the decimal count/distributor of CD4017. The experimental can control audibly neon and holiday lights through the signal. Input audio signal after C9014 composed of operational amplifier for power amplifier, the adjust potentiometer extraction amplification signal input voltage CD4017 distributors make its drive to count, then connect the LED display running situation of the circuit. This simple audio indicator just use only U1 and can produce two colors LED with the audio signal tandem come pursuit of the running effect, from LED display the running of the situation takes can understand the general audio signal. The variation in the audio and the frequency of the signal and the corresponding level size. In this light can achieve jump to change, slowly, atlas, lighting four forms, used in home, hotel, discos, theater, advertising and other fields, and a wide range of USES, rU1h life in a modern society.
Ultrasonic speech translator and communications system
Akerman, M.A.; Ayers, C.W.; Haynes, H.D.
1996-07-23
A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system includes an ultrasonic transmitting device and an ultrasonic receiving device. The ultrasonic transmitting device accepts as input an audio signal such as human voice input from a microphone or tape deck. The ultrasonic transmitting device frequency modulates an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output. 7 figs.
Alderete, John; Davies, Monica
2018-04-01
This work describes a methodology of collecting speech errors from audio recordings and investigates how some of its assumptions affect data quality and composition. Speech errors of all types (sound, lexical, syntactic, etc.) were collected by eight data collectors from audio recordings of unscripted English speech. Analysis of these errors showed that: (i) different listeners find different errors in the same audio recordings, but (ii) the frequencies of error patterns are similar across listeners; (iii) errors collected "online" using on the spot observational techniques are more likely to be affected by perceptual biases than "offline" errors collected from audio recordings; and (iv) datasets built from audio recordings can be explored and extended in a number of ways that traditional corpus studies cannot be.
Adaptive multi-resolution Modularity for detecting communities in networks
NASA Astrophysics Data System (ADS)
Chen, Shi; Wang, Zhi-Zhong; Bao, Mei-Hua; Tang, Liang; Zhou, Ji; Xiang, Ju; Li, Jian-Ming; Yi, Chen-He
2018-02-01
Community structure is a common topological property of complex networks, which attracted much attention from various fields. Optimizing quality functions for community structures is a kind of popular strategy for community detection, such as Modularity optimization. Here, we introduce a general definition of Modularity, by which several classical (multi-resolution) Modularity can be derived, and then propose a kind of adaptive (multi-resolution) Modularity that can combine the advantages of different Modularity. By applying the Modularity to various synthetic and real-world networks, we study the behaviors of the methods, showing the validity and advantages of the multi-resolution Modularity in community detection. The adaptive Modularity, as a kind of multi-resolution method, can naturally solve the first-type limit of Modularity and detect communities at different scales; it can quicken the disconnecting of communities and delay the breakup of communities in heterogeneous networks; and thus it is expected to generate the stable community structures in networks more effectively and have stronger tolerance against the second-type limit of Modularity.
Product modular design incorporating preventive maintenance issues
NASA Astrophysics Data System (ADS)
Gao, Yicong; Feng, Yixiong; Tan, Jianrong
2016-03-01
Traditional modular design methods lead to product maintenance problems, because the module form of a system is created according to either the function requirements or the manufacturing considerations. For solving these problems, a new modular design method is proposed with the considerations of not only the traditional function related attributes, but also the maintenance related ones. First, modularity parameters and modularity scenarios for product modularity are defined. Then the reliability and economic assessment models of product modularity strategies are formulated with the introduction of the effective working age of modules. A mathematical model used to evaluate the difference among the modules of the product so that the optimal module of the product can be established. After that, a multi-objective optimization problem based on metrics for preventive maintenance interval different degrees and preventive maintenance economics is formulated for modular optimization. Multi-objective GA is utilized to rapidly approximate the Pareto set of optimal modularity strategy trade-offs between preventive maintenance cost and preventive maintenance interval difference degree. Finally, a coordinate CNC boring machine is adopted to depict the process of product modularity. In addition, two factorial design experiments based on the modularity parameters are constructed and analyzed. These experiments investigate the impacts of these parameters on the optimal modularity strategies and the structure of module. The research proposes a new modular design method, which may help to improve the maintainability of product in modular design.
NASA Technical Reports Server (NTRS)
1974-01-01
A descriptive handbook for the audio/CTE splitter/interleaver (RCA part No. 8673734-502) was presented. This unit is designed to perform two major functions: extract audio and time data from an interleaved video/audio signal (splitter section), and provide a test interleaved video/audio/CTE signal for the system (interleaver section). It is a rack mounting unit 7 inches high, 19 inches wide, 20 inches deep, mounted on slides for retracting from the rack, and weighs approximately 40 pounds. The following information is provided: installation, operation, principles of operation, maintenance, schematics and parts lists.
Paper-Based Textbooks with Audio Support for Print-Disabled Students.
Fujiyoshi, Akio; Ohsawa, Akiko; Takaira, Takuya; Tani, Yoshiaki; Fujiyoshi, Mamoru; Ota, Yuko
2015-01-01
Utilizing invisible 2-dimensional codes and digital audio players with a 2-dimensional code scanner, we developed paper-based textbooks with audio support for students with print disabilities, called "multimodal textbooks." Multimodal textbooks can be read with the combination of the two modes: "reading printed text" and "listening to the speech of the text from a digital audio player with a 2-dimensional code scanner." Since multimodal textbooks look the same as regular textbooks and the price of a digital audio player is reasonable (about 30 euro), we think multimodal textbooks are suitable for students with print disabilities in ordinary classrooms.
Musical examination to bridge audio data and sheet music
NASA Astrophysics Data System (ADS)
Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali
2015-03-01
The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly useful for teaching music lessons on the web. The developed system is evaluated with songs played with guitar, keyboard, violin, and other popular musical instruments (primarily electronic or stringed instruments). The Musicians Aid system is successful at both representing and analyzing audio data and it is also powerful in assisting individuals interested in learning and understanding music.
ERIC Educational Resources Information Center
Udo, J. P.; Acevedo, B.; Fels, D. I.
2010-01-01
Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…
Rosemann, Stephanie; Thiel, Christiane M
2018-07-15
Hearing loss is associated with difficulties in understanding speech, especially under adverse listening conditions. In these situations, seeing the speaker improves speech intelligibility in hearing-impaired participants. On the neuronal level, previous research has shown cross-modal plastic reorganization in the auditory cortex following hearing loss leading to altered processing of auditory, visual and audio-visual information. However, how reduced auditory input effects audio-visual speech perception in hearing-impaired subjects is largely unknown. We here investigated the impact of mild to moderate age-related hearing loss on processing audio-visual speech using functional magnetic resonance imaging. Normal-hearing and hearing-impaired participants performed two audio-visual speech integration tasks: a sentence detection task inside the scanner and the McGurk illusion outside the scanner. Both tasks consisted of congruent and incongruent audio-visual conditions, as well as auditory-only and visual-only conditions. We found a significantly stronger McGurk illusion in the hearing-impaired participants, which indicates stronger audio-visual integration. Neurally, hearing loss was associated with an increased recruitment of frontal brain areas when processing incongruent audio-visual, auditory and also visual speech stimuli, which may reflect the increased effort to perform the task. Hearing loss modulated both the audio-visual integration strength measured with the McGurk illusion and brain activation in frontal areas in the sentence task, showing stronger integration and higher brain activation with increasing hearing loss. Incongruent compared to congruent audio-visual speech revealed an opposite brain activation pattern in left ventral postcentral gyrus in both groups, with higher activation in hearing-impaired participants in the incongruent condition. Our results indicate that already mild to moderate hearing loss impacts audio-visual speech processing accompanied by changes in brain activation particularly involving frontal areas. These changes are modulated by the extent of hearing loss. Copyright © 2018 Elsevier Inc. All rights reserved.
Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J
2012-01-01
Abstract Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow. The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow. PMID:22859881
Motion-seeded object-based attention for dynamic visual imagery
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak; Kim, Kyungnam
2017-05-01
This paper† describes a novel system that finds and segments "objects of interest" from dynamic imagery (video) that (1) processes each frame using an advanced motion algorithm that pulls out regions that exhibit anomalous motion, and (2) extracts the boundary of each object of interest using a biologically-inspired segmentation algorithm based on feature contours. The system uses a series of modular, parallel algorithms, which allows many complicated operations to be carried out by the system in a very short time, and can be used as a front-end to a larger system that includes object recognition and scene understanding modules. Using this method, we show 90% accuracy with fewer than 0.1 false positives per frame of video, which represents a significant improvement over detection using a baseline attention algorithm.
Reconfigurable, Cognitive Software-Defined Radio
NASA Technical Reports Server (NTRS)
Bhat, Arvind
2015-01-01
Software-defined radio (SDR) technology allows radios to be reconfigured to perform different communication functions without using multiple radios to accomplish each task. Intelligent Automation, Inc., has developed SDR platforms that switch adaptively between different operation modes. The innovation works by modifying both transmit waveforms and receiver signal processing tasks. In Phase I of the project, the company developed SDR cognitive capabilities, including adaptive modulation and coding (AMC), automatic modulation recognition (AMR), and spectrum sensing. In Phase II, these capabilities were integrated into SDR platforms. The reconfigurable transceiver design employs high-speed field-programmable gate arrays, enabling multimode operation and scalable architecture. Designs are based on commercial off-the-shelf (COTS) components and are modular in nature, making it easier to upgrade individual components rather than redesigning the entire SDR platform as technology advances.
Development of modularity in the neural activity of childrenʼs brains
NASA Astrophysics Data System (ADS)
Chen, Man; Deem, Michael W.
2015-02-01
We study how modularity of the human brain changes as children develop into adults. Theory suggests that modularity can enhance the response function of a networked system subject to changing external stimuli. Thus, greater cognitive performance might be achieved for more modular neural activity, and modularity might likely increase as children develop. The value of modularity calculated from functional magnetic resonance imaging (fMRI) data is observed to increase during childhood development and peak in young adulthood. Head motion is deconvolved from the fMRI data, and it is shown that the dependence of modularity on age is independent of the magnitude of head motion. A model is presented to illustrate how modularity can provide greater cognitive performance at short times, i.e. task switching. A fitness function is extracted from the model. Quasispecies theory is used to predict how the average modularity evolves with age, illustrating the increase of modularity during development from children to adults that arises from selection for rapid cognitive function in young adults. Experiments exploring the effect of modularity on cognitive performance are suggested. Modularity may be a potential biomarker for injury, rehabilitation, or disease.
Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn
2016-01-01
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
Digital Audio Application to Short Wave Broadcasting
NASA Technical Reports Server (NTRS)
Chen, Edward Y.
1997-01-01
Digital audio is becoming prevalent not only in consumer electornics, but also in different broadcasting media. Terrestrial analog audio broadcasting in the AM and FM bands will be eventually be replaced by digital systems.
Steganalysis of recorded speech
NASA Astrophysics Data System (ADS)
Johnson, Micah K.; Lyu, Siwei; Farid, Hany
2005-03-01
Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.
Effects of aging on audio-visual speech integration.
Huyse, Aurélie; Leybaert, Jacqueline; Berthommier, Frédéric
2014-10-01
This study investigated the impact of aging on audio-visual speech integration. A syllable identification task was presented in auditory-only, visual-only, and audio-visual congruent and incongruent conditions. Visual cues were either degraded or unmodified. Stimuli were embedded in stationary noise alternating with modulated noise. Fifteen young adults and 15 older adults participated in this study. Results showed that older adults had preserved lipreading abilities when the visual input was clear but not when it was degraded. The impact of aging on audio-visual integration also depended on the quality of the visual cues. In the visual clear condition, the audio-visual gain was similar in both groups and analyses in the framework of the fuzzy-logical model of perception confirmed that older adults did not differ from younger adults in their audio-visual integration abilities. In the visual reduction condition, the audio-visual gain was reduced in the older group, but only when the noise was stationary, suggesting that older participants could compensate for the loss of lipreading abilities by using the auditory information available in the valleys of the noise. The fuzzy-logical model of perception confirmed the significant impact of aging on audio-visual integration by showing an increased weight of audition in the older group.
TECHNICAL NOTE: Portable audio electronics for impedance-based measurements in microfluidics
NASA Astrophysics Data System (ADS)
Wood, Paul; Sinton, David
2010-08-01
We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1-50 mM), flow rate (2-120 µL min-1) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ~10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems.
NASA Astrophysics Data System (ADS)
Park, Nam In; Kim, Seon Man; Kim, Hong Kook; Kim, Ji Woon; Kim, Myeong Bo; Yun, Su Won
In this paper, we propose a video-zoom driven audio-zoom algorithm in order to provide audio zooming effects in accordance with the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone system, in conjunction with a soft masking process that considers the phase differences between microphones. Thus, the audio-zoom processed signal is obtained by multiplying an audio gain derived from a video-zoom level by the masked signal. After all, a real-time audio-zoom system is implemented on an ARM-CORETEX-A8 having a clock speed of 600 MHz after different levels of optimization are performed such as algorithmic level, C-code, and memory optimizations. To evaluate the complexity of the proposed real-time audio-zoom system, test data whose length is 21.3 seconds long is sampled at 48 kHz. As a result, it is shown from the experiments that the processing time for the proposed audio-zoom system occupies 14.6% or less of the ARM clock cycles. It is also shown from the experimental results performed in a semi-anechoic chamber that the signal with the front direction can be amplified by approximately 10 dB compared to the other directions.
Laboratory and in-flight experiments to evaluate 3-D audio display technology
NASA Technical Reports Server (NTRS)
Ericson, Mark; Mckinley, Richard; Kibbe, Marion; Francis, Daniel
1994-01-01
Laboratory and in-flight experiments were conducted to evaluate 3-D audio display technology for cockpit applications. A 3-D audio display generator was developed which digitally encodes naturally occurring direction information onto any audio signal and presents the binaural sound over headphones. The acoustic image is stabilized for head movement by use of an electromagnetic head-tracking device. In the laboratory, a 3-D audio display generator was used to spatially separate competing speech messages to improve the intelligibility of each message. Up to a 25 percent improvement in intelligibility was measured for spatially separated speech at high ambient noise levels (115 dB SPL). During the in-flight experiments, pilots reported that spatial separation of speech communications provided a noticeable improvement in intelligibility. The use of 3-D audio for target acquisition was also investigated. In the laboratory, 3-D audio enabled the acquisition of visual targets in about two seconds average response time at 17 degrees accuracy. During the in-flight experiments, pilots correctly identified ground targets 50, 75, and 100 percent of the time at separation angles of 12, 20, and 35 degrees, respectively. In general, pilot performance in the field with the 3-D audio display generator was as expected, based on data from laboratory experiments.
NASA Astrophysics Data System (ADS)
Nasrudin, Ajeng Ratih; Setiawan, Wawan; Sanjaya, Yayan
2017-05-01
This study is titled the impact of audio narrated animation on students' understanding in learning humanrespiratory system based on gender. This study was conducted in eight grade of junior high school. This study aims to investigate the difference of students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation. Research method that is used is quasy experiment with matching pre-test post-test comparison group design. The procedures of study are: (1) preliminary study and learning habituation using audio narrated animation; (2) implementation of learning using audio narrated animation and taking data; (3) analysis and discussion. The result of analysis shows that there is significant difference on students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation, both in general and specifically in achieving learning indicators. The discussion related to the impact of audio narrated animation, gender characteristics, and constructivist learning environment. It can be concluded that there is significant difference of students' understanding at boys and girls classes in learning human respiratory system using audio narrated animation. Additionally, based on interpretation of students' respond, there is the difference increment of agreement level in learning environment.
Dynamics of modularity of neural activity in the brain during development
NASA Astrophysics Data System (ADS)
Deem, Michael; Chen, Man
2014-03-01
Theory suggests that more modular systems can have better response functions at short times. This theory suggests that greater cognitive performance may be achieved for more modular neural activity, and that modularity of neural activity may, therefore, likely increase with development in children. We study the relationship between age and modularity of brain neural activity in developing children. The value of modularity calculated from fMRI data is observed to increase during childhood development and peak in young adulthood. We interpret these results as evidence of selection for plasticity in the cognitive function of the human brain. We present a model to illustrate how modularity can provide greater cognitive performance at short times and enhance fast, low-level, automatic cognitive processes. Conversely, high-level, effortful, conscious cognitive processes may not benefit from modularity. We use quasispecies theory to predict how the average modularity evolves with age, given a fitness function extracted from the model. We suggest further experiments exploring the effect of modularity on cognitive performance and suggest that modularity may be a potential biomarker for injury, rehabilitation, or disease.
Real Time Implementation of an LPC Algorithm. Speech Signal Processing Research at CHI
1975-05-01
SIGNAL PROCESSING HARDWARE 2-1 2.1 INTRODUCTION 2-1 2.2 TWO- CHANNEL AUDIO SIGNAL SYSTEM 2-2 2.3 MULTI- CHANNEL AUDIO SIGNAL SYSTEM 2-5 2.3.1... Channel Audio Signal System 2-30 I ii kv^i^ünt«.jfc*. ji .„* ,:-v*. ’.ii. *.. ...... — ■ -,,.,-c-» —ipponp ■^ TOHaBWgBpwiBWgPlpaiPWgW v.«.wN...Messages .... 1-55 1-13. Lost or Out of Order Message 1-56 2-1. Block Diagram of Two- Channel Audio Signal System . . 2-3 2-2. Block Diagram of Audio
Review of Audio Interfacing Literature for Computer-Assisted Music Instruction.
ERIC Educational Resources Information Center
Watanabe, Nan
1980-01-01
Presents a review of the literature dealing with audio devices used in computer assisted music instruction and discusses the need for research and development of reliable, cost-effective, random access audio hardware. (Author)
Yu, Jesang; Choi, Ji Hoon; Ma, Sun Young; Jeung, Tae Sig; Lim, Sangwook
2015-09-01
To compare audio-only biofeedback to conventional audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy, limiting damage to healthy surrounding tissues caused by organ movement. Six healthy volunteers were assisted by audiovisual or audio-only biofeedback systems to regulate their respirations. Volunteers breathed through a mask developed for this study by following computer-generated guiding curves displayed on a screen, combined with instructional sounds. They then performed breathing following instructional sounds only. The guiding signals and the volunteers' respiratory signals were logged at 20 samples per second. The standard deviations between the guiding and respiratory curves for the audiovisual and audio-only biofeedback systems were 21.55% and 23.19%, respectively; the average correlation coefficients were 0.9778 and 0.9756, respectively. The regularities between audiovisual and audio-only biofeedback for six volunteers' respirations were same statistically from the paired t-test. The difference between the audiovisual and audio-only biofeedback methods was not significant. Audio-only biofeedback has many advantages, as patients do not require a mask and can quickly adapt to this method in the clinic.
Ultrasonic speech translator and communications system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akerman, M.A.; Ayers, C.W.; Haynes, H.D.
1996-07-23
A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system includes an ultrasonic transmitting device and an ultrasonic receiving device. The ultrasonic transmitting device accepts as input an audio signal such as human voice input from a microphone or tape deck. The ultrasonic transmitting device frequency modulatesmore » an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output. 7 figs.« less
Ultrasonic speech translator and communications system
Akerman, M. Alfred; Ayers, Curtis W.; Haynes, Howard D.
1996-01-01
A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system (20) includes an ultrasonic transmitting device (100) and an ultrasonic receiving device (200). The ultrasonic transmitting device (100) accepts as input (115) an audio signal such as human voice input from a microphone (114) or tape deck. The ultrasonic transmitting device (100) frequency modulates an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device (200) converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output (250).
Mining knowledge in noisy audio data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czyzewski, A.
1996-12-31
This paper demonstrates a KDD method applied to audio data analysis, particularly, it presents possibilities which result from replacing traditional methods of analysis and acoustic signal processing by KDD algorithms when restoring audio recordings affected by strong noise.
Research into Teleconferencing
1981-02-01
Wichman (1970) found more cooperation under conditions of audio- visual communication than conditions of audio communication alone. Laplante (1971) found...was found for audio teleconferences. These results, taken with the results concerning group perfor- mance, seem to indicate that visual communication gives
ERIC Educational Resources Information Center
Virginia State Dept. of Agriculture and Consumer Services, Richmond, VA.
This document is an annotated bibliography of audio-visual aids in the field of consumer education, intended especially for use among low-income, elderly, and handicapped consumers. It was developed to aid consumer education program planners in finding audio-visual resources to enhance their presentations. Materials listed include 293 resources…
Papadopoulos, Konstantinos; Koustriava, Eleni; Koukourikos, Panagiotis; Kartasidou, Lefkothea; Barouti, Marialena; Varveris, Asimis; Misiou, Marina; Zacharogeorga, Timoclia; Anastasiadis, Theocharis
2017-01-01
Disorientation and inability of wayfinding are phenomena with a great frequency for individuals with visual impairments during the process of travelling novel environments. Orientation and mobility aids could suggest important tools for the preparation of a more secure and cognitively mapped travelling. The aim of the present study was to examine if spatial knowledge structured after an individual with blindness had studied the map of an urban area that was delivered through a verbal description, an audio-tactile map or an audio-haptic map, could be used for detecting in the area specific points of interest. The effectiveness of the three aids with reference to each other was also examined. The results of the present study highlight the effectiveness of the audio-tactile and the audio-haptic maps as orientation and mobility aids, especially when these are compared to verbal descriptions.
Entertainment and Pacification System For Car Seat
NASA Technical Reports Server (NTRS)
Elrod, Susan Vinz (Inventor); Dabney, Richard W. (Inventor)
2006-01-01
An entertainment and pacification system for use with a child car seat has speakers mounted in the child car seat with a plurality of audio sources and an anti-noise audio system coupled to the child car seat. A controllable switching system provides for, at any given time, the selective activation of i) one of the audio sources such that the audio signal generated thereby is coupled to one or more of the speakers, and ii) the anti-noise audio system such that an ambient-noise-canceling audio signal generated thereby is coupled to one or more of the speakers. The controllable switching system can receive commands generated at one of first controls located at the child car seat and second controls located remotely with respect to the child car seat with commands generated by the second controls overriding commands generated by the first controls.
Detecting double compression of audio signal
NASA Astrophysics Data System (ADS)
Yang, Rui; Shi, Yun Q.; Huang, Jiwu
2010-01-01
MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.
A high efficiency PWM CMOS class-D audio power amplifier
NASA Astrophysics Data System (ADS)
Zhangming, Zhu; Lianxi, Liu; Yintang, Yang; Han, Lei
2009-02-01
Based on the difference close-loop feedback technique and the difference pre-amp, a high efficiency PWM CMOS class-D audio power amplifier is proposed. A rail-to-rail PWM comparator with window function has been embedded in the class-D audio power amplifier. Design results based on the CSMC 0.5 μm CMOS process show that the max efficiency is 90%, the PSRR is -75 dB, the power supply voltage range is 2.5-5.5 V, the THD+N in 1 kHz input frequency is less than 0.20%, the quiescent current in no load is 2.8 mA, and the shutdown current is 0.5 μA. The active area of the class-D audio power amplifier is about 1.47 × 1.52 mm2. With the good performance, the class-D audio power amplifier can be applied to several audio power systems.
Modular Courses in British Higher Education: A Critical Assessment
ERIC Educational Resources Information Center
Church, Clive
1975-01-01
The trends towards modular course structures is examined. British conceptions of modularization are compared with American interpretations of modular instruction, the former shown to be concerned almost exclusively with content, the latter attempting more radical changes in students' learning behavior. Rationales for British modular schemes are…
Musical stairs: the impact of audio feedback during stair-climbing physical therapies for children.
Khan, Ajmal; Biddiss, Elaine
2015-05-01
Enhanced biofeedback during rehabilitation therapies has the potential to provide a therapeutic environment optimally designed for neuroplasticity. This study investigates the impact of audio feedback on the achievement of a targeted therapeutic goal, namely, use of reciprocal steps. Stair-climbing therapy sessions conducted with and without audio feedback were compared in a randomized AB/BA cross-over study design. Seventeen children, aged 4-7 years, with various diagnoses participated. Reports from the participants, therapists, and a blinded observer were collected to evaluate achievement of the therapeutic goal, motivation and enjoyment during the therapy sessions. Audio feedback resulted in a 5.7% increase (p = 0.007) in reciprocal steps. Levels of participant enjoyment increased significantly (p = 0.031) and motivation was reported by child participants and therapists to be greater when audio feedback was provided. These positive results indicate that audio feedback may influence the achievement of therapeutic goals and promote enjoyment and motivation in young patients engaged in rehabilitation therapies. This study lays the groundwork for future research to determine the long term effects of audio feedback on functional outcomes of therapy. Stair-climbing is an important mobility skill for promoting independence and activities of daily life and is a key component of rehabilitation therapies for physically disabled children. Provision of audio feedback during stair-climbing therapies for young children may increase their achievement of a targeted therapeutic goal (i.e., use of reciprocal steps). Children's motivation and enjoyment of the stair-climbing therapy was enhanced when audio feedback was provided.
NASA Astrophysics Data System (ADS)
Shimizu, Dominique
Though blended course audio feedback has been associated with several measures of course satisfaction at the postsecondary and graduate levels compared to text feedback, it may take longer to prepare and positive results are largely unverified in K-12 literature. The purpose of this quantitative study was to investigate the time investment and learning impact of audio communications with 228 secondary students in a blended online learning biology unit at a central Florida public high school. A short, individualized audio message regarding the student's progress was given to each student in the audio group; similar text-based messages were given to each student in the text-based group on the same schedule; a control got no feedback. A pretest and posttest were employed to measure learning gains in the three groups. To compare the learning gains in two types of feedback with each other and to no feedback, a controlled, randomized, experimental design was implemented. In addition, the creation and posting of audio and text feedback communications were timed in order to assess whether audio feedback took longer to produce than text only feedback. While audio feedback communications did take longer to create and post, there was no difference between learning gains as measured by posttest scores when student received audio, text-based, or no feedback. Future studies using a similar randomized, controlled experimental design are recommended to verify these results and test whether the trend holds in a broader range of subjects, over different time frames, and using a variety of assessment types to measure student learning.
NASA Astrophysics Data System (ADS)
Siswanto, Didik
2017-12-01
School as a place to study require a medium of learning. Instructional media containinginformation about the lessons that will be used by teachers to convey a lesson. School early childhood education Al-Kindy Pekanbaru interms of learning the letter hijaiyah still use conventional learning media. But with the conventionalmedia is not very attractive to use, so the need for an exciting learning medium that can make childrenbecome interested in learningThe purpose of this study was to create a Media Learning Introduction Letter Hijaiyahmultimedia form and benefit from the introduction of letters Hijaiyah Learning Media is a renewal of themedium of learning in School early childhood education Al-Kindy Pekanbaru.In this study the authors tried to make the learning application that contains the basicknowledge of letters hijaiyah dsertai with animation, audio and explanation how to read the letters inorder to complete the learning media letters hijaiyah more interactive.
Reblin, Maija; Otis-Green, Shirley; Ellington, Lee; Clayton, Margaret F
2014-12-01
Although there is growing recognition of the importance of integrating spirituality within health care, there is little evidence to guide clinicians in how to best communicate with patients and family about their spiritual or existential concerns. Using an audio-recorded home hospice nurse visit immediately following the death of a patient as a case-study, we identify spiritually-sensitive communication strategies. The nurse incorporates spirituality in her support of the family by 1) creating space to allow for the expression of emotions and spiritual beliefs and 2) encouraging meaning-based coping, including emphasizing the caregivers' strengths and reframing negative experiences. Hospice provides an excellent venue for modeling successful examples of spiritual communication. Health care professionals can learn these techniques to support patients and families in their own holistic practice. All health care professionals benefit from proficiency in spiritual communication skills. Attention to spiritual concerns ultimately improves care. © The Author(s) 2014.
NASA Astrophysics Data System (ADS)
Obermayer, Richard W.; Nugent, William A.
2000-11-01
The SPAWAR Systems Center San Diego is currently developing an advanced Multi-Modal Watchstation (MMWS); design concepts and software from this effort are intended for transition to future United States Navy surface combatants. The MMWS features multiple flat panel displays and several modes of user interaction, including voice input and output, natural language recognition, 3D audio, stylus and gestural inputs. In 1999, an extensive literature review was conducted on basic and applied research concerned with alerting and warning systems. After summarizing that literature, a human computer interaction (HCI) designer's guide was prepared to support the design of an attention allocation subsystem (AAS) for the MMWS. The resultant HCI guidelines are being applied in the design of a fully interactive AAS prototype. An overview of key findings from the literature review, a proposed design methodology with illustrative examples, and an assessment of progress made in implementing the HCI designers guide are presented.
Digitized molecular diagnostics: reading disk-based bioassays with standard computer drives.
Li, Yunchao; Ou, Lily M L; Yu, Hua-Zhong
2008-11-01
We report herein a digital signal readout protocol for screening disk-based bioassays with standard optical drives of ordinary desktop/notebook computers. Three different types of biochemical recognition reactions (biotin-streptavidin binding, DNA hybridization, and protein-protein interaction) were performed directly on a compact disk in a line array format with the help of microfluidic channel plates. Being well-correlated with the optical darkness of the binding sites (after signal enhancement by gold nanoparticle-promoted autometallography), the reading error levels of prerecorded audio files can serve as a quantitative measure of biochemical interaction. This novel readout protocol is about 1 order of magnitude more sensitive than fluorescence labeling/scanning and has the capability of examining multiplex microassays on the same disk. Because no modification to either hardware or software is needed, it promises a platform technology for rapid, low-cost, and high-throughput point-of-care biomedical diagnostics.
Dissociation of modular total hip arthroplasty at the neck-stem interface without dislocation.
Kouzelis, A; Georgiou, C S; Megas, P
2012-12-01
Modular femoral and acetabular components are now widely used, but only a few complications related to the modularity itself have been reported. We describe a case of dissociation of the modular total hip arthroplasty (THA) at the femoral neck-stem interface during walking. The possible causes of this dissociation are discussed. Successful treatment was provided with surgical revision and replacement of the modular neck components. Surgeons who use modular components in hip arthroplasties should be aware of possible early complications in which the modularity of the prostheses is the major factor of failure.
Quasispecies theory for evolution of modularity.
Park, Jeong-Man; Niestemski, Liang Ren; Deem, Michael W
2015-01-01
Biological systems are modular, and this modularity evolves over time and in different environments. A number of observations have been made of increased modularity in biological systems under increased environmental pressure. We here develop a quasispecies theory for the dynamics of modularity in populations of these systems. We show how the steady-state fitness in a randomly changing environment can be computed. We derive a fluctuation dissipation relation for the rate of change of modularity and use it to derive a relationship between rate of environmental changes and rate of growth of modularity. We also find a principle of least action for the evolved modularity at steady state. Finally, we compare our predictions to simulations of protein evolution and find them to be consistent.
Code of Federal Regulations, 2013 CFR
2013-01-01
... that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice.... If the Judge determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the Judge determines...
Code of Federal Regulations, 2011 CFR
2011-01-01
... that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice.... If the Judge determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the Judge determines...
47 CFR 11.54 - EAS operation during a National Level emergency.
Code of Federal Regulations, 2013 CFR
2013-10-01
... emergency, EAS Participants may transmit in lieu of the EAS audio feed an audio feed of the President's voice message from an alternative source, such as a broadcast network audio feed. [77 FR 16705, Mar. 22...
Code of Federal Regulations, 2012 CFR
2012-01-01
... that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice.... If the Judge determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the Judge determines...
7 CFR 47.14 - Prehearing conferences.
Code of Federal Regulations, 2012 CFR
2012-01-01
... determines that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent.... If the examiner determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the examiner...
47 CFR 11.54 - EAS operation during a National Level emergency.
Code of Federal Regulations, 2014 CFR
2014-10-01
... emergency, EAS Participants may transmit in lieu of the EAS audio feed an audio feed of the President's voice message from an alternative source, such as a broadcast network audio feed. [77 FR 16705, Mar. 22...
Code of Federal Regulations, 2014 CFR
2014-01-01
... that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice.... If the Judge determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the Judge determines...
Code of Federal Regulations, 2012 CFR
2012-01-01
... which the deposition is to be conducted (telephone, audio-visual telecommunication, or by personal...) The place of the deposition; (iii) The manner of the deposition (telephone, audio-visual... shall be conducted in the manner (telephone, audio-visual telecommunication, or personal attendance of...
Code of Federal Regulations, 2010 CFR
2010-01-01
... that conducting the conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice.... If the Judge determines that a conference conducted by audio-visual telecommunication would... correspondence, the conference shall be conducted by audio-visual telecommunication unless the Judge determines...
47 CFR 11.54 - EAS operation during a National Level emergency.
Code of Federal Regulations, 2012 CFR
2012-10-01
... emergency, EAS Participants may transmit in lieu of the EAS audio feed an audio feed of the President's voice message from an alternative source, such as a broadcast network audio feed. [77 FR 16705, Mar. 22...
Instrumental Landing Using Audio Indication
NASA Astrophysics Data System (ADS)
Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.
2018-02-01
The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.
Realization of guitar audio effects using methods of digital signal processing
NASA Astrophysics Data System (ADS)
Buś, Szymon; Jedrzejewski, Konrad
2015-09-01
The paper is devoted to studies on possibilities of realization of guitar audio effects by means of methods of digital signal processing. As a result of research, some selected audio effects corresponding to the specifics of guitar sound were realized as the real-time system called Digital Guitar Multi-effect. Before implementation in the system, the selected effects were investigated using the dedicated application with a graphical user interface created in Matlab environment. In the second stage, the real-time system based on a microcontroller and an audio codec was designed and realized. The system is designed to perform audio effects on the output signal of an electric guitar.
Power saver circuit for audio/visual signal unit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Right, R. W.
1985-02-12
A combined audio and visual signal unit with the audio and visual components actuated alternately and powered over a single cable pair in such a manner that only one of the audio and visual components is drawing power from the power supply at any given instant. Thus, the power supply is never called upon to provide more energy than that drawn by the one of the components having the greater power requirement. This is particularly advantageous when several combined audio and visual signal units are coupled in parallel on one cable pair. Typically, the signal unit may comprise a hornmore » and a strobe light for a fire alarm signalling system.« less
A centralized audio presentation manager
DOE Office of Scientific and Technical Information (OSTI.GOV)
Papp, A.L. III; Blattner, M.M.
1994-05-16
The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in themore » most perceptible manner through the use of a theoretically and empirically designed rule set.« less
Design of batch audio/video conversion platform based on JavaEE
NASA Astrophysics Data System (ADS)
Cui, Yansong; Jiang, Lianpin
2018-03-01
With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.
Self-organized modularization in evolutionary algorithms.
Dauscher, Peter; Uthmann, Thomas
2005-01-01
The principle of modularization has proven to be extremely successful in the field of technical applications and particularly for Software Engineering purposes. The question to be answered within the present article is whether mechanisms can also be identified within the framework of Evolutionary Computation that cause a modularization of solutions. We will concentrate on processes, where modularization results only from the typical evolutionary operators, i.e. selection and variation by recombination and mutation (and not, e.g., from special modularization operators). This is what we call Self-Organized Modularization. Based on a combination of two formalizations by Radcliffe and Altenberg, some quantitative measures of modularity are introduced. Particularly, we distinguish Built-in Modularity as an inherent property of a genotype and Effective Modularity, which depends on the rest of the population. These measures can easily be applied to a wide range of present Evolutionary Computation models. It will be shown, both theoretically and by simulation, that under certain conditions, Effective Modularity (as defined within this paper) can be a selection factor. This causes Self-Organized Modularization to take place. The experimental observations emphasize the importance of Effective Modularity in comparison with Built-in Modularity. Although the experimental results have been obtained using a minimalist toy model, they can lead to a number of consequences for existing models as well as for future approaches. Furthermore, the results suggest a complex self-amplification of highly modular equivalence classes in the case of respected relations. Since the well-known Holland schemata are just the equivalence classes of respected relations in most Simple Genetic Algorithms, this observation emphasizes the role of schemata as Building Blocks (in comparison with arbitrary subsets of the search space).
[Modular enteral nutrition in pediatrics].
Murillo Sanchís, S; Prenafeta Ferré, M T; Sempere Luque, M D
1991-01-01
Modular Enteral Nutrition may be a substitute for Parenteral Nutrition in children with different pathologies. Study of 4 children with different pathologies selected from a group of 40 admitted to the Maternal-Childrens Hospital "Valle de Hebrón" in Barcelona, who received modular enteral nutrition. They were monitored on a daily basis by the Dietician Service. Modular enteral nutrition consists of modules of proteins, peptides, lipids, glucids and mineral salts-vitamins. 1.--Craneo-encephalic traumatisms with loss of consciousness, Feeding with a combination of parenteral nutrition and modular enteral nutrition for 7 days. In view of the tolerance and good results of the modular enteral nutrition, the parenteral nutrition was suspended and modular enteral nutrition alone used up to a total of 43 days. 2.--55% burns with 36 days of hyperproteic modular enteral nutrition together with normal feeding. A more rapid recovery was achieved with an increase in total proteins and albumin. 3.--Persistent diarrhoea with 31 days of modular enteral nutrition, 5 days on parenteral nutrition alone and 8 days on combined parenteral nutrition and modular enteral nutrition. In view of the tolerance and good results of the modular enteral nutrition, the parenteral nutrition was suspended. 4.--Mucoviscidosis with a total of 19 days on modular enteral nutrition, 12 of which were exclusively on modular enteral nutrition and 7 as a night supplement to normal feeding. We administered proteic intakes of up to 20% of the total calorific intake and in concentrations of up to 1.2 calories/ml of the final preparation, always with a good tolerance. Modular enteral nutrition can and should be used as a substitute for parenteral nutrition in children with different pathologies, thus preventing the complications inherent in parenteral nutrition.
Convergent evolution of modularity in metabolic networks through different community structures.
Zhou, Wanding; Nakhleh, Luay
2012-09-14
It has been reported that the modularity of metabolic networks of bacteria is closely related to the variability of their living habitats. However, given the dependency of the modularity score on the community structure, it remains unknown whether organisms achieve certain modularity via similar or different community structures. In this work, we studied the relationship between similarities in modularity scores and similarities in community structures of the metabolic networks of 1021 species. Both similarities are then compared against the genetic distances. We revisited the association between modularity and variability of the microbial living environments and extended the analysis to other aspects of their life style such as temperature and oxygen requirements. We also tested both topological and biological intuition of the community structures identified and investigated the extent of their conservation with respect to the taxonomy. We find that similar modularities are realized by different community structures. We find that such convergent evolution of modularity is closely associated with the number of (distinct) enzymes in the organism's metabolome, a consequence of different life styles of the species. We find that the order of modularity is the same as the order of the number of the enzymes under the classification based on the temperature preference but not on the oxygen requirement. Besides, inspection of modularity-based communities reveals that these communities are graph-theoretically meaningful yet not reflective of specific biological functions. From an evolutionary perspective, we find that the community structures are conserved only at the level of kingdoms. Our results call for more investigation into the interplay between evolution and modularity: how evolution shapes modularity, and how modularity affects evolution (mainly in terms of fitness and evolvability). Further, our results call for exploring new measures of modularity and network communities that better correspond to functional categorizations.
Code of Federal Regulations, 2012 CFR
2012-01-01
... (telephone, audio-visual telecommunication, or personal attendance of those who are to participate in the... that conducting the deposition by audio-visual telecommunication: (i) Is necessary to prevent prejudice... determines that a deposition conducted by audio-visual telecommunication would measurably increase the United...
47 CFR Figure 2 to Subpart N of... - Typical Audio Wave
Code of Federal Regulations, 2011 CFR
2011-10-01
... 47 Telecommunication 1 2011-10-01 2011-10-01 false Typical Audio Wave 2 Figure 2 to Subpart N of Part 2 Telecommunication FEDERAL COMMUNICATIONS COMMISSION GENERAL FREQUENCY ALLOCATIONS AND RADIO... Audio Wave EC03JN91.006 ...
9 CFR 202.112 - Rule 12: Oral hearing.
Code of Federal Regulations, 2010 CFR
2010-01-01
... hearing shall be conducted by audio-visual telecommunication unless the presiding officer determines that... hearing by audio-visual telecommunication. If the presiding officer determines that a hearing conducted by audio-visual telecommunication would measurably increase the United States Department of Agriculture's...
9 CFR 202.112 - Rule 12: Oral hearing.
Code of Federal Regulations, 2011 CFR
2011-01-01
... hearing shall be conducted by audio-visual telecommunication unless the presiding officer determines that... hearing by audio-visual telecommunication. If the presiding officer determines that a hearing conducted by audio-visual telecommunication would measurably increase the United States Department of Agriculture's...
MedlinePlus FAQ: Is audio description available for videos on MedlinePlus?
... audiodescription.html Question: Is audio description available for videos on MedlinePlus? To use the sharing features on ... page, please enable JavaScript. Answer: Audio description of videos helps make the content of videos accessible to ...
A Multimodal Emotion Detection System during Human-Robot Interaction
Alonso-Martín, Fernando; Malfaz, María; Sequeira, João; Gorostiza, Javier F.; Salichs, Miguel A.
2013-01-01
In this paper, a multimodal user-emotion detection system for social robots is presented. This system is intended to be used during human–robot interaction, and it is integrated as part of the overall interaction system of the robot: the Robotics Dialog System (RDS). Two modes are used to detect emotions: the voice and face expression analysis. In order to analyze the voice of the user, a new component has been developed: Gender and Emotion Voice Analysis (GEVA), which is written using the Chuck language. For emotion detection in facial expressions, the system, Gender and Emotion Facial Analysis (GEFA), has been also developed. This last system integrates two third-party solutions: Sophisticated High-speed Object Recognition Engine (SHORE) and Computer Expression Recognition Toolbox (CERT). Once these new components (GEVA and GEFA) give their results, a decision rule is applied in order to combine the information given by both of them. The result of this rule, the detected emotion, is integrated into the dialog system through communicative acts. Hence, each communicative act gives, among other things, the detected emotion of the user to the RDS so it can adapt its strategy in order to get a greater satisfaction degree during the human–robot dialog. Each of the new components, GEVA and GEFA, can also be used individually. Moreover, they are integrated with the robotic control platform ROS (Robot Operating System). Several experiments with real users were performed to determine the accuracy of each component and to set the final decision rule. The results obtained from applying this decision rule in these experiments show a high success rate in automatic user emotion recognition, improving the results given by the two information channels (audio and visual) separately. PMID:24240598
Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems
NASA Technical Reports Server (NTRS)
Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan
2010-01-01
A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.
Fukushima, Hidetada; Panczyk, Micah; Hu, Chengcheng; Dameff, Christian; Chikani, Vatsal; Vadeboncoeur, Tyler; Spaite, Daniel W; Bobrow, Bentley J
2017-08-29
Emergency 9-1-1 callers use a wide range of terms to describe abnormal breathing in persons with out-of-hospital cardiac arrest (OHCA). These breathing descriptors can obstruct the telephone cardiopulmonary resuscitation (CPR) process. We conducted an observational study of emergency call audio recordings linked to confirmed OHCAs in a statewide Utstein-style database. Breathing descriptors fell into 1 of 8 groups (eg, gasping, snoring). We divided the study population into groups with and without descriptors for abnormal breathing to investigate the impact of these descriptors on patient outcomes and telephone CPR process. Callers used descriptors in 459 of 2411 cases (19.0%) between October 1, 2010, and December 31, 2014. Survival outcome was better when the caller used a breathing descriptor (19.6% versus 8.8%, P <0.0001), with an odds ratio of 1.63 (95% confidence interval, 1.17-2.25). After exclusions, 379 of 459 cases were eligible for process analysis. When callers described abnormal breathing, the rates of telecommunicator OHCA recognition, CPR instruction, and telephone CPR were lower than when callers did not use a breathing descriptor (79.7% versus 93.0%, P <0.0001; 65.4% versus 72.5%, P =0.0078; and 60.2% versus 66.9%, P =0.0123, respectively). The time interval between call receipt and OHCA recognition was longer when the caller used a breathing descriptor (118.5 versus 73.5 seconds, P <0.0001). Descriptors of abnormal breathing are associated with improved outcomes but also with delays in the identification of OHCA. Familiarizing telecommunicators with these descriptors may improve the telephone CPR process including OHCA recognition for patients with increased probability of survival. © 2017 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley.
Implicit Contractive Mappings in Modular Metric and Fuzzy Metric Spaces
Hussain, N.; Salimi, P.
2014-01-01
The notion of modular metric spaces being a natural generalization of classical modulars over linear spaces like Lebesgue, Orlicz, Musielak-Orlicz, Lorentz, Orlicz-Lorentz, and Calderon-Lozanovskii spaces was recently introduced. In this paper we investigate the existence of fixed points of generalized α-admissible modular contractive mappings in modular metric spaces. As applications, we derive some new fixed point theorems in partially ordered modular metric spaces, Suzuki type fixed point theorems in modular metric spaces and new fixed point theorems for integral contractions. In last section, we develop an important relation between fuzzy metric and modular metric and deduce certain new fixed point results in triangular fuzzy metric spaces. Moreover, some examples are provided here to illustrate the usability of the obtained results. PMID:25003157
StirMark Benchmark: audio watermarking attacks based on lossy compression
NASA Astrophysics Data System (ADS)
Steinebach, Martin; Lang, Andreas; Dittmann, Jana
2002-04-01
StirMark Benchmark is a well-known evaluation tool for watermarking robustness. Additional attacks are added to it continuously. To enable application based evaluation, in our paper we address attacks against audio watermarks based on lossy audio compression algorithms to be included in the test environment. We discuss the effect of different lossy compression algorithms like MPEG-2 audio Layer 3, Ogg or VQF on a selection of audio test data. Our focus is on changes regarding the basic characteristics of the audio data like spectrum or average power and on removal of embedded watermarks. Furthermore we compare results of different watermarking algorithms and show that lossy compression is still a challenge for most of them. There are two strategies for adding evaluation of robustness against lossy compression to StirMark Benchmark: (a) use of existing free compression algorithms (b) implementation of a generic lossy compression simulation. We discuss how such a model can be implemented based on the results of our tests. This method is less complex, as no real psycho acoustic model has to be applied. Our model can be used for audio watermarking evaluation of numerous application fields. As an example, we describe its importance for e-commerce applications with watermarking security.
Padmanabhan, R; Hildreth, A J; Laws, D
2005-09-01
Pre-operative anxiety is common and often significant. Ambulatory surgery challenges our pre-operative goal of an anxiety-free patient by requiring people to be 'street ready' within a brief period of time after surgery. Recently, it has been demonstrated that music can be used successfully to relieve patient anxiety before operations, and that audio embedded with tones that create binaural beats within the brain of the listener decreases subjective levels of anxiety in patients with chronic anxiety states. We measured anxiety with the State-Trait Anxiety Inventory questionnaire and compared binaural beat audio (Binaural Group) with an identical soundtrack but without these added tones (Audio Group) and with a third group who received no specific intervention (No Intervention Group). Mean [95% confidence intervals] decreases in anxiety scores were 26.3%[19-33%] in the Binaural Group (p = 0.001 vs. Audio Group, p < 0.0001 vs. No Intervention Group), 11.1%[6-16%] in the Audio Group (p = 0.15 vs. No Intervention Group) and 3.8%[0-7%] in the No Intervention Group. Binaural beat audio has the potential to decrease acute pre-operative anxiety significantly.
Yu, Jesang; Choi, Ji Hoon; Ma, Sun Young; Jeung, Tae Sig
2015-01-01
Purpose To compare audio-only biofeedback to conventional audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy, limiting damage to healthy surrounding tissues caused by organ movement. Materials and Methods Six healthy volunteers were assisted by audiovisual or audio-only biofeedback systems to regulate their respirations. Volunteers breathed through a mask developed for this study by following computer-generated guiding curves displayed on a screen, combined with instructional sounds. They then performed breathing following instructional sounds only. The guiding signals and the volunteers' respiratory signals were logged at 20 samples per second. Results The standard deviations between the guiding and respiratory curves for the audiovisual and audio-only biofeedback systems were 21.55% and 23.19%, respectively; the average correlation coefficients were 0.9778 and 0.9756, respectively. The regularities between audiovisual and audio-only biofeedback for six volunteers' respirations were same statistically from the paired t-test. Conclusion The difference between the audiovisual and audio-only biofeedback methods was not significant. Audio-only biofeedback has many advantages, as patients do not require a mask and can quickly adapt to this method in the clinic. PMID:26484309
Modular Power Standard for Space Explorations Missions
NASA Technical Reports Server (NTRS)
Oeftering, Richard C.; Gardner, Brent G.
2016-01-01
Future human space exploration will most likely be composed of assemblies of multiple modular spacecraft elements with interconnected electrical power systems. An electrical system composed of a standardized set modular building blocks provides significant development, integration, and operational cost advantages. The modular approach can also provide the flexibility to configure power systems to meet the mission needs. A primary goal of the Advanced Exploration Systems (AES) Modular Power System (AMPS) project is to establish a Modular Power Standard that is needed to realize these benefits. This paper is intended to give the space exploration community a "first look" at the evolving Modular Power Standard and invite their comments and technical contributions.
Molecular Dynamics Simulations of DNA-Free and DNA-Bound TAL Effectors
Wan, Hua; Hu, Jian-ping; Li, Kang-shun; Tian, Xu-hong; Chang, Shan
2013-01-01
TAL (transcriptional activator-like) effectors (TALEs) are DNA-binding proteins, containing a modular central domain that recognizes specific DNA sequences. Recently, the crystallographic studies of TALEs revealed the structure of DNA-recognition domain. In this article, molecular dynamics (MD) simulations are employed to study two crystal structures of an 11.5-repeat TALE, in the presence and absence of DNA, respectively. The simulated results indicate that the specific binding of RVDs (repeat-variable diresidues) with DNA leads to the markedly reduced fluctuations of tandem repeats, especially at the two ends. In the DNA-bound TALE system, the base-specific interaction is formed mainly by the residue at position 13 within a TAL repeat. Tandem repeats with weak RVDs are unfavorable for the TALE-DNA binding. These observations are consistent with experimental studies. By using principal component analysis (PCA), the dominant motions are open-close movements between the two ends of the superhelical structure in both DNA-free and DNA-bound TALE systems. The open-close movements are found to be critical for the recognition and binding of TALE-DNA based on the analysis of free energy landscape (FEL). The conformational analysis of DNA indicates that the 5′ end of DNA target sequence has more remarkable structural deformability than the other sites. Meanwhile, the conformational change of DNA is likely associated with the specific interaction of TALE-DNA. We further suggest that the arrangement of N-terminal repeats with strong RVDs may help in the design of efficient TALEs. This study provides some new insights into the understanding of the TALE-DNA recognition mechanism. PMID:24130757
Digital Multicasting of Multiple Audio Streams
NASA Technical Reports Server (NTRS)
Macha, Mitchell; Bullock, John
2007-01-01
The Mission Control Center Voice Over Internet Protocol (MCC VOIP) system (see figure) comprises hardware and software that effect simultaneous, nearly real-time transmission of as many as 14 different audio streams to authorized listeners via the MCC intranet and/or the Internet. The original version of the MCC VOIP system was conceived to enable flight-support personnel located in offices outside a spacecraft mission control center to monitor audio loops within the mission control center. Different versions of the MCC VOIP system could be used for a variety of public and commercial purposes - for example, to enable members of the general public to monitor one or more NASA audio streams through their home computers, to enable air-traffic supervisors to monitor communication between airline pilots and air-traffic controllers in training, and to monitor conferences among brokers in a stock exchange. At the transmitting end, the audio-distribution process begins with feeding the audio signals to analog-to-digital converters. The resulting digital streams are sent through the MCC intranet, using a user datagram protocol (UDP), to a server that converts them to encrypted data packets. The encrypted data packets are then routed to the personal computers of authorized users by use of multicasting techniques. The total data-processing load on the portion of the system upstream of and including the encryption server is the total load imposed by all of the audio streams being encoded, regardless of the number of the listeners or the number of streams being monitored concurrently by the listeners. The personal computer of a user authorized to listen is equipped with special- purpose MCC audio-player software. When the user launches the program, the user is prompted to provide identification and a password. In one of two access- control provisions, the program is hard-coded to validate the user s identity and password against a list maintained on a domain-controller computer at the MCC. In the other access-control provision, the program verifies that the user is authorized to have access to the audio streams. Once both access-control checks are completed, the audio software presents a graphical display that includes audiostream-selection buttons and volume-control sliders. The user can select all or any subset of the available audio streams and can adjust the volume of each stream independently of that of the other streams. The audio-player program spawns a "read" process for the selected stream(s). The spawned process sends, to the router(s), a "multicast-join" request for the selected streams. The router(s) responds to the request by sending the encrypted multicast packets to the spawned process. The spawned process receives the encrypted multicast packets and sends a decryption packet to audio-driver software. As the volume or muting features are changed by the user, interrupts are sent to the spawned process to change the corresponding attributes sent to the audio-driver software. The total latency of this system - that is, the total time from the origination of the audio signals to generation of sound at a listener s computer - lies between four and six seconds.
47 CFR 10.520 - Common audio attention signal.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section 10.520 Telecommunication FEDERAL COMMUNICATIONS COMMISSION GENERAL COMMERCIAL MOBILE ALERT SYSTEM Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment...
Code of Federal Regulations, 2012 CFR
2012-01-01
... hearing to be conducted by telephone or audio-visual telecommunication; (10) Require each party to provide... prior to any deposition to be conducted by telephone or audio-visual telecommunication; (11) Require that any hearing to be conducted by telephone or audio-visual telecommunication be conducted at...
Code of Federal Regulations, 2014 CFR
2014-04-01
... Relations DEPARTMENT OF STATE PUBLIC DIPLOMACY AND EXCHANGES WORLD-WIDE FREE FLOW OF AUDIO-VISUAL MATERIALS... certification of United States produced audio-visual materials under the provisions of the Beirut Agreement... staff with authority to issue Certificates or Importation Documents. Audio-visual materials—means: (1...
22 CFR 61.3 - Certification and authentication criteria.
Code of Federal Regulations, 2014 CFR
2014-04-01
... AUDIO-VISUAL MATERIALS § 61.3 Certification and authentication criteria. (a) The Department shall certify or authenticate audio-visual materials submitted for review as educational, scientific and... of the material. (b) The Department will not certify or authenticate any audio-visual material...
Code of Federal Regulations, 2013 CFR
2013-04-01
... Relations DEPARTMENT OF STATE PUBLIC DIPLOMACY AND EXCHANGES WORLD-WIDE FREE FLOW OF AUDIO-VISUAL MATERIALS... certification of United States produced audio-visual materials under the provisions of the Beirut Agreement... staff with authority to issue Certificates or Importation Documents. Audio-visual materials—means: (1...
22 CFR 61.3 - Certification and authentication criteria.
Code of Federal Regulations, 2013 CFR
2013-04-01
... AUDIO-VISUAL MATERIALS § 61.3 Certification and authentication criteria. (a) The Department shall certify or authenticate audio-visual materials submitted for review as educational, scientific and... of the material. (b) The Department will not certify or authenticate any audio-visual material...
9 CFR 202.110 - Rule 10: Prehearing conference.
Code of Federal Regulations, 2013 CFR
2013-01-01
... conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice to a party; (ii) Is... presiding officer determines that a prehearing conference conducted by audio-visual telecommunication would... conducted by audio-visual telecommunication unless the presiding officer determines that conducting the...
9 CFR 202.110 - Rule 10: Prehearing conference.
Code of Federal Regulations, 2010 CFR
2010-01-01
... conference by audio-visual telecommunication: (i) Is necessary to prevent prejudice to a party; (ii) Is... presiding officer determines that a prehearing conference conducted by audio-visual telecommunication would... conducted by audio-visual telecommunication unless the presiding officer determines that conducting the...
Code of Federal Regulations, 2012 CFR
2012-04-01
... Relations DEPARTMENT OF STATE PUBLIC DIPLOMACY AND EXCHANGES WORLD-WIDE FREE FLOW OF AUDIO-VISUAL MATERIALS... certification of United States produced audio-visual materials under the provisions of the Beirut Agreement... staff with authority to issue Certificates or Importation Documents. Audio-visual materials—means: (1...
Code of Federal Regulations, 2011 CFR
2011-01-01
... hearing to be conducted by telephone or audio-visual telecommunication; (10) Require each party to provide... prior to any deposition to be conducted by telephone or audio-visual telecommunication; (11) Require that any hearing to be conducted by telephone or audio-visual telecommunication be conducted at...
22 CFR 61.3 - Certification and authentication criteria.
Code of Federal Regulations, 2012 CFR
2012-04-01
... AUDIO-VISUAL MATERIALS § 61.3 Certification and authentication criteria. (a) The Department shall certify or authenticate audio-visual materials submitted for review as educational, scientific and... of the material. (b) The Department will not certify or authenticate any audio-visual material...
Audio-Tutorial Instruction in Medicine.
ERIC Educational Resources Information Center
Boyle, Gloria J.; Herrick, Merlyn C.
This progress report concerns an audio-tutorial approach used at the University of Missouri-Columbia School of Medicine. Instructional techniques such as slide-tape presentations, compressed speech audio tapes, computer-assisted instruction (CAI), motion pictures, television, microfiche, and graphic and printed materials have been implemented,…
Spatial Audio on the Web: Or Why Can't I hear Anything Over There?
NASA Technical Reports Server (NTRS)
Wenzel, Elizabeth M.; Schlickenmaier, Herbert (Technical Monitor); Johnson, Gerald (Technical Monitor); Frey, Mary Anne (Technical Monitor); Schneider, Victor S. (Technical Monitor); Ahunada, Albert J. (Technical Monitor)
1997-01-01
Auditory complexity, freedom of movement and interactivity is not always possible in a "true" virtual environment, much less in web-based audio. However, a lot of the perceptual and engineering constraints (and frustrations) that researchers, engineers and listeners have experienced in virtual audio are relevant to spatial audio on the web. My talk will discuss some of these engineering constraints and their perceptual consequences, and attempt to relate these issues to implementation on the web.
A review of lossless audio compression standards and algorithms
NASA Astrophysics Data System (ADS)
Muin, Fathiah Abdul; Gunawan, Teddy Surya; Kartiwi, Mira; Elsheikh, Elsheikh M. A.
2017-09-01
Over the years, lossless audio compression has gained popularity as researchers and businesses has become more aware of the need for better quality and higher storage demand. This paper will analyse various lossless audio coding algorithm and standards that are used and available in the market focusing on Linear Predictive Coding (LPC) specifically due to its popularity and robustness in audio compression, nevertheless other prediction methods are compared to verify this. Advanced representation of LPC such as LSP decomposition techniques are also discussed within this paper.
Bishop, Laura; Goebl, Werner
2017-07-21
Ensemble musicians often exchange visual cues in the form of body gestures (e.g., rhythmic head nods) to help coordinate piece entrances. These cues must communicate beats clearly, especially if the piece requires interperformer synchronization of the first chord. This study aimed to (1) replicate prior findings suggesting that points of peak acceleration in head gestures communicate beat position and (2) identify the kinematic features of head gestures that encourage successful synchronization. It was expected that increased precision of the alignment between leaders' head gestures and first note onsets, increased gesture smoothness, magnitude, and prototypicality, and increased leader ensemble/conducting experience would improve gesture synchronizability. Audio/MIDI and motion capture recordings were made of piano duos performing short musical passages under assigned leader/follower conditions. The leader of each trial listened to a particular tempo over headphones, then cued their partner in at the given tempo, without speaking. A subset of motion capture recordings were then presented as point-light videos with corresponding audio to a sample of musicians who tapped in synchrony with the beat. Musicians were found to align their first taps with the period of deceleration following acceleration peaks in leaders' head gestures, suggesting that acceleration patterns communicate beat position. Musicians' synchronization with leaders' first onsets improved as cueing gesture smoothness and magnitude increased and prototypicality decreased. Synchronization was also more successful with more experienced leaders' gestures. These results might be applied to interactive systems using gesture recognition or reproduction for music-making tasks (e.g., intelligent accompaniment systems).